We have a Hadoop 0.20.2 + Hbase 0.20.6 setup with three data nodes (12GB, 1.5TB each) and one master node (24GB, 1.5TB). We store a relatively simple table in HBase (1 column familiy, 5 columns, rowkey about 100chars).
In order to better understand the load behavior, I wanted to put 5*10^8 rows into that table. I wrote an M/R job that uses a Split Input Format to split the 5*10^8 logical row keys (essentially just counting from 0 to 5*10^8-1) into 1000 chunks of 500000 keys and then let the map do the actual job of writing the corresponding rows (with some random column values) into hbase. So there are 1000 map tasks, no reducer. Each task writes 500000 rows into Hbase. We have 6 mapper slots, i.e. 24 mappers running parallel. The whole job runs for approx. 48 hours. Initially the map tasks need around 30 min. each. After a while things take longer and longer, eventually reaching > 2h. It tops around the 850s task after which things speed up again improving to about 48min. in the end, until completed. It's all dedicated machines and there is nothing else running. The map tasks have 200m heap and when checking with vmstat in between I cannot observe swapping. Also, on the master it seems that heap utilization is not at the limit and no swapping either. All Hadoop and Hbase processes have 1G heap. Any idea what would cause the strong variation (or degradation) of write performance? Is there a way of finding out where time gets lost? Thanks, Henning
