I was using importtsv tool to bulk import a text file of 8GB, 100M records into HBase. The process was really fast before it reaches reduce 33%, it will be stucked for half an hour and then jumped to 66%. During this half hour, only one disk of one node was used, which maybe the bottleneck.
Can I increase the reduce task number of the job, which was 1 by default? Thanks. Sean 2011.09.09
