Hello,
I am writing a MR job where each reducer will output one HFile containing some 
of the rows of the table that will be created.
At first I thought to use the HashPartitioner to achieve load balancing, but 
this would mix the rows and the output of each reducer will not be a continuous 
part of the Hbase table that will be created combining all these files.

So, I would like to ask you if it is important to use a Partitioner 
(TotalOrderPartitioner, for example) that will allow the reducers to have a 
continuous part of the table?

If I do not do that, will this ruin the performance of HBase when executing 
queries or when it runs compactions, as rows, which are supposed to be next to 
each other, will be in different HFiles and the number of disk seeks will 
increase?

Thank you for your help!
Panagiotis
                                          

Reply via email to