Hi all, How to make use of a TableSplit or a Region Split? How is it used in TableInputFormatBase# getSplits() ?
I have 6 Region Servers across the cluster for the map-reduce task which i am using, How to leverage this so that the table is split across the clusters and the map-reduce application finishes fast.. Right now, it is very slow.. For aggregating 3 table values, 1 with 100,000 rows and other two tables i'm only using get operating to get the value by passing the key.. For this setup, it takes 40-50 mins.. Which is worse.. The first table would eventually be around 20-25m rows.. Please lead me in the right way.. I will paste the code if anybody is interested. -- Regards- Pavan
