Hi, The setup consist of hadoop 1.0.1 and hbase 0.94.x. Loading raw data into hdfs and then into hbase consumes good amount of time for 10tb of raw data (using hadoop shell -copyFromLocal and pig script to load hbase).
1. Moving to hadoop 2.x will benefit performing better is my question. If yes please provide relevent links or docs which expains how it is achieved. 2. I do not need sorting my data while loading into hbase so what are the ways i can disable sort ta Mapper and at Reducer is my 2nd question. any other thoughts are welcome... thanks in advance
