Hi Bharath! One of the main benefits of using HBase is that it gives you random access to your data. The main goal is not to use it for big batch processing jobs going through all or a lot of your data. Even though hooks into MapReduce jobs gives you that option.
So when ever you fetch data using get and scan, that data is brought to the client, for you to process it there. Using HBase as the source or sink in a MR this is not the case. What access patterns do you have to your data, are you doing a lot of random reads or mostly batch processing of data? Regards Erik
