Actually i mean how to do randomly get in MapReduce, not scan. Let me give a detailed description of my requirement: There's a Hbase table contais all the users(about 2G) we collected, and the rowkey is the user id. Every hour there comes some user info(5M~10M) For every coming user, get(HBase Get) the info from HBase, do a merge with the current hour info and put to HBase again. (If the user not exists in HBase, just consider this hour info)
Now the getting step is done on one machine, i want to do it distributly with MapReduce. [email protected] From: Shahab Yunus Date: 2014-08-11 20:10 To: [email protected] Subject: Re: How to get specific rowkey from hbase You can use the util classes provided already. Note that it won't be very fast and you might want to try out bulk import as well (especially if it is one time or rare occurrence.) It depends on your use case. Check out the documentation below: For the Map Reduce Hbase util: http://hbase.apache.org/book/mapreduce.example.html http://bigdataprocessing.wordpress.com/2012/07/27/hadoop-hbase-mapreduce-examples/ For Hbase Bulk import: http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/ Regards, Shahab On Mon, Aug 11, 2014 at 7:14 AM, [email protected] <[email protected]> wrote: > > Hi, > > I have an input which has about 10M records,each recored is a rowkey > in hbase. > How can i get these data from HBase with MapReduce job? > > Thanks, > Lei > > > [email protected] >
