Actually i mean how to do randomly get in MapReduce, not scan.

Let me give a detailed description of my requirement:
There's a Hbase table contais all the users(about 2G) we collected, and the 
rowkey is the user id.  
Every hour there comes some user info(5M~10M)
For every coming user, get(HBase Get) the info from HBase, do a merge with the 
current hour info and put to HBase again. (If the user not exists in HBase, 
just consider this hour info)

Now the getting step is done on one machine, i want to do it distributly with 
MapReduce.



[email protected]
 
From: Shahab Yunus
Date: 2014-08-11 20:10
To: [email protected]
Subject: Re: How to get specific rowkey from hbase
You can use the util classes provided already. Note that it won't be very
fast and you might want to try out bulk import as well (especially if it is
one time or rare occurrence.) It depends on your use case. Check out the
documentation below:
 
For the Map Reduce Hbase util:
http://hbase.apache.org/book/mapreduce.example.html
http://bigdataprocessing.wordpress.com/2012/07/27/hadoop-hbase-mapreduce-examples/
 
For Hbase Bulk import:
http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/
 
Regards,
Shahab
 
 
On Mon, Aug 11, 2014 at 7:14 AM, [email protected] <[email protected]>
wrote:
 
>
> Hi,
>
>     I have an input which has  about  10M records,each recored is a rowkey
> in hbase.
>     How can i get these data from HBase with MapReduce job?
>
> Thanks,
> Lei
>
>
> [email protected]
>

Reply via email to