You kind of have two threads along the same lines. See my response in your other thread...
On Aug 22, 2013, at 10:41 AM, Pavan Sudheendra <[email protected]> wrote: > scan.setCaching(500); > > I really don't understand this purpose though.. > > > On Thu, Aug 22, 2013 at 9:09 PM, Kevin O'dell <[email protected]>wrote: > >> QQ what is your caching set to? >> On Aug 22, 2013 11:25 AM, "Pavan Sudheendra" <[email protected]> wrote: >> >>> Hi all, >>> >>> A serious question.. I know this isn't one of the best hbase practices >> but >>> I really want to know.. >>> >>> I am doing a join across 3 table in hbase.. One table contain 19m >> records, >>> one contains 2m and another contains 1m records. >>> >>> I'm doing this inside the mapper function.. I know this can be done with >>> pig and hive etc. Leaving the specifics out, how long would experts think >>> it would take for the mapper to finish aggregating them across a 6 node >>> cluster.. One is the job tracker and 5 are task trackers.. By the time I >>> see the map reduce job status for input records reach 600,000 it's taking >>> an hour.. It can't be right.. >>> >>> Any tips? Please help. >>> >>> Thanks. >>> >>> -- >>> Regards- >>> Pavan >>> >> > > > > -- > Regards- > Pavan The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental. Use at your own risk. Michael Segel michael_segel (AT) hotmail.com
