Ted, can you elaborate a little bit why this issue boosts performance? I couldn't figure out from the issue comments if they execCoprocessor scans the entire .META. table or and entire table, to understand the actual improvement.
Thanks! On Fri, Aug 9, 2013 at 8:44 AM, Ted Yu <[email protected]> wrote: > I think you need HBASE-6870 which went into 0.94.8 > > Upgrading should boost coprocessor performance. > > Cheers > > On Aug 8, 2013, at 10:21 PM, Kiru Pakkirisamy <[email protected]> > wrote: > > > Ted, > > Here is the method signature/protocol > > public Map<String, Double> getFooMap<String, Double> input, > > int topN) throws IOException; > > > > There are 31 regions on 4 nodes X 8 CPU. > > I am on 0.94.6 (from Hortonworks). > > I think it seems to behave like what linwukang says, - it is almost a > full table scan in the coprocessor. > > Actually, when I set more specific ColumnPrefixFilters performance went > down. > > I want to do things on the server side because, I dont want to be > sending 500K column/values to the client. > > I cannot believe a single-threaded client which does some calculations > and group-by beats the coprocessor running in 31 regions. > > > > Regards, > > - kiru > > > > > > Kiru Pakkirisamy | webcloudtech.wordpress.com > > > > > > ________________________________ > > From: Ted Yu <[email protected]> > > To: [email protected]; Kiru Pakkirisamy <[email protected]> > > Sent: Thursday, August 8, 2013 8:40 PM > > Subject: Re: Client Get vs Coprocessor scan performance > > > > > > Can you give us a bit more information ? > > > > How do you deliver the 55 rowkeys to your endpoint ? > > How many regions do you have for this table ? > > > > What HBase version are you using ? > > > > Thanks > > > > On Thu, Aug 8, 2013 at 6:43 PM, Kiru Pakkirisamy > > <[email protected]>wrote: > > > >> Hi, > >> I am finding an odd behavior with the Coprocessor performance lagging a > >> client side Get. > >> I have a table with 500000 rows. Each have variable # of columns in one > >> column family (in this case about 600000 columns in total are processed) > >> When I try to get specific 55 rows, the client side completes in > half-the > >> time as the coprocessor endpoint. > >> I am using 55 RowFilters on the Coprocessor scan side. The rows are > >> processed are exactly the same way in both the cases. > >> Any pointers on how to debug this scenario ? > >> > >> Regards, > >> - kiru > >> > >> > >> Kiru Pakkirisamy | webcloudtech.wordpress.com >
