Ted, Here is the method signature/protocol public Map<String, Double> getFooMap<String, Double> input, int topN) throws IOException;
There are 31 regions on 4 nodes X 8 CPU. I am on 0.94.6 (from Hortonworks). I think it seems to behave like what linwukang says, - it is almost a full table scan in the coprocessor. Actually, when I set more specific ColumnPrefixFilters performance went down. I want to do things on the server side because, I dont want to be sending 500K column/values to the client. I cannot believe a single-threaded client which does some calculations and group-by beats the coprocessor running in 31 regions. Regards, - kiru Kiru Pakkirisamy | webcloudtech.wordpress.com ________________________________ From: Ted Yu <[email protected]> To: [email protected]; Kiru Pakkirisamy <[email protected]> Sent: Thursday, August 8, 2013 8:40 PM Subject: Re: Client Get vs Coprocessor scan performance Can you give us a bit more information ? How do you deliver the 55 rowkeys to your endpoint ? How many regions do you have for this table ? What HBase version are you using ? Thanks On Thu, Aug 8, 2013 at 6:43 PM, Kiru Pakkirisamy <[email protected]>wrote: > Hi, > I am finding an odd behavior with the Coprocessor performance lagging a > client side Get. > I have a table with 500000 rows. Each have variable # of columns in one > column family (in this case about 600000 columns in total are processed) > When I try to get specific 55 rows, the client side completes in half-the > time as the coprocessor endpoint. > I am using 55 RowFilters on the Coprocessor scan side. The rows are > processed are exactly the same way in both the cases. > Any pointers on how to debug this scenario ? > > Regards, > - kiru > > > Kiru Pakkirisamy | webcloudtech.wordpress.com
