Ted,
Here is the method signature/protocol
public Map<String, Double> getFooMap<String, Double> input,
int topN) throws IOException;

There are 31 regions on 4 nodes X 8 CPU.
I am on 0.94.6 (from Hortonworks).
I think it seems to behave like what linwukang says, - it is almost a full 
table scan in the coprocessor. 
Actually, when I set more specific ColumnPrefixFilters performance went down.
I want to do things on the server side because, I dont want to be sending 500K 
column/values to the client.
I cannot believe a single-threaded client which does some calculations and 
group-by  beats the coprocessor running in 31 regions.
 
Regards,
- kiru


Kiru Pakkirisamy | webcloudtech.wordpress.com


________________________________
 From: Ted Yu <[email protected]>
To: [email protected]; Kiru Pakkirisamy <[email protected]> 
Sent: Thursday, August 8, 2013 8:40 PM
Subject: Re: Client Get vs Coprocessor scan performance
 

Can you give us a bit more information ?

How do you deliver the 55 rowkeys to your endpoint ?
How many regions do you have for this table ?

What HBase version are you using ?

Thanks

On Thu, Aug 8, 2013 at 6:43 PM, Kiru Pakkirisamy
<[email protected]>wrote:

> Hi,
> I am finding an odd behavior with the Coprocessor performance lagging a
> client side Get.
> I have a table with 500000 rows. Each have variable # of columns in one
> column family (in this case about 600000 columns in total are processed)
> When I try to get specific 55 rows, the client side completes in half-the
> time as the coprocessor endpoint.
> I am using  55 RowFilters on the Coprocessor scan side. The rows are
> processed are exactly the same way in both the cases.
> Any pointers on how to debug this scenario ?
>
> Regards,
> - kiru
>
>
> Kiru Pakkirisamy | webcloudtech.wordpress.com

Reply via email to