Thanks for reporting back!
On Fri, Mar 8, 2013 at 9:02 AM, Kim Hamilton <[email protected]> wrote: > I profiled it and getStartKeysInRange is taking all the time. Recall I'm > running 0.92.1. I think these factors are consistent with > https://issues.apache.org/jira/browse/HBASE-5492, which was fixed in > 0.92.3. > > We'll be upgrading soon, so I'll be able to verify the perf issue is gone. > > Thanks for the help everyone! > > > On Tue, Mar 5, 2013 at 8:54 PM, Kimdhamilton <[email protected]> > wrote: > > > Yes, definitely. I'm following up tomorrow with more testing and will > > report back. I'm definitely seeing significant load on .META. but want to > > see what I can determine about the root cause > > > > > > Sent from my Samsung smartphone on AT&T > > > > > > -------- Original message -------- > > Subject: RE: endpoint coprocessor performance > > From: Anoop Sam John <[email protected]> > > To: "[email protected]" <[email protected]> > > CC: > > > > > > Yes agree with Andrew here... I checked the 94 code base yday. I also > > feel that the efficiency should be on the higher side.. And there is no > > whole table scan. The HBase client issues scan for only those regions > which > > come under the start/stop keys that app specified. Yes it is contacting > > .META. to know the regions coming within the start/stop rows. But that > > should not be a big efficiency issue IMHO also. > > > > @Kim - Can you do some profiling and let us know which area of code is > > eating up time in your case? > > > > HBASE-6877 also I am seeing. > > > > -Anoop- > > ________________________________________ > > From: Andrew Purtell [[email protected]] > > Sent: Wednesday, March 06, 2013 7:28 AM > > To: [email protected] > > Subject: Re: endpoint coprocessor performance > > > > > In current logic, HTable#coprocessorExec always scan the whole table, > its > > efficiency is low > > > > No, I don't think that is correct. > > > > In its current logic, coprocessorExec always scans the META table for all > > regions of the target table, to find the up to date locations, and then > > dispatches the exec in parallel to all regions of the target table. The > > efficiency of the exec is actually high because invocations happen in > > parallel across the cluster, with results reassembled back at the client > as > > they come in. > > > > The increased setup latency relative to a Scan and the load on META is > > because of the initial scan on META to find the up to date locations of > all > > regions of the target table. For a Scan, the cached locations of regions > > are used, and relocations are handled transparently by the client. Exec > > could be updated to do this as well. > > > > > > > > > > On Wed, Mar 6, 2013 at 5:13 AM, Kim Hamilton <[email protected]> > > wrote: > > > > > Thanks so much! This describes exactly what I'm seeing. I did notice > > > extremely heavy load on the region server carrying .META., as described > > in > > > HBASE-6870: > > > > > > In current logic, HTable#coprocessorExec always scan the whole table, > > > its efficiency > > > is low and will affect the Regionserver carrying .META. under large > > > coprocessorExec requests > > > > > > > > > Thanks again, > > > Kim > > > On Mon, Mar 4, 2013 at 8:08 PM, Stephen Boesch <[email protected]> > > wrote: > > > > > > > great question from Kim and follow-up/answers. > > > > > > > > > > > > 2013/3/4 Gary Helmling <[email protected]> > > > > > > > > > I see this is HBASE-6870. I thought that sounded familiar. > > > > > > > > > > > > > > > On Mon, Mar 4, 2013 at 6:23 PM, Gary Helmling <[email protected] > > > > > > wrote: > > > > > > > > > > > > > > > > > Check your logs for whether your end-point coprocessor is hitting > > > > > >> zookeeper on every invocation to figure out the region start > key. > > > > > >> Unfortunately (at least last time I checked), the default way of > > > > > invoking > > > > > >> an end point coprocessor doesn't use the meta cache. You can go > > > > through > > > > > a > > > > > >> combination of the following instead: > > > > > >> HRegionLocation regionLocation = retried ? > > > > > >> connection.relocateRegion(**tableName, tableKey) : > > > > > >> connection.locateRegion(**tableName, tableKey); > > > > > >> ... > > > > > >> Then call HConnection.processExecs call, passing in the > regionKeys > > > > from > > > > > >> above. > > > > > >> You can trap the error case of the region being relocated and > try > > > > again > > > > > >> with retried = true and it'll update the meta data cache when > > > > > >> relocateRegion is called. > > > > > >> > > > > > > > > > > > > > > > > > > Any idea if we have an improvement logged in JIRA for this? This > > is > > > > > > definitely something we should improve on. > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
