HBASE-6870 targeted whole table scanning for each coprocessorService call which exhibited itself through:
HTable#coprocessorService -> getStartKeysInRange -> getStartEndKeys -> getRegionLocations -> MetaScanner.allTableRegions(getConfiguration(), getTableName(), false) The cached region locations in HConnectionImplementation would be used. Cheers On Sat, Aug 17, 2013 at 2:21 PM, Asaf Mesika <[email protected]> wrote: > Ted, can you elaborate a little bit why this issue boosts performance? > I couldn't figure out from the issue comments if they execCoprocessor scans > the entire .META. table or and entire table, to understand the actual > improvement. > > Thanks! > > > > > On Fri, Aug 9, 2013 at 8:44 AM, Ted Yu <[email protected]> wrote: > > > I think you need HBASE-6870 which went into 0.94.8 > > > > Upgrading should boost coprocessor performance. > > > > Cheers > > > > On Aug 8, 2013, at 10:21 PM, Kiru Pakkirisamy <[email protected] > > > > wrote: > > > > > Ted, > > > Here is the method signature/protocol > > > public Map<String, Double> getFooMap<String, Double> input, > > > int topN) throws IOException; > > > > > > There are 31 regions on 4 nodes X 8 CPU. > > > I am on 0.94.6 (from Hortonworks). > > > I think it seems to behave like what linwukang says, - it is almost a > > full table scan in the coprocessor. > > > Actually, when I set more specific ColumnPrefixFilters performance went > > down. > > > I want to do things on the server side because, I dont want to be > > sending 500K column/values to the client. > > > I cannot believe a single-threaded client which does some calculations > > and group-by beats the coprocessor running in 31 regions. > > > > > > Regards, > > > - kiru > > > > > > > > > Kiru Pakkirisamy | webcloudtech.wordpress.com > > > > > > > > > ________________________________ > > > From: Ted Yu <[email protected]> > > > To: [email protected]; Kiru Pakkirisamy <[email protected] > > > > > Sent: Thursday, August 8, 2013 8:40 PM > > > Subject: Re: Client Get vs Coprocessor scan performance > > > > > > > > > Can you give us a bit more information ? > > > > > > How do you deliver the 55 rowkeys to your endpoint ? > > > How many regions do you have for this table ? > > > > > > What HBase version are you using ? > > > > > > Thanks > > > > > > On Thu, Aug 8, 2013 at 6:43 PM, Kiru Pakkirisamy > > > <[email protected]>wrote: > > > > > >> Hi, > > >> I am finding an odd behavior with the Coprocessor performance lagging > a > > >> client side Get. > > >> I have a table with 500000 rows. Each have variable # of columns in > one > > >> column family (in this case about 600000 columns in total are > processed) > > >> When I try to get specific 55 rows, the client side completes in > > half-the > > >> time as the coprocessor endpoint. > > >> I am using 55 RowFilters on the Coprocessor scan side. The rows are > > >> processed are exactly the same way in both the cases. > > >> Any pointers on how to debug this scenario ? > > >> > > >> Regards, > > >> - kiru > > >> > > >> > > >> Kiru Pakkirisamy | webcloudtech.wordpress.com > > >
