Sorry for the delay. It looks like the problem is because of PrefixFilter... I assumed that i does a seek...
If i use startRow instead.. it works fine.. But is it the correct approach? ./zahoor On Wed, Oct 17, 2012 at 3:38 AM, lars hofhansl <[email protected]> wrote: > I reopened HBASE-6577 > > > > ----- Original Message ----- > From: lars hofhansl <[email protected]> > To: "[email protected]" <[email protected]>; lars hofhansl < > [email protected]> > Cc: > Sent: Tuesday, October 16, 2012 2:39 PM > Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks > > Looks like this is exactly the scenario I was trying to optimize with > HBASE-6577. Hmm... > ________________________________ > From: lars hofhansl <[email protected]> > To: "[email protected]" <[email protected]> > Sent: Tuesday, October 16, 2012 12:21 AM > Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks > > PrefixFilter does not do any seeking by itself, so I doubt this is related > to HBASE-6757. > Does this only happen with FAST_DIFF compression? > > > If you can create an isolated test program (that sets up the scenario and > then runs a scan with the filter such that it is very slow), I'm happy to > take a look. > > -- Lars > > > > ----- Original Message ----- > From: J Mohamed Zahoor <[email protected]> > To: "[email protected]" <[email protected]> > Cc: > Sent: Monday, October 15, 2012 10:27 AM > Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks > > Is this related to HBASE-6757 ? > I use a filter list with > - prefix filter > - filter list of column filters > > /zahoor > > On Monday, October 15, 2012, J Mohamed Zahoor wrote: > > > Hi > > > > My scanner performance is very slow when using a Prefix filter on a > > **Encoded Column** ( encoded using FAST_DIFF on both memory and disk). > > I am using 94.1 hbase. > > > > jstack shows that much time is spent on seeking the row. > > Even if i give a exact row key match in the prefix filter it takes about > > two minutes to return a single row. > > Running this multiple times also seems to be redirecting things to disk > > (loadBlock). > > > > > > at > > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:1027) > > at > > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:461) > > at > > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) > > at > > > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) > > at > > > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) > > at > > > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) > > at > > > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521) > > - locked <0x000000059584fab8> (a > > org.apache.hadoop.hbase.regionserver.StoreScanner) > > at > > > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402) > > - locked <0x000000059584fab8> (a > > org.apache.hadoop.hbase.regionserver.StoreScanner) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRow(HRegion.java:3507) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3455) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3406) > > - locked <0x000000059589bb30> (a > > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3423) > > > > If is set the start and end row as same row in scan ... it come in very > > quick. > > > > Saw this link > > > http://search-hadoop.com/m/9f0JH1Kz24U1&subj=Re+HBase+0+94+2+SNAPSHOT+Scanning+Bug > > But it looks like things are fine in 94.1. > > > > Any pointers on why this is slow? > > > > > > Note: the row has not many columns(5 and less than a kb) and lots of > > versions (1500+) > > > > ./zahoor > > > > > > > >
