Looks like this is exactly the scenario I was trying to optimize with HBASE-6577. Hmm... ________________________________ From: lars hofhansl <[email protected]> To: "[email protected]" <[email protected]> Sent: Tuesday, October 16, 2012 12:21 AM Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks
PrefixFilter does not do any seeking by itself, so I doubt this is related to HBASE-6757. Does this only happen with FAST_DIFF compression? If you can create an isolated test program (that sets up the scenario and then runs a scan with the filter such that it is very slow), I'm happy to take a look. -- Lars ----- Original Message ----- From: J Mohamed Zahoor <[email protected]> To: "[email protected]" <[email protected]> Cc: Sent: Monday, October 15, 2012 10:27 AM Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks Is this related to HBASE-6757 ? I use a filter list with - prefix filter - filter list of column filters /zahoor On Monday, October 15, 2012, J Mohamed Zahoor wrote: > Hi > > My scanner performance is very slow when using a Prefix filter on a > **Encoded Column** ( encoded using FAST_DIFF on both memory and disk). > I am using 94.1 hbase. > > jstack shows that much time is spent on seeking the row. > Even if i give a exact row key match in the prefix filter it takes about > two minutes to return a single row. > Running this multiple times also seems to be redirecting things to disk > (loadBlock). > > > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:1027) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:461) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) > at > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521) > - locked <0x000000059584fab8> (a > org.apache.hadoop.hbase.regionserver.StoreScanner) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402) > - locked <0x000000059584fab8> (a > org.apache.hadoop.hbase.regionserver.StoreScanner) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRow(HRegion.java:3507) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3455) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3406) > - locked <0x000000059589bb30> (a > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3423) > > If is set the start and end row as same row in scan ... it come in very > quick. > > Saw this link > http://search-hadoop.com/m/9f0JH1Kz24U1&subj=Re+HBase+0+94+2+SNAPSHOT+Scanning+Bug > But it looks like things are fine in 94.1. > > Any pointers on why this is slow? > > > Note: the row has not many columns(5 and less than a kb) and lots of > versions (1500+) > > ./zahoor > > >
