First i upgraded my cluster to 94.2.. even then the problem persisted.. Then i moved to using startRow instead of prefix filter..
,/zahoor On Wed, Oct 17, 2012 at 2:12 PM, J Mohamed Zahoor <[email protected]> wrote: > Sorry for the delay. > > It looks like the problem is because of PrefixFilter... > I assumed that i does a seek... > > If i use startRow instead.. it works fine.. But is it the correct approach? > > ./zahoor > > > On Wed, Oct 17, 2012 at 3:38 AM, lars hofhansl <[email protected]>wrote: > >> I reopened HBASE-6577 >> >> >> >> ----- Original Message ----- >> From: lars hofhansl <[email protected]> >> To: "[email protected]" <[email protected]>; lars hofhansl < >> [email protected]> >> Cc: >> Sent: Tuesday, October 16, 2012 2:39 PM >> Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks >> >> Looks like this is exactly the scenario I was trying to optimize with >> HBASE-6577. Hmm... >> ________________________________ >> From: lars hofhansl <[email protected]> >> To: "[email protected]" <[email protected]> >> Sent: Tuesday, October 16, 2012 12:21 AM >> Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks >> >> PrefixFilter does not do any seeking by itself, so I doubt this is >> related to HBASE-6757. >> Does this only happen with FAST_DIFF compression? >> >> >> If you can create an isolated test program (that sets up the scenario and >> then runs a scan with the filter such that it is very slow), I'm happy to >> take a look. >> >> -- Lars >> >> >> >> ----- Original Message ----- >> From: J Mohamed Zahoor <[email protected]> >> To: "[email protected]" <[email protected]> >> Cc: >> Sent: Monday, October 15, 2012 10:27 AM >> Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks >> >> Is this related to HBASE-6757 ? >> I use a filter list with >> - prefix filter >> - filter list of column filters >> >> /zahoor >> >> On Monday, October 15, 2012, J Mohamed Zahoor wrote: >> >> > Hi >> > >> > My scanner performance is very slow when using a Prefix filter on a >> > **Encoded Column** ( encoded using FAST_DIFF on both memory and disk). >> > I am using 94.1 hbase. >> > >> > jstack shows that much time is spent on seeking the row. >> > Even if i give a exact row key match in the prefix filter it takes about >> > two minutes to return a single row. >> > Running this multiple times also seems to be redirecting things to disk >> > (loadBlock). >> > >> > >> > at >> > >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:1027) >> > at >> > >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:461) >> > at >> > >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) >> > at >> > >> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) >> > at >> > >> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) >> > at >> > >> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) >> > at >> > >> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521) >> > - locked <0x000000059584fab8> (a >> > org.apache.hadoop.hbase.regionserver.StoreScanner) >> > at >> > >> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402) >> > - locked <0x000000059584fab8> (a >> > org.apache.hadoop.hbase.regionserver.StoreScanner) >> > at >> > >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRow(HRegion.java:3507) >> > at >> > >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3455) >> > at >> > >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3406) >> > - locked <0x000000059589bb30> (a >> > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) >> > at >> > >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3423) >> > >> > If is set the start and end row as same row in scan ... it come in very >> > quick. >> > >> > Saw this link >> > >> http://search-hadoop.com/m/9f0JH1Kz24U1&subj=Re+HBase+0+94+2+SNAPSHOT+Scanning+Bug >> > But it looks like things are fine in 94.1. >> > >> > Any pointers on why this is slow? >> > >> > >> > Note: the row has not many columns(5 and less than a kb) and lots of >> > versions (1500+) >> > >> > ./zahoor >> > >> > >> > >> >> >
