Hi guys, Thank you for the explanations.
/David On Wed, Jan 23, 2013 at 4:44 AM, Anoop Sam John <[email protected]> wrote: > Hi, > > >In a scan, when a filter's filterKeyValue method returns > >ReturnCode.NEXT_ROW - does it actually skip to the next row or just the > >next batch > > It will go to the new row. > > >In HBase 0.92 > > hasFilterRow has not been overridden for certain filters which > effectively > > do filter out rows (SingleColumnValueFilter for example). > > Yes this is an issue in old versions. It is fixed in trunk now. > > > I spent some time looking at HRegion.java to get to grips with how > > filterRow works (or not) when batching is enabled. > > See the method RegionScannerImpl#nextInternal(int limit) [In > HRegion.java]. You can see a do while loop. This loop takes all the KVs for > a row (and thus can be grouped as one Result). This one only checks for the > batch size (limit) When the filter says to go to next row, there will be a > seek to the next row [As Ted said see the code in StoreScanner]. This will > make the peekRow() return the next row key which is not same as the > currentRow.. [Pls see the code].. So this batch will end there and next > batch will be KVs from next row only. > > -Anoop- > ________________________________________ > From: Ted Yu [[email protected]] > Sent: Wednesday, January 23, 2013 6:18 AM > To: [email protected] > Subject: Re: ResultCode.NEXT_ROW and scans with batching enabled > > Take a look at StoreScanner#next(): > > ScanQueryMatcher.MatchCode qcode = matcher.match(kv); > > ... > > case SEEK_NEXT_ROW: > > // This is just a relatively simple end of scan fix, to > short-cut end > > // us if there is an endKey in the scan. > > if (!matcher.moreRowsMayExistAfter(kv)) { > > return false; > > } > > reseek(matcher.getKeyForNextRow(kv)); > > break; > Cheers > > On Tue, Jan 22, 2013 at 4:13 PM, David Koch <[email protected]> wrote: > > > Hello, > > > > In a scan, when a filter's filterKeyValue method returns > > ReturnCode.NEXT_ROW - does it actually skip to the next row or just the > > next batch, provided of course batching is enabled? Where in the HBase > > source code can I find out about this? > > > > I spent some time looking at HRegion.java to get to grips with how > > filterRow works (or not) when batching is enabled. In HBase 0.92 > > hasFilterRow has not been overridden for certain filters which > effectively > > do filter out rows (SingleColumnValueFilter for example). Thus, these > > filters do not generate a warning when used with a batched scan which - > > while risky - provides the needed filtering in some cases. This has been > > fixed for subsequent versions (at least 0.96) so I need to re-implement > > custom filters which use this "effect". > > > > Thanks, > > > > /David > > >
