Hi, >In a scan, when a filter's filterKeyValue method returns >ReturnCode.NEXT_ROW - does it actually skip to the next row or just the >next batch
It will go to the new row. >In HBase 0.92 > hasFilterRow has not been overridden for certain filters which effectively > do filter out rows (SingleColumnValueFilter for example). Yes this is an issue in old versions. It is fixed in trunk now. > I spent some time looking at HRegion.java to get to grips with how > filterRow works (or not) when batching is enabled. See the method RegionScannerImpl#nextInternal(int limit) [In HRegion.java]. You can see a do while loop. This loop takes all the KVs for a row (and thus can be grouped as one Result). This one only checks for the batch size (limit) When the filter says to go to next row, there will be a seek to the next row [As Ted said see the code in StoreScanner]. This will make the peekRow() return the next row key which is not same as the currentRow.. [Pls see the code].. So this batch will end there and next batch will be KVs from next row only. -Anoop- ________________________________________ From: Ted Yu [[email protected]] Sent: Wednesday, January 23, 2013 6:18 AM To: [email protected] Subject: Re: ResultCode.NEXT_ROW and scans with batching enabled Take a look at StoreScanner#next(): ScanQueryMatcher.MatchCode qcode = matcher.match(kv); ... case SEEK_NEXT_ROW: // This is just a relatively simple end of scan fix, to short-cut end // us if there is an endKey in the scan. if (!matcher.moreRowsMayExistAfter(kv)) { return false; } reseek(matcher.getKeyForNextRow(kv)); break; Cheers On Tue, Jan 22, 2013 at 4:13 PM, David Koch <[email protected]> wrote: > Hello, > > In a scan, when a filter's filterKeyValue method returns > ReturnCode.NEXT_ROW - does it actually skip to the next row or just the > next batch, provided of course batching is enabled? Where in the HBase > source code can I find out about this? > > I spent some time looking at HRegion.java to get to grips with how > filterRow works (or not) when batching is enabled. In HBase 0.92 > hasFilterRow has not been overridden for certain filters which effectively > do filter out rows (SingleColumnValueFilter for example). Thus, these > filters do not generate a warning when used with a batched scan which - > while risky - provides the needed filtering in some cases. This has been > fixed for subsequent versions (at least 0.96) so I need to re-implement > custom filters which use this "effect". > > Thanks, > > /David >
