Hi,

>In a scan, when a filter's filterKeyValue method returns
>ReturnCode.NEXT_ROW - does it actually skip to the next row or just the
>next batch

It will go to the new row.

>In HBase 0.92
> hasFilterRow has not been overridden for certain filters which effectively
> do filter out rows (SingleColumnValueFilter for example). 

Yes this is an issue in old versions. It is fixed in trunk now.

> I spent some time looking at HRegion.java to get to grips with how
> filterRow works (or not) when batching is enabled.

See the method RegionScannerImpl#nextInternal(int limit)  [In HRegion.java]. 
You can see a do while loop. This loop takes all the KVs for a row (and thus 
can be grouped as one Result). This one only checks for the batch size (limit)  
When the filter says to go to next row, there will be a seek to the next row 
[As Ted said see the code in StoreScanner]. This will make the peekRow() return 
the next row key which is not same as the currentRow.. [Pls see the code]..  So 
this batch will end there and next batch will be KVs from next row only.

-Anoop-
________________________________________
From: Ted Yu [[email protected]]
Sent: Wednesday, January 23, 2013 6:18 AM
To: [email protected]
Subject: Re: ResultCode.NEXT_ROW and scans with batching enabled

Take a look at StoreScanner#next():

        ScanQueryMatcher.MatchCode qcode = matcher.match(kv);

...

          case SEEK_NEXT_ROW:

            // This is just a relatively simple end of scan fix, to
short-cut end

            // us if there is an endKey in the scan.

            if (!matcher.moreRowsMayExistAfter(kv)) {

              return false;

            }

            reseek(matcher.getKeyForNextRow(kv));

            break;
Cheers

On Tue, Jan 22, 2013 at 4:13 PM, David Koch <[email protected]> wrote:

> Hello,
>
> In a scan, when a filter's filterKeyValue method returns
> ReturnCode.NEXT_ROW - does it actually skip to the next row or just the
> next batch, provided of course batching is enabled? Where in the HBase
> source code can I find out about this?
>
> I spent some time looking at HRegion.java to get to grips with how
> filterRow works (or not) when batching is enabled. In HBase 0.92
> hasFilterRow has not been overridden for certain filters which effectively
> do filter out rows (SingleColumnValueFilter for example). Thus, these
> filters do not generate a warning when used with a batched scan which -
> while risky - provides the needed filtering in some cases. This has been
> fixed for subsequent versions (at least 0.96) so I need to re-implement
> custom filters which use this "effect".
>
> Thanks,
>
> /David
>

Reply via email to