Hi St.Ack:

Schema cannot be changed to a single row.
The API describes "Do not rely on filters carrying state across rows; its not 
reliable in current hbase as we have no handlers in place for when regions 
split, close or server crashes." If we manage region splitting ourselves, so 
the split issue doesn't apply. Other failures can be handled on the application 
level. Does each invocation of scanner.next instantiate a new filter at the 
server side even on the same region (I.e. Does scanning on the same region use 
the same filter or different filter depending on the scanner.next calls??)

Best Regards,

Jerry 

Sent from my iPad (sorry for spelling mistakes)

On 2012-08-01, at 18:44, Stack <[email protected]> wrote:

> On Wed, Aug 1, 2012 at 10:44 PM, Jerry Lam <[email protected]> wrote:
>> Hi HBase guru:
>> 
>> From Lars George talk, he mentions that filter has no state. What if I need
>> to scan rows in which the decision to filter one row or not is based on the
>> previous row's column values? Any idea how one can implement this type of
>> logic?
> 
> You could try carrying state in the client (but if client dies state dies).
> 
> You can't have scanners carry state across rows.  It says so in API
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/package-summary.html#package_description
> (Whatever about the API, if LarsG says it, it must be so!).
> 
> Here is the issue: If row X is in region A on server 1 there is
> nothing to prevent row X+1 from being on region B on server 2.  How do
> you carry the state between such rows reliably?
> 
> Can you redo your schema such that the state you need to carry remains
> within a row?
> St.Ack

Reply via email to