On Wed, Aug 1, 2012 at 10:44 PM, Jerry Lam <[email protected]> wrote: > Hi HBase guru: > > From Lars George talk, he mentions that filter has no state. What if I need > to scan rows in which the decision to filter one row or not is based on the > previous row's column values? Any idea how one can implement this type of > logic?
You could try carrying state in the client (but if client dies state dies). You can't have scanners carry state across rows. It says so in API http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/package-summary.html#package_description (Whatever about the API, if LarsG says it, it must be so!). Here is the issue: If row X is in region A on server 1 there is nothing to prevent row X+1 from being on region B on server 2. How do you carry the state between such rows reliably? Can you redo your schema such that the state you need to carry remains within a row? St.Ack
