Hi Lars: That is useful. I appreciate it. The idea about cross row transaction is an interesting one.
Can I have an iterator on the client side that get rows from a coprocessor? (i.e. Filtered rows are streamed into the client application and client can access them via iterator) Best Regards, Jerry On Thu, Aug 2, 2012 at 12:13 AM, lars hofhansl <[email protected]> wrote: > The Filter is initialized per Region as part of a RegionScannerImpl. > > So as long as all the rows you are interested are co-located in the same > region you can keep that state in the Filter instance. > > You can use a custom RegionSplitPolicy to control (to some extend at > least) how the rows are colocated (KeyPrefixRegionSplitPolicy is an > example). > > I also blogged about this here (in the context of cross row transactions): > http://hadoop-hbase.blogspot.com/2012/02/limited-cross-row-transactions-in-hbase.html > > > Maybe what you really are looking for are coprocessors? > > > -- Lars > > > > ----- Original Message ----- > From: Jerry Lam <[email protected]> > To: "[email protected]" <[email protected]> > Cc: > Sent: Wednesday, August 1, 2012 7:06 PM > Subject: Re: Filter with State > > Hi Lars, > > I understand that it is more difficult to carry states across > regions/servers, how about in a single region? Knowing that the rows in a > single region have dependencies, can we have filter with state? If filter > doesn't provide this ability, is there other mechanism in hbase to offer > this kind of functionalities? > > I think this is a good feature because it allows efficient scanning on > dependent rows. Instead of fetching each row to the client side and check > if we should fetch the next row, the filter on the server side handles this > logic. > > Best Regards, > > Jerry > > Sent from my iPad (sorry for spelling mistakes) > > On 2012-08-01, at 21:52, lars hofhansl <[email protected]> wrote: > > > The issue here is that different rows can be located in different > regions or even different region servers, so no local state will carry over > all rows. > > > > > > > > ----- Original Message ----- > > From: Jerry Lam <[email protected]> > > To: "[email protected]" <[email protected]> > > Cc: "[email protected]" <[email protected]> > > Sent: Wednesday, August 1, 2012 5:48 PM > > Subject: Re: Filter with State > > > > Hi St.Ack: > > > > Schema cannot be changed to a single row. > > The API describes "Do not rely on filters carrying state across rows; > its not reliable in current hbase as we have no handlers in place for when > regions split, close or server crashes." If we manage region splitting > ourselves, so the split issue doesn't apply. Other failures can be handled > on the application level. Does each invocation of scanner.next instantiate > a new filter at the server side even on the same region (I.e. Does scanning > on the same region use the same filter or different filter depending on the > scanner.next calls??) > > > > Best Regards, > > > > Jerry > > > > Sent from my iPad (sorry for spelling mistakes) > > > > On 2012-08-01, at 18:44, Stack <[email protected]> wrote: > > > >> On Wed, Aug 1, 2012 at 10:44 PM, Jerry Lam <[email protected]> > wrote: > >>> Hi HBase guru: > >>> > >>> From Lars George talk, he mentions that filter has no state. What if I > need > >>> to scan rows in which the decision to filter one row or not is based > on the > >>> previous row's column values? Any idea how one can implement this type > of > >>> logic? > >> > >> You could try carrying state in the client (but if client dies state > dies). > >> > >> You can't have scanners carry state across rows. It says so in API > >> > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/package-summary.html#package_description > >> (Whatever about the API, if LarsG says it, it must be so!). > >> > >> Here is the issue: If row X is in region A on server 1 there is > >> nothing to prevent row X+1 from being on region B on server 2. How do > >> you carry the state between such rows reliably? > >> > >> Can you redo your schema such that the state you need to carry remains > >> within a row? > >> St.Ack > > > >
