David,

The core challenge here is to be able to continue scans under failure
conditions. There are several places where we tear down the iterator tree
and rebuild it, including when tablet servers die, when we need to free
resources to support concurrency, and a few others. In order to continue a
scan where we left off, we need to be able to point to some place in the
stream of key/value pairs. If we want to be robust against tablet server
failure we can't just store that scan session information on the tablet
server, so we use the last key that was returned to the client for that
information.

When you add an iterator that transforms keys, you change the meaning of
that pointer so that it points into the transformed stream instead of the
underlying stream. There are two requirements in order to do this sanely:
1. Your iterator should not change the row portion of the key, although it
can change any of the remaining parts.
2. Your iterator's seek method should perform the reverse transformation on
the range when it seeks the underlying iterator(s). This will ensure that
you don't skip ranges of underlying keys when the scanner continues the
scan.

That said, I agree that the behavior of the scanner where it ignores keys
that are returned is probably not optimal, and I'm not sure why it does
that, except maybe to prevent some infinite loops.

Adam



On Tue, Jan 22, 2013 at 11:55 AM, Slater, David M.
<david.sla...@jhuapl.edu>wrote:

> In designing some of my own custom iterators, I was noticing some
> interesting behavior. Note: my iterator does not return the original key,
> but instead returns a computed value that is not necessarily in
> lexicographic order.****
>
> ** **
>
> So far as I can tell, when the Scanner switches between tablets, it checks
> the key that is returned in the new tablet and compares it (I think it
> compares key.row()) with the last key from the previous tablet. If the new
> key is greater than the previous one, then it proceeds normally. If,
> however, the new key is less than or equal to the previous key, then the
> Scanner does not return the value. It does, however, continue to iterate
> through the tablet, continuing to compare until it finds a key greater than
> the last one. Once it finds one, however, it progresses through the rest of
> that tablet without doing a check. (It implicitly assumes that everything
> in a tablet will be correctly ordered). ****
>
> ** **
>
> Now if I was to return the original key, it would work fine (since it
> would always be in order), but that also limits the functionality of my
> custom iterator. ****
>
> ** **
>
> My primary question is: why would it be designed this way? When switching
> between tablets, are there potential problems that might crop up if this
> check isn’t done?****
>
> ** **
>
> Thanks,
> David****
>

Reply via email to