He's talking about using iterators that transform keys (we don't have any built-in, IIRC), like those that extend the new TransformingIterator. Scanner logic is written, such that it will resume scanning from the last key it received. This is important for handling failures and splits/migrations during a scan. So, in this context, a "reversible transformation" simply means that when the client tells the tserver's iterator stack scan, it can transform what the client thinks is the starting point for the scan, back to what it actually should have been prior to transformation, so it can resume from the correct place. This is necessary, because the client will not know what the data looked like prior to transformation, as it only sees data returned from the iterator stack.
Now, the assumption here, is that the key that the client *thinks* is the starting point is in the same tablet that the real starting *is*. Otherwise, it doesn't matter if the transformation is reversible, because the real starting point could be on a different tablet entirely (due to splits). To ensure this doesn't happen, it's important to make sure that transforming iterators that you implement do not transform the RowID portion of the key... or else, if they do, they can send a special key back, that is understood by client code that can inform the client to query a different tablet server... the one the client needs to resume scanning from. Yes, there should be unit tests, but the unit tests would be against iterators that actually transform keys in this way... and I don't think we provide any. That'd be user code. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Sat, May 25, 2013 at 9:36 AM, David Medinets <[email protected]> wrote: > Is there a unit test exposing this behavior? And what does "reversible > transformation" mean? > > > On Wed, May 1, 2013 at 8:36 PM, Adam Fuchs <[email protected]> wrote: >> >> For all the rest of you on this thread, the big problem you'll run into >> when returning keys out of range is that the reseeking behavior will skip a >> bunch of underlying keys (i.e. don't try this at home). For example, say you >> have tablets ["A","D"], ("D","M"], and ("M","ZZZZ..."]. If you do a query on >> ["A","M"] and return "N" after seeing the underlying key "A", you may never >> see keys from the ("D","M"] tablet. A good rule of thumb is to return keys >> in the same row as the underlying keys that were used to generate them and >> use a reversible transformation of columns within each row. >> >
