A cursor is an opaque deletion-tolerant index into a Btree keyed by source
userid and modification time. It brings you to a point in time in the
reverse chron sorted list. So, since you can't change the past, other than
erasing it, it's effectively stable. (Modifications bubble to the top.) But
you have to deal with additions at the list head and also block shrinkage
due to deletions, so your blocks begin to overlap quite a bit as the data
ages. (If you cache cursors and read much later, you'll see the first few
rows of cursor[n+1]'s block as duplicates of the last rows of cursor[n]'s
block. The intersection cardinality is equal to the number of deletions in
cursor[n]'s block). Still, there may be value in caching these cursors and
then heuristically rebalancing them when the overlap proportion crosses some
threshold.


-John Kalucki
http://twitter.com/jkalucki
Infrastructure, Twitter Inc.


On Sat, Jan 16, 2010 at 10:40 PM, Marc Mims <marc.m...@gmail.com> wrote:

> * John Kalucki <j...@twitter.com> [091209 09:28]:
> > A cursor should be valid forever, but as it ages and rows are removed,
> you
> > might see some minor data loss and probably more duplicates.
>
> Out of curiosity, what is a cursor?  From our (the users') perspective,
> it's just an opaque number.  But I'm curious.  How is it generated?
> What does it represent internally?
>
>        -Marc
>

Reply via email to