[twitter-dev] Re: Twitter, Please Explain How Cursors Work

John Kalucki Tue, 06 Oct 2009 16:12:39 -0700

I described, in some detail, the reasons for cursors here:
http://groups.google.com/group/twitter-development-talk/msg/badfb7b6074aab10


If the details are uninteresting, the high-level summary is this: The
paged API was designed in a previous era. Paging is simply too
expensive and totally impractical to provide with the current
following counts. Also the QoS had deteriorated to the point where
some doubted that anyone was seriously using the methods. Paging is
going away and paging is not coming back.

The cursored approach allows us to continue to provide access to the
social graph via the REST API. As a benefit, QoS has been dramatically
improved and data quality is now pretty close to perfect.

If the implementation details and invariants described are confusing,
then stick to the well worn part of the path: Request the first block
with a cursor of -1. Keep requesting forward until you get a cursor of
0.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.

On Oct 6, 11:06 am, Jesse Stay <jesses...@gmail.com> wrote:
> I said the same thing in the last thread about this - still no clue what
> Twitter is doing with cursors and how it is any different than the previous
> paging methods.
> Jesse
>
> On Tue, Oct 6, 2009 at 10:22 AM, Dewald Pretorius <dpr...@gmail.com> wrote:
>
> > Thanks John. However, I will be the first to put up my hand and say
> > that I have no clue what you said.
>
> > Can someone please translate John's answer into easy to understand
> > language, with specific relation to the questions I asked?
>
> > Dewald
>
> > On Oct 5, 1:17 am, John Kalucki <jkalu...@gmail.com> wrote:
> > > I haven't looked at all the parts of the system, so there's some
> > > chance that I'm missing something.
>
> > > The method returns the followers in the reverse chronological order of
> > > edge creation. Cursor A will have the most recent 5,000 edges, by
> > > creation time, B the next most recent 5,000, etc. The last cursor will
> > > have the oldest edges.
>
> > > Each cursor points to some arbitrary edge. If you go back and retrieve
> > > cursor B, you should receive N edges created just before the edge-
> > > pointed-to-by-B was created. I don't recall if N is always 5000,
> > > generally 5000 or if it's at most 5000. This detail shouldn't matter,
> > > other than, on occasion, you'll make an extra API call.
>
> > > In any case, retrieving cursor B will never return edges created after
> > > the edge-pointed-to-by-B was created. All edges returned by cursor B
> > > will be no-newer-than, and generally older than, than the edge-pointed-
> > > to-by-B.
>
> > > So, all future sets returned by cursor B are always disjoint from the
> > > set originally returned by cursor A. In your example, if you refetched
> > > both A and B, the result sets wouldn't be disjoint as there are no
> > > longer 5,000 edges between cursor A and cursor B.
>
> > > I think this, in part answers your question. ?
>
> > > -John Kaluckihttp://twitter.com/jkalucki
> > > Services, Twitter Inc.
>
> > > On Oct 4, 6:10 pm, Dewald Pretorius <dpr...@gmail.com> wrote:
>
> > > > For discussion purposes, let's assume I am cursoring through a very
> > > > volatile followers list of @veryvolatile. We have the following
> > > > cursors:
>
> > > > A = 5,000
> > > > B = 5,000
> > > > C = 5,000
>
> > > > I retrieve Cursor A and process it. Next I retrieve Cursor B and
> > > > process it. Then I retrieve Cursor C and process it.
>
> > > > While I am processing Cursor C, 200 of the people who were in Cursor A
> > > > unfollow @veryvolatile, and 400 of the people who were in Cursor B
> > > > unfollow @veryvolatile.
>
> > > > What do I get when I go back from C to B? Do I now get 4,600 ids in
> > > > the list?
>
> > > > Or, do I get 5,000 in B, which now includes a subset of 400 ids that
> > > > were previously in Cursor A?
>
> > > > Dewald

[twitter-dev] Re: Twitter, Please Explain How Cursors Work

Reply via email to