[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-07 Thread Jeffrey Greenberg
John,Please clarify this scenario. If one makes a complete set of calls
starting from cursor -1 unto the end at one moment, and then another set of
the same calls later is there any invariance?  If so what?

From the statements above I understand:
- always 5000 followers are returned (if the user has more than 5000, and
the last call will have less)
- the order is the same: it's the time order that users followed this
account

And thus:
- there is no correlation in the API between a particular cursor and a set
of returned values (followers)

Is that it?


On Tue, Oct 6, 2009 at 4:12 PM, John Kalucki jkalu...@gmail.com wrote:


 I described, in some detail, the reasons for cursors here:

 http://groups.google.com/group/twitter-development-talk/msg/badfb7b6074aab10

 If the details are uninteresting, the high-level summary is this: The
 paged API was designed in a previous era. Paging is simply too
 expensive and totally impractical to provide with the current
 following counts. Also the QoS had deteriorated to the point where
 some doubted that anyone was seriously using the methods. Paging is
 going away and paging is not coming back.

 The cursored approach allows us to continue to provide access to the
 social graph via the REST API. As a benefit, QoS has been dramatically
 improved and data quality is now pretty close to perfect.

 If the implementation details and invariants described are confusing,
 then stick to the well worn part of the path: Request the first block
 with a cursor of -1. Keep requesting forward until you get a cursor of
 0.

 -John Kalucki
 http://twitter.com/jkalucki
 Services, Twitter Inc.

 On Oct 6, 11:06 am, Jesse Stay jesses...@gmail.com wrote:
  I said the same thing in the last thread about this - still no clue what
  Twitter is doing with cursors and how it is any different than the
 previous
  paging methods.
  Jesse
 
  On Tue, Oct 6, 2009 at 10:22 AM, Dewald Pretorius dpr...@gmail.com
 wrote:
 
   Thanks John. However, I will be the first to put up my hand and say
   that I have no clue what you said.
 
   Can someone please translate John's answer into easy to understand
   language, with specific relation to the questions I asked?
 
   Dewald
 
   On Oct 5, 1:17 am, John Kalucki jkalu...@gmail.com wrote:
I haven't looked at all the parts of the system, so there's some
chance that I'm missing something.
 
The method returns the followers in the reverse chronological order
 of
edge creation. Cursor A will have the most recent 5,000 edges, by
creation time, B the next most recent 5,000, etc. The last cursor
 will
have the oldest edges.
 
Each cursor points to some arbitrary edge. If you go back and
 retrieve
cursor B, you should receive N edges created just before the edge-
pointed-to-by-B was created. I don't recall if N is always 5000,
generally 5000 or if it's at most 5000. This detail shouldn't matter,
other than, on occasion, you'll make an extra API call.
 
In any case, retrieving cursor B will never return edges created
 after
the edge-pointed-to-by-B was created. All edges returned by cursor B
will be no-newer-than, and generally older than, than the
 edge-pointed-
to-by-B.
 
So, all future sets returned by cursor B are always disjoint from the
set originally returned by cursor A. In your example, if you
 refetched
both A and B, the result sets wouldn't be disjoint as there are no
longer 5,000 edges between cursor A and cursor B.
 
I think this, in part answers your question. ?
 
-John Kaluckihttp://twitter.com/jkalucki
Services, Twitter Inc.
 
On Oct 4, 6:10 pm, Dewald Pretorius dpr...@gmail.com wrote:
 
 For discussion purposes, let's assume I am cursoring through a very
 volatile followers list of @veryvolatile. We have the following
 cursors:
 
 A = 5,000
 B = 5,000
 C = 5,000
 
 I retrieve Cursor A and process it. Next I retrieve Cursor B and
 process it. Then I retrieve Cursor C and process it.
 
 While I am processing Cursor C, 200 of the people who were in
 Cursor A
 unfollow @veryvolatile, and 400 of the people who were in Cursor B
 unfollow @veryvolatile.
 
 What do I get when I go back from C to B? Do I now get 4,600 ids in
 the list?
 
 Or, do I get 5,000 in B, which now includes a subset of 400 ids
 that
 were previously in Cursor A?
 
 Dewald



[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-07 Thread John Kalucki

First you have to assume no changes to the set. Users with any
significant following will see constant churn. Factoring out natural
churn then:

Ideally, the results are the same. Practically, the results are the
same. In a very few corner cases they are not. For the next several
weeks, for edges that were created over ~2 weeks ago, there will be,
very very rarely, issues with cursor jitter: In theory and in practice
there will be some over-delivery -- the last userid, or so, in a block
may be duplicated in the first rows a subsequent block. In theory
there might be similar under-delivery, but we haven't found an actual
case of under-delivery yet. You may need to deduplicate your results
if your app is very sensitive to duplication. In any case, new edges
no longer suffer from this jitter, and we're going to repair the whole
graph in a few weeks. I think this will require several megawatthours
of computation.

Your first two statements are correct. I don't understand your third
statement. But I think it is a false assertion. Could you briefly
restate?

An aside: There may be some signal in the cursors. Especially in the
most significant bytes. They're references into the edge-creation-time
index after all. I don't know how much obfuscation there is,
especially in the lsb's, but the cursors ideally should be treated as
opaque tokens. While unlikely, we may change their format at some time
in the future. And then various acts of daring do could break.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.

On Oct 7, 6:57 am, Jeffrey Greenberg jeffreygreenb...@gmail.com
wrote:
 John,Please clarify this scenario. If one makes a complete set of calls
 starting from cursor -1 unto the end at one moment, and then another set of
 the same calls later is there any invariance?  If so what?

 From the statements above I understand:
 - always 5000 followers are returned (if the user has more than 5000, and
 the last call will have less)
 - the order is the same: it's the time order that users followed this
 account

 And thus:
 - there is no correlation in the API between a particular cursor and a set
 of returned values (followers)

 Is that it?

 On Tue, Oct 6, 2009 at 4:12 PM, John Kalucki jkalu...@gmail.com wrote:

  I described, in some detail, the reasons for cursors here:

 http://groups.google.com/group/twitter-development-talk/msg/badfb7b60...

  If the details are uninteresting, the high-level summary is this: The
  paged API was designed in a previous era. Paging is simply too
  expensive and totally impractical to provide with the current
  following counts. Also the QoS had deteriorated to the point where
  some doubted that anyone was seriously using the methods. Paging is
  going away and paging is not coming back.

  The cursored approach allows us to continue to provide access to the
  social graph via the REST API. As a benefit, QoS has been dramatically
  improved and data quality is now pretty close to perfect.

  If the implementation details and invariants described are confusing,
  then stick to the well worn part of the path: Request the first block
  with a cursor of -1. Keep requesting forward until you get a cursor of
  0.

  -John Kalucki
 http://twitter.com/jkalucki
  Services, Twitter Inc.

  On Oct 6, 11:06 am, Jesse Stay jesses...@gmail.com wrote:
   I said the same thing in the last thread about this - still no clue what
   Twitter is doing with cursors and how it is any different than the
  previous
   paging methods.
   Jesse

   On Tue, Oct 6, 2009 at 10:22 AM, Dewald Pretorius dpr...@gmail.com
  wrote:

Thanks John. However, I will be the first to put up my hand and say
that I have no clue what you said.

Can someone please translate John's answer into easy to understand
language, with specific relation to the questions I asked?

Dewald

On Oct 5, 1:17 am, John Kalucki jkalu...@gmail.com wrote:
 I haven't looked at all the parts of the system, so there's some
 chance that I'm missing something.

 The method returns the followers in the reverse chronological order
  of
 edge creation. Cursor A will have the most recent 5,000 edges, by
 creation time, B the next most recent 5,000, etc. The last cursor
  will
 have the oldest edges.

 Each cursor points to some arbitrary edge. If you go back and
  retrieve
 cursor B, you should receive N edges created just before the edge-
 pointed-to-by-B was created. I don't recall if N is always 5000,
 generally 5000 or if it's at most 5000. This detail shouldn't matter,
 other than, on occasion, you'll make an extra API call.

 In any case, retrieving cursor B will never return edges created
  after
 the edge-pointed-to-by-B was created. All edges returned by cursor B
 will be no-newer-than, and generally older than, than the
  edge-pointed-
 to-by-B.

 So, all future sets returned by cursor B are always disjoint from the
 set originally 

[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-06 Thread Dewald Pretorius

Thanks John. However, I will be the first to put up my hand and say
that I have no clue what you said.

Can someone please translate John's answer into easy to understand
language, with specific relation to the questions I asked?

Dewald

On Oct 5, 1:17 am, John Kalucki jkalu...@gmail.com wrote:
 I haven't looked at all the parts of the system, so there's some
 chance that I'm missing something.

 The method returns the followers in the reverse chronological order of
 edge creation. Cursor A will have the most recent 5,000 edges, by
 creation time, B the next most recent 5,000, etc. The last cursor will
 have the oldest edges.

 Each cursor points to some arbitrary edge. If you go back and retrieve
 cursor B, you should receive N edges created just before the edge-
 pointed-to-by-B was created. I don't recall if N is always 5000,
 generally 5000 or if it's at most 5000. This detail shouldn't matter,
 other than, on occasion, you'll make an extra API call.

 In any case, retrieving cursor B will never return edges created after
 the edge-pointed-to-by-B was created. All edges returned by cursor B
 will be no-newer-than, and generally older than, than the edge-pointed-
 to-by-B.

 So, all future sets returned by cursor B are always disjoint from the
 set originally returned by cursor A. In your example, if you refetched
 both A and B, the result sets wouldn't be disjoint as there are no
 longer 5,000 edges between cursor A and cursor B.

 I think this, in part answers your question. ?

 -John Kaluckihttp://twitter.com/jkalucki
 Services, Twitter Inc.

 On Oct 4, 6:10 pm, Dewald Pretorius dpr...@gmail.com wrote:

  For discussion purposes, let's assume I am cursoring through a very
  volatile followers list of @veryvolatile. We have the following
  cursors:

  A = 5,000
  B = 5,000
  C = 5,000

  I retrieve Cursor A and process it. Next I retrieve Cursor B and
  process it. Then I retrieve Cursor C and process it.

  While I am processing Cursor C, 200 of the people who were in Cursor A
  unfollow @veryvolatile, and 400 of the people who were in Cursor B
  unfollow @veryvolatile.

  What do I get when I go back from C to B? Do I now get 4,600 ids in
  the list?

  Or, do I get 5,000 in B, which now includes a subset of 400 ids that
  were previously in Cursor A?

  Dewald


[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-06 Thread Jesse Stay
I said the same thing in the last thread about this - still no clue what
Twitter is doing with cursors and how it is any different than the previous
paging methods.
Jesse

On Tue, Oct 6, 2009 at 10:22 AM, Dewald Pretorius dpr...@gmail.com wrote:


 Thanks John. However, I will be the first to put up my hand and say
 that I have no clue what you said.

 Can someone please translate John's answer into easy to understand
 language, with specific relation to the questions I asked?

 Dewald

 On Oct 5, 1:17 am, John Kalucki jkalu...@gmail.com wrote:
  I haven't looked at all the parts of the system, so there's some
  chance that I'm missing something.
 
  The method returns the followers in the reverse chronological order of
  edge creation. Cursor A will have the most recent 5,000 edges, by
  creation time, B the next most recent 5,000, etc. The last cursor will
  have the oldest edges.
 
  Each cursor points to some arbitrary edge. If you go back and retrieve
  cursor B, you should receive N edges created just before the edge-
  pointed-to-by-B was created. I don't recall if N is always 5000,
  generally 5000 or if it's at most 5000. This detail shouldn't matter,
  other than, on occasion, you'll make an extra API call.
 
  In any case, retrieving cursor B will never return edges created after
  the edge-pointed-to-by-B was created. All edges returned by cursor B
  will be no-newer-than, and generally older than, than the edge-pointed-
  to-by-B.
 
  So, all future sets returned by cursor B are always disjoint from the
  set originally returned by cursor A. In your example, if you refetched
  both A and B, the result sets wouldn't be disjoint as there are no
  longer 5,000 edges between cursor A and cursor B.
 
  I think this, in part answers your question. ?
 
  -John Kaluckihttp://twitter.com/jkalucki
  Services, Twitter Inc.
 
  On Oct 4, 6:10 pm, Dewald Pretorius dpr...@gmail.com wrote:
 
   For discussion purposes, let's assume I am cursoring through a very
   volatile followers list of @veryvolatile. We have the following
   cursors:
 
   A = 5,000
   B = 5,000
   C = 5,000
 
   I retrieve Cursor A and process it. Next I retrieve Cursor B and
   process it. Then I retrieve Cursor C and process it.
 
   While I am processing Cursor C, 200 of the people who were in Cursor A
   unfollow @veryvolatile, and 400 of the people who were in Cursor B
   unfollow @veryvolatile.
 
   What do I get when I go back from C to B? Do I now get 4,600 ids in
   the list?
 
   Or, do I get 5,000 in B, which now includes a subset of 400 ids that
   were previously in Cursor A?
 
   Dewald



[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-06 Thread Brian Smith

John,

Based on your description, it looks like you are on the verge of being able
to offer a very useful capability: the ability to query the follows AND
unfollows since the last time you checked. That would be a great addition to
the API.

For example, I'd really like to be able to page through A, B, C, etc. And
then, after that, say OK, what's changed since then?

Regards,
Brian


 -Original Message-
 From: twitter-development-talk@googlegroups.com [mailto:twitter-
 development-t...@googlegroups.com] On Behalf Of John Kalucki
 Sent: Sunday, October 04, 2009 11:17 PM
 To: Twitter Development Talk
 Subject: [twitter-dev] Re: Twitter, Please Explain How Cursors Work
 
 
 I haven't looked at all the parts of the system, so there's some
 chance that I'm missing something.
 
 The method returns the followers in the reverse chronological order of
 edge creation. Cursor A will have the most recent 5,000 edges, by
 creation time, B the next most recent 5,000, etc. The last cursor will
 have the oldest edges.
 
 Each cursor points to some arbitrary edge. If you go back and retrieve
 cursor B, you should receive N edges created just before the edge-
 pointed-to-by-B was created. I don't recall if N is always 5000,
 generally 5000 or if it's at most 5000. This detail shouldn't matter,
 other than, on occasion, you'll make an extra API call.
 
 In any case, retrieving cursor B will never return edges created after
 the edge-pointed-to-by-B was created. All edges returned by cursor B
 will be no-newer-than, and generally older than, than the edge-pointed-
 to-by-B.
 
 So, all future sets returned by cursor B are always disjoint from the
 set originally returned by cursor A. In your example, if you refetched
 both A and B, the result sets wouldn't be disjoint as there are no
 longer 5,000 edges between cursor A and cursor B.
 
 I think this, in part answers your question. ?
 
 -John Kalucki
 http://twitter.com/jkalucki
 Services, Twitter Inc.
 
 On Oct 4, 6:10 pm, Dewald Pretorius dpr...@gmail.com wrote:
  For discussion purposes, let's assume I am cursoring through a very
  volatile followers list of @veryvolatile. We have the following
  cursors:
 
  A = 5,000
  B = 5,000
  C = 5,000
 
  I retrieve Cursor A and process it. Next I retrieve Cursor B and
  process it. Then I retrieve Cursor C and process it.
 
  While I am processing Cursor C, 200 of the people who were in Cursor
 A
  unfollow @veryvolatile, and 400 of the people who were in Cursor B
  unfollow @veryvolatile.
 
  What do I get when I go back from C to B? Do I now get 4,600 ids in
  the list?
 
  Or, do I get 5,000 in B, which now includes a subset of 400 ids that
  were previously in Cursor A?
 
  Dewald



[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-06 Thread jmathai

On Oct 6, 11:06 am, Jesse Stay jesses...@gmail.com wrote:
 I said the same thing in the last thread about this - still no clue what
 Twitter is doing with cursors and how it is any different than the previous
 paging methods.
 Jesse

Is the main advantage that the new method takes a snapshot of the
followers list and let's you page through them?

I'd be willing to sacrifice some accuracy for speed since I'm not
doing anything like auto-unfollow.  From a sample set of 150k calls to
the api the average latency I have (from the west coast) is .85
seconds.  Grabbing a follower list serially, 100 at a time is
painful.  I much preferred what I was doing before (total # / 100 -
fire off that many calls in parallel).  If I dropped a few followers
in the process, that was ok because it's so much faster and I don't
need my copy of the social graph to be 100% accurate.


[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-06 Thread Tim Haines
On Wed, Oct 7, 2009 at 7:58 AM, jmathai jmat...@gmail.com wrote:


 I'd be willing to sacrifice some accuracy for speed since I'm not
 doing anything like auto-unfollow.  From a sample set of 150k calls to
 the api the average latency I have (from the west coast) is .85
 seconds.  Grabbing a follower list serially, 100 at a time is
 painful.  I much preferred what I was doing before (total # / 100 -
 fire off that many calls in parallel).  If I dropped a few followers
 in the process, that was ok because it's so much faster and I don't
 need my copy of the social graph to be 100% accurate.



I'm in the same boat - and filed this recently:
http://code.google.com/p/twitter-api/issues/detail?id=1078colspec=ID%20Stars%20Type%20Status%20Priority%20Owner%20Summary%20Opened%20Modified%20Component


[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-06 Thread John Kalucki

There is no snapshotting. 5,000 edges are returned on each call. Few
users have more than 5,000 followers or more than 5,000 followings.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.

On Oct 6, 11:58 am, jmathai jmat...@gmail.com wrote:
 On Oct 6, 11:06 am, Jesse Stay jesses...@gmail.com wrote:

  I said the same thing in the last thread about this - still no clue what
  Twitter is doing with cursors and how it is any different than the previous
  paging methods.
  Jesse

 Is the main advantage that the new method takes a snapshot of the
 followers list and let's you page through them?

 I'd be willing to sacrifice some accuracy for speed since I'm not
 doing anything like auto-unfollow.  From a sample set of 150k calls to
 the api the average latency I have (from the west coast) is .85
 seconds.  Grabbing a follower list serially, 100 at a time is
 painful.  I much preferred what I was doing before (total # / 100 -
 fire off that many calls in parallel).  If I dropped a few followers
 in the process, that was ok because it's so much faster and I don't
 need my copy of the social graph to be 100% accurate.


[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-06 Thread John Kalucki

No. If we are to offer real-time social graph changes, they'll be via
the Streaming API. In the mean time, there is no low-latency high-
throughput way to determine changes to the social graph. Attempts to
simulate this at large scale via repeated polling are likely to be
frustrating.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.

On Oct 6, 11:12 am, Brian Smith br...@briansmith.org wrote:
 John,

 Based on your description, it looks like you are on the verge of being able
 to offer a very useful capability: the ability to query the follows AND
 unfollows since the last time you checked. That would be a great addition to
 the API.

 For example, I'd really like to be able to page through A, B, C, etc. And
 then, after that, say OK, what's changed since then?

 Regards,
 Brian

  -Original Message-
  From: twitter-development-talk@googlegroups.com [mailto:twitter-
  development-t...@googlegroups.com] On Behalf Of John Kalucki
  Sent: Sunday, October 04, 2009 11:17 PM
  To: Twitter Development Talk
  Subject: [twitter-dev] Re: Twitter, Please Explain How Cursors Work

  I haven't looked at all the parts of the system, so there's some
  chance that I'm missing something.

  The method returns the followers in the reverse chronological order of
  edge creation. Cursor A will have the most recent 5,000 edges, by
  creation time, B the next most recent 5,000, etc. The last cursor will
  have the oldest edges.

  Each cursor points to some arbitrary edge. If you go back and retrieve
  cursor B, you should receive N edges created just before the edge-
  pointed-to-by-B was created. I don't recall if N is always 5000,
  generally 5000 or if it's at most 5000. This detail shouldn't matter,
  other than, on occasion, you'll make an extra API call.

  In any case, retrieving cursor B will never return edges created after
  the edge-pointed-to-by-B was created. All edges returned by cursor B
  will be no-newer-than, and generally older than, than the edge-pointed-
  to-by-B.

  So, all future sets returned by cursor B are always disjoint from the
  set originally returned by cursor A. In your example, if you refetched
  both A and B, the result sets wouldn't be disjoint as there are no
  longer 5,000 edges between cursor A and cursor B.

  I think this, in part answers your question. ?

  -John Kalucki
 http://twitter.com/jkalucki
  Services, Twitter Inc.

  On Oct 4, 6:10 pm, Dewald Pretorius dpr...@gmail.com wrote:
   For discussion purposes, let's assume I am cursoring through a very
   volatile followers list of @veryvolatile. We have the following
   cursors:

   A = 5,000
   B = 5,000
   C = 5,000

   I retrieve Cursor A and process it. Next I retrieve Cursor B and
   process it. Then I retrieve Cursor C and process it.

   While I am processing Cursor C, 200 of the people who were in Cursor
  A
   unfollow @veryvolatile, and 400 of the people who were in Cursor B
   unfollow @veryvolatile.

   What do I get when I go back from C to B? Do I now get 4,600 ids in
   the list?

   Or, do I get 5,000 in B, which now includes a subset of 400 ids that
   were previously in Cursor A?

   Dewald


[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-06 Thread John Kalucki

I described, in some detail, the reasons for cursors here:
http://groups.google.com/group/twitter-development-talk/msg/badfb7b6074aab10

If the details are uninteresting, the high-level summary is this: The
paged API was designed in a previous era. Paging is simply too
expensive and totally impractical to provide with the current
following counts. Also the QoS had deteriorated to the point where
some doubted that anyone was seriously using the methods. Paging is
going away and paging is not coming back.

The cursored approach allows us to continue to provide access to the
social graph via the REST API. As a benefit, QoS has been dramatically
improved and data quality is now pretty close to perfect.

If the implementation details and invariants described are confusing,
then stick to the well worn part of the path: Request the first block
with a cursor of -1. Keep requesting forward until you get a cursor of
0.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.

On Oct 6, 11:06 am, Jesse Stay jesses...@gmail.com wrote:
 I said the same thing in the last thread about this - still no clue what
 Twitter is doing with cursors and how it is any different than the previous
 paging methods.
 Jesse

 On Tue, Oct 6, 2009 at 10:22 AM, Dewald Pretorius dpr...@gmail.com wrote:

  Thanks John. However, I will be the first to put up my hand and say
  that I have no clue what you said.

  Can someone please translate John's answer into easy to understand
  language, with specific relation to the questions I asked?

  Dewald

  On Oct 5, 1:17 am, John Kalucki jkalu...@gmail.com wrote:
   I haven't looked at all the parts of the system, so there's some
   chance that I'm missing something.

   The method returns the followers in the reverse chronological order of
   edge creation. Cursor A will have the most recent 5,000 edges, by
   creation time, B the next most recent 5,000, etc. The last cursor will
   have the oldest edges.

   Each cursor points to some arbitrary edge. If you go back and retrieve
   cursor B, you should receive N edges created just before the edge-
   pointed-to-by-B was created. I don't recall if N is always 5000,
   generally 5000 or if it's at most 5000. This detail shouldn't matter,
   other than, on occasion, you'll make an extra API call.

   In any case, retrieving cursor B will never return edges created after
   the edge-pointed-to-by-B was created. All edges returned by cursor B
   will be no-newer-than, and generally older than, than the edge-pointed-
   to-by-B.

   So, all future sets returned by cursor B are always disjoint from the
   set originally returned by cursor A. In your example, if you refetched
   both A and B, the result sets wouldn't be disjoint as there are no
   longer 5,000 edges between cursor A and cursor B.

   I think this, in part answers your question. ?

   -John Kaluckihttp://twitter.com/jkalucki
   Services, Twitter Inc.

   On Oct 4, 6:10 pm, Dewald Pretorius dpr...@gmail.com wrote:

For discussion purposes, let's assume I am cursoring through a very
volatile followers list of @veryvolatile. We have the following
cursors:

A = 5,000
B = 5,000
C = 5,000

I retrieve Cursor A and process it. Next I retrieve Cursor B and
process it. Then I retrieve Cursor C and process it.

While I am processing Cursor C, 200 of the people who were in Cursor A
unfollow @veryvolatile, and 400 of the people who were in Cursor B
unfollow @veryvolatile.

What do I get when I go back from C to B? Do I now get 4,600 ids in
the list?

Or, do I get 5,000 in B, which now includes a subset of 400 ids that
were previously in Cursor A?

Dewald


[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-06 Thread Brian Smith

John Kalucki wrote:
 No. If we are to offer real-time social graph changes, they'll be via
 the Streaming API. In the mean time, there is no low-latency high-
 throughput way to determine changes to the social graph. Attempts to
 simulate this at large scale via repeated polling are likely to be
 frustrating.

Never mind. I was requesting this because previously statuses/followers was
documented to return followers in the order they joined Twitter. However,
on Sept. 25th, Alex updated the documentation to say it returns followers
in the order they followed the user which is what I wanted.

http://apiwiki.twitter.com/sdiff.php?first=Twitter%2BREST%2BAPI%2BMethod%253
A%2Bstatuses%25C2%25A0followerssecond=Twitter%2BREST%2BAPI%2BMethod%253A%2B
statuses%25C2%25A0followers.2009-09-25-16-55-57

I did not notice this change because it did not show up in the changelog.

Regards,
Brian



[twitter-dev] Re: Twitter, Please Explain How Cursors Work

2009-10-04 Thread John Kalucki

I haven't looked at all the parts of the system, so there's some
chance that I'm missing something.

The method returns the followers in the reverse chronological order of
edge creation. Cursor A will have the most recent 5,000 edges, by
creation time, B the next most recent 5,000, etc. The last cursor will
have the oldest edges.

Each cursor points to some arbitrary edge. If you go back and retrieve
cursor B, you should receive N edges created just before the edge-
pointed-to-by-B was created. I don't recall if N is always 5000,
generally 5000 or if it's at most 5000. This detail shouldn't matter,
other than, on occasion, you'll make an extra API call.

In any case, retrieving cursor B will never return edges created after
the edge-pointed-to-by-B was created. All edges returned by cursor B
will be no-newer-than, and generally older than, than the edge-pointed-
to-by-B.

So, all future sets returned by cursor B are always disjoint from the
set originally returned by cursor A. In your example, if you refetched
both A and B, the result sets wouldn't be disjoint as there are no
longer 5,000 edges between cursor A and cursor B.

I think this, in part answers your question. ?

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.

On Oct 4, 6:10 pm, Dewald Pretorius dpr...@gmail.com wrote:
 For discussion purposes, let's assume I am cursoring through a very
 volatile followers list of @veryvolatile. We have the following
 cursors:

 A = 5,000
 B = 5,000
 C = 5,000

 I retrieve Cursor A and process it. Next I retrieve Cursor B and
 process it. Then I retrieve Cursor C and process it.

 While I am processing Cursor C, 200 of the people who were in Cursor A
 unfollow @veryvolatile, and 400 of the people who were in Cursor B
 unfollow @veryvolatile.

 What do I get when I go back from C to B? Do I now get 4,600 ids in
 the list?

 Or, do I get 5,000 in B, which now includes a subset of 400 ids that
 were previously in Cursor A?

 Dewald