The page based approach does not scale with large sets. We can no
longer support this kind of API without throwing a painful number of
503s.

Working with row-counts forces the data store to recount rows in an O
(n^2) manner. Cursors avoid this issue by allowing practically
constant time access to the next block. The cost becomes O(n/
block_size) which, yes, is O(n), but a graceful one given n < 10^7 and
a block_size of 5000. The cursor approach provides a more complete and
consistent result set.

Proportionally, very few users require multiple page fetches with a
page size of 5,000.

Also, scraping the social graph repeatedly at high speed is could
often be considered a low-value, borderline abusive use of the social
graph API.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.




On Sep 18, 1:09 am, alan_b <ala...@gmail.com> wrote:
> when dealing with retrieving a large followers list from API, what i
> did was estimate the no. of pages i need (total / 5000) from the
> follower count of user's profile, and then send concurrent API
> requests to improve the speed.
>
> now with the new cursor-based pagination, this become impossible(it
> stills work, but i guess page-based pagination will be obsoleted
> someday?), because I don't know the next_cursor until I finish
> downloading a whole page. so i guess the page-based should be preserved
> (and improve)? rather than making it obsolete?

Reply via email to