The other solution would be to send it to us in batch results, attaching a
timestamp to the request telling us "this is what the user's social graph
looked like at x time". I personally would start with the compressed format
though, as that makes it all possible to retrieve in a single request.
On Sun, Sep 6, 2009 at 10:33 PM, Jesse Stay <jesses...@gmail.com> wrote:
> Agreed. Is there a chance Twitter can return the full results in compressed
> (gzip or similar) format to reduce load, leaving the burden of decompressing
> on our end and reducing bandwidth? I'm sure there are other areas this
> could apply as well. I think you'll find compressing the full social graph
> of a user significantly reduces the size of the data you have to pass
> through the pipe - my tests have proved it to be a huge difference, and
> you'll have to get way past the 10s of millions of ids before things slow
> down at all after that.
> On Sun, Sep 6, 2009 at 8:27 PM, Dewald Pretorius <dpr...@gmail.com> wrote:
>> There is no way that paging through a large and volatile data set can
>> ever return results that are 100% accurate.
>> Let's say one wants to page through @aplusk's followers list. That's
>> going to take between 3 and 5 minutes just to collect the follower ids
>> with &page (or the new cursors).
>> It is likely that some of the follower ids that you have gone past and
>> have already colledted, have unfollowed @aplusk while you are still
>> collecting the rest. I assume that the Twitter system does paging by
>> doing a standard SQL LIMIT clause. If you do LIMIT 1000000, 20 and
>> some of the ids that you have already paged past have been deleted,
>> the result set is going to "shift to the left" and you are going to
>> miss the ones that were above 1000000 but have subsequently moved left
>> to below 1000000.
>> There really are only two solutions to this problem:
>> a) we need to have the capability to reliably retrieve the entire
>> result set in one API call, or
>> b) everyone has to accept that the result set cannot be guaranteed to
>> be 100% accurate.