Anyone else still confused at how this works?  I'm still confused at how
this is any different than the way it was before with the paging (other than
one-less API call).

Jesse

On Sun, Oct 4, 2009 at 10:57 PM, John Kalucki <[email protected]> wrote:

>
> If an API is untrusted, it must be treated as entirely untrusted. You
> should be adding defensive heuristics between the untrusted API
> results and your application. If a given fetch seems bad, then queue
> the results and don't act on them until otherwise corroborated,
> perhaps by some quorum of subsequent results. You should also
> carefully be checking HTTP result codes, and performing exhaustive
> field existence checking.
>
> In the end, if some results are untrusted, you cannot trust the
> suggested improvements, as the improvements will, by necessity, be
> served from the same data store.
>
> Finally, the suggested improvements take resources away from
> stabilizing and otherwise improving the API.
>
> The purpose of the cursored resource is to make retrieval of high-
> velocity high-cardinality sets possible via a RESTful API. This scheme
> does not provide a snapshot view.
>
> The cursor scheme offers several useful properties however. One such
> property is that if an edge exists at the beginning of a traversal and
> remains unmodified throughout the traversal, the edge will always(**)
> be returned in the result set, regardless of all other possible
> operations performed on all other edges in the set. Additions and
> modifications made after the first block is returned will tend to not
> to be represented (perhaps never be present). Deletions made after the
> first block is returned may or may not be represented. This is a very
> strong and very useful form of consistency.
>
> ** = There remains an issue with cursor jitter that can, very rarely,
> result in minor loss and minor overdelivery. I don't know when this
> issue will be fully addressed. This jitter issue should only effect
> high velocity sets, and rarely, if ever, affect ordinary users.
>
> -John Kalucki
> http://twitter.com/jkalucki
> Services, Twitter Inc.
>
>
> On Oct 4, 10:45 am, Jesse Stay <[email protected]> wrote:
> > John, because no offense, but frankly I don't trust the Twitter API. I've
> > been burned too many times by things that were "supposed to work", code
> > pushed into production that wasn't tested properly, etc. that I know
> better
> > to do all I can to account for Twitter's mistakes.  There's no telling if
> at
> > some point that next_cursor returns nothing, but in reality it was
> supposed
> > to return something, and my users accidentally unfollow all their friends
> > because of it when they weren't intending to do so.
> > Having that number in there ensures, without a doubt (unless the number
> > itself is wrong, which I can't do anything about), that I know if Twitter
> is
> > right or not when I retrieve that next_cursor value.  I hope that makes
> > sense - it's nothing against Twitter, I've just seen it too many times to
> > know that I need to have backup error checking in place to be sure I know
> > Twitter's return data is correct.
> >
> > Regarding the user being removed before finished, I thought the whole
> > purpose of these cursors was to provide a snapshot of a social graph at a
> > given point of time, so unfollowed users don't show up until after the
> list
> > is retrieved - is that not the case?  Also, my experience has been that
> > pulling the user's friend and follower count ahead of time pulls a number
> > that is not the same as the number of followers/friends I actually pull
> from
> > the API.  Having you guys do a count on the set ahead of time will help
> > ensure that's the correct number.
> >
> > Thanks,
> >
> > Jesse
> >
> > On Sun, Oct 4, 2009 at 8:24 AM, John Kalucki <[email protected]> wrote:
> >
> > > Curious -- why isn't the end of list indicator a reliable enough
> > > indication?  "Iterate until" seems simple and reliable.
> >
> > > Can you request the denormalized count via the API before you begin?
> > > (Not familiar enough with the API, but the back-end store offers this
> > > for all sorts of purposes.) You'd have to apply some heuristic to
> > > allow for high-velocity sets.
> >
> > > The last user in the list could be removed before iteration completes,
> > > setting up a race-condition that you'd have to allow for as well.
> >
> > > -John Kalucki
> > >http://twitter.com/jkalucki
> > > Services, Twitter Inc.
> >
> > > On Oct 4, 1:29 am, Jesse Stay <[email protected]> wrote:
> > > > I was wondering if it might be possible to include, at least in the
> first
> > > > page, but if it's easier it could be on all pages, either a total
> > > expected
> > > > number of followers/friends, or a total expected number of returned
> pages
> > > > when the cursor parameter is provided for friends/ids and
> followers/ids?
> > > I'm
> > > > assuming since you're moving to the cursor-based approach you ought
> to be
> > > > able to accurately count this now since it's a snapshot of the data
> at
> > > that
> > > > time.
> > > > The reason I think that would be useful is that occasionally Twitter
> goes
> > > > down or introduces code that could break this.  This would enable us
> to
> > > be
> > > > absolutely sure we've hit the end of the entire set.  I guess another
> > > > approach could also be to just list the last expected cursor ID in
> the
> > > set
> > > > so we can be looking for that.
> >
> > > > Thanks,
> >
> > > > Jesse
>

Reply via email to