Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-11 Thread John Kalucki
If you are writing a general purpose display app, I think, (but I am not at
all certain), that you can ignore this issue. Reasonable polling frequency
on modest velocity timelines will sometimes, but very rarely, miss a tweet.
Also, over time, we're doing things to make this better for everyone. Many
of our projects have the side-effect of reducing K, decreasing the already
low since_id failure odds even further. Some tweet pipeline changes when
live in the last few weeks that dramatically reduce the K distribution for
various user types.

Since I don't know how the Last-Modified time exactly works, I'm going to
restate your response slightly:

Assuming synchronized clocks (or solely the Twitter Clock, if exposed
properly via Last-Modified), given a poll at time t, the newest status is at
least t - n seconds old, and sufficient n, then even a naive since_id
algorithm will be effectively Exactly Once. Assuming that Twitter is running
normally. For a given poll, when the poll time and last update time delta
drops below this n second period, there's a non-zero loss risk.

Just what is n? It is K expressed as time rather than as a discrete count.
For some timelines types, with some classes of users, K is as much as
perhaps 180 seconds. For others, K is less than 1 second. There's some
variability here that we should characterize more carefully internally and
then discuss publicly. I suspect there's a lot to be learned from this
exercise.

Since_id really runs into trouble when any of the following are too great:
the polling frequency, the updating frequency, the roughly-sorted K value.
If you are polling often to reduce display latency, use the Streaming API.
If the timeline moves too fast to capture it all exactly, you should
reconsider your requirements or get a Commercial Data License for the
Streaming API. Does the user really need to see every Bieber at 3 Biebers
Per Second? How would they ever know if they missed 10^-5 of them in a blur?
If you need them all for analysis, consider calculating the confidence
interval given a sample proportion of 1 - 10^6 (6 9s) or so vs. a total
enumeration. Indistinguishable. If you need them for some other purpose, say
CRM, the Streaming API may be the answer.

-John Kalucki
http://twitter.com/jkalucki
Infrastructure, Twitter Inc.


On Fri, Apr 9, 2010 at 2:28 PM, Brian Smith br...@briansmith.org wrote:

 John,



 I am not polling. I am simply trying to implement a basic “refresh” feature
 like every desktop/mobile Twitter app has. Basically, I just want to let
 users scroll through their timelines, and be reasonably sure that I am
 presenting them with an accurate  complete view of the timeline, while
 using as little bandwidth as possible.



 When I said “10 seconds old”/“30 seconds old”/etc. I was referring to I was
 referring to the age at the time the page of tweets was generated. So,
 basically, if the tweet’s timestamp – the response’s Last-Modified time more
 than 10,000 ms (from what you said below), you are almost definitely getting
 At Least Once behavior if Twitter is operating normally, and you can use
 that information to get At Least Once behavior that emulates Exactly Once
 behavior with little (usually no) overhead. Is that a correct interpretation
 of what you were saying?



 Thanks,

 Brian





 *From:* twitter-development-talk@googlegroups.com [mailto:
 twitter-development-t...@googlegroups.com] *On Behalf Of *John Kalucki
 *Sent:* Friday, April 09, 2010 3:31 PM

 *To:* twitter-development-talk@googlegroups.com
 *Subject:* Re: [twitter-dev] Re: Upcoming changes to the way status IDs
 are sequenced



 Your second paragraph doesn't quite make sense. The period between your
 next poll and the timestamp of the last status is irrelevant. The issue is
 solely the magnitude of K on the roughly sorted stream of events that are
 applied to the materialized timeline vector. As K varies, so do the odds,
 however infinitesimally small, that you will miss a tweet using the last
 status id returned. The period between your polls of the API does not affect
 this K.

 My recommendation is to ignore this issue in nearly every use case. If you
 are, however, polling high velocity timelines (including search queries) and
 attempting to approximate an Exactly Once QoS, you should, basically, stop
 doing that. You are probably wasting resources and you'll probably never get
 Exactly Once behavior anyway. Use the Streaming API instead.

 -John Kalucki
 http://twitter.com/jkalucki
 Infrastructure, Twitter Inc.

 On Fri, Apr 9, 2010 at 12:20 PM, Brian Smith br...@briansmith.org wrote:

 John,



 Thank you. That was one of the most informative emails on the Twitter API I
 have seen on the list.



 Basically, even now, an application should not use an ID of a tweet for
 since_id if the tweet is less than 10 seconds old, ignoring service
 abnormalities. Probably a larger threshold (30 seconds or even a minute)
 would be better, especially when you take

Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-11 Thread Josh Bleecher Snyder
Hi John (et al.),

These emails from you are great -- they are exactly the sort of
thoughtful, detailed, specific, technical emails that I would
personally love to see accompany future announcements. I think they
would prevent a fair amount of FUD. Thank you.

I have one stupid question, if you don't mind, though. You refer in
every email to K. What, precisely, does K refer to? What are its
units? (I think I know what it you mean by it, but I'd be interested
to hear precisely.)

Thanks,
Josh



On Sun, Apr 11, 2010 at 2:23 PM, John Kalucki j...@twitter.com wrote:
 If you are writing a general purpose display app, I think, (but I am not at
 all certain), that you can ignore this issue. Reasonable polling frequency
 on modest velocity timelines will sometimes, but very rarely, miss a tweet.
 Also, over time, we're doing things to make this better for everyone. Many
 of our projects have the side-effect of reducing K, decreasing the already
 low since_id failure odds even further. Some tweet pipeline changes when
 live in the last few weeks that dramatically reduce the K distribution for
 various user types.

 Since I don't know how the Last-Modified time exactly works, I'm going to
 restate your response slightly:

 Assuming synchronized clocks (or solely the Twitter Clock, if exposed
 properly via Last-Modified), given a poll at time t, the newest status is at
 least t - n seconds old, and sufficient n, then even a naive since_id
 algorithm will be effectively Exactly Once. Assuming that Twitter is running
 normally. For a given poll, when the poll time and last update time delta
 drops below this n second period, there's a non-zero loss risk.

 Just what is n? It is K expressed as time rather than as a discrete count.
 For some timelines types, with some classes of users, K is as much as
 perhaps 180 seconds. For others, K is less than 1 second. There's some
 variability here that we should characterize more carefully internally and
 then discuss publicly. I suspect there's a lot to be learned from this
 exercise.

 Since_id really runs into trouble when any of the following are too great:
 the polling frequency, the updating frequency, the roughly-sorted K value.
 If you are polling often to reduce display latency, use the Streaming API.
 If the timeline moves too fast to capture it all exactly, you should
 reconsider your requirements or get a Commercial Data License for the
 Streaming API. Does the user really need to see every Bieber at 3 Biebers
 Per Second? How would they ever know if they missed 10^-5 of them in a blur?
 If you need them all for analysis, consider calculating the confidence
 interval given a sample proportion of 1 - 10^6 (6 9s) or so vs. a total
 enumeration. Indistinguishable. If you need them for some other purpose, say
 CRM, the Streaming API may be the answer.

 -John Kalucki
 http://twitter.com/jkalucki
 Infrastructure, Twitter Inc.


 On Fri, Apr 9, 2010 at 2:28 PM, Brian Smith br...@briansmith.org wrote:

 John,



 I am not polling. I am simply trying to implement a basic “refresh”
 feature like every desktop/mobile Twitter app has. Basically, I just want to
 let users scroll through their timelines, and be reasonably sure that I am
 presenting them with an accurate  complete view of the timeline, while
 using as little bandwidth as possible.



 When I said “10 seconds old”/“30 seconds old”/etc. I was referring to I
 was referring to the age at the time the page of tweets was generated. So,
 basically, if the tweet’s timestamp – the response’s Last-Modified time more
 than 10,000 ms (from what you said below), you are almost definitely getting
 At Least Once behavior if Twitter is operating normally, and you can use
 that information to get At Least Once behavior that emulates Exactly Once
 behavior with little (usually no) overhead. Is that a correct interpretation
 of what you were saying?



 Thanks,

 Brian





 From: twitter-development-talk@googlegroups.com
 [mailto:twitter-development-t...@googlegroups.com] On Behalf Of John Kalucki
 Sent: Friday, April 09, 2010 3:31 PM
 To: twitter-development-talk@googlegroups.com
 Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs are
 sequenced



 Your second paragraph doesn't quite make sense. The period between your
 next poll and the timestamp of the last status is irrelevant. The issue is
 solely the magnitude of K on the roughly sorted stream of events that are
 applied to the materialized timeline vector. As K varies, so do the odds,
 however infinitesimally small, that you will miss a tweet using the last
 status id returned. The period between your polls of the API does not affect
 this K.

 My recommendation is to ignore this issue in nearly every use case. If you
 are, however, polling high velocity timelines (including search queries) and
 attempting to approximate an Exactly Once QoS, you should, basically, stop
 doing that. You are probably wasting resources and you'll probably never get

Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-11 Thread John Kalucki
.
 
 
 
  When I said “10 seconds old”/“30 seconds old”/etc. I was referring to I
  was referring to the age at the time the page of tweets was generated.
 So,
  basically, if the tweet’s timestamp – the response’s Last-Modified time
 more
  than 10,000 ms (from what you said below), you are almost definitely
 getting
  At Least Once behavior if Twitter is operating normally, and you can use
  that information to get At Least Once behavior that emulates Exactly
 Once
  behavior with little (usually no) overhead. Is that a correct
 interpretation
  of what you were saying?
 
 
 
  Thanks,
 
  Brian
 
 
 
 
 
  From: twitter-development-talk@googlegroups.com
  [mailto:twitter-development-t...@googlegroups.com] On Behalf Of John
 Kalucki
  Sent: Friday, April 09, 2010 3:31 PM
  To: twitter-development-talk@googlegroups.com
  Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs
 are
  sequenced
 
 
 
  Your second paragraph doesn't quite make sense. The period between your
  next poll and the timestamp of the last status is irrelevant. The issue
 is
  solely the magnitude of K on the roughly sorted stream of events that
 are
  applied to the materialized timeline vector. As K varies, so do the
 odds,
  however infinitesimally small, that you will miss a tweet using the last
  status id returned. The period between your polls of the API does not
 affect
  this K.
 
  My recommendation is to ignore this issue in nearly every use case. If
 you
  are, however, polling high velocity timelines (including search queries)
 and
  attempting to approximate an Exactly Once QoS, you should, basically,
 stop
  doing that. You are probably wasting resources and you'll probably never
 get
  Exactly Once behavior anyway. Use the Streaming API instead.
 
  -John Kalucki
  http://twitter.com/jkalucki
  Infrastructure, Twitter Inc.
 
  On Fri, Apr 9, 2010 at 12:20 PM, Brian Smith br...@briansmith.org
 wrote:
 
  John,
 
 
 
  Thank you. That was one of the most informative emails on the Twitter
 API
  I have seen on the list.
 
 
 
  Basically, even now, an application should not use an ID of a tweet for
  since_id if the tweet is less than 10 seconds old, ignoring service
  abnormalities. Probably a larger threshold (30 seconds or even a minute)
  would be better, especially when you take into consideration the
 likelihood
  of clock skew between the servers that generate the timestamps.
 
 
 
  I think this is information that would be useful to have added to the
 API
  documentation, as I know many applications are taking a much more naive
  approach to pagination.
 
 
 
  Thanks again,
 
  Brian
 
 
 
  From: twitter-development-talk@googlegroups.com On Behalf Of John
 Kalucki
  Sent: Friday, April 09, 2010 1:20 PM
 
  To: twitter-development-talk@googlegroups.com
  Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs
 are
  sequenced
 
 
 
  Folks are making a lot of incorrect assumptions about the Twitter
  architecture, especially around how we materialize and present timeline
  vectors and just what QoS we're really offering. This new scheme does
 not
  significantly, or perhaps even observably, make the existing issues
 around
  since_id any better or any worse. And I'm being very precise here. The
  since_id situation is such that the few milliseconds skew possible in
  Snowflake are practically irrelevant and lost in the noise of a 4 to 6
  orders-of-magnitude misconception. (That's a very big misconception.)
 
  If you do not know the rough ordering of our event stream as it applied
 to
  the materialized timeline vectors and also the expected rate of change
 on
  the timeline in question, you cannot make good choices about making
 since_id
  perfect. But, neither you should you try to make it perfect, nor should
 you
  have to worry about this.
 
  If you insist upon worrying about this, here's my slight salting of
 Mark's
  advice: In the existing continuously increasing id generation scheme on
 the
  Twitter.com API, I'd subtract about 5000 ids from since_id to ensure
  sufficient overlap in nearly all cases, but even this could be lossy in
 the
  face of severe operational issues -- issues of a type that we haven't
 seen
  in many many months. The search API has a different K in its rough
 ordering,
  so you might need more like 10,000 ids. In the new Snowflake scheme, I'd
  overlap by about 5000 milliseconds for twitter.com APIs and 10,000 ms
 for
  search APIs.
 
  Despite all this, things still could go wrong. An engineer here is known
  for pointing out that even things that almost never ever happen, happen
 all
  the time on the Twitter system. Now, just because they are happening, to
  someone, all the time, doesn't mean that they'll ever ever happen to you
 or
  your users in a thousand years -- but some's getting hit with it,
 somewhere,
  a few times a day.
 
  The above schemes no longer treat the id as an opaque unique ordered
  identifier. And woe lies in wait for you

Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-11 Thread Nick Arnett
On Sun, Apr 11, 2010 at 5:14 PM, John Kalucki j...@twitter.com wrote:


 This is useful stuff for dealing with infinite sequences of events -- like,
 picking a random example, the insertion of new tweets into a materialized
 timeline (a cache of the timeline vector).


The Twitter stream is an infinite sequence of events... now that's serious
optimism about how long Twitter will exist!

Sorry, just had to say it.

Of course, some infinities are bigger than others.

Nick


-- 
To unsubscribe, reply using remove me as the subject.


Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-09 Thread Dave Sherohman
On Thu, Apr 08, 2010 at 05:03:29PM -0700, Naveen wrote:
 However, I wanted to be clear and feel it should be made obvious that
 with this change, there is a possibility that a tweet may not be
 delivered to client if the implementation of how since_id is currently
 used is not updated to cover the case.  I still envision the situation
 as more likely than you seem to believe and figure as tweet velocity
 increases, the likelihood will also increase; But I am assuming have
 better data to support your viewpoint than I and shall defer.

Maybe I'm just missing something here, but it seems trivial to fix on
Twitter's side (enough so that I assume it's what they've been planning
from the start to do):  Only return tweets from closed buckets.

We are guaranteed that the buckets will be properly ordered.  The order
will only be randomized within a bucket.  Therefore, by only returning
tweets from buckets which are no longer receiving new tweets, since_id
works and will never miss a tweet.

And, yes, this does mean a slight delay in getting the tweets out
because they have to wait a few milliseconds for their bucket to close
before being exposed to calls which can use since_id, plus maybe a
little longer for the contents of that bucket to be distributed to
multiple servers.  That's still going to only take time comparable to
round-trip times for an HTTP request to fetch the data for display to a
user and be far, far less than the average refresh delay required by
those clients which fall under the API rate limit.  I submit, therefore,
that any such delay caused by waiting for buckets to close will be
inconsequential.

-- 
Dave Sherohman


RE: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-09 Thread Brian Smith
John,

 

Thank you. That was one of the most informative emails on the Twitter API I
have seen on the list.

 

Basically, even now, an application should not use an ID of a tweet for
since_id if the tweet is less than 10 seconds old, ignoring service
abnormalities. Probably a larger threshold (30 seconds or even a minute)
would be better, especially when you take into consideration the likelihood
of clock skew between the servers that generate the timestamps.

 

I think this is information that would be useful to have added to the API
documentation, as I know many applications are taking a much more naive
approach to pagination.

 

Thanks again,

Brian

 

From: twitter-development-talk@googlegroups.com On Behalf Of John Kalucki
Sent: Friday, April 09, 2010 1:20 PM
To: twitter-development-talk@googlegroups.com
Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs are
sequenced

 

Folks are making a lot of incorrect assumptions about the Twitter
architecture, especially around how we materialize and present timeline
vectors and just what QoS we're really offering. This new scheme does not
significantly, or perhaps even observably, make the existing issues around
since_id any better or any worse. And I'm being very precise here. The
since_id situation is such that the few milliseconds skew possible in
Snowflake are practically irrelevant and lost in the noise of a 4 to 6
orders-of-magnitude misconception. (That's a very big misconception.)

If you do not know the rough ordering of our event stream as it applied to
the materialized timeline vectors and also the expected rate of change on
the timeline in question, you cannot make good choices about making since_id
perfect. But, neither you should you try to make it perfect, nor should you
have to worry about this.

If you insist upon worrying about this, here's my slight salting of Mark's
advice: In the existing continuously increasing id generation scheme on the
Twitter.com API, I'd subtract about 5000 ids from since_id to ensure
sufficient overlap in nearly all cases, but even this could be lossy in the
face of severe operational issues -- issues of a type that we haven't seen
in many many months. The search API has a different K in its rough ordering,
so you might need more like 10,000 ids. In the new Snowflake scheme, I'd
overlap by about 5000 milliseconds for twitter.com APIs and 10,000 ms for
search APIs.

Despite all this, things still could go wrong. An engineer here is known for
pointing out that even things that almost never ever happen, happen all the
time on the Twitter system. Now, just because they are happening, to
someone, all the time, doesn't mean that they'll ever ever happen to you or
your users in a thousand years -- but some's getting hit with it, somewhere,
a few times a day.

The above schemes no longer treat the id as an opaque unique ordered
identifier. And woe lies in wait for you as changes are made to these ids.
Woe. You also need to deduplicate. Be very careful and understand fully what
you summon by breaking this semantic contract.

In the end, since_id issues go away on the Streaming API, and other than
around various start-up discontinuities, you can ignore this issue. I'll be
talking about Rough Ordering, among other things Streaming, at the Chirp
conference. Come geek out. 

-John Kalucki
http://twitter.com/jkalucki
Infrastructure, Twitter Inc.



On Fri, Apr 9, 2010 at 1:58 AM, Dave Sherohman d...@fishtwits.com wrote:

On Thu, Apr 08, 2010 at 05:03:29PM -0700, Naveen wrote:
 However, I wanted to be clear and feel it should be made obvious that
 with this change, there is a possibility that a tweet may not be
 delivered to client if the implementation of how since_id is currently
 used is not updated to cover the case.  I still envision the situation
 as more likely than you seem to believe and figure as tweet velocity
 increases, the likelihood will also increase; But I am assuming have
 better data to support your viewpoint than I and shall defer.

Maybe I'm just missing something here, but it seems trivial to fix on
Twitter's side (enough so that I assume it's what they've been planning
from the start to do):  Only return tweets from closed buckets.

We are guaranteed that the buckets will be properly ordered.  The order
will only be randomized within a bucket.  Therefore, by only returning
tweets from buckets which are no longer receiving new tweets, since_id
works and will never miss a tweet.

And, yes, this does mean a slight delay in getting the tweets out
because they have to wait a few milliseconds for their bucket to close
before being exposed to calls which can use since_id, plus maybe a
little longer for the contents of that bucket to be distributed to
multiple servers.  That's still going to only take time comparable to
round-trip times for an HTTP request to fetch the data for display to a
user and be far, far less than the average refresh delay required by
those clients which fall under

Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-09 Thread John Kalucki
Your second paragraph doesn't quite make sense. The period between your next
poll and the timestamp of the last status is irrelevant. The issue is solely
the magnitude of K on the roughly sorted stream of events that are applied
to the materialized timeline vector. As K varies, so do the odds, however
infinitesimally small, that you will miss a tweet using the last status id
returned. The period between your polls of the API does not affect this K.

My recommendation is to ignore this issue in nearly every use case. If you
are, however, polling high velocity timelines (including search queries) and
attempting to approximate an Exactly Once QoS, you should, basically, stop
doing that. You are probably wasting resources and you'll probably never get
Exactly Once behavior anyway. Use the Streaming API instead.

-John Kalucki
http://twitter.com/jkalucki
Infrastructure, Twitter Inc.

On Fri, Apr 9, 2010 at 12:20 PM, Brian Smith br...@briansmith.org wrote:

 John,



 Thank you. That was one of the most informative emails on the Twitter API I
 have seen on the list.



 Basically, even now, an application should not use an ID of a tweet for
 since_id if the tweet is less than 10 seconds old, ignoring service
 abnormalities. Probably a larger threshold (30 seconds or even a minute)
 would be better, especially when you take into consideration the likelihood
 of clock skew between the servers that generate the timestamps.



 I think this is information that would be useful to have added to the API
 documentation, as I know many applications are taking a much more naive
 approach to pagination.



 Thanks again,

 Brian



 *From:* twitter-development-talk@googlegroups.com *On Behalf Of *John
 Kalucki
 *Sent:* Friday, April 09, 2010 1:20 PM

 *To:* twitter-development-talk@googlegroups.com
 *Subject:* Re: [twitter-dev] Re: Upcoming changes to the way status IDs
 are sequenced



 Folks are making a lot of incorrect assumptions about the Twitter
 architecture, especially around how we materialize and present timeline
 vectors and just what QoS we're really offering. This new scheme does not
 significantly, or perhaps even observably, make the existing issues around
 since_id any better or any worse. And I'm being very precise here. The
 since_id situation is such that the few milliseconds skew possible in
 Snowflake are practically irrelevant and lost in the noise of a 4 to 6
 orders-of-magnitude misconception. (That's a very big misconception.)


 If you do not know the rough ordering of our event stream as it applied to
 the materialized timeline vectors and also the expected rate of change on
 the timeline in question, you cannot make good choices about making since_id
 perfect. But, neither you should you try to make it perfect, nor should you
 have to worry about this.

 If you insist upon worrying about this, here's my slight salting of Mark's
 advice: In the existing continuously increasing id generation scheme on the
 Twitter.com API, I'd subtract about 5000 ids from since_id to ensure
 sufficient overlap in nearly all cases, but even this could be lossy in the
 face of severe operational issues -- issues of a type that we haven't seen
 in many many months. The search API has a different K in its rough ordering,
 so you might need more like 10,000 ids. In the new Snowflake scheme, I'd
 overlap by about 5000 milliseconds for twitter.com APIs and 10,000 ms for
 search APIs.

 Despite all this, things still could go wrong. An engineer here is known
 for pointing out that even things that almost never ever happen, happen all
 the time on the Twitter system. Now, just because they are happening, to
 someone, all the time, doesn't mean that they'll ever ever happen to you or
 your users in a thousand years -- but some's getting hit with it, somewhere,
 a few times a day.

 The above schemes no longer treat the id as an opaque unique ordered
 identifier. And woe lies in wait for you as changes are made to these ids.
 Woe. You also need to deduplicate. Be very careful and understand fully what
 you summon by breaking this semantic contract.

 In the end, since_id issues go away on the Streaming API, and other than
 around various start-up discontinuities, you can ignore this issue. I'll be
 talking about Rough Ordering, among other things Streaming, at the Chirp
 conference. Come geek out.

 -John Kalucki
 http://twitter.com/jkalucki
 Infrastructure, Twitter Inc.

 On Fri, Apr 9, 2010 at 1:58 AM, Dave Sherohman d...@fishtwits.com wrote:

 On Thu, Apr 08, 2010 at 05:03:29PM -0700, Naveen wrote:
  However, I wanted to be clear and feel it should be made obvious that
  with this change, there is a possibility that a tweet may not be
  delivered to client if the implementation of how since_id is currently
  used is not updated to cover the case.  I still envision the situation
  as more likely than you seem to believe and figure as tweet velocity
  increases, the likelihood will also increase; But I am assuming have

RE: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-09 Thread Brian Smith
John,

 

I am not polling. I am simply trying to implement a basic refresh feature
like every desktop/mobile Twitter app has. Basically, I just want to let
users scroll through their timelines, and be reasonably sure that I am
presenting them with an accurate  complete view of the timeline, while
using as little bandwidth as possible.

 

When I said 10 seconds old/30 seconds old/etc. I was referring to I was
referring to the age at the time the page of tweets was generated. So,
basically, if the tweet's timestamp - the response's Last-Modified time more
than 10,000 ms (from what you said below), you are almost definitely getting
At Least Once behavior if Twitter is operating normally, and you can use
that information to get At Least Once behavior that emulates Exactly Once
behavior with little (usually no) overhead. Is that a correct interpretation
of what you were saying?

 

Thanks,

Brian

 

 

From: twitter-development-talk@googlegroups.com
[mailto:twitter-development-t...@googlegroups.com] On Behalf Of John Kalucki
Sent: Friday, April 09, 2010 3:31 PM
To: twitter-development-talk@googlegroups.com
Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs are
sequenced

 

Your second paragraph doesn't quite make sense. The period between your next
poll and the timestamp of the last status is irrelevant. The issue is solely
the magnitude of K on the roughly sorted stream of events that are applied
to the materialized timeline vector. As K varies, so do the odds, however
infinitesimally small, that you will miss a tweet using the last status id
returned. The period between your polls of the API does not affect this K.

My recommendation is to ignore this issue in nearly every use case. If you
are, however, polling high velocity timelines (including search queries) and
attempting to approximate an Exactly Once QoS, you should, basically, stop
doing that. You are probably wasting resources and you'll probably never get
Exactly Once behavior anyway. Use the Streaming API instead.

-John Kalucki
http://twitter.com/jkalucki
Infrastructure, Twitter Inc.

On Fri, Apr 9, 2010 at 12:20 PM, Brian Smith br...@briansmith.org wrote:

John,

 

Thank you. That was one of the most informative emails on the Twitter API I
have seen on the list.

 

Basically, even now, an application should not use an ID of a tweet for
since_id if the tweet is less than 10 seconds old, ignoring service
abnormalities. Probably a larger threshold (30 seconds or even a minute)
would be better, especially when you take into consideration the likelihood
of clock skew between the servers that generate the timestamps.

 

I think this is information that would be useful to have added to the API
documentation, as I know many applications are taking a much more naive
approach to pagination.

 

Thanks again,

Brian

 

From: twitter-development-talk@googlegroups.com On Behalf Of John Kalucki
Sent: Friday, April 09, 2010 1:20 PM


To: twitter-development-talk@googlegroups.com
Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs are
sequenced

 

Folks are making a lot of incorrect assumptions about the Twitter
architecture, especially around how we materialize and present timeline
vectors and just what QoS we're really offering. This new scheme does not
significantly, or perhaps even observably, make the existing issues around
since_id any better or any worse. And I'm being very precise here. The
since_id situation is such that the few milliseconds skew possible in
Snowflake are practically irrelevant and lost in the noise of a 4 to 6
orders-of-magnitude misconception. (That's a very big misconception.)



If you do not know the rough ordering of our event stream as it applied to
the materialized timeline vectors and also the expected rate of change on
the timeline in question, you cannot make good choices about making since_id
perfect. But, neither you should you try to make it perfect, nor should you
have to worry about this.

If you insist upon worrying about this, here's my slight salting of Mark's
advice: In the existing continuously increasing id generation scheme on the
Twitter.com API, I'd subtract about 5000 ids from since_id to ensure
sufficient overlap in nearly all cases, but even this could be lossy in the
face of severe operational issues -- issues of a type that we haven't seen
in many many months. The search API has a different K in its rough ordering,
so you might need more like 10,000 ids. In the new Snowflake scheme, I'd
overlap by about 5000 milliseconds for twitter.com APIs and 10,000 ms for
search APIs.

Despite all this, things still could go wrong. An engineer here is known for
pointing out that even things that almost never ever happen, happen all the
time on the Twitter system. Now, just because they are happening, to
someone, all the time, doesn't mean that they'll ever ever happen to you or
your users in a thousand years -- but some's getting hit with it, somewhere,
a few times a day

Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-08 Thread Mark McBride
Thank you for the feedback.  It's great to hear about the variety of use
cases people have for the API, and in particular all the different ways
people are using IDs. To alleviate some of the concerns raised in this
thread we thought it would be useful to give more details about how we plan
to generate IDs

1) IDs are still 64-bit integers.  This should minimize any migration pains.
2) You can still sort on ID.  Within a few millieconds you may get out of
order results, but for most use cases this shouldn't be an issue.
3) since_id will still work (within the caveats given above).
4) We will provide a way to backfill from the streaming API.
5) You cannot use the generated ID to reverse engineer tweet velocity.  Note
that you can still use the streaming API to determine the rate of public
statuses.

Additional items of interest
1) At some point we will likely start using this as an ID for direct
messages too
2) We will almost certainly open source the ID generation code, probably
before we actually cut over to using it.
3) We STRONGLY suggest that you treat IDs as roughly sorted (roughly being
within a few ms buckets), opaque 64-bit integers.  We may need to change the
scheme again at some point in the future, and want to minimize migration
pains should we need to do this.

Hopefully this puts you more at ease with the changes we're making.  If it
raises new concerns, please let us know!

  ---Mark

http://twitter.com/mccv

On Mon, Apr 5, 2010 at 4:18 PM, M. Edward (Ed) Borasky zn...@comcast.netwrote:

 On 04/05/2010 12:55 AM, Tim Haines wrote:
  This made me laugh.  Hard.
 
  On Fri, Apr 2, 2010 at 6:47 AM, Dewald Pretorius dpr...@gmail.com
 wrote:
 
  Mark,
 
  It's extremely important where you have two bots that reply to each
  others' tweets. With incorrectly sorted tweets, you get conversations
  that look completely unnatural.
 
  On Apr 1, 1:39 pm, Mark McBride mmcbr...@twitter.com wrote:
  Just out of curiosity, what applications are you building that require
  sub-second sorting resolution for tweets?

 Yeah - my bot laughed too ;-)
 --
 M. Edward (Ed) Borasky
 borasky-research.net/m-edward-ed-borasky

 A mathematician is a device for turning coffee into theorems. ~ Paul
 Erdős


 --
 To unsubscribe, reply using remove me as the subject.



Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-08 Thread Nick Arnett
On Thu, Apr 1, 2010 at 10:47 AM, Dewald Pretorius dpr...@gmail.com wrote:

 Mark,

 It's extremely important where you have two bots that reply to each
 others' tweets. With incorrectly sorted tweets, you get conversations
 that look completely unnatural.


I'd love to see an example of two bots replying to each other and looking
entirely natural!

We all knew this sort of thing was going on, removing the pesky humans from
the loop, but I always thought it was unintentional.

There's a science fiction story in there somewhere.

Nick


-- 
Subscription settings: 
http://groups.google.com/group/twitter-development-talk/subscribe?hl=en


Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-08 Thread Lil Peck
On Thu, Apr 8, 2010 at 5:39 PM, Nick Arnett nick.arn...@gmail.com wrote:

 I'd love to see an example of two bots replying to each other and looking
 entirely natural!

 We all knew this sort of thing was going on, removing the pesky humans from
 the loop, but I always thought it was unintentional.

 There's a science fiction story in there somewhere.



Do Twitterbots dream of electric sheep?


-- 
Subscription settings: 
http://groups.google.com/group/twitter-development-talk/subscribe?hl=en


RE: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-08 Thread Brian Smith
What does “within the caveats given above” mean? Either since_id will work or 
it won’t. It seems to me that if IDs are only in a “rough” order, since_id 
won’t work—in particular, there is a possibility that paging through tweets 
using since_id will completely skip over some tweets. 

 

My concern is that, since tweets will not be serialized at the time they are 
written, there will be a race condition between me making a request and users 
posting new statuses. That is, I could get a response with the largest id in 
the response being X that gets evaluated just before a tweet (X-1) has been 
saved in the database; If so, when I issue a request with since_id=X, my 
program will never see the newer tweet (X-1).

 

Are you going to change the implementation of the timeline methods so that they 
never return a tweet with ID X until all nodes in the cluster guarantee that 
they won’t create a new tweet with an ID less than X?

 

I implement the following logic:

 

1.  Let LATEST start out as the earliest tweet available in the user’s 
timeline.

2.  Make a request with since_id={LATEST}, which returns a set of tweets T.

3.  If T is empty then stop.

4.  Let LATEST= max({ id(t), for all t in T}).

5.  Goto 2.

 

Will I be guaranteed not to skip over any tweets in the timeline using this 
logic? If not, what do I need to do to ensure I get them all?

 

Thanks,

Brian

 

 

From: twitter-development-talk@googlegroups.com 
[mailto:twitter-development-t...@googlegroups.com] On Behalf Of Mark McBride
Sent: Thursday, April 08, 2010 5:10 PM
To: twitter-development-talk@googlegroups.com
Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs are 
sequenced

 

Thank you for the feedback.  It's great to hear about the variety of use cases 
people have for the API, and in particular all the different ways people are 
using IDs. To alleviate some of the concerns raised in this thread we thought 
it would be useful to give more details about how we plan to generate IDs

 

1) IDs are still 64-bit integers.  This should minimize any migration pains.

2) You can still sort on ID.  Within a few millieconds you may get out of order 
results, but for most use cases this shouldn't be an issue.  

3) since_id will still work (within the caveats given above).  

4) We will provide a way to backfill from the streaming API.

5) You cannot use the generated ID to reverse engineer tweet velocity.  Note 
that you can still use the streaming API to determine the rate of public 
statuses.

 

Additional items of interest

1) At some point we will likely start using this as an ID for direct messages 
too

2) We will almost certainly open source the ID generation code, probably before 
we actually cut over to using it.

3) We STRONGLY suggest that you treat IDs as roughly sorted (roughly being 
within a few ms buckets), opaque 64-bit integers.  We may need to change the 
scheme again at some point in the future, and want to minimize migration pains 
should we need to do this.

 

Hopefully this puts you more at ease with the changes we're making.  If it 
raises new concerns, please let us know!

 

  ---Mark

 http://twitter.com/mccv http://twitter.com/mccv

 

On Mon, Apr 5, 2010 at 4:18 PM, M. Edward (Ed) Borasky zn...@comcast.net 
wrote:

On 04/05/2010 12:55 AM, Tim Haines wrote:
 This made me laugh.  Hard.

 On Fri, Apr 2, 2010 at 6:47 AM, Dewald Pretorius dpr...@gmail.com wrote:

 Mark,

 It's extremely important where you have two bots that reply to each
 others' tweets. With incorrectly sorted tweets, you get conversations
 that look completely unnatural.

 On Apr 1, 1:39 pm, Mark McBride mmcbr...@twitter.com wrote:
 Just out of curiosity, what applications are you building that require
 sub-second sorting resolution for tweets?

Yeah - my bot laughed too ;-)

--
M. Edward (Ed) Borasky
borasky-research.net/m-edward-ed-borasky

A mathematician is a device for turning coffee into theorems. ~ Paul Erdős



--

To unsubscribe, reply using remove me as the subject.

 



[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-08 Thread Naveen
This was my initial concern with the randomly generated ids that I
brought up, though I think Brian described it better than I.

It simply seems very likely that when using since_id to populate newer
tweets for the user, that some tweets will never be seen, because the
since_id of the last message received will be larger than one
generated 1ms later.

With the random generation of ids, I can see two way guarantee
delivery of all tweets in a users timeline
1. Page forwards and backwards to ensure no tweets generated at or
near the same time as the newest one did not receive a lower id. This
will be very expensive for a mobile client not to mention complicate
any refresh algorithms significantly.
2. Given that we know how IDs are generated (i.e. which bits represent
the time) we can simply over request by decrementing the since_id time
bits, by a second or two and filter out duplicates. (again, not really
ideal for mobile clients where battery life is an issue, plus it then
makes the implementation very dependent on twitters id format
remaining stable)

Please anyone explain if Brian and I are misinterpreting this as a
very real possibility of never displaying some tweets in a time line,
without changing how we request data from twitter (i.e. since_id
doesn't break)

--Naveen Ayyagari
@knight9
@SocialScope


On Apr 8, 7:01 pm, Brian Smith br...@briansmith.org wrote:
 What does “within the caveats given above” mean? Either since_id will work or 
 it won’t. It seems to me that if IDs are only in a “rough” order, since_id 
 won’t work—in particular, there is a possibility that paging through tweets 
 using since_id will completely skip over some tweets.

 My concern is that, since tweets will not be serialized at the time they are 
 written, there will be a race condition between me making a request and users 
 posting new statuses. That is, I could get a response with the largest id in 
 the response being X that gets evaluated just before a tweet (X-1) has been 
 saved in the database; If so, when I issue a request with since_id=X, my 
 program will never see the newer tweet (X-1).

 Are you going to change the implementation of the timeline methods so that 
 they never return a tweet with ID X until all nodes in the cluster guarantee 
 that they won’t create a new tweet with an ID less than X?

 I implement the following logic:

 1.      Let LATEST start out as the earliest tweet available in the user’s 
 timeline.

 2.      Make a request with since_id={LATEST}, which returns a set of tweets 
 T.

 3.      If T is empty then stop.

 4.      Let LATEST= max({ id(t), for all t in T}).

 5.      Goto 2.

 Will I be guaranteed not to skip over any tweets in the timeline using this 
 logic? If not, what do I need to do to ensure I get them all?

 Thanks,

 Brian

 From: twitter-development-talk@googlegroups.com 
 [mailto:twitter-development-t...@googlegroups.com] On Behalf Of Mark McBride
 Sent: Thursday, April 08, 2010 5:10 PM
 To: twitter-development-talk@googlegroups.com
 Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs are 
 sequenced

 Thank you for the feedback.  It's great to hear about the variety of use 
 cases people have for the API, and in particular all the different ways 
 people are using IDs. To alleviate some of the concerns raised in this thread 
 we thought it would be useful to give more details about how we plan to 
 generate IDs

 1) IDs are still 64-bit integers.  This should minimize any migration pains.

 2) You can still sort on ID.  Within a few millieconds you may get out of 
 order results, but for most use cases this shouldn't be an issue.  

 3) since_id will still work (within the caveats given above).  

 4) We will provide a way to backfill from the streaming API.

 5) You cannot use the generated ID to reverse engineer tweet velocity.  Note 
 that you can still use the streaming API to determine the rate of public 
 statuses.

 Additional items of interest

 1) At some point we will likely start using this as an ID for direct messages 
 too

 2) We will almost certainly open source the ID generation code, probably 
 before we actually cut over to using it.

 3) We STRONGLY suggest that you treat IDs as roughly sorted (roughly being 
 within a few ms buckets), opaque 64-bit integers.  We may need to change the 
 scheme again at some point in the future, and want to minimize migration 
 pains should we need to do this.

 Hopefully this puts you more at ease with the changes we're making.  If it 
 raises new concerns, please let us know!

   ---Mark

  http://twitter.com/mccvhttp://twitter.com/mccv

 On Mon, Apr 5, 2010 at 4:18 PM, M. Edward (Ed) Borasky zn...@comcast.net 
 wrote:

 On 04/05/2010 12:55 AM, Tim Haines wrote:

  This made me laugh.  Hard.

  On Fri, Apr 2, 2010 at 6:47 AM, Dewald Pretorius dpr...@gmail.com wrote:

  Mark,

  It's extremely important where you have two bots that reply to each
  others' tweets. With incorrectly sorted tweets, you

Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-08 Thread Mark McBride
It's a possibility, but by no means a probability.  Note that you can
mitigate this by using the newest tweet that is outside your danger zone.
 For example in a sequence of tweets t1, t2 ... ti ... tn with creation
times c1, c2 ... ci ... cn and a comfort threshold e you could use since_id
from the latest ti such that c1 - ci  e.

  ---Mark

http://twitter.com/mccv


On Thu, Apr 8, 2010 at 4:27 PM, Naveen knig...@gmail.com wrote:

 This was my initial concern with the randomly generated ids that I
 brought up, though I think Brian described it better than I.

 It simply seems very likely that when using since_id to populate newer
 tweets for the user, that some tweets will never be seen, because the
 since_id of the last message received will be larger than one
 generated 1ms later.

 With the random generation of ids, I can see two way guarantee
 delivery of all tweets in a users timeline
 1. Page forwards and backwards to ensure no tweets generated at or
 near the same time as the newest one did not receive a lower id. This
 will be very expensive for a mobile client not to mention complicate
 any refresh algorithms significantly.
 2. Given that we know how IDs are generated (i.e. which bits represent
 the time) we can simply over request by decrementing the since_id time
 bits, by a second or two and filter out duplicates. (again, not really
 ideal for mobile clients where battery life is an issue, plus it then
 makes the implementation very dependent on twitters id format
 remaining stable)

 Please anyone explain if Brian and I are misinterpreting this as a
 very real possibility of never displaying some tweets in a time line,
 without changing how we request data from twitter (i.e. since_id
 doesn't break)

 --Naveen Ayyagari
 @knight9
 @SocialScope


 On Apr 8, 7:01 pm, Brian Smith br...@briansmith.org wrote:
  What does “within the caveats given above” mean? Either since_id will
 work or it won’t. It seems to me that if IDs are only in a “rough” order,
 since_id won’t work—in particular, there is a possibility that paging
 through tweets using since_id will completely skip over some tweets.
 
  My concern is that, since tweets will not be serialized at the time they
 are written, there will be a race condition between me making a request and
 users posting new statuses. That is, I could get a response with the largest
 id in the response being X that gets evaluated just before a tweet (X-1) has
 been saved in the database; If so, when I issue a request with since_id=X,
 my program will never see the newer tweet (X-1).
 
  Are you going to change the implementation of the timeline methods so
 that they never return a tweet with ID X until all nodes in the cluster
 guarantee that they won’t create a new tweet with an ID less than X?
 
  I implement the following logic:
 
  1.  Let LATEST start out as the earliest tweet available in the
 user’s timeline.
 
  2.  Make a request with since_id={LATEST}, which returns a set of
 tweets T.
 
  3.  If T is empty then stop.
 
  4.  Let LATEST= max({ id(t), for all t in T}).
 
  5.  Goto 2.
 
  Will I be guaranteed not to skip over any tweets in the timeline using
 this logic? If not, what do I need to do to ensure I get them all?
 
  Thanks,
 
  Brian
 
  From: twitter-development-talk@googlegroups.com [mailto:
 twitter-development-t...@googlegroups.com] On Behalf Of Mark McBride
  Sent: Thursday, April 08, 2010 5:10 PM
  To: twitter-development-talk@googlegroups.com
  Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs are
 sequenced
 
  Thank you for the feedback.  It's great to hear about the variety of use
 cases people have for the API, and in particular all the different ways
 people are using IDs. To alleviate some of the concerns raised in this
 thread we thought it would be useful to give more details about how we plan
 to generate IDs
 
  1) IDs are still 64-bit integers.  This should minimize any migration
 pains.
 
  2) You can still sort on ID.  Within a few millieconds you may get out of
 order results, but for most use cases this shouldn't be an issue.
 
  3) since_id will still work (within the caveats given above).
 
  4) We will provide a way to backfill from the streaming API.
 
  5) You cannot use the generated ID to reverse engineer tweet velocity.
  Note that you can still use the streaming API to determine the rate of
 public statuses.
 
  Additional items of interest
 
  1) At some point we will likely start using this as an ID for direct
 messages too
 
  2) We will almost certainly open source the ID generation code, probably
 before we actually cut over to using it.
 
  3) We STRONGLY suggest that you treat IDs as roughly sorted (roughly
 being within a few ms buckets), opaque 64-bit integers.  We may need to
 change the scheme again at some point in the future, and want to minimize
 migration pains should we need to do this.
 
  Hopefully this puts you more at ease with the changes we're

[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-08 Thread Naveen
Ahh, yes, your workaround is a little better than mine, but it is
still a work around and requires changes to how since_id is currently
used by what I have assume is most applications. I understand the need
for change and am willing to work around it, I can imagine the
scalability issues of trying to use a synchronized id for all tweets.

However, I wanted to be clear and feel it should be made obvious that
with this change, there is a possibility that a tweet may not be
delivered to client if the implementation of how since_id is currently
used is not updated to cover the case.  I still envision the situation
as more likely than you seem to believe and figure as tweet velocity
increases, the likelihood will also increase; But I am assuming have
better data to support your viewpoint than I and shall defer.

--Naveen Ayyagari
@knight9
@SocialScope

On Apr 8, 7:37 pm, Mark McBride mmcbr...@twitter.com wrote:
 It's a possibility, but by no means a probability.  Note that you can
 mitigate this by using the newest tweet that is outside your danger zone.
  For example in a sequence of tweets t1, t2 ... ti ... tn with creation
 times c1, c2 ... ci ... cn and a comfort threshold e you could use since_id
 from the latest ti such that c1 - ci  e.

   ---Mark

 http://twitter.com/mccv

 On Thu, Apr 8, 2010 at 4:27 PM, Naveen knig...@gmail.com wrote:
  This was my initial concern with the randomly generated ids that I
  brought up, though I think Brian described it better than I.

  It simply seems very likely that when using since_id to populate newer
  tweets for the user, that some tweets will never be seen, because the
  since_id of the last message received will be larger than one
  generated 1ms later.

  With the random generation of ids, I can see two way guarantee
  delivery of all tweets in a users timeline
  1. Page forwards and backwards to ensure no tweets generated at or
  near the same time as the newest one did not receive a lower id. This
  will be very expensive for a mobile client not to mention complicate
  any refresh algorithms significantly.
  2. Given that we know how IDs are generated (i.e. which bits represent
  the time) we can simply over request by decrementing the since_id time
  bits, by a second or two and filter out duplicates. (again, not really
  ideal for mobile clients where battery life is an issue, plus it then
  makes the implementation very dependent on twitters id format
  remaining stable)

  Please anyone explain if Brian and I are misinterpreting this as a
  very real possibility of never displaying some tweets in a time line,
  without changing how we request data from twitter (i.e. since_id
  doesn't break)

  --Naveen Ayyagari
  @knight9
  @SocialScope

  On Apr 8, 7:01 pm, Brian Smith br...@briansmith.org wrote:
   What does “within the caveats given above” mean? Either since_id will
  work or it won’t. It seems to me that if IDs are only in a “rough” order,
  since_id won’t work—in particular, there is a possibility that paging
  through tweets using since_id will completely skip over some tweets.

   My concern is that, since tweets will not be serialized at the time they
  are written, there will be a race condition between me making a request and
  users posting new statuses. That is, I could get a response with the largest
  id in the response being X that gets evaluated just before a tweet (X-1) has
  been saved in the database; If so, when I issue a request with since_id=X,
  my program will never see the newer tweet (X-1).

   Are you going to change the implementation of the timeline methods so
  that they never return a tweet with ID X until all nodes in the cluster
  guarantee that they won’t create a new tweet with an ID less than X?

   I implement the following logic:

   1.      Let LATEST start out as the earliest tweet available in the
  user’s timeline.

   2.      Make a request with since_id={LATEST}, which returns a set of
  tweets T.

   3.      If T is empty then stop.

   4.      Let LATEST= max({ id(t), for all t in T}).

   5.      Goto 2.

   Will I be guaranteed not to skip over any tweets in the timeline using
  this logic? If not, what do I need to do to ensure I get them all?

   Thanks,

   Brian

   From: twitter-development-talk@googlegroups.com [mailto:
  twitter-development-t...@googlegroups.com] On Behalf Of Mark McBride
   Sent: Thursday, April 08, 2010 5:10 PM
   To: twitter-development-talk@googlegroups.com
   Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs are
  sequenced

   Thank you for the feedback.  It's great to hear about the variety of use
  cases people have for the API, and in particular all the different ways
  people are using IDs. To alleviate some of the concerns raised in this
  thread we thought it would be useful to give more details about how we plan
  to generate IDs

   1) IDs are still 64-bit integers.  This should minimize any migration
  pains.

   2) You can still sort on ID

RE: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-08 Thread Brian Smith
Mark, thank you for taking the time to respond. 

 

What is the smallest “comfort threshold” that will guarantee that we will see 
all the tweets, with none skipped over and the fewest tweets returned multiple 
times?

 

Let’s say the comfort threshold was 2 seconds. It seems to me like there could 
realistically be dozens or hundreds of tweets within those two seconds in a 
single timeline, and a request that used the logic you mentioned would return 
an entire page (200 tweets) consisting of tweets that the application already 
has; the application would be making a relatively large download, receiving 
nothing useful for it, and not be able to make any progress because its 
since_id would get “stuck”. This is at odds with many (most?) applications goal 
in using since_id, which is to transfer as little data as possible.

 

It seems like a better alternative would a new parameter that says “don’t give 
me any tweets that are less than X seconds old,” where X seconds is the 
comfort threshold. That way, the application may lag behind by a few of 
seconds, but at least it would be able to confidently page through the timeline 
without excessive data transfer. Without such a mechanism, it looks like this 
change will be a significant degradation of service that result in 
applications’ “refresh” features becoming either unreliable or very wasteful.

 

But, is it realistic for applications to expect the Twitter cluster to be in 
sync within 2 seconds? 10 seconds? 30 seconds? That is the part that is unclear 
to me. 

 

Thanks again,

Brian

 

 

From: twitter-development-talk@googlegroups.com 
[mailto:twitter-development-t...@googlegroups.com] On Behalf Of Mark McBride
Sent: Thursday, April 08, 2010 6:38 PM
To: twitter-development-talk@googlegroups.com
Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs are 
sequenced

 

It's a possibility, but by no means a probability.  Note that you can mitigate 
this by using the newest tweet that is outside your danger zone.  For example 
in a sequence of tweets t1, t2 ... ti ... tn with creation times c1, c2 ... ci 
... cn and a comfort threshold e you could use since_id from the latest ti such 
that c1 - ci  e.


  ---Mark

http://twitter.com/mccv



On Thu, Apr 8, 2010 at 4:27 PM, Naveen knig...@gmail.com wrote:

This was my initial concern with the randomly generated ids that I
brought up, though I think Brian described it better than I.

It simply seems very likely that when using since_id to populate newer
tweets for the user, that some tweets will never be seen, because the
since_id of the last message received will be larger than one
generated 1ms later.

With the random generation of ids, I can see two way guarantee
delivery of all tweets in a users timeline
1. Page forwards and backwards to ensure no tweets generated at or
near the same time as the newest one did not receive a lower id. This
will be very expensive for a mobile client not to mention complicate
any refresh algorithms significantly.
2. Given that we know how IDs are generated (i.e. which bits represent
the time) we can simply over request by decrementing the since_id time
bits, by a second or two and filter out duplicates. (again, not really
ideal for mobile clients where battery life is an issue, plus it then
makes the implementation very dependent on twitters id format
remaining stable)

Please anyone explain if Brian and I are misinterpreting this as a
very real possibility of never displaying some tweets in a time line,
without changing how we request data from twitter (i.e. since_id
doesn't break)

--Naveen Ayyagari
@knight9
@SocialScope



On Apr 8, 7:01 pm, Brian Smith br...@briansmith.org wrote:
 What does “within the caveats given above” mean? Either since_id will work or 
 it won’t. It seems to me that if IDs are only in a “rough” order, since_id 
 won’t work—in particular, there is a possibility that paging through tweets 
 using since_id will completely skip over some tweets.

 My concern is that, since tweets will not be serialized at the time they are 
 written, there will be a race condition between me making a request and users 
 posting new statuses. That is, I could get a response with the largest id in 
 the response being X that gets evaluated just before a tweet (X-1) has been 
 saved in the database; If so, when I issue a request with since_id=X, my 
 program will never see the newer tweet (X-1).

 Are you going to change the implementation of the timeline methods so that 
 they never return a tweet with ID X until all nodes in the cluster guarantee 
 that they won’t create a new tweet with an ID less than X?

 I implement the following logic:

 1.  Let LATEST start out as the earliest tweet available in the user’s 
 timeline.

 2.  Make a request with since_id={LATEST}, which returns a set of tweets 
 T.

 3.  If T is empty then stop.

 4.  Let LATEST= max({ id(t), for all t in T}).

 5.  Goto 2.

 Will I be guaranteed

Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-05 Thread M. Edward (Ed) Borasky
On 04/05/2010 12:55 AM, Tim Haines wrote:
 This made me laugh.  Hard.
 
 On Fri, Apr 2, 2010 at 6:47 AM, Dewald Pretorius dpr...@gmail.com wrote:
 
 Mark,

 It's extremely important where you have two bots that reply to each
 others' tweets. With incorrectly sorted tweets, you get conversations
 that look completely unnatural.

 On Apr 1, 1:39 pm, Mark McBride mmcbr...@twitter.com wrote:
 Just out of curiosity, what applications are you building that require
 sub-second sorting resolution for tweets?

Yeah - my bot laughed too ;-)
-- 
M. Edward (Ed) Borasky
borasky-research.net/m-edward-ed-borasky

A mathematician is a device for turning coffee into theorems. ~ Paul Erdős


-- 
To unsubscribe, reply using remove me as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-01 Thread Aki
It actually makes sense to use tweet ID to sort tweets, because
timestamp is not a valid source of information for accurate sorting.
It is a very common case to have multiple tweets posted at the exact
same second, and it is not possible to reproduce the correct ordering
of tweets on the client side. This can be improved by having better
precision for timestamp (maybe milliseconds), but it is still possible
to get tweets posted at the exact same milliseconds (although it is
very rare).

If Twitter really needs to change the tweet ID scheme, I think better
solution for sorting is required to be provided through API.

On Mar 27, 7:41 am, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 Hi Developers,

 It's no secret that Twitter is growing exponentially. The tweets keep coming
 with ever increasing velocity, thanks in large part to your great
 applications.

 Twitter has adapted to the increasing number of tweets in ways that have
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit
 unsigned integers for status IDs some time ago. You all weathered that storm
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.

 Now we're reaching the scalability limit of our current tweet ID generation
 scheme. Unlike the previous tweet ID migrations, the solution to the current
 issue is significantly different. However, in most cases the new approach we
 will take will not result in any noticeable differences to you the developer
 or your users.

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore. Likewise for usage of
 IDs in mathematical operations -- for instance, subtracting two status IDs
 to determine the number of tweets in between will no longer be possible.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

 Taylor Singletary
 Developer Advocate, Twitterhttp://twitter.com/episod


-- 
To unsubscribe, reply using remove me as the subject.


Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-01 Thread Mark McBride
Just out of curiosity, what applications are you building that require
sub-second sorting resolution for tweets?

  ---Mark

http://twitter.com/mccv


On Wed, Mar 31, 2010 at 11:01 PM, Aki yoru.fuku...@gmail.com wrote:

 It actually makes sense to use tweet ID to sort tweets, because
 timestamp is not a valid source of information for accurate sorting.
 It is a very common case to have multiple tweets posted at the exact
 same second, and it is not possible to reproduce the correct ordering
 of tweets on the client side. This can be improved by having better
 precision for timestamp (maybe milliseconds), but it is still possible
 to get tweets posted at the exact same milliseconds (although it is
 very rare).

 If Twitter really needs to change the tweet ID scheme, I think better
 solution for sorting is required to be provided through API.

 On Mar 27, 7:41 am, Taylor Singletary taylorsinglet...@twitter.com
 wrote:
  Hi Developers,
 
  It's no secret that Twitter is growing exponentially. The tweets keep
 coming
  with ever increasing velocity, thanks in large part to your great
  applications.
 
  Twitter has adapted to the increasing number of tweets in ways that have
  affected you in the past: We moved from 32 bit unsigned integers to
 64-bit
  unsigned integers for status IDs some time ago. You all weathered that
 storm
  with ease. The tweetapoclypse was averted, and the tweets kept flowing.
 
  Now we're reaching the scalability limit of our current tweet ID
 generation
  scheme. Unlike the previous tweet ID migrations, the solution to the
 current
  issue is significantly different. However, in most cases the new approach
 we
  will take will not result in any noticeable differences to you the
 developer
  or your users.
 
  We are planning to replace our current sequential tweet ID generation
  routine with a simple, more scalable solution. IDs will still be 64-bit
  unsigned integers. However, this new solution is no longer guaranteed to
  generate sequential IDs.  Instead IDs will be derived based on time: the
  most significant bits being sourced from a timestamp and the least
  significant bits will be effectively random.
 
  Please don't depend on the exact format of the ID. As our infrastructure
  needs evolve, we might need to tweak the generation algorithm again.
 
  If you've been trying to divine meaning from status IDs aside from their
  role as a primary key, you won't be able to anymore. Likewise for usage
 of
  IDs in mathematical operations -- for instance, subtracting two status
 IDs
  to determine the number of tweets in between will no longer be possible.
 
  For the majority of applications we think this scheme switch will be a
  non-event. Before implementing these changes, we'd like to know if your
  applications currently depend on the sequential nature of IDs. Do you
 depend
  on the density of the tweet sequence being constant?  Are you trying to
  analyze the IDs as anything other than opaque, ordered identifiers? Aside
  for guaranteed sequential tweet ID ordering, what APIs can we provide you
 to
  accomplish your goals?
 
  Taylor Singletary
  Developer Advocate, Twitterhttp://twitter.com/episod


 --
 To unsubscribe, reply using remove me as the subject.



[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-01 Thread Dewald Pretorius
Mark,

It's extremely important where you have two bots that reply to each
others' tweets. With incorrectly sorted tweets, you get conversations
that look completely unnatural.

On Apr 1, 1:39 pm, Mark McBride mmcbr...@twitter.com wrote:
 Just out of curiosity, what applications are you building that require
 sub-second sorting resolution for tweets?

   ---Mark

 http://twitter.com/mccv



 On Wed, Mar 31, 2010 at 11:01 PM, Aki yoru.fuku...@gmail.com wrote:
  It actually makes sense to use tweet ID to sort tweets, because
  timestamp is not a valid source of information for accurate sorting.
  It is a very common case to have multiple tweets posted at the exact
  same second, and it is not possible to reproduce the correct ordering
  of tweets on the client side. This can be improved by having better
  precision for timestamp (maybe milliseconds), but it is still possible
  to get tweets posted at the exact same milliseconds (although it is
  very rare).

  If Twitter really needs to change the tweet ID scheme, I think better
  solution for sorting is required to be provided through API.

  On Mar 27, 7:41 am, Taylor Singletary taylorsinglet...@twitter.com
  wrote:
   Hi Developers,

   It's no secret that Twitter is growing exponentially. The tweets keep
  coming
   with ever increasing velocity, thanks in large part to your great
   applications.

   Twitter has adapted to the increasing number of tweets in ways that have
   affected you in the past: We moved from 32 bit unsigned integers to
  64-bit
   unsigned integers for status IDs some time ago. You all weathered that
  storm
   with ease. The tweetapoclypse was averted, and the tweets kept flowing.

   Now we're reaching the scalability limit of our current tweet ID
  generation
   scheme. Unlike the previous tweet ID migrations, the solution to the
  current
   issue is significantly different. However, in most cases the new approach
  we
   will take will not result in any noticeable differences to you the
  developer
   or your users.

   We are planning to replace our current sequential tweet ID generation
   routine with a simple, more scalable solution. IDs will still be 64-bit
   unsigned integers. However, this new solution is no longer guaranteed to
   generate sequential IDs.  Instead IDs will be derived based on time: the
   most significant bits being sourced from a timestamp and the least
   significant bits will be effectively random.

   Please don't depend on the exact format of the ID. As our infrastructure
   needs evolve, we might need to tweak the generation algorithm again.

   If you've been trying to divine meaning from status IDs aside from their
   role as a primary key, you won't be able to anymore. Likewise for usage
  of
   IDs in mathematical operations -- for instance, subtracting two status
  IDs
   to determine the number of tweets in between will no longer be possible.

   For the majority of applications we think this scheme switch will be a
   non-event. Before implementing these changes, we'd like to know if your
   applications currently depend on the sequential nature of IDs. Do you
  depend
   on the density of the tweet sequence being constant?  Are you trying to
   analyze the IDs as anything other than opaque, ordered identifiers? Aside
   for guaranteed sequential tweet ID ordering, what APIs can we provide you
  to
   accomplish your goals?

   Taylor Singletary
   Developer Advocate, Twitterhttp://twitter.com/episod

  --
  To unsubscribe, reply using remove me as the subject.- Hide quoted text -

 - Show quoted text -


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-01 Thread M. Edward (Ed) Borasky
On Apr 1, 10:47 am, Dewald Pretorius dpr...@gmail.com wrote:
 Mark,

 It's extremely important where you have two bots that reply to each
 others' tweets. With incorrectly sorted tweets, you get conversations
 that look completely unnatural.

Uh ... bots talking to each other on Twitter? Is this something I can
watch today, or something that someone would build if the technology
existed in the API to support it? ;-)



[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-01 Thread M. Edward (Ed) Borasky


On Apr 1, 9:39 am, Mark McBride mmcbr...@twitter.com wrote:
 Just out of curiosity, what applications are you building that require
 sub-second sorting resolution for tweets?

   ---Mark

Twitter's capacity planning? ;-)



-- 
To unsubscribe, reply using remove me as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-01 Thread Dewald Pretorius
Ed,

I dunno. Maybe sub-second sorting resolution for tweets is also
important for kids who grew up with cell phones and texting, and can
type *really* fast on an iPhone.

On Apr 1, 4:41 pm, M. Edward (Ed) Borasky zzn...@gmail.com wrote:
 On Apr 1, 10:47 am, Dewald Pretorius dpr...@gmail.com wrote:

  Mark,

  It's extremely important where you have two bots that reply to each
  others' tweets. With incorrectly sorted tweets, you get conversations
  that look completely unnatural.

 Uh ... bots talking to each other on Twitter? Is this something I can
 watch today, or something that someone would build if the technology
 existed in the API to support it? ;-)


-- 
To unsubscribe, reply using remove me as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-01 Thread Aki
I'm developing desktop Twitter client. I think accurate sorting is
needed, because the order of tweets may look different on every
application without accurate sorting. It's not that it would totally
kill my Twitter client, but I take accurate presentation of tweets
seriously, and I think it would be better to have consistent tweet
ordering across all applications.

If this scheme change is really needed (e.g. required to processing
new tweets simultaneously across multiple servers without
synchronising tweet ID), I would suggest adding time in milliseconds
to tweet information, which would have much better accuracy.

On Apr 2, 3:39 am, Mark McBride mmcbr...@twitter.com wrote:
 Just out of curiosity, what applications are you building that require
 sub-second sorting resolution for tweets?

   ---Mark

 http://twitter.com/mccv



 On Wed, Mar 31, 2010 at 11:01 PM, Aki yoru.fuku...@gmail.com wrote:
  It actually makes sense to use tweet ID to sort tweets, because
  timestamp is not a valid source of information for accurate sorting.
  It is a very common case to have multiple tweets posted at the exact
  same second, and it is not possible to reproduce the correct ordering
  of tweets on the client side. This can be improved by having better
  precision for timestamp (maybe milliseconds), but it is still possible
  to get tweets posted at the exact same milliseconds (although it is
  very rare).

  If Twitter really needs to change the tweet ID scheme, I think better
  solution for sorting is required to be provided through API.

  On Mar 27, 7:41 am, Taylor Singletary taylorsinglet...@twitter.com
  wrote:
   Hi Developers,

   It's no secret that Twitter is growing exponentially. The tweets keep
  coming
   with ever increasing velocity, thanks in large part to your great
   applications.

   Twitter has adapted to the increasing number of tweets in ways that have
   affected you in the past: We moved from 32 bit unsigned integers to
  64-bit
   unsigned integers for status IDs some time ago. You all weathered that
  storm
   with ease. The tweetapoclypse was averted, and the tweets kept flowing.

   Now we're reaching the scalability limit of our current tweet ID
  generation
   scheme. Unlike the previous tweet ID migrations, the solution to the
  current
   issue is significantly different. However, in most cases the new approach
  we
   will take will not result in any noticeable differences to you the
  developer
   or your users.

   We are planning to replace our current sequential tweet ID generation
   routine with a simple, more scalable solution. IDs will still be 64-bit
   unsigned integers. However, this new solution is no longer guaranteed to
   generate sequential IDs.  Instead IDs will be derived based on time: the
   most significant bits being sourced from a timestamp and the least
   significant bits will be effectively random.

   Please don't depend on the exact format of the ID. As our infrastructure
   needs evolve, we might need to tweak the generation algorithm again.

   If you've been trying to divine meaning from status IDs aside from their
   role as a primary key, you won't be able to anymore. Likewise for usage
  of
   IDs in mathematical operations -- for instance, subtracting two status
  IDs
   to determine the number of tweets in between will no longer be possible.

   For the majority of applications we think this scheme switch will be a
   non-event. Before implementing these changes, we'd like to know if your
   applications currently depend on the sequential nature of IDs. Do you
  depend
   on the density of the tweet sequence being constant?  Are you trying to
   analyze the IDs as anything other than opaque, ordered identifiers? Aside
   for guaranteed sequential tweet ID ordering, what APIs can we provide you
  to
   accomplish your goals?

   Taylor Singletary
   Developer Advocate, Twitterhttp://twitter.com/episod

  --
  To unsubscribe, reply using remove me as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-04-01 Thread M. Edward (Ed) Borasky
On Apr 1, 4:34 pm, Aki yoru.fuku...@gmail.com wrote:
 I'm developing desktop Twitter client. I think accurate sorting is
 needed, because the order of tweets may look different on every
 application without accurate sorting. It's not that it would totally
 kill my Twitter client, but I take accurate presentation of tweets
 seriously, and I think it would be better to have consistent tweet
 ordering across all applications.

 If this scheme change is really needed (e.g. required to processing
 new tweets simultaneously across multiple servers without
 synchronising tweet ID), I would suggest adding time in milliseconds
 to tweet information, which would have much better accuracy.

No matter what the timestamp resolution is, you're still going to have
a
non-zero probability of multiple tweets per timestamp. And if you have
an event somewhere, like an earthquake or an orca killing his
trainer
in a show, you're going to see bursts of tweets from the scene,
assuming
the infrastructure survived the event. The probability of multiple
tweets per timestamp will increase dramatically in such a
circumstance.

But - I personally don't see how it would hurt Twitter to publish
average tweet inter-arrival times or average tweets per second on a
web
page for all the world to see. In fact, I'd love to be able to pull up
a
map of the world and see tweets-per-second mapped in (near) real time
-
say, refreshing every minute or so. Why make the world work to pull
this
out of the APIs? ;-)

How hard can it be?

http://earthquake.usgs.gov/earthquakes/recenteqsanim/world/


-- 
To unsubscribe, reply using remove me as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-31 Thread eugene.man...@gmail.com
Second that. Our app continuously retrieves feeds of individual users
and lists. Monotonically increasing are required to be able to do that
(using since_id).

Please provide an alternative for this use case in case you change
your id generation scheme.

Thanks!

On Mar 26, 1:57 pm, Naveen knig...@gmail.com wrote:
 We do not require that ids be sequential, but if the ids are not
 monotonically increasing it cause some issue with how we manage
 since_ids..

 i.e. if a message posted by userA, 1 ns after userB, we would assume
 userB has a higher id than userA. While it may seem like nitpicking,
 wouldn't there a change userB message wont get delivered if its id is
 lower than userAs message and I happen to  query the API just before
 userB but right after userA posted?

 --Naveen

 On Mar 26, 4:41 pm, Taylor Singletary taylorsinglet...@twitter.com
 wrote:

  Hi Developers,

  It's no secret that Twitter is growing exponentially. The tweets keep coming
  with ever increasing velocity, thanks in large part to your great
  applications.

  Twitter has adapted to the increasing number of tweets in ways that have
  affected you in the past: We moved from 32 bit unsigned integers to 64-bit
  unsigned integers for status IDs some time ago. You all weathered that storm
  with ease. The tweetapoclypse was averted, and the tweets kept flowing.

  Now we're reaching the scalability limit of our current tweet ID generation
  scheme. Unlike the previous tweet ID migrations, the solution to the current
  issue is significantly different. However, in most cases the new approach we
  will take will not result in any noticeable differences to you the developer
  or your users.

  We are planning to replace our current sequential tweet ID generation
  routine with a simple, more scalable solution. IDs will still be 64-bit
  unsigned integers. However, this new solution is no longer guaranteed to
  generate sequential IDs.  Instead IDs will be derived based on time: the
  most significant bits being sourced from a timestamp and the least
  significant bits will be effectively random.

  Please don't depend on the exact format of the ID. As our infrastructure
  needs evolve, we might need to tweak the generation algorithm again.

  If you've been trying to divine meaning from status IDs aside from their
  role as a primary key, you won't be able to anymore. Likewise for usage of
  IDs in mathematical operations -- for instance, subtracting two status IDs
  to determine the number of tweets in between will no longer be possible.

  For the majority of applications we think this scheme switch will be a
  non-event. Before implementing these changes, we'd like to know if your
  applications currently depend on the sequential nature of IDs. Do you depend
  on the density of the tweet sequence being constant?  Are you trying to
  analyze the IDs as anything other than opaque, ordered identifiers? Aside
  for guaranteed sequential tweet ID ordering, what APIs can we provide you to
  accomplish your goals?

  Taylor Singletary
  Developer Advocate, Twitterhttp://twitter.com/episod


Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-31 Thread Adam Fields
On Wed, Mar 31, 2010 at 07:30:00AM -0700, eugene.man...@gmail.com wrote:
 Second that. Our app continuously retrieves feeds of individual users
 and lists. Monotonically increasing are required to be able to do that
 (using since_id).
[...]

Since the most significant bits are generated from a timestamp, later
tweets will always have a higher number than earlier ones (except in
the case of the black hole explorer probe tweeting its progress from
within the event horizon).

To illustrate this with decimal numbers from 0-9:

If two users post three tweets each in the space of three seconds,
they may space like this (the first digit is the timestamp, the second
is the random digit):

User 1: 05
User 2: 06
User 1: 17
User 2: 12
User 1: 27
User 2: 29

Tweets 12 and 17 are out of order, but they're not really in
order, since they happened at the same time (depending on the
precision of the timestamp) by different users. User 1's tweets (05,
17, 27) and User 2's tweets (06, 12, 29) will always be ordered
properly by time within each user even though the second digit is
random.

-- 
- Adam
--
If you liked this email, you might also like:
Good article on technical aspects of lens variation 
-- http://workstuff.tumblr.com/post/479306926
Cooking at home is different 
-- http://www.aquick.org/blog/2009/10/15/cooking-at-home-is-different/
Bloom 
-- http://www.flickr.com/photos/fields/4449638140/
fields: RT @smokingapples: Warning: Clicking this link might result in 
uncontr... 
-- http://twitter.com/fields/statuses/11338927699
--
** I design intricate-yet-elegant processes for user and machine problems.
** Custom development project broken? Contact me, I can help.
** Some of what I do: http://workstuff.tumblr.com/post/70505118/aboutworkstuff

[ http://www.adamfields.com/resume.html ].. Experience
[ http://www.morningside-analytics.com ] .. Latest Venture
[ http://www.confabb.com ]  Founder


-- 
To unsubscribe, reply using remove me as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-31 Thread Mr Blog
The ability to create apps like http://www.tweespeed.com/ as a result
of a few quick APIs to get the difference between two status IDs is
really nice.

Perhaps even if status IDs are not sequential there could be some kind
of a an API method like tweetCount(firstID, secondID) that if given
two statusIDs the API returns the number of tweets between these two?



-- 
To unsubscribe, reply using remove me as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-31 Thread Dewald Pretorius
It's really bad application design practice to assign any significance
to the primary key of an entity, except for a means to uniquely
identify each member of the entity.


-- 
To unsubscribe, reply using remove me as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-30 Thread Jacopo Gio
+1 ;-)

On 26 mar, 21:57, bob.hitching b...@hitching.net wrote:
 +1 on the need to maintain support for since_id in the Search API

 On Mar 27, 7:41 am, Taylor Singletary taylorsinglet...@twitter.com
 wrote:

  Hi Developers,

  It's no secret that Twitter is growing exponentially. The tweets keep coming
  with ever increasing velocity, thanks in large part to your great
  applications.

  Twitter has adapted to the increasing number of tweets in ways that have
  affected you in the past: We moved from 32 bit unsigned integers to 64-bit
  unsigned integers for status IDs some time ago. You all weathered that storm
  with ease. The tweetapoclypse was averted, and the tweets kept flowing.

  Now we're reaching the scalability limit of our current tweet ID generation
  scheme. Unlike the previous tweet ID migrations, the solution to the current
  issue is significantly different. However, in most cases the new approach we
  will take will not result in any noticeable differences to you the developer
  or your users.

  We are planning to replace our current sequential tweet ID generation
  routine with a simple, more scalable solution. IDs will still be 64-bit
  unsigned integers. However, this new solution is no longer guaranteed to
  generate sequential IDs.  Instead IDs will be derived based on time: the
  most significant bits being sourced from a timestamp and the least
  significant bits will be effectively random.

  Please don't depend on the exact format of the ID. As our infrastructure
  needs evolve, we might need to tweak the generation algorithm again.

  If you've been trying to divine meaning from status IDs aside from their
  role as a primary key, you won't be able to anymore. Likewise for usage of
  IDs in mathematical operations -- for instance, subtracting two status IDs
  to determine the number of tweets in between will no longer be possible.

  For the majority of applications we think this scheme switch will be a
  non-event. Before implementing these changes, we'd like to know if your
  applications currently depend on the sequential nature of IDs. Do you depend
  on the density of the tweet sequence being constant?  Are you trying to
  analyze the IDs as anything other than opaque, ordered identifiers? Aside
  for guaranteed sequential tweet ID ordering, what APIs can we provide you to
  accomplish your goals?

  Taylor Singletary
  Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-30 Thread Aaron Rankin
Taylor,

We too rely heavily on the sequential, increasing nature of standard
tweet and DM IDs. Are both are in scope here?

While it is straightforward to can change code to sort and compare on
dates, this will be a major undertaking for our application. I suspect
many other applications are in the same situation. This will be a
major impact to the community of existing applications.

I hope the Twitter technical team recognizes the pain this will cause
and has weighed all possible scaling alternatives.



Aaron

--
Sprout Social

On Mar 26, 3:41 pm, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 Hi Developers,

 It's no secret that Twitter is growing exponentially. The tweets keep coming
 with ever increasing velocity, thanks in large part to your great
 applications.

 Twitter has adapted to the increasing number of tweets in ways that have
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit
 unsigned integers for status IDs some time ago. You all weathered that storm
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.

 Now we're reaching the scalability limit of our current tweet ID generation
 scheme. Unlike the previous tweet ID migrations, the solution to the current
 issue is significantly different. However, in most cases the new approach we
 will take will not result in any noticeable differences to you the developer
 or your users.

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore. Likewise for usage of
 IDs in mathematical operations -- for instance, subtracting two status IDs
 to determine the number of tweets in between will no longer be possible.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

 Taylor Singletary
 Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-29 Thread Zaudio
We're relying on the ID being sequention for a number of purposes:

1) Counting elapsed tweets to estimate tweet rates to feed back into
count parameter to backtrack when restarting streaming API/Shadow -
how will we be able to do that without sequential IDs???

2) Indexing and sorting pages of tweets to be display by our
application - moving away from sequential IDs to break our sorting
algarithms... and require recoding to sort exclusively by date alone

3) Polling for new mentions to merge within streamed tweets - we use
the ID as a last placeholder - again chnages there would break our app
unless recoded.

Zaudio
Developer BullsOnWallStreet.com

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-28 Thread janole
I'm relying on increasing status_id's for indexing/sorting tweets and
on using since_id (for saving traffic.) Any change to this will break
my mobile Twitter client.

As long as the status_id's are increasing for sequential tweets, I'm
fine - however, I don't really understand the need of random gaps in
the status_id numbering!? =)

Ole @ mobileways.de / Gravity
http://twitter.com/janole

On 26 Mrz., 22:41, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 Hi Developers,

 It's no secret that Twitter is growing exponentially. The tweets keep coming
 with ever increasing velocity, thanks in large part to your great
 applications.

 Twitter has adapted to the increasing number of tweets in ways that have
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit
 unsigned integers for status IDs some time ago. You all weathered that storm
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.

 Now we're reaching the scalability limit of our current tweet ID generation
 scheme. Unlike the previous tweet ID migrations, the solution to the current
 issue is significantly different. However, in most cases the new approach we
 will take will not result in any noticeable differences to you the developer
 or your users.

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore. Likewise for usage of
 IDs in mathematical operations -- for instance, subtracting two status IDs
 to determine the number of tweets in between will no longer be possible.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

 Taylor Singletary
 Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-27 Thread Karthik
My applications will have an impact in the SQL queries. Right now, to
display tweets in reverse chronological order and with pagination, my
query has something like this:

SELECT * FROM
tweets
INNER JOIN mytable on tweets.id = mytable.tweet_id
GROUP BY tweets.id
ORDER BY tweets.id DESC
LIMIT 100, 20

Having the group and order clause on the same field (tweets.id)
improves the performance. But after this change, I'm forced to use the
timestamp in the ORDER clause.

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-27 Thread Josh Bleecher Snyder
 So I think we need to allow Twitter some leeway here.

I apologize if my tone came off badly; it was not intended. I've just
had bumpy rides using timestamps for coordination in distributed
systems (less cool ones than space flight), so this worried me a
little. In the end, whatever Twitter decides to do, I'll work with.


 As far as occasional glitches are concerned, we have those now. Every
 so often, we still get Fail Whales, 5xx errors, DDos attacks, etc.

The difference is that those errors are straightforwardly detectable
on the client side and can be handled more or less gracefully. Minor,
intermittent data issues (like the odd missing tweet) are less
straightforward to detect, but still trigger support emails. :)

-josh

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-27 Thread Chad Etzel
So, I guess for the since_id issue, it boils down to this question:

Regarding the since_id parameter, when you (Twitter) flip the switch
on the new ID format, will I (as a developer) have to change any of my
code in order for it to function the way it does now? This question
applies equally for both the Twitter API and the Search API.

Check One:

[ ] YES
[ ] NO

Taylor's previous response alluded to no (a good thing), but I
wasn't 100% assured.

-Chad

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Rich
I second this request, as a mobile developer since_id is essential for
caching old tweets and only retreiving new tweets.

since_id is invaluable.  You say  in most cases the new approach we
will take will not result in any noticeable differences  does that
mean you will still handle a since_id being passed and if not how will
we now use the API?

On Mar 26, 8:48 pm, Brian Smith br...@briansmith.org wrote:
 Any app that pages through timelines uses since_id or max_id depends
 responses being ordered by tweet ID. What will be the replacement for
 since_id and max_id?

 Taylor Singletary wrote:

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread bob.hitching
+1 on the need to maintain support for since_id in the Search API

On Mar 27, 7:41 am, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 Hi Developers,

 It's no secret that Twitter is growing exponentially. The tweets keep coming
 with ever increasing velocity, thanks in large part to your great
 applications.

 Twitter has adapted to the increasing number of tweets in ways that have
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit
 unsigned integers for status IDs some time ago. You all weathered that storm
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.

 Now we're reaching the scalability limit of our current tweet ID generation
 scheme. Unlike the previous tweet ID migrations, the solution to the current
 issue is significantly different. However, in most cases the new approach we
 will take will not result in any noticeable differences to you the developer
 or your users.

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore. Likewise for usage of
 IDs in mathematical operations -- for instance, subtracting two status IDs
 to determine the number of tweets in between will no longer be possible.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

 Taylor Singletary
 Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Naveen
We do not require that ids be sequential, but if the ids are not
monotonically increasing it cause some issue with how we manage
since_ids..

i.e. if a message posted by userA, 1 ns after userB, we would assume
userB has a higher id than userA. While it may seem like nitpicking,
wouldn't there a change userB message wont get delivered if its id is
lower than userAs message and I happen to  query the API just before
userB but right after userA posted?

--Naveen

On Mar 26, 4:41 pm, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 Hi Developers,

 It's no secret that Twitter is growing exponentially. The tweets keep coming
 with ever increasing velocity, thanks in large part to your great
 applications.

 Twitter has adapted to the increasing number of tweets in ways that have
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit
 unsigned integers for status IDs some time ago. You all weathered that storm
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.

 Now we're reaching the scalability limit of our current tweet ID generation
 scheme. Unlike the previous tweet ID migrations, the solution to the current
 issue is significantly different. However, in most cases the new approach we
 will take will not result in any noticeable differences to you the developer
 or your users.

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore. Likewise for usage of
 IDs in mathematical operations -- for instance, subtracting two status IDs
 to determine the number of tweets in between will no longer be possible.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

 Taylor Singletary
 Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Craig Hockenberry
Hi Taylor!

Please comment on how this change will affect this bug:

http://code.google.com/p/twitter-api/issues/detail?id=1529

Hopefully, the timestamp portion of the ID will allow since_id to work
correctly when load increases.

-ch


On Mar 26, 1:41 pm, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 Hi Developers,

 It's no secret that Twitter is growing exponentially. The tweets keep coming
 with ever increasing velocity, thanks in large part to your great
 applications.

 Twitter has adapted to the increasing number of tweets in ways that have
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit
 unsigned integers for status IDs some time ago. You all weathered that storm
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.

 Now we're reaching the scalability limit of our current tweet ID generation
 scheme. Unlike the previous tweet ID migrations, the solution to the current
 issue is significantly different. However, in most cases the new approach we
 will take will not result in any noticeable differences to you the developer
 or your users.

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore. Likewise for usage of
 IDs in mathematical operations -- for instance, subtracting two status IDs
 to determine the number of tweets in between will no longer be possible.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

 Taylor Singletary
 Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Michael Bleigh
To those voicing concerns about since_id I believe the key word is
that they will no longer be *sequential*, something entirely different
from them no longer being *increasing*. Since ID is a core part of the
Twitter API that I very much doubt will be in jeopardy from this
change. Twitter devs feel free to back me up or refute me. :)

On Mar 26, 4:41 pm, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 Hi Developers,

 It's no secret that Twitter is growing exponentially. The tweets keep coming
 with ever increasing velocity, thanks in large part to your great
 applications.

 Twitter has adapted to the increasing number of tweets in ways that have
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit
 unsigned integers for status IDs some time ago. You all weathered that storm
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.

 Now we're reaching the scalability limit of our current tweet ID generation
 scheme. Unlike the previous tweet ID migrations, the solution to the current
 issue is significantly different. However, in most cases the new approach we
 will take will not result in any noticeable differences to you the developer
 or your users.

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore. Likewise for usage of
 IDs in mathematical operations -- for instance, subtracting two status IDs
 to determine the number of tweets in between will no longer be possible.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

 Taylor Singletary
 Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Steve Streza
Especially on mobile devices, it's significantly faster to sort tweets
by comparing the long long representation of an ID rather than by the
date. It's also more accurate, as two tweets that come in at the exact
same second will still be sorted in the correct order.

Steve

On Mar 26, 4:41 pm, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 Hi Developers,

 It's no secret that Twitter is growing exponentially. The tweets keep coming
 with ever increasing velocity, thanks in large part to your great
 applications.

 Twitter has adapted to the increasing number of tweets in ways that have
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit
 unsigned integers for status IDs some time ago. You all weathered that storm
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.

 Now we're reaching the scalability limit of our current tweet ID generation
 scheme. Unlike the previous tweet ID migrations, the solution to the current
 issue is significantly different. However, in most cases the new approach we
 will take will not result in any noticeable differences to you the developer
 or your users.

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore. Likewise for usage of
 IDs in mathematical operations -- for instance, subtracting two status IDs
 to determine the number of tweets in between will no longer be possible.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

 Taylor Singletary
 Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Psychs
Can we assume a status ID will be unique or not?
It's unclear here.

If not, it should be a big problem for most apps.

- Satoshi


On Mar 27, 5:41 am, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore.

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Ray Krueger
I would think that this would make no difference for since_id. The
purpose of since_id is for us to the API give me the data I need
that's happened since this id. Don't assume it's implemented as
select * from tweets were id  since_id. :)


On Mar 26, 4:01 pm, Michael Bleigh mble...@gmail.com wrote:
 To those voicing concerns about since_id I believe the key word is
 that they will no longer be *sequential*, something entirely different
 from them no longer being *increasing*. Since ID is a core part of the
 Twitter API that I very much doubt will be in jeopardy from this
 change. Twitter devs feel free to back me up or refute me. :)

 On Mar 26, 4:41 pm, Taylor Singletary taylorsinglet...@twitter.com
 wrote:

  Hi Developers,

  It's no secret that Twitter is growing exponentially. The tweets keep coming
  with ever increasing velocity, thanks in large part to your great
  applications.

  Twitter has adapted to the increasing number of tweets in ways that have
  affected you in the past: We moved from 32 bit unsigned integers to 64-bit
  unsigned integers for status IDs some time ago. You all weathered that storm
  with ease. The tweetapoclypse was averted, and the tweets kept flowing.

  Now we're reaching the scalability limit of our current tweet ID generation
  scheme. Unlike the previous tweet ID migrations, the solution to the current
  issue is significantly different. However, in most cases the new approach we
  will take will not result in any noticeable differences to you the developer
  or your users.

  We are planning to replace our current sequential tweet ID generation
  routine with a simple, more scalable solution. IDs will still be 64-bit
  unsigned integers. However, this new solution is no longer guaranteed to
  generate sequential IDs.  Instead IDs will be derived based on time: the
  most significant bits being sourced from a timestamp and the least
  significant bits will be effectively random.

  Please don't depend on the exact format of the ID. As our infrastructure
  needs evolve, we might need to tweak the generation algorithm again.

  If you've been trying to divine meaning from status IDs aside from their
  role as a primary key, you won't be able to anymore. Likewise for usage of
  IDs in mathematical operations -- for instance, subtracting two status IDs
  to determine the number of tweets in between will no longer be possible.

  For the majority of applications we think this scheme switch will be a
  non-event. Before implementing these changes, we'd like to know if your
  applications currently depend on the sequential nature of IDs. Do you depend
  on the density of the tweet sequence being constant?  Are you trying to
  analyze the IDs as anything other than opaque, ordered identifiers? Aside
  for guaranteed sequential tweet ID ordering, what APIs can we provide you to
  accomplish your goals?

  Taylor Singletary
  Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Nigel Legg
I hope you're right, but my app design depends on since_id, and before I
proceed further I want to be sure that I will not have to rebuild when this
new format comes in.

On 26 March 2010 21:09, Ray Krueger raykrue...@gmail.com wrote:

 I would think that this would make no difference for since_id. The
 purpose of since_id is for us to the API give me the data I need
 that's happened since this id. Don't assume it's implemented as
 select * from tweets were id  since_id. :)


 On Mar 26, 4:01 pm, Michael Bleigh mble...@gmail.com wrote:
  To those voicing concerns about since_id I believe the key word is
  that they will no longer be *sequential*, something entirely different
  from them no longer being *increasing*. Since ID is a core part of the
  Twitter API that I very much doubt will be in jeopardy from this
  change. Twitter devs feel free to back me up or refute me. :)
 
  On Mar 26, 4:41 pm, Taylor Singletary taylorsinglet...@twitter.com
  wrote:
 
   Hi Developers,
 
   It's no secret that Twitter is growing exponentially. The tweets keep
 coming
   with ever increasing velocity, thanks in large part to your great
   applications.
 
   Twitter has adapted to the increasing number of tweets in ways that
 have
   affected you in the past: We moved from 32 bit unsigned integers to
 64-bit
   unsigned integers for status IDs some time ago. You all weathered that
 storm
   with ease. The tweetapoclypse was averted, and the tweets kept flowing.
 
   Now we're reaching the scalability limit of our current tweet ID
 generation
   scheme. Unlike the previous tweet ID migrations, the solution to the
 current
   issue is significantly different. However, in most cases the new
 approach we
   will take will not result in any noticeable differences to you the
 developer
   or your users.
 
   We are planning to replace our current sequential tweet ID generation
   routine with a simple, more scalable solution. IDs will still be 64-bit
   unsigned integers. However, this new solution is no longer guaranteed
 to
   generate sequential IDs.  Instead IDs will be derived based on time:
 the
   most significant bits being sourced from a timestamp and the least
   significant bits will be effectively random.
 
   Please don't depend on the exact format of the ID. As our
 infrastructure
   needs evolve, we might need to tweak the generation algorithm again.
 
   If you've been trying to divine meaning from status IDs aside from
 their
   role as a primary key, you won't be able to anymore. Likewise for usage
 of
   IDs in mathematical operations -- for instance, subtracting two status
 IDs
   to determine the number of tweets in between will no longer be
 possible.
 
   For the majority of applications we think this scheme switch will be a
   non-event. Before implementing these changes, we'd like to know if your
   applications currently depend on the sequential nature of IDs. Do you
 depend
   on the density of the tweet sequence being constant?  Are you trying to
   analyze the IDs as anything other than opaque, ordered identifiers?
 Aside
   for guaranteed sequential tweet ID ordering, what APIs can we provide
 you to
   accomplish your goals?
 
   Taylor Singletary
   Developer Advocate, Twitterhttp://twitter.com/episod

 To unsubscribe from this group, send email to twitter-development-talk+
 unsubscribegooglegroups.com or reply to this email with the words REMOVE
 ME as the subject.


To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Taylor Singletary
A quick clarification for you all since there seems to be the most concern
around using since_id as a parameter:

since_id will work as well as it does today as a result of this change.

Also, a reminder that the actual integer format of the tweet IDs will not be
changing. They'll still be unsigned 64bit integers as they are today.

Taylor Singletary
Developer Advocate, Twitter
http://twitter.com/episod


On Fri, Mar 26, 2010 at 1:41 PM, Taylor Singletary 
taylorsinglet...@twitter.com wrote:

 Hi Developers,

 It's no secret that Twitter is growing exponentially. The tweets keep
 coming with ever increasing velocity, thanks in large part to your great
 applications.

 Twitter has adapted to the increasing number of tweets in ways that have
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit
 unsigned integers for status IDs some time ago. You all weathered that storm
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.

 Now we're reaching the scalability limit of our current tweet ID generation
 scheme. Unlike the previous tweet ID migrations, the solution to the current
 issue is significantly different. However, in most cases the new approach we
 will take will not result in any noticeable differences to you the developer
 or your users.

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore. Likewise for usage of
 IDs in mathematical operations -- for instance, subtracting two status IDs
 to determine the number of tweets in between will no longer be possible.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

 Taylor Singletary
 Developer Advocate, Twitter
 http://twitter.com/episod


To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread mikawhite
I am using since_id in my app to know when to stop paging on both the
api  search api. My code expects the id to be sequential.

RT @jkalucki: Primary-Key-Density-Change-Pocalypse. Of total doom.

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Alberty Pascal
 So it would be cool if some way were provided for me to gauge tweet
 volumes at regular intervals (currently every 2 minutes).

Take a look to Tweespeed http://www.tweespeed.com

But with the change annonced, this site is dead at term ...

pas...@tweespeed

On Mar 26, 10:01 pm, jerememonteau m...@jmoe.com wrote:
 Whoops, accidentally just replied to author the first time...but...

 I build this little site about 9 months ago, depending on the
 monotonically increasing nature of tweet IDs :

 http://www.tweelocity.com

 This is a fun graph :

 http://tweelocity.com/chart/60/300/

 So it would be cool if some way were provided for me to gauge tweet
 volumes at regular intervals (currently every 2 minutes).

 I also think it's super cool that the twitter team is even giving a
 heads up like this.

 On Mar 26, 1:41 pm, Taylor Singletary taylorsinglet...@twitter.com
 wrote:

  Hi Developers,

  It's no secret that Twitter is growing exponentially. The tweets keep coming
  with ever increasing velocity, thanks in large part to your great
  applications.

  Twitter has adapted to the increasing number of tweets in ways that have
  affected you in the past: We moved from 32 bit unsigned integers to 64-bit
  unsigned integers for status IDs some time ago. You all weathered that storm
  with ease. The tweetapoclypse was averted, and the tweets kept flowing.

  Now we're reaching the scalability limit of our current tweet ID generation
  scheme. Unlike the previous tweet ID migrations, the solution to the current
  issue is significantly different. However, in most cases the new approach we
  will take will not result in any noticeable differences to you the developer
  or your users.

  We are planning to replace our current sequential tweet ID generation
  routine with a simple, more scalable solution. IDs will still be 64-bit
  unsigned integers. However, this new solution is no longer guaranteed to
  generate sequential IDs.  Instead IDs will be derived based on time: the
  most significant bits being sourced from a timestamp and the least
  significant bits will be effectively random.

  Please don't depend on the exact format of the ID. As our infrastructure
  needs evolve, we might need to tweak the generation algorithm again.

  If you've been trying to divine meaning from status IDs aside from their
  role as a primary key, you won't be able to anymore. Likewise for usage of
  IDs in mathematical operations -- for instance, subtracting two status IDs
  to determine the number of tweets in between will no longer be possible.

  For the majority of applications we think this scheme switch will be a
  non-event. Before implementing these changes, we'd like to know if your
  applications currently depend on the sequential nature of IDs. Do you depend
  on the density of the tweet sequence being constant?  Are you trying to
  analyze the IDs as anything other than opaque, ordered identifiers? Aside
  for guaranteed sequential tweet ID ordering, what APIs can we provide you to
  accomplish your goals?

  Taylor Singletary
  Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Naveen Ayyagari
I am still a little unclear if we will be able to determine the correct 
since_id to pass to the api by always looking for the largest tweet id we have 
seen. 

It seems if two messages are posted at very close to same time, they may not be 
sequential since the bottom bits will be randomly generated and I will not be 
able to safely just always use the largest id I have seen as the since_id??

Correct me if I am confusing myself please. 



On Mar 26, 2010, at 5:33 PM, Taylor Singletary wrote:

 A quick clarification for you all since there seems to be the most concern 
 around using since_id as a parameter:
 
 since_id will work as well as it does today as a result of this change. 
 
 Also, a reminder that the actual integer format of the tweet IDs will not be 
 changing. They'll still be unsigned 64bit integers as they are today.
 
 Taylor Singletary
 Developer Advocate, Twitter
 http://twitter.com/episod
 
 
 On Fri, Mar 26, 2010 at 1:41 PM, Taylor Singletary 
 taylorsinglet...@twitter.com wrote:
 Hi Developers,
 
 It's no secret that Twitter is growing exponentially. The tweets keep coming 
 with ever increasing velocity, thanks in large part to your great 
 applications.
 
 Twitter has adapted to the increasing number of tweets in ways that have 
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit 
 unsigned integers for status IDs some time ago. You all weathered that storm 
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.
 
 Now we're reaching the scalability limit of our current tweet ID generation 
 scheme. Unlike the previous tweet ID migrations, the solution to the current 
 issue is significantly different. However, in most cases the new approach we 
 will take will not result in any noticeable differences to you the developer 
 or your users.
 
 We are planning to replace our current sequential tweet ID generation routine 
 with a simple, more scalable solution. IDs will still be 64-bit unsigned 
 integers. However, this new solution is no longer guaranteed to generate 
 sequential IDs.  Instead IDs will be derived based on time: the most 
 significant bits being sourced from a timestamp and the least significant 
 bits will be effectively random. 
 
 Please don't depend on the exact format of the ID. As our infrastructure 
 needs evolve, we might need to tweak the generation algorithm again.
 
 If you've been trying to divine meaning from status IDs aside from their role 
 as a primary key, you won't be able to anymore. Likewise for usage of IDs in 
 mathematical operations -- for instance, subtracting two status IDs to 
 determine the number of tweets in between will no longer be possible.
 
 For the majority of applications we think this scheme switch will be a 
 non-event. Before implementing these changes, we'd like to know if your 
 applications currently depend on the sequential nature of IDs. Do you depend 
 on the density of the tweet sequence being constant?  Are you trying to 
 analyze the IDs as anything other than opaque, ordered identifiers? Aside for 
 guaranteed sequential tweet ID ordering, what APIs can we provide you to 
 accomplish your goals?
 
 Taylor Singletary
 Developer Advocate, Twitter
 http://twitter.com/episod
 
 
 To unsubscribe from this group, send email to 
 twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
 with the words REMOVE ME as the subject.

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread bjhess
+1 on IDs being increasing. Sequential doesn't matter to me. I don't
actually trust passing since_id to Twitter and having them handle the
limiting of my result list. I've gotten into trouble when that feature
suddenly quit being recognized and my code wasn't defensive enough to
double-check since_id. With that fear in mind, increasing IDs are a
must.

I'm assuming the direct message ID algorithm will remain unchanged?

Thanks,

~Barry
http://bjhess.com
http://getHarvest.com

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Benedek
Hi,

From a practical development point of view having growing IDs are
very helpful.

With many common database operations greatly simplifies things for the
developers. (Most application with local storage or cache need one key
less. Or complex queries need fewer values in a temporary table.) This
leads to more simple code and faster applications.

I believe that your approach, the most significant bits being sourced
from a timestamp, solves most problem and poses no need of change in
code for most developer IF newer status messages always have bigger
IDs.

But I think a second precision should be enough.

There is one other thing I worry about, even if there is second
precision. When a user posts two status updates within a second. In
this case the second one should alway have a bigger id. This seems a
bit theoretical, but:

 - updates can come from a buffer eg. possible offline twitter
client
 - updates longer than 140 characters can be split into two (or more)
updates by a possible application.

So newer updates from the same users should always have bigger IDs.

If these two are granted, no application (except a few twitter
statistics sites) should have any problem with a change like this.

Benedek Toth

On márc. 26, 21:41, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 Hi Developers,

 It's no secret that Twitter is growing exponentially. The tweets keep coming
 with ever increasing velocity, thanks in large part to your great
 applications.

 Twitter has adapted to the increasing number of tweets in ways that have
 affected you in the past: We moved from 32 bit unsigned integers to 64-bit
 unsigned integers for status IDs some time ago. You all weathered that storm
 with ease. The tweetapoclypse was averted, and the tweets kept flowing.

 Now we're reaching the scalability limit of our current tweet ID generation
 scheme. Unlike the previous tweet ID migrations, the solution to the current
 issue is significantly different. However, in most cases the new approach we
 will take will not result in any noticeable differences to you the developer
 or your users.

 We are planning to replace our current sequential tweet ID generation
 routine with a simple, more scalable solution. IDs will still be 64-bit
 unsigned integers. However, this new solution is no longer guaranteed to
 generate sequential IDs.  Instead IDs will be derived based on time: the
 most significant bits being sourced from a timestamp and the least
 significant bits will be effectively random.

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.

 If you've been trying to divine meaning from status IDs aside from their
 role as a primary key, you won't be able to anymore. Likewise for usage of
 IDs in mathematical operations -- for instance, subtracting two status IDs
 to determine the number of tweets in between will no longer be possible.

 For the majority of applications we think this scheme switch will be a
 non-event. Before implementing these changes, we'd like to know if your
 applications currently depend on the sequential nature of IDs. Do you depend
 on the density of the tweet sequence being constant?  Are you trying to
 analyze the IDs as anything other than opaque, ordered identifiers? Aside
 for guaranteed sequential tweet ID ordering, what APIs can we provide you to
 accomplish your goals?

 Taylor Singletary
 Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread isaiah
So will they be monotonically increasing?  That seems to be the key
question.  If they're not necessarily monotonic with respect to their
date, then it seems like it would be a pretty painful change.

Isaiah

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


Re: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Josh Bleecher Snyder
Hi Taylor (et al.),

There are two reasons to think that, with the scheme you propose,
tweet ids will not necessarily be monotonically increasing.

Naveen hit the first:

 It seems if two messages are posted at very close to same time, they may not
 be sequential since the bottom bits will be randomly generated

There is another: Time synchronization is hard to always get right
(Einstein jokes aside). Clock skew happens for any number of reasons
-- sometimes ntpd sends time backwards when network i/o gets really
ugly, machine clocks wander, colos get out of sync, humans err, etc.
These are rare events, but they do happen, and they can cause
misalignment of clocks big enough for the odd tweet or two to fall
through.

Does missing the odd tweet or two matter? As for the tweet themselves:
Probably not. But if it gets noticed, it causes users / developers to
lose some amount of trust in their app / platform...and that matters a
lot and can also generate a lot of annoying support emails.


You wrote:

 since_id will work as well as it does today as a result of this change.

Is that assuming monotonically increasing tweet ids? If not, would you
mind elaborating?


Having a universal counter is untenable, but having occasional,
undiagnosable, unreproducible glitches also sucks. :) Thinking out
loud, perhaps there is some middle ground -- a way to have generally
monotonically increasing ids globally, and guaranteed monotonically
increasing ids along some useful dimension, such as per user (this
doesn't play nicely e.g. w/ Cassandra, but it is still reasonably
scalable by other means). Not sure whether that would help folks or
not...

-josh

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread dcreemer
Hi-

Thanks for the heads-up. I have a couple of questions: Most
importantly: when will this change happen?

I understand that we should not depend on the format of the ID, but
since we currently do, can we get more exact information on the new
format? Is there going to be a very large discontinuous jump at the
switchover time? Which bits will be used for what? We currently depend
on the ID for a variety of caching and storage schemes -- I'm ok
changing, but I need to plan, and understand the exact ID format post
change to figure out how much work I need to.

Thanks,
-- David

On Mar 26, 1:41 pm, Taylor Singletary taylorsinglet...@twitter.com
wrote:
 Hi Developers,

...

 Please don't depend on the exact format of the ID. As our infrastructure
 needs evolve, we might need to tweak the generation algorithm again.


To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread M. Edward (Ed) Borasky
On Mar 26, 4:01 pm, Josh Bleecher Snyder joshar...@gmail.com wrote:
 Having a universal counter is untenable, but having occasional,
 undiagnosable, unreproducible glitches also sucks. :) Thinking out
 loud, perhaps there is some middle ground -- a way to have generally
 monotonically increasing ids globally, and guaranteed monotonically
 increasing ids along some useful dimension, such as per user (this
 doesn't play nicely e.g. w/ Cassandra, but it is still reasonably
 scalable by other means). Not sure whether that would help folks or
 not...

I used to work at Goddard Space Flight Center. As you can well
imagine, accurate timekeeping was a hard requirement for many of the
projects and tasks there, though not all of them. The issue is cost.
Truly accurate timekeeping is achievable, but the cost to Twitter must
be passed on to its customers, and the last time I looked, social
media was an extremely competitive business. So I think we need to
allow Twitter some leeway here.

Right now, tweets carry a timestamp good to the nearest second. I
haven't looked recently, but the last published figure from Twitter
was that about 600 of them would have that timestamp on average. If
you truly need time resolution finer than that, make a business case,
apply for Firehose access, establish a business relationship with
Twitter, invest in the infrastructure on your end for the high-
precision timekeeping hardware and software, etc.

As far as occasional glitches are concerned, we have those now. Every
so often, we still get Fail Whales, 5xx errors, DDos attacks, etc. My
broadband sometimes doesn't work. Sometimes, we have a windstorm or an
ice storm and I lose electricity for a couple of hours. GMail goes
down sometimes. Amazon goes down sometimes. Water mains break. And
every so often, the astronomers add leap seconds to correct for
hitches in the Earth's gitalong. I think we can live with an
occasional clock error, or gap in the tweet IDs. And if you're
interested, I can point you at the fairly simple math needed to
correct for these glitches.

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Martin Dudek
Good morning

hope all are well,

Like TweeSpeed and also assumingly

http://popacular.com/gigatweet/

I have to little apps deriving the volume of tweets on twitter from
the ID

http://twopular.com/labs/tweetMania

http://twopular.com/labs/countingTweets

With the announced change visualization of the twitter hype like this
wouldn't be possible anymore.
I think it would be great if you could provide some other means to
track the volume of tweets before the change of the status_id take
place.

When is this change supposed to happen?

Thx

martin

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.


[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

2010-03-26 Thread Ivo
My Desktop Client is also depending on since_id right now in order to
display the user all new tweets since he logged out. Also without
since_id and max_id it's not really possible to implemente a more
link at the bottom.

Personally, for my needs it would be enough if since_id and max_id
would be replaced with since_date and max_date, as that wouldn't
affect my application since I'm not relying on the tweet ids but only
need to retrieve tweets in a specific timespan.

But one thing for sure, there needs to be a replacement.

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words REMOVE ME as the subject.