Hello. Couple questions: 1.) Are you planning on eventually eliminating the integer representation and only using strings for id's?
2.) If an application doesn't use Javascript to parse JSON (for example, YAJL-OBJC and NSNumbers in Obj-C), is it necessary to make any changes at all? Thanks. On Oct 19, 3:52 pm, Matt Harris <[email protected]> wrote: > Hey everyone, > > Thank you to all of you for your questions, patience and contributions to > this thread. Hearing your views and knowing how you use the API helps us > provide more information where there wasn't enough, and clarify details > where there was ambiguity. > > I've collated the questions i've received from you directly, over Twitter to > @twitterapi and through this list. I hope the comments below provide enough > information to answer those questions and explain the reasoning being our > decisions. > > Thanks for your support and patience, > @themattharris > > 1) Will search.twitter.com also include id_str and > in_reply_to_status_id_str? > Yes, Search will include the String representations of those IDs. > > 2) Which fields are affected by this change? > All IDs which are transmitted as Integers will have a String representation > in the API response. Only Tweet IDs (which includes mentions and retweets) > will be moving to new Snowflake IDs. Messages (DMs), Saved Searches and > Users may change to a Snowflake ID scheme in the future but this isn’t > planned for this year. > > We are adding String representations of the Integer IDs now so you can > update all of your code to use the String representations throughout. to > allow developers to make the change now for all the ID fields and be > prepared should any other IDs break the 53bit boundary. > > 3) Which fields will have String representations? > The fields which will have String representations are: > > id (DM, Saved Search, User, List ) > in_reply_to_status_id > in_reply_to_user_id > new_id (Streaming only. Will be removed when Snowflake is enabled) > current_user_retweet_id (When include_my_retweet=1 is passed) > > 4) Can you provide a complete Tweet example with Snowflake ID to test? > > [{"coordinates":null,"truncated":false,"created_at":"Thu Oct 14 22:20:15 > +0000 > 2010","favorited":false,"entities":{"urls":[],"hashtags":[],"user_mentions" > :[{"name":"Matt > Harris","id":777925,"id_str":"777925","indices":[0,14],"screen_name":"thema > ttharris"}]},"text":"@themattharris > hey how are > things?","annotations":null,"contributors":[{"id":819797,"id_str":"819797", > "screen_name":"episod"}],"id":10765432100123456789,"id_str":"10765432100123 > 456789","retweet_count":0,"geo":null,"retweeted":false,"in_reply_to_user_id > ":777925,"in_reply_to_user_id_str":"777925","in_reply_to_screen_name":"them > attharris","user":{"id":6253282,"id_str":"6253282"},"source":"web","place": > null,"in_reply_to_status_id":10586268426842688951,"in_reply_to_status_id_st > r":"10586268426842688951"}] > > 5) What is happening with new_id in the Streaming API? > new_id and new_id_str will be switched off when or soon after Snowflake is > enabled on November 4th. > > 6) Why not restrict IDs to 53bits? > > A Snowflake ID is composed: > * 41bits for millisecond precision time (69 years) > * 10bits for a configured machine identity (1024 machines) > * 12bits for a sequence number (4096 per machine) > > The factor influencing the length of the ID is the time. For a 53bit ID this > would mean only 31bits are available for the time. 31bits is only enough for > 24 days (2147483648/(1000*60*60*24)) of time. > > Reducing the resolution of the timestamp would prevent a K-sorted resolution > of 1 second or less. > > Reducing the configured machine identity or sequence number by 1bit would > mean we couldn’t scale Twitter, or operate our infrastructure in an > uncoordinated high-available way. > > 7) When will the 53bit Integer overflow happen? > 24 days after Snowflake starts counting. > > 8) Is it safe to parse and store IDs as signed 64bit Integers? > Yes. > > 9) Why offer both the String and Integer versions of the ID? > The String representation is needed to ensure languages which cannot convert > the >53bit Integer can still use the ID in other API requests. > > The Integer value is being retained for languages which can handle > numbers>53bit and to prevent applications which have not converted from being > > cut-off from Twitter. > > 10) When ID is null what will the _str representation be? > The _str representation will also be null. > > 11) Did you really mean ‘unsigned’ 64bit Integer? > Strictly speaking the Snowflake is a signed 64bit long under the hood. That > being said, we will never use the negative bit and won’t require the full > 64bits for positive numbers for about 69 years: > > http://www.google.com/search?q=%282**41%29+%2F+%2860*60*24*1000%29+%2... > > 12) Why not make the strings opt-in? > We did consider this as an option but decided against it for a number of > reasons. The first reason is that the ID is fundamental to being able to > work with the data from the API so receiving the correct ID shouldn’t be > something you have to opt into. The second, more influential reason, is that > making the _str representations opt-in would create significant, performance > affecting issues for the API. > > On Mon, Oct 18, 2010 at 5:34 PM, themattharris > <[email protected]>wrote: > > > > > > > > > Thanks to @gotwalt for spotting the missing commas. > > > Fixed JSON sample ... > > > [ > > { > > "coordinates": null, > > "truncated": false, > > "created_at": "Thu Oct 14 22:20:15 +0000 2010", > > "favorited": false, > > "entities": { > > "urls": [ > > ], > > "hashtags": [ > > ], > > "user_mentions": [ > > { > > "name": "Matt Harris", > > "id": 777925, > > "id_str": "777925", > > "indices": [ > > 0, > > 14 > > ], > > "screen_name": "themattharris" > > } > > ] > > }, > > "text": "@themattharris hey how are things?", > > "annotations": null, > > "contributors": [ > > { > > "id": 819797, > > "id_str": "819797", > > "screen_name": "episod" > > } > > ], > > "id": 12738165059, > > "id_str": "12738165059", > > "retweet_count": 0, > > "geo": null, > > "retweeted": false, > > "in_reply_to_user_id": 777925, > > "in_reply_to_user_id_str": "777925", > > "in_reply_to_screen_name": "themattharris", > > "user": { > > "id": 6253282, > > "id_str": "6253282" > > }, > > "source": "web", > > "place": null, > > "in_reply_to_status_id": 12738040524, > > "in_reply_to_status_id_str": "12738040524" > > } > > ] > > > Best, > > @themattharris > > > On Oct 18, 5:19 pm, Matt Harris <[email protected]> wrote: > > > Last week you may remember Twitter planned to enable the new Status ID > > > generator - 'Snowflake' but didn't. The purpose of this email is to > > explain > > > the reason why this didn't happen, what we are doing about it, and what > > the > > > new release plan is. > > > > So what is Snowflake? > > > ------------------------------ > > > Snowflake is a service we will be using to generate unique Tweet IDs. > > These > > > Tweet IDs are unique 64bit unsigned integers, which, instead of being > > > sequential like the current IDs, are based on time. The full ID is > > composed > > > of a timestamp, a worker number, and a sequence number. > > > > The problem > > > ----------------- > > > Before launch it came to our attention that some programming languages > > such > > > as Javascript cannot support numbers with >53bits. This can be easily > > > examined by running a command similar to: (90071992547409921).toString() > > in > > > your browsers console or by running the following JSON snippet through > > your > > > JSON parser. > > > > {"id": 10765432100123456789, "id_str": "10765432100123456789"} > > > > In affected JSON parsers the ID will not be converted successfully and > > will > > > lose accuracy. In some parsers there may even be an exception. > > > > The solution > > > ---------------- > > > To allow javascript and JSON parsers to read the IDs we need to include a > > > string version of any ID when responding in the JSON format. What this > > means > > > is Status, User, Direct Message and Saved Search IDs in the Twitter API > > will > > > now be returned as an integer and a string in JSON responses. This will > > > apply to the main Twitter API, the Streaming API and the Search API. > > > > For example, a status object will now contain an id and an id_str. The > > > following JSON representation of a status object shows the two versions > > of > > > the ID fields for each data point. > > > > [ > > > { > > > "coordinates": null, > > > "truncated": false, > > > "created_at": "Thu Oct 14 22:20:15 +0000 2010", > > > "favorited": false, > > > "entities": { > > > "urls": [ > > > ], > > > "hashtags": [ > > > ], > > > "user_mentions": [ > > > { > > > "name": "Matt Harris", > > > "id": 777925, > > > "id_str": "777925", > > > "indices": [ > > > 0, > > > 14 > > > ], > > > "screen_name": "themattharris" > > > } > > > ] > > > }, > > > "text": "@themattharris hey how are things?", > > > "annotations": null, > > > "contributors": [ > > > { > > > "id": 819797, > > > "id_str": "819797", > > > "screen_name": "episod" > > > } > > > ], > > > "id": 12738165059, > > > "id_str": "12738165059", > > > "retweet_count": 0, > > > "geo": null, > > > "retweeted": false, > > > "in_reply_to_user_id": 777925, > > > "in_reply_to_user_id_str": "777925", > > > "in_reply_to_screen_name": "themattharris", > > > "user": { > > > "id": 6253282 > > > "id_str": "6253282" > > > }, > > > "source": "web", > > > "place": null, > > > "in_reply_to_status_id": 12738040524 > > > "in_reply_to_status_id_str": "12738040524" > > > } > > > ] > > > > What should you do - RIGHT NOW > > > ---------------------------------------------- > > > The first thing you should do is attempt to decode the JSON snippet above > > > using your production code parser. Observe the output to confirm the ID > > has > > > not lost accuracy. > > > > What you do next depends on what happens: > > > > * If your code converts the ID successfully without losing accuracy you > > are > > > OK but should consider converting to the _str versions of IDs as soon as > > > possible. > > > * If your code has lost accuracy, convert your code to using the _str > > > version immediately. If you do not do this your... > > read more » -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
