1) Please don't, I don't want to have to convert everything back to integers within my code. I consider the string representation a hack around some issues with certain programming languages, and not an optimal solution. Wouldn't want this to become the default option.
2) No Tom On 11/11/10 6:34 AM, SM wrote: > Hello. Couple questions: > > 1.) Are you planning on eventually eliminating the integer > representation and only using strings for id's? > > 2.) If an application doesn't use Javascript to parse JSON (for > example, YAJL-OBJC and NSNumbers in Obj-C), is it necessary to make > any changes at all? > > Thanks. > > > > On Oct 19, 3:52 pm, Matt Harris <thematthar...@twitter.com> wrote: >> Hey everyone, >> >> Thank you to all of you for your questions, patience and contributions to >> this thread. Hearing your views and knowing how you use the API helps us >> provide more information where there wasn't enough, and clarify details >> where there was ambiguity. >> >> I've collated the questions i've received from you directly, over Twitter to >> @twitterapi and through this list. I hope the comments below provide enough >> information to answer those questions and explain the reasoning being our >> decisions. >> >> Thanks for your support and patience, >> @themattharris >> >> 1) Will search.twitter.com also include id_str and >> in_reply_to_status_id_str? >> Yes, Search will include the String representations of those IDs. >> >> 2) Which fields are affected by this change? >> All IDs which are transmitted as Integers will have a String representation >> in the API response. Only Tweet IDs (which includes mentions and retweets) >> will be moving to new Snowflake IDs. Messages (DMs), Saved Searches and >> Users may change to a Snowflake ID scheme in the future but this isn’t >> planned for this year. >> >> We are adding String representations of the Integer IDs now so you can >> update all of your code to use the String representations throughout. to >> allow developers to make the change now for all the ID fields and be >> prepared should any other IDs break the 53bit boundary. >> >> 3) Which fields will have String representations? >> The fields which will have String representations are: >> >> id (DM, Saved Search, User, List ) >> in_reply_to_status_id >> in_reply_to_user_id >> new_id (Streaming only. Will be removed when Snowflake is enabled) >> current_user_retweet_id (When include_my_retweet=1 is passed) >> >> 4) Can you provide a complete Tweet example with Snowflake ID to test? >> >> [{"coordinates":null,"truncated":false,"created_at":"Thu Oct 14 22:20:15 >> +0000 >> 2010","favorited":false,"entities":{"urls":[],"hashtags":[],"user_mentions" >> :[{"name":"Matt >> Harris","id":777925,"id_str":"777925","indices":[0,14],"screen_name":"thema >> ttharris"}]},"text":"@themattharris >> hey how are >> things?","annotations":null,"contributors":[{"id":819797,"id_str":"819797", >> "screen_name":"episod"}],"id":10765432100123456789,"id_str":"10765432100123 >> 456789","retweet_count":0,"geo":null,"retweeted":false,"in_reply_to_user_id >> ":777925,"in_reply_to_user_id_str":"777925","in_reply_to_screen_name":"them >> attharris","user":{"id":6253282,"id_str":"6253282"},"source":"web","place": >> null,"in_reply_to_status_id":10586268426842688951,"in_reply_to_status_id_st >> r":"10586268426842688951"}] >> >> 5) What is happening with new_id in the Streaming API? >> new_id and new_id_str will be switched off when or soon after Snowflake is >> enabled on November 4th. >> >> 6) Why not restrict IDs to 53bits? >> >> A Snowflake ID is composed: >> * 41bits for millisecond precision time (69 years) >> * 10bits for a configured machine identity (1024 machines) >> * 12bits for a sequence number (4096 per machine) >> >> The factor influencing the length of the ID is the time. For a 53bit ID this >> would mean only 31bits are available for the time. 31bits is only enough for >> 24 days (2147483648/(1000*60*60*24)) of time. >> >> Reducing the resolution of the timestamp would prevent a K-sorted resolution >> of 1 second or less. >> >> Reducing the configured machine identity or sequence number by 1bit would >> mean we couldn’t scale Twitter, or operate our infrastructure in an >> uncoordinated high-available way. >> >> 7) When will the 53bit Integer overflow happen? >> 24 days after Snowflake starts counting. >> >> 8) Is it safe to parse and store IDs as signed 64bit Integers? >> Yes. >> >> 9) Why offer both the String and Integer versions of the ID? >> The String representation is needed to ensure languages which cannot convert >> the >53bit Integer can still use the ID in other API requests. >> >> The Integer value is being retained for languages which can handle >> numbers>53bit and to prevent applications which have not converted from being >> >> cut-off from Twitter. >> >> 10) When ID is null what will the _str representation be? >> The _str representation will also be null. >> >> 11) Did you really mean ‘unsigned’ 64bit Integer? >> Strictly speaking the Snowflake is a signed 64bit long under the hood. That >> being said, we will never use the negative bit and won’t require the full >> 64bits for positive numbers for about 69 years: >> >> http://www.google.com/search?q=%282**41%29+%2F+%2860*60*24*1000%29+%2... >> >> 12) Why not make the strings opt-in? >> We did consider this as an option but decided against it for a number of >> reasons. The first reason is that the ID is fundamental to being able to >> work with the data from the API so receiving the correct ID shouldn’t be >> something you have to opt into. The second, more influential reason, is that >> making the _str representations opt-in would create significant, performance >> affecting issues for the API. >> >> On Mon, Oct 18, 2010 at 5:34 PM, themattharris >> <thematthar...@twitter.com>wrote: >> >> >> >> >> >> >> >>> Thanks to @gotwalt for spotting the missing commas. >> >>> Fixed JSON sample ... >> >>> [ >>> { >>> "coordinates": null, >>> "truncated": false, >>> "created_at": "Thu Oct 14 22:20:15 +0000 2010", >>> "favorited": false, >>> "entities": { >>> "urls": [ >>> ], >>> "hashtags": [ >>> ], >>> "user_mentions": [ >>> { >>> "name": "Matt Harris", >>> "id": 777925, >>> "id_str": "777925", >>> "indices": [ >>> 0, >>> 14 >>> ], >>> "screen_name": "themattharris" >>> } >>> ] >>> }, >>> "text": "@themattharris hey how are things?", >>> "annotations": null, >>> "contributors": [ >>> { >>> "id": 819797, >>> "id_str": "819797", >>> "screen_name": "episod" >>> } >>> ], >>> "id": 12738165059, >>> "id_str": "12738165059", >>> "retweet_count": 0, >>> "geo": null, >>> "retweeted": false, >>> "in_reply_to_user_id": 777925, >>> "in_reply_to_user_id_str": "777925", >>> "in_reply_to_screen_name": "themattharris", >>> "user": { >>> "id": 6253282, >>> "id_str": "6253282" >>> }, >>> "source": "web", >>> "place": null, >>> "in_reply_to_status_id": 12738040524, >>> "in_reply_to_status_id_str": "12738040524" >>> } >>> ] >> >>> Best, >>> @themattharris >> >>> On Oct 18, 5:19 pm, Matt Harris <thematthar...@twitter.com> wrote: >>>> Last week you may remember Twitter planned to enable the new Status ID >>>> generator - 'Snowflake' but didn't. The purpose of this email is to >>> explain >>>> the reason why this didn't happen, what we are doing about it, and what >>> the >>>> new release plan is. >> >>>> So what is Snowflake? >>>> ------------------------------ >>>> Snowflake is a service we will be using to generate unique Tweet IDs. >>> These >>>> Tweet IDs are unique 64bit unsigned integers, which, instead of being >>>> sequential like the current IDs, are based on time. The full ID is >>> composed >>>> of a timestamp, a worker number, and a sequence number. >> >>>> The problem >>>> ----------------- >>>> Before launch it came to our attention that some programming languages >>> such >>>> as Javascript cannot support numbers with >53bits. This can be easily >>>> examined by running a command similar to: (90071992547409921).toString() >>> in >>>> your browsers console or by running the following JSON snippet through >>> your >>>> JSON parser. >> >>>> {"id": 10765432100123456789, "id_str": "10765432100123456789"} >> >>>> In affected JSON parsers the ID will not be converted successfully and >>> will >>>> lose accuracy. In some parsers there may even be an exception. >> >>>> The solution >>>> ---------------- >>>> To allow javascript and JSON parsers to read the IDs we need to include a >>>> string version of any ID when responding in the JSON format. What this >>> means >>>> is Status, User, Direct Message and Saved Search IDs in the Twitter API >>> will >>>> now be returned as an integer and a string in JSON responses. This will >>>> apply to the main Twitter API, the Streaming API and the Search API. >> >>>> For example, a status object will now contain an id and an id_str. The >>>> following JSON representation of a status object shows the two versions >>> of >>>> the ID fields for each data point. >> >>>> [ >>>> { >>>> "coordinates": null, >>>> "truncated": false, >>>> "created_at": "Thu Oct 14 22:20:15 +0000 2010", >>>> "favorited": false, >>>> "entities": { >>>> "urls": [ >>>> ], >>>> "hashtags": [ >>>> ], >>>> "user_mentions": [ >>>> { >>>> "name": "Matt Harris", >>>> "id": 777925, >>>> "id_str": "777925", >>>> "indices": [ >>>> 0, >>>> 14 >>>> ], >>>> "screen_name": "themattharris" >>>> } >>>> ] >>>> }, >>>> "text": "@themattharris hey how are things?", >>>> "annotations": null, >>>> "contributors": [ >>>> { >>>> "id": 819797, >>>> "id_str": "819797", >>>> "screen_name": "episod" >>>> } >>>> ], >>>> "id": 12738165059, >>>> "id_str": "12738165059", >>>> "retweet_count": 0, >>>> "geo": null, >>>> "retweeted": false, >>>> "in_reply_to_user_id": 777925, >>>> "in_reply_to_user_id_str": "777925", >>>> "in_reply_to_screen_name": "themattharris", >>>> "user": { >>>> "id": 6253282 >>>> "id_str": "6253282" >>>> }, >>>> "source": "web", >>>> "place": null, >>>> "in_reply_to_status_id": 12738040524 >>>> "in_reply_to_status_id_str": "12738040524" >>>> } >>>> ] >> >>>> What should you do - RIGHT NOW >>>> ---------------------------------------------- >>>> The first thing you should do is attempt to decode the JSON snippet above >>>> using your production code parser. Observe the output to confirm the ID >>> has >>>> not lost accuracy. >> >>>> What you do next depends on what happens: >> >>>> * If your code converts the ID successfully without losing accuracy you >>> are >>>> OK but should consider converting to the _str versions of IDs as soon as >>>> possible. >>>> * If your code has lost accuracy, convert your code to using the _str >>>> version immediately. If you do not do this your... >> >> read more » > -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk