This still gives you some API legacy that you need to maintain, but
it's a much cleaner approach than what is currently being proposed.

-ch

On Oct 19, 10:39 am, Tom van der Woerdt <i...@tvdw.eu> wrote:
> I wouldn't blame this on JSON, because it's not JSON that has the
> problems, but JavaScript. All of my Objective-C apps that communicate
> use JSON as well, and they don't have the limitation. The issue does not
> apply to XML either - there's no type specification in XML.
>
> As far as I know, this issue will only cause trouble for a few
> applications that work with JavaScript and depend on the IDs a lot.
>
> My suggestion to solve this issue would be to introduce an additional
> parameter (just like include_rts, just with a different name) that turns
> all IDs into strings. No extra fields, just an additional optional
> parameter. Won't cause trouble for the applications that can't parse it
> and requires minimal implementation effort for developers.
>
> I hope I'm not too late with my suggestion :-)
>
> Tom
>
> On 10/19/10 7:10 PM, Craig Hockenberry wrote:
>
>
>
> > This approach feels wrong to me. The red flag is the duplication of
> > data within the payload: in 30+ years of professional development,
> > I've never seen that work out well.
>
> > The root of the problem is that you've chosen to deliver data in a
> > format (JSON) that can't support integers with a value greater than
> > 2^53 bits. And some of your data uses 2^64 bits.
>
> > The result is that you're working around the problem in a language by
> > using a string. Avoiding the root problem will encumber you with
> > legacy that you'll regret later.
>
> > Look at your proposed solution from a different point-of-view: say you
> > have a language that can't handle Unicode well (e.g. BASIC or Ruby.)
> > Would you solve this problem by adding another field called
> > "text_ascii"?
>
> > "text": "@themattharris hey how are things in K benhavn?".
> > "text_ascii": "@themattharris hey how are things in Kobenhavn?".
>
> > Seems silly, yet that is exactly what you're doing for Javascript and
> > long integers.
>
> > A part of this legacy in your payload is future confusion for
> > developers. Someone new to the Twitter API is going to be confused as
> > to why your ID values have both numeric and string representations.
> > And smart developers are going to lean towards the numeric
> > representation:
>
> > * 8 bytes of storage for 10765432100123456789 instead of 20 bytes.
> > * Faster sorting (less data to compare.)
> > * Correct sorting: "011" and "10" have different order depending on
> > whether you're sorting the string or numeric representation.
>
> > They'll eventually pay the price for choosing incorrectly.
>
> > Every ID in the API is going to need documentation as a result. For
> > example, are place IDs affected by this change? And what about the IDs
> > returned by the Search API? (there's no mention of "since_id_str" and
> > "max_id_str" above.)
>
> > Losing consistency with the XML format is also a problem. Unless
> > you're planning on adding _str elements to the XML payload, you're
> > presenting developers with a one-way street. A consumer of JSON
> > "id_str" can't  easily change the format of data they want to consume.
>
> > In my mind, you really only have two good choices at this point:
>
> > 1) Limit Snowflake's ID space to 2^53 bits. Easier for developers,
> > harder for Twitter.
>
> > 2) Make all Twitter IDs into strings. Easier for Twitter, harder for
> > developers.
>
> > The second choice is obviously more disruptive, but if you really need
> > the ID space, it's the right one. Even if it means I need to make
> > major changes to my code.
>
> > On Oct 18, 5:19 pm, Matt Harris <thematthar...@twitter.com> wrote:
> >> Last week you may remember Twitter planned to enable the new Status ID
> >> generator - 'Snowflake' but didn't. The purpose of this email is to explain
> >> the reason why this didn't happen, what we are doing about it, and what the
> >> new release plan is.
>
> >> So what is Snowflake?
> >> ------------------------------
> >> Snowflake is a service we will be using to generate unique Tweet IDs. These
> >> Tweet IDs are unique 64bit unsigned integers, which, instead of being
> >> sequential like the current IDs, are based on time. The full ID is composed
> >> of a timestamp, a worker number, and a sequence number.
>
> >> The problem
> >> -----------------
> >> Before launch it came to our attention that some programming languages such
> >> as Javascript cannot support numbers with >53bits. This can be easily
> >> examined by running a command similar to: (90071992547409921).toString() in
> >> your browsers console or by running the following JSON snippet through your
> >> JSON parser.
>
> >>     {"id": 10765432100123456789, "id_str": "10765432100123456789"}
>
> >> In affected JSON parsers the ID will not be converted successfully and will
> >> lose accuracy. In some parsers there may even be an exception.
>
> >> The solution
> >> ----------------
> >> To allow javascript and JSON parsers to read the IDs we need to include a
> >> string version of any ID when responding in the JSON format. What this 
> >> means
> >> is Status, User, Direct Message and Saved Search IDs in the Twitter API 
> >> will
> >> now be returned as an integer and a string in JSON responses. This will
> >> apply to the main Twitter API, the Streaming API and the Search API.
>
> >> For example, a status object will now contain an id and an id_str. The
> >> following JSON representation of a status object shows the two versions of
> >> the ID fields for each data point.
>
> >> [
> >>   {
> >>     "coordinates": null,
> >>     "truncated": false,
> >>     "created_at": "Thu Oct 14 22:20:15 +0000 2010",
> >>     "favorited": false,
> >>     "entities": {
> >>       "urls": [
> >>       ],
> >>       "hashtags": [
> >>       ],
> >>       "user_mentions": [
> >>         {
> >>           "name": "Matt Harris",
> >>           "id": 777925,
> >>           "id_str": "777925",
> >>           "indices": [
> >>             0,
> >>             14
> >>           ],
> >>           "screen_name": "themattharris"
> >>         }
> >>       ]
> >>     },
> >>     "text": "@themattharris hey how are things?",
> >>     "annotations": null,
> >>     "contributors": [
> >>       {
> >>         "id": 819797,
> >>         "id_str": "819797",
> >>         "screen_name": "episod"
> >>       }
> >>     ],
> >>     "id": 12738165059,
> >>     "id_str": "12738165059",
> >>     "retweet_count": 0,
> >>     "geo": null,
> >>     "retweeted": false,
> >>     "in_reply_to_user_id": 777925,
> >>     "in_reply_to_user_id_str": "777925",
> >>     "in_reply_to_screen_name": "themattharris",
> >>     "user": {
> >>       "id": 6253282
> >>       "id_str": "6253282"
> >>     },
> >>     "source": "web",
> >>     "place": null,
> >>     "in_reply_to_status_id": 12738040524
> >>     "in_reply_to_status_id_str": "12738040524"
> >>   }
> >> ]
>
> >> What should you do - RIGHT NOW
> >> ----------------------------------------------
> >> The first thing you should do is attempt to decode the JSON snippet above
> >> using your production code parser. Observe the output to confirm the ID has
> >> not lost accuracy.
>
> >> What you do next depends on what happens:
>
> >> * If your code converts the ID successfully without losing accuracy you are
> >> OK but should consider converting to the _str versions of IDs as soon as
> >> possible.
> >> * If your code has lost accuracy, convert your code to using the _str
> >> version immediately. If you do not do this your code will be unable to
> >> interact with the Twitter API reliably.
> >> * In some language parsers, the JSON may throw an exception when reading 
> >> the
> >> ID value. If this happens in your parser you will need to pre-parse the
> >> data, removing or replacing ID parameters with their _str versions.
>
> >> Summary
> >> -------------
> >> 1) If you develop in Javascript, know that you will have to update your 
> >> code
> >> to read the string version instead of the integer version.
>
> >> 2) If you use a JSON decoder, validate that the example JSON, above, 
> >> decodes
> >> without throwing exceptions. If exceptions are thrown, you will need to
> >> pre-parse the data. Please let us know the name, version, and language of
> >> the parser which throws the exception so we can investigate.
>
> >> Timeline
> >> -----------
> >> by 22nd October 2010 (Friday): String versions of ID numbers will start
> >> appearing in the API responses
> >> 4th November 2010 (Thursday) : Snowflake will be turned on but at ~41bit
> >> length
> >> 26th November 2010 (Friday) : Status IDs will break 53bits in length and
> >> cease being usable as Integers in Javascript based languages
>
> >> We understand this isn t as seamless a transition as we had planned and
> >> appreciate for some of you this change requires an update to your code.
> >> We ve tried to give as much time as possible for you to make the migration
> >> and update your code to use the new string representations.
>
> >> Our own products and tools are affected by the change and we will be making
> >> available any pre-parsing snippets we have created to ensure code continues
> >> to work with the new IDs.
>
> >> Thanks for your support and understanding.
>
> >> ---
> >> @themattharris
> >> Developer Advocate, Twitterhttp://twitter.com/themattharris

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk

Reply via email to