On Fri, Jul 17, 2009 at 04:15:52AM -0700, Bjoern wrote: > > probably it is too late to change it now, but someone has to say it: I > think it is the wrong approach to do HTML escaping in the API on the > Twitter side.
What data are you referring to that is being HTML-escaped? >From what I can tell, the text of status messages, at least, are not escaped by the API. For example, look at: http://twitter.com/statuses/show/2688630329.json or http://twitter.com/statuses/show/2688630329.xml In the JSON format, non-ascii characters are properly escaped unicode in the javascript strings; in the XML format, non-asciis are encoded as XML numeric character entities. Either way, once you've (properly) decoded the message, you should have plain old unicode. If one (incorrectly) posts (already encoded) HTML entities in a status update, the twitter.com web page is lenient about not double-encoding them. In other words if you post a status update of "A & B", the twitter.com web interface will display this as "A & B", even though the API (correctly) will report the status text to be "A & B". E.g. compare status 2688630329 (links above) to: http://twitter.com/statuses/show/2688620445.json http://twitter.com/statuses/show/2688620445.xml ... Or were you talking about something else altogether? Jeff