On Fri, Jul 17, 2009 at 04:15:52AM -0700, Bjoern wrote:
> 
> probably it is too late to change it now, but someone has to say it: I
> think it is the wrong approach to do HTML escaping in the API on the
> Twitter side.

What data are you referring to that is being HTML-escaped?

>From what I can tell, the text of status messages, at least, are not escaped
by the API.  For example, look at:

   http://twitter.com/statuses/show/2688630329.json

 or

   http://twitter.com/statuses/show/2688630329.xml

In the JSON format, non-ascii characters are properly escaped unicode
in the javascript strings; in the XML format, non-asciis are encoded
as XML numeric character entities.  Either way, once you've (properly)
decoded the message, you should have plain old unicode.

If one (incorrectly) posts (already encoded) HTML entities in a status
update, the twitter.com web page is lenient about not double-encoding
them.  In other words if you post a status update of "A & B", the
twitter.com web interface will display this as "A & B", even though the
API (correctly) will report the status text to be "A & B".

E.g. compare status 2688630329 (links above) to:

   http://twitter.com/statuses/show/2688620445.json
   http://twitter.com/statuses/show/2688620445.xml


... Or were you talking about something else altogether?

Jeff

Reply via email to