I'm trying to figure out how to properly process Tweet data from the
search API.  I'm retrieving data in the JSON format, and each Tweet
has a "text" property.  I'd assumed we would treat the content as
text.  However, I'm seeing multiple instances of entities such as
" and & in the JSON.  So now I'm unsure how to proceed.

Is there a bug in the search API where characters in JSON strings are
being encoded as XML?  Is there a bug in some popular Twitter clients
where content is sent improperly encoded?  Or am I supposed to expect
those entities?  If I'm supposed to expect those entities, are they
HTML or XML?  What else should I expect (aside from the markup
convention, like those @username tags)?  Am I also supposed to process
angle brackets since those are being encoded?

Obviously, I don't want to just blindly insert any Tweet text as HTML.

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk

Reply via email to