I came across this issue when handling tweets and decided which format
to store them in my database.

The API docs state:  "Please note that angle brackets ("<" and ">")
are entity-encoded" (http://apiwiki.twitter.com/REST-API-
Documentation#Encoding)

So, we must assume that the messages we receive are first 'escaped',
and then html-entity-encoded. Thus, < is escaped as &lt; which is then
encoded as &amp;lt;
The result, if echo'd directly in the browser, would show &lt; (so, it
has been escaped, as intended).

Non ASCII characters (for example, Arabic characters) are also html
entity encoded (with XML responses, at least).

I think I use something similar to the following for storing responses
in my database, in a format which I believe is most similar to
Twitter.com's storage. This results in a true UTF-8 string:

htmlspecialchars_decode(html_entity_decode($string, ENT_NOQUOTES,
'UTF-8'));

That may not be entirely correct, but I don't have my client library
to hand. By using this method, the text will remain searchable as a
compter-readable string.

I hope that helps,

CaMason

On Jan 17, 9:42 am, nattu <[email protected]> wrote:
> Did you try url encoding the character?
>
> On Jan 16, 8:43 pm, "twonvo.com" <[email protected]> wrote:
>
> > i have a similar problem... when submitting from my tool the user
> > cannot use &!
>
> > any ideas? - tried converting to &amp; with no luck
>
> > On Jan 16, 2:52 pm, nattu <[email protected]> wrote:
>
> > > If the status message for a user contains the characters '<' or '>',
> > > while
> > > fetching through the API, the returned status contains the double html
> > > encoded form of < and >, i.e., '&amp;&lt;' and '&amp;&gt;'. This is
> > > causing
> > > issues when used in our application.
>
> > >http://twitter.com/statuses/update.xml
> > > We are requesting the status update for which the new status is
> > > returned.

Reply via email to