PHP treats strings as c strings basically (char/byte arrays). It won't really
do anything special automagically and leaves it up to you to make sure you
treat your strings safely. Make sure your code is encoded in utf-8 and make
sure your content types are set to UTF-8 in your responses. Use UTF-8 wherever
you can in your dbs and use utf8_encode/decode and the mb functions
replacements where you can't. If you are making http requests mark your
encodings in your requests correctly (with CURL set your charset to UTF-8 in
your request headers).
In java, all strings are high level representations of chars (internally UCS2
wide chars but you don't need to worry about that). You just need to make sure
you decode/encode properly and mark your charsets in your requests and
Sent from my iPad
On May 13, 2010, at 10:51 AM, Matt Sanford <m...@twitter.com> wrote:
> Hi giustin,
> I don't think it's the same issue since yours is more PHP specific.
> My guess is that the PHP library in question or the code you're using
> to process the results is incorrectly converting between UTF-8 and
> ISO-8859-1 . Maybe someone on the list with some more PHP knowledge
> can suggest a fix.
> — Matt Sanford / @mzsanford
>  =
> The UTF-8 encoding of ã is two bytes. When those same two bytes are
> interpreted as ISO-8859-1 (a.k.a ISO-Latin-1) they are interpreted as
> two characters, like so (fixed width font required):
> UTF-8 Bytes vs. Same bytes in ISO-8859-1
> n 0x6E n
> ã 0xC3 Ã
> 0xA3 £
> o 0x6F o
> On May 12, 7:19 pm, giustin <tgiu...@gmail.com> wrote:
>> I have similar problems.
>> When I try to search using the tag "não" the result is ""nÃ£o". The
>> API that I used were Twitter Search API from Ryan Faerman (http://
>> On 12 maio, 21:47, Matt Sanford <m...@twitter.com> wrote:
>>> Hi there,
>>> All characters in Tweets are utf-8. I'm assuming you're looking
>>> for something specific like accents or ASCII-art punctuation. Can you
>>> describe your problem in a little more detail? I might be able to help
>>> once I know what you're trying to prevent.
>>> — Matt Sanford / @mzsanford
>>> On May 12, 4:21 pm, adamjamesdrew <theikl...@gmail.com> wrote:
>>>> any ideas?