I faced a similar issue recently with 0.6.x that was fixed in 0.8.x
since 0.8.x includes a newer version of V8 which addressed some issues
with Unicode handling such as:

https://github.com/joyent/node/issues/2686

The linked V8 issue has more details.

-- Daniel R. <[email protected]> [http://danielr.neophi.com/]


On Fri, Jul 20, 2012 at 6:54 PM, Taylor Hughes <[email protected]> wrote:
> v0.6.18 — we haven't bumped our project up to 0.8 yet.
>
> Just tried with 0.8.3 and you're right — looks good in 0.8. Didn't think an
> upgrade would change this particular behavior so I didn't try it out. :)
>
> Amazing. Thanks!
>
> -t
>
>
>
> On Fri, Jul 20, 2012 at 3:48 PM, Marcel Laverdet <[email protected]>
> wrote:
>>
>> What version of node? This is what I get:
>>
>> > var moji1 = (new Buffer('\xf0\x9f\x8d\x94',
>> > 'binary')).toString('utf-8');
>> > var moji2 = (new Buffer('\u00f0\u009f\u008d\u0094',
>> > 'binary')).toString('utf-8');
>> > var moji3 = decodeURIComponent('%F0%9F%8D%94');
>> > moji1 == moji2
>> true
>> > moji2 == moji3
>> true
>>
>>
>> On Fri, Jul 20, 2012 at 1:48 PM, Taylor Hughes <[email protected]> wrote:
>>>
>>> Hi nodejs group!
>>>
>>> I was just wrestling with a bug in our app — concerning an iPhone emoji
>>> => multipart POST to a node.js backend (decoding with formidible library) —
>>> and came across the following Interesting Case™.
>>>
>>> The bug was: emoji chars POSTed from an iPhone, as part of a multipart
>>> request, were being converted into \ufffd (UTF-8 replacement) chars, whereas
>>> with form-encoded POSTs they were not.
>>>
>>> From this behavior I isolated the following interesting snippet:
>>>
>>> // This is an emoji character POSTed by an iPhone:
>>> var binary = '\u00f0\u009f\u008d\u0094';
>>> // The same binary string, urlencoded byte for byte (what you get with a
>>> form-encoded POST of the same thing):
>>> var urlencoded = '%F0%9F%8D%94';
>>>
>>> // Convert from the binary string
>>> var utf8 = new Buffer(binary, 'binary').toString('utf-8');
>>>
>>> // Convert from the urlencoded version of the same thing
>>> var utf8uri = decodeURIComponent(urlencoded);
>>>
>>> // Results are not the same:
>>> utf8 == utf8uri // false
>>>
>>> // utf8    => "\ufffd" (UTF-8 replacement character)
>>> // utf8uri => "\ud83c\udf54" (characters the iPhone can understand as the
>>> original emoji)
>>>
>>>
>>> (Note that normal multibyte UTF-8 characters go through both the same
>>> way, and seem to come out fine in both cases.)
>>>
>>> I'm mostly curious about why this happens — namely why
>>> decodeURIComponent() is seemingly more permissive with UTF-8 decoding than
>>> other mechanisms like StringDecoder() and Buffer.toString() — and if there's
>>> a way to preserve strange UTF-8 characters using those mechanisms too.
>>>
>>> Thanks!
>>> Taylor
>>>
>>> --
>>> Job Board: http://jobs.nodejs.org/
>>> Posting guidelines:
>>> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
>>> You received this message because you are subscribed to the Google
>>> Groups "nodejs" group.
>>> To post to this group, send email to [email protected]
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>> For more options, visit this group at
>>> http://groups.google.com/group/nodejs?hl=en?hl=en
>>
>>
>> --
>> Job Board: http://jobs.nodejs.org/
>> Posting guidelines:
>> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
>> You received this message because you are subscribed to the Google
>> Groups "nodejs" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/nodejs?hl=en?hl=en
>
>
> --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to