v0.6.18 — we haven't bumped our project up to 0.8 yet. Just tried with 0.8.3 and you're right — looks good in 0.8. Didn't think an upgrade would change this particular behavior so I didn't try it out. :)
Amazing. Thanks! -t On Fri, Jul 20, 2012 at 3:48 PM, Marcel Laverdet <[email protected]>wrote: > What version of node? This is what I get: > > > var moji1 = (new Buffer('\xf0\x9f\x8d\x94', 'binary')).toString('utf-8'); > > var moji2 = (new Buffer('\u00f0\u009f\u008d\u0094', > 'binary')).toString('utf-8'); > > var moji3 = decodeURIComponent('%F0%9F%8D%94'); > > moji1 == moji2 > true > > moji2 == moji3 > true > > > On Fri, Jul 20, 2012 at 1:48 PM, Taylor Hughes <[email protected]> wrote: > >> Hi nodejs group! >> >> I was just wrestling with a bug in our app — concerning an iPhone emoji >> => multipart POST to a node.js backend (decoding with formidible library) — >> and came across the following Interesting Case™. >> >> The bug was: emoji chars POSTed from an iPhone, as part of a multipart >> request, were being converted into \ufffd (UTF-8 replacement) chars, >> whereas with form-encoded POSTs they were not. >> >> From this behavior I isolated the following interesting snippet: >> >> // This is an emoji character POSTed by an iPhone: >> var binary = '\u00f0\u009f\u008d\u0094'; >> // The same binary string, urlencoded byte for byte (what you get with a >> form-encoded POST of the same thing): >> var urlencoded = '%F0%9F%8D%94'; >> >> // Convert from the binary string >> var utf8 = new Buffer(binary, 'binary').toString('utf-8'); >> >> // Convert from the urlencoded version of the same thing >> var utf8uri = decodeURIComponent(urlencoded); >> >> // Results are not the same: >> utf8 == utf8uri // false >> >> // utf8 => "\ufffd" (UTF-8 replacement character) >> // utf8uri => "\ud83c\udf54" (characters the iPhone can understand as the >> original emoji) >> >> >> (Note that normal multibyte UTF-8 characters go through both the same >> way, and seem to come out fine in both cases.) >> >> I'm mostly curious about why this happens — namely why >> decodeURIComponent() is seemingly more permissive with UTF-8 decoding than >> other mechanisms like StringDecoder() and Buffer.toString() — and if >> there's a way to preserve strange UTF-8 characters using those mechanisms >> too. >> >> Thanks! >> Taylor >> >> -- >> Job Board: http://jobs.nodejs.org/ >> Posting guidelines: >> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines >> You received this message because you are subscribed to the Google >> Groups "nodejs" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/nodejs?hl=en?hl=en >> > > -- > Job Board: http://jobs.nodejs.org/ > Posting guidelines: > https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines > You received this message because you are subscribed to the Google > Groups "nodejs" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/nodejs?hl=en?hl=en > -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en
