My understanding is it takes care of reconstructing unicode characters that had their bytes split across multiple buffers. The toString() function on a buffer can't see other buffers.
On Fri, May 18, 2012 at 12:18 AM, Dean Mao <[email protected]> wrote: > So what is the purpose of StringDecoder exactly? The current docs for it > appear to be empty. http://nodejs.org/api/string_decoder.html > > How is it different than doing buffer.toString('utf8')? > > > On Thu, May 17, 2012 at 3:36 PM, Ben Noordhuis <[email protected]> wrote: > >> On Thu, May 17, 2012 at 9:56 PM, Mattias Ernelli <[email protected]> >> wrote: >> > How should node buffers/streams be handled if parsing/conversion of utf8 >> > encoded text data will be done? >> > >> > This simple test shows that naive concatenation or processing of buffers >> > will fail: >> > >> > var str = "Hälöö!"; >> > >> > var b = new Buffer(str); >> > >> > var b1 = b.slice(0, 5); >> > var b2 = b.slice(5); >> > >> > console.log("b: " + b.toString()); >> > console.log("b1: " + b1.toString()); >> > console.log("b2: " + b2.toString()); >> > >> > var str2 = b1.toString() + b2.toString(); >> > >> > console.log("str2: " + str2); >> > >> > >> > So assume that a http response containing utf8 encoded text will be >> > processed before forwarding it, to convert it to a string it must be >> > concatenated first. Or is text manipulation better carried out directly >> on >> > the buffer chunks? Which of course can be pretty hard compared to simply >> > applying regex'es/substring manipulation on complete strings. >> > >> > A quick fix is of course to filter all chunks through some decoder that >> > keeps track of trailing utf8 sequences that is incomplete, >> maybe that's what >> > the undocumented string >> > decoder does? http://nodejs.org/api/string_decoder.html >> >> Yes, string_decoder will do what you want. Example usage: >> >> var StringDecoder = require('string_decoder').StringDecoder; >> var sd = new StringDecoder('utf8'); >> >> var buf = new Buffer('Hälöö!'); >> var buf1 = buf.slice(0, 5); >> var buf2 = buf.slice(5); >> >> var str1 = sd.write(buf1); >> var str2 = sd.write(buf2); >> var str3 = str1 + str2; >> >> Alternatively, you can use node-buffertools (`npm install >> buffertools`) to concatenate the buffers efficiently. >> >> -- >> Job Board: http://jobs.nodejs.org/ >> Posting guidelines: >> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines >> You received this message because you are subscribed to the Google >> Groups "nodejs" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/nodejs?hl=en?hl=en >> > > -- > Job Board: http://jobs.nodejs.org/ > Posting guidelines: > https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines > You received this message because you are subscribed to the Google > Groups "nodejs" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/nodejs?hl=en?hl=en > -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en
