So is data concat'ed via buffertools (a SlowBuffer) exempt from this issue? myslowbuffer.toString() won't have the split byte issue?
On Thu, May 17, 2012 at 11:22 PM, Tim Caswell <[email protected]> wrote: > My understanding is it takes care of reconstructing unicode characters > that had their bytes split across multiple buffers. The toString() > function on a buffer can't see other buffers. > > > On Fri, May 18, 2012 at 12:18 AM, Dean Mao <[email protected]> wrote: > >> So what is the purpose of StringDecoder exactly? The current docs for it >> appear to be empty. http://nodejs.org/api/string_decoder.html >> >> How is it different than doing buffer.toString('utf8')? >> >> >> On Thu, May 17, 2012 at 3:36 PM, Ben Noordhuis <[email protected]>wrote: >> >>> On Thu, May 17, 2012 at 9:56 PM, Mattias Ernelli <[email protected]> >>> wrote: >>> > How should node buffers/streams be handled if parsing/conversion of >>> utf8 >>> > encoded text data will be done? >>> > >>> > This simple test shows that naive concatenation or processing of >>> buffers >>> > will fail: >>> > >>> > var str = "Hälöö!"; >>> > >>> > var b = new Buffer(str); >>> > >>> > var b1 = b.slice(0, 5); >>> > var b2 = b.slice(5); >>> > >>> > console.log("b: " + b.toString()); >>> > console.log("b1: " + b1.toString()); >>> > console.log("b2: " + b2.toString()); >>> > >>> > var str2 = b1.toString() + b2.toString(); >>> > >>> > console.log("str2: " + str2); >>> > >>> > >>> > So assume that a http response containing utf8 encoded text will be >>> > processed before forwarding it, to convert it to a string it must be >>> > concatenated first. Or is text manipulation better carried out >>> directly on >>> > the buffer chunks? Which of course can be pretty hard compared to >>> simply >>> > applying regex'es/substring manipulation on complete strings. >>> > >>> > A quick fix is of course to filter all chunks through some decoder that >>> > keeps track of trailing utf8 sequences that is incomplete, >>> maybe that's what >>> > the undocumented string >>> > decoder does? http://nodejs.org/api/string_decoder.html >>> >>> Yes, string_decoder will do what you want. Example usage: >>> >>> var StringDecoder = require('string_decoder').StringDecoder; >>> var sd = new StringDecoder('utf8'); >>> >>> var buf = new Buffer('Hälöö!'); >>> var buf1 = buf.slice(0, 5); >>> var buf2 = buf.slice(5); >>> >>> var str1 = sd.write(buf1); >>> var str2 = sd.write(buf2); >>> var str3 = str1 + str2; >>> >>> Alternatively, you can use node-buffertools (`npm install >>> buffertools`) to concatenate the buffers efficiently. >>> >>> -- >>> Job Board: http://jobs.nodejs.org/ >>> Posting guidelines: >>> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines >>> You received this message because you are subscribed to the Google >>> Groups "nodejs" group. >>> To post to this group, send email to [email protected] >>> To unsubscribe from this group, send email to >>> [email protected] >>> For more options, visit this group at >>> http://groups.google.com/group/nodejs?hl=en?hl=en >>> >> >> -- >> Job Board: http://jobs.nodejs.org/ >> Posting guidelines: >> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines >> You received this message because you are subscribed to the Google >> Groups "nodejs" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/nodejs?hl=en?hl=en >> > > -- > Job Board: http://jobs.nodejs.org/ > Posting guidelines: > https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines > You received this message because you are subscribed to the Google > Groups "nodejs" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/nodejs?hl=en?hl=en > -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en
