My understanding is it takes care of reconstructing unicode characters that
had their bytes split across multiple buffers.  The toString() function on
a buffer can't see other buffers.

On Fri, May 18, 2012 at 12:18 AM, Dean Mao <[email protected]> wrote:

> So what is the purpose of StringDecoder exactly?  The current docs for it
> appear to be empty.  http://nodejs.org/api/string_decoder.html
>
> How is it different than doing buffer.toString('utf8')?
>
>
> On Thu, May 17, 2012 at 3:36 PM, Ben Noordhuis <[email protected]> wrote:
>
>> On Thu, May 17, 2012 at 9:56 PM, Mattias Ernelli <[email protected]>
>> wrote:
>> > How should node buffers/streams be handled if parsing/conversion of utf8
>> > encoded text data will be done?
>> >
>> > This simple test shows that naive concatenation or processing of buffers
>> > will fail:
>> >
>> > var str = "Hälöö!";
>> >
>> > var b = new Buffer(str);
>> >
>> > var b1 = b.slice(0, 5);
>> > var b2 = b.slice(5);
>> >
>> > console.log("b: " + b.toString());
>> > console.log("b1: " + b1.toString());
>> > console.log("b2: " + b2.toString());
>> >
>> > var str2 = b1.toString() + b2.toString();
>> >
>> > console.log("str2: " + str2);
>> >
>> >
>> > So assume that a http response containing utf8 encoded text will be
>> > processed before forwarding it, to convert it to a string it must be
>> > concatenated first. Or is text manipulation better carried out directly
>> on
>> > the buffer chunks? Which of course can be pretty hard compared to simply
>> > applying regex'es/substring manipulation on complete strings.
>> >
>> > A quick fix is of course to filter all chunks through some decoder that
>> > keeps track of trailing utf8 sequences that is incomplete,
>> maybe that's what
>> > the undocumented string
>> > decoder does? http://nodejs.org/api/string_decoder.html
>>
>> Yes, string_decoder will do what you want. Example usage:
>>
>>  var StringDecoder = require('string_decoder').StringDecoder;
>>  var sd = new StringDecoder('utf8');
>>
>>  var buf = new Buffer('Hälöö!');
>>  var buf1 = buf.slice(0, 5);
>>  var buf2 = buf.slice(5);
>>
>>  var str1 = sd.write(buf1);
>>  var str2 = sd.write(buf2);
>>  var str3 = str1 + str2;
>>
>> Alternatively, you can use node-buffertools (`npm install
>> buffertools`) to concatenate the buffers efficiently.
>>
>> --
>> Job Board: http://jobs.nodejs.org/
>> Posting guidelines:
>> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
>> You received this message because you are subscribed to the Google
>> Groups "nodejs" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/nodejs?hl=en?hl=en
>>
>
>  --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en
>

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to