So is data concat'ed via buffertools (a SlowBuffer) exempt from this issue?
 myslowbuffer.toString() won't have the split byte issue?


On Thu, May 17, 2012 at 11:22 PM, Tim Caswell <[email protected]> wrote:

> My understanding is it takes care of reconstructing unicode characters
> that had their bytes split across multiple buffers.  The toString()
> function on a buffer can't see other buffers.
>
>
> On Fri, May 18, 2012 at 12:18 AM, Dean Mao <[email protected]> wrote:
>
>> So what is the purpose of StringDecoder exactly?  The current docs for it
>> appear to be empty.  http://nodejs.org/api/string_decoder.html
>>
>> How is it different than doing buffer.toString('utf8')?
>>
>>
>> On Thu, May 17, 2012 at 3:36 PM, Ben Noordhuis <[email protected]>wrote:
>>
>>> On Thu, May 17, 2012 at 9:56 PM, Mattias Ernelli <[email protected]>
>>> wrote:
>>> > How should node buffers/streams be handled if parsing/conversion of
>>> utf8
>>> > encoded text data will be done?
>>> >
>>> > This simple test shows that naive concatenation or processing of
>>> buffers
>>> > will fail:
>>> >
>>> > var str = "Hälöö!";
>>> >
>>> > var b = new Buffer(str);
>>> >
>>> > var b1 = b.slice(0, 5);
>>> > var b2 = b.slice(5);
>>> >
>>> > console.log("b: " + b.toString());
>>> > console.log("b1: " + b1.toString());
>>> > console.log("b2: " + b2.toString());
>>> >
>>> > var str2 = b1.toString() + b2.toString();
>>> >
>>> > console.log("str2: " + str2);
>>> >
>>> >
>>> > So assume that a http response containing utf8 encoded text will be
>>> > processed before forwarding it, to convert it to a string it must be
>>> > concatenated first. Or is text manipulation better carried out
>>> directly on
>>> > the buffer chunks? Which of course can be pretty hard compared to
>>> simply
>>> > applying regex'es/substring manipulation on complete strings.
>>> >
>>> > A quick fix is of course to filter all chunks through some decoder that
>>> > keeps track of trailing utf8 sequences that is incomplete,
>>> maybe that's what
>>> > the undocumented string
>>> > decoder does? http://nodejs.org/api/string_decoder.html
>>>
>>> Yes, string_decoder will do what you want. Example usage:
>>>
>>>  var StringDecoder = require('string_decoder').StringDecoder;
>>>  var sd = new StringDecoder('utf8');
>>>
>>>  var buf = new Buffer('Hälöö!');
>>>  var buf1 = buf.slice(0, 5);
>>>  var buf2 = buf.slice(5);
>>>
>>>  var str1 = sd.write(buf1);
>>>  var str2 = sd.write(buf2);
>>>  var str3 = str1 + str2;
>>>
>>> Alternatively, you can use node-buffertools (`npm install
>>> buffertools`) to concatenate the buffers efficiently.
>>>
>>> --
>>> Job Board: http://jobs.nodejs.org/
>>> Posting guidelines:
>>> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
>>> You received this message because you are subscribed to the Google
>>> Groups "nodejs" group.
>>> To post to this group, send email to [email protected]
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>> For more options, visit this group at
>>> http://groups.google.com/group/nodejs?hl=en?hl=en
>>>
>>
>>  --
>> Job Board: http://jobs.nodejs.org/
>> Posting guidelines:
>> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
>> You received this message because you are subscribed to the Google
>> Groups "nodejs" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/nodejs?hl=en?hl=en
>>
>
>  --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en
>

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to