Re: [nodejs] buffer toString with partial utf8 character?

Jimb Esser Mon, 08 Sep 2014 08:35:30 -0700

\uFFFD is a valid character, so you'll always have a valid string.  If you 
do as you suggest, you will both have a valid string and a actual prefix of 
the your file, which is probably what you want, representing the first 508 
- 512 bytes (with some chance of chopping off a character that was actually 
in the file if the last utf8 character was actually \uFFFD).


On Friday, September 5, 2014 1:08:53 AM UTC-7, Mark Hahn wrote:
>
> So if I find \uFFFD as the last character of a valid but truncated utf8 
>> buffer and I strip it, I should always end up with a valid string, right?  
>
>  
>
>> That was an awkward sentence.  Let me try in code.  If buf is the first 
>> 512 bytes of a long utf8 file will the following always produce a valid 
>> string?
>
>  
>     str = buf.toString();
>     if (str[str.length-1] is '\uFFFD') str = str.slice(0, -1);
>
>

-- 
Job board: http://jobs.nodejs.org/
New group rules: 
https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
--- 
You received this message because you are subscribed to the Google Groups 
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/nodejs/c2bdac7d-1557-41d2-8b6f-3bdd9a8fcb58%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [nodejs] buffer toString with partial utf8 character?

Reply via email to