The position can *not* be in the middle of a character (the docs say
"however, the will not write only partially encoded characters."*), however
that does mean that it may not have completely filled the buffer (will
prefer to leave the last bytes unused rather than write a partial
character).
When needing to do this same thing ourselves in the past, I believe we
found the most efficient solution was to just scan the string to calculate,
for a given number of bytes, how many UTF8 characters in it was (was much
more efficient than allocating a new Buffer every time - the whole point of
why we're writing into an existing buffer was to avoid allocating buffers
;). There's probably a module somewhere, but I think we used the snippet
below. I think you should be able to do var remaining =
str.slice(bytesToOffs(str, buf.write(str)));
function bytesToOffs(str, num_bytes) {
var idx = 0;
while (num_bytes > 0 && idx < str.length) {
var c = str.charCodeAt(idx);
if (c <= 0x7F) {
--num_bytes;
} else if (c <= 0x07FF) {
num_bytes -= 2;
} else if (c <= 0xFFFF) {
num_bytes -= 3;
} else if (c <= 0x1FFFFF) {
num_bytes -= 4;
} else if (c <= 0x3FFFFFF) {
num_bytes -= 5;
} else {
num_bytes -= 6;
}
if (num_bytes >= 0) {
++idx;
}
}
return idx;
}
Note: I last looked at this closely on Node 0.6 or so, so there might be
better approaches now. Also, if you have a reasonable upper bound to your
string size, it might be more efficient to keep one, large re-used buffer
to first write your string into, and use that for other operations
(including writing bytes directly from that temp buffer to the other buffer
for more efficiency, if you don't mind partial characters being written),
though depending on what exactly you need the character index for, that
might not help.
Jimb
* Quoted bad grammar in the current live docs, PR
<https://github.com/nodejs/node/pull/4863> sent.
On Sunday, January 24, 2016 at 11:48:08 PM UTC-8, Matt Sergeant wrote:
>
> Because the string could contain non-ascii data, you'd have to create a
> buffer from the string to get that position. Be aware that the position
> could be in the middle of a character, so you'll have to deal with that.
>
> You can use Buffer.byteLength(thestring) == buf.write(thestring) to find
> out if the write didn't copy the entire string.
>
> On Sat, Jan 23, 2016 at 2:38 PM, Chethiya Abeysinghe <[email protected]
> <javascript:>> wrote:
>
>> This question is regarding this :
>> https://nodejs.org/api/buffer.html#buffer_buf_write_string_offset_length_encoding
>>
>> In case Buffer has no enough space to fit in all the bytes this function
>> returns the number of bytes it could write to the buffer. But is there an
>> efficient way to find last index of the string written to that buffer so
>> that rest of the string can be processed separately?
>>
>> P.S. I get that creating another string with buffer and getting it's
>> length solves this. But my understanding is it's not that efficient as it
>> allocates more memory for the new String?
>>
>> Thanks
>>
>> --
>> Job board: http://jobs.nodejs.org/
>> New group rules:
>> https://gist.github.com/othiym23/9886289#file-moderation-policy-md
>> Old group rules:
>> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "nodejs" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected]
>> <javascript:>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/nodejs/54b20633-8f0c-4460-98d6-4f9f403d9c0f%40googlegroups.com
>>
>> <https://groups.google.com/d/msgid/nodejs/54b20633-8f0c-4460-98d6-4f9f403d9c0f%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
--
Job board: http://jobs.nodejs.org/
New group rules:
https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules:
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
---
You received this message because you are subscribed to the Google Groups
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/nodejs/4aadcaf2-1bac-4268-8438-d141e2ec2b3c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.