On Mon, Mar 9, 2009 at 8:41 PM, Erik Corry <[email protected]> wrote: > 2009/3/9 Stephan Beal <[email protected]>: > Here's the text from v8.h: > > * Allocates a new string from either utf-8 encoded or ascii data. > * The second parameter 'length' gives the buffer length. > * If the data is utf-8 encoded, the caller must > * be careful to supply the length parameter. > * If it is not given, the function calls > * 'strlen' to determine the buffer length, it might be > * wrong if 'data' contains a null character. > */
Aha, okay i wasn't clear on the automatic assumption to utf8. Fair enough. > So it will assume that it is UTF-8 if it is not ASCII. Not all binary > sequences are valid UTF-8 so you can't use this for binary data. > Internally, V8 does not use UTF-8 so this data will be converted to > UC16. Doh, and here all along i assumed utf8 was what WAS used, as the API has Utf8Value but no Utf16Value. > /** Allocates a new string from utf16 data.*/ > static Local<String> New(const uint16_t* data, int length = -1); > > This one takes 16 bit characters and can represent binary data with no > corruption, but the length is in characters, so you can's use it for > an odd number of bytes. What's the byte order? >> In my case i'm working on an i/o library which of course treats the >> data as opaque (void*). If i understand you correctly, if it happens ... > Giving binary data to the above New method will result in undefined behaviour. Fair enough. > The external strings must have their data either in ASCII or in UC16. > There's no Latin1 and undefined stuff will result if you try. In the > case of an external string the actual string data is not on the V8 > heap. It is assumed to be immutable too of course since all JS > strings are immutable. That wouldn't solve my case, which is effectively latin1. i'll need to think about that (but don't mind living with the limitation of ascii read/write). >> That's an idea. Didn't think of that. It'd mean (in my case) buffering >> arbitrarily large read buffers, and since v8 doesn't guaranty GC will >> ever be called, i don't want to risk it causing an arbitrarily-sized >> leak. > > If the data is on the V8 heap then it won't be collected without a GC either. > :) But even if i registered it for gc via a weak pointer callback, it's not guaranteed to be freed, so i'm forced to add external gc to it in *any* case and have the client call the cleanup routine when their context dies (this is currently handled via a sentry object in the client app which cleans up when it goes out of scope). -- ----- stephan beal http://wanderinghorse.net/home/stephan/ --~--~---------~--~----~------------~-------~--~----~ v8-users mailing list [email protected] http://groups.google.com/group/v8-users -~----------~----~----~----~------~----~------~--~---
