Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char
On Wed, Feb 6, 2013 at 5:35 PM, Alec Flett alecfl...@chromium.org wrote: Personally outside of WebKit I tend to see more char* as the common denominator for raw bytes. I've been coding in C since around 1972, and I admit that in the early days, char was used as a synonym for byte, however, that convention has long been dropped in favor of unsigned char. See [1] for a nice discussion, especially the second answer. [1] http://stackoverflow.com/questions/653336/should-a-buffer-of-bytes-be-signed-or-unsigned-char-buffer So far Benjamin objected, and then seems to have rescinded. Glenn, do you depend on SerializedScriptValue's current method signatures? I personally alway use unsigned char/uint8_t for raw byte strings. However, I don't have any dependency on SerializedScriptValue. Given Maciej and Adam's comments, you had best stay with unsigned char/uint_8 and convert any non-conforming usage to this pattern when needed. ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char
ok, so something else has come up: SharedBuffer. SharedBuffer has an adoptVector method that allows you to adopt Vectorchar... some of the stuff I'm using that interacts with LevelDB is also dealing with SharedBuffer, hence I've had to do some nasty casting from Vectoruint8_t to Vectorchar to allow me to call SharedBuffer.adoptVector() And again, we could tweak SharedBuffer to accept Vectorunsigned char but now we have two subsystems (LevelDB, and SharedBuffer) that seem to prefer char as raw data. Personally outside of WebKit I tend to see more char* as the common denominator for raw bytes. Further, there are no subsystems that actually depend on SerializedScriptValue using uint8_t - it was just what we decided to use when (ironically) we were hooking up IndexedDB to JSC, just a month or so ago. So far Benjamin objected, and then seems to have rescinded. Glenn, do you depend on SerializedScriptValue's current method signatures? Alec On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain benja...@webkit.org wrote: On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.org wrote: Well, nobody is explicitly using LChar with SerializedScriptValue (maybe it should, maybe that's another issue) but I guess this is why I'm asking - I'm happy to just deal with this in IDB with some ugly reinterpret_casts here and there (ok maybe not happy, but satisfied enough) if folks prefer that. I don't personally find uint8_t to be any more intuitive than char, but it sounds like some do. Nevermind... Well, since you never use character types and only raw data, just ignore my comment. As far as I know, it is already common to use signed char for raw data (in the network stack for example). Benjamin On Wed, Feb 6, 2013 at 4:06 PM, Alec Flett alecfl...@google.com wrote: ok, so something else has come up: SharedBuffer. SharedBuffer has an adoptVector method that allows you to adopt Vectorchar... some of the stuff I'm using that interacts with LevelDB is also dealing with SharedBuffer, hence I've had to do some nasty casting from Vectoruint8_t to Vectorchar to allow me to call SharedBuffer.adoptVector() And again, we could tweak SharedBuffer to accept Vectorunsigned char but now we have two subsystems (LevelDB, and SharedBuffer) that seem to prefer char as raw data. Personally outside of WebKit I tend to see more char* as the common denominator for raw bytes. Further, there are no subsystems that actually depend on SerializedScriptValue using uint8_t - it was just what we decided to use when (ironically) we were hooking up IndexedDB to JSC, just a month or so ago. So far Benjamin objected, and then seems to have rescinded. Glenn, do you depend on SerializedScriptValue's current method signatures? Alec On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain benja...@webkit.orgwrote: On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.orgwrote: Well, nobody is explicitly using LChar with SerializedScriptValue (maybe it should, maybe that's another issue) but I guess this is why I'm asking - I'm happy to just deal with this in IDB with some ugly reinterpret_casts here and there (ok maybe not happy, but satisfied enough) if folks prefer that. I don't personally find uint8_t to be any more intuitive than char, but it sounds like some do. Nevermind... Well, since you never use character types and only raw data, just ignore my comment. As far as I know, it is already common to use signed char for raw data (in the network stack for example). Benjamin ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char
I think we should continue to use uint8_t instead of char as the primary way to represent a raw byte in WebKit. First, it's good to distinguish raw data from C strings at the type system level, and second, the unpredictable signedness of char is actively bad for byte-oriented processing. Another library making a different choice doesn't overcome these reasons. If there's particular libraries we want to use which have different conventions, the adaptation should be done at the level of interfacing the library. Changing WebKit's conventiones because of one optional dependency does not make sense to me. Regards, Maciej On Feb 6, 2013, at 4:35 PM, Alec Flett alecfl...@chromium.org wrote: ok, so something else has come up: SharedBuffer. SharedBuffer has an adoptVector method that allows you to adopt Vectorchar... some of the stuff I'm using that interacts with LevelDB is also dealing with SharedBuffer, hence I've had to do some nasty casting from Vectoruint8_t to Vectorchar to allow me to call SharedBuffer.adoptVector() And again, we could tweak SharedBuffer to accept Vectorunsigned char but now we have two subsystems (LevelDB, and SharedBuffer) that seem to prefer char as raw data. Personally outside of WebKit I tend to see more char* as the common denominator for raw bytes. Further, there are no subsystems that actually depend on SerializedScriptValue using uint8_t - it was just what we decided to use when (ironically) we were hooking up IndexedDB to JSC, just a month or so ago. So far Benjamin objected, and then seems to have rescinded. Glenn, do you depend on SerializedScriptValue's current method signatures? Alec On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain benja...@webkit.org wrote: On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.org wrote: Well, nobody is explicitly using LChar with SerializedScriptValue (maybe it should, maybe that's another issue) but I guess this is why I'm asking - I'm happy to just deal with this in IDB with some ugly reinterpret_casts here and there (ok maybe not happy, but satisfied enough) if folks prefer that. I don't personally find uint8_t to be any more intuitive than char, but it sounds like some do. Nevermind... Well, since you never use character types and only raw data, just ignore my comment. As far as I know, it is already common to use signed char for raw data (in the network stack for example). Benjamin On Wed, Feb 6, 2013 at 4:06 PM, Alec Flett alecfl...@google.com wrote: ok, so something else has come up: SharedBuffer. SharedBuffer has an adoptVector method that allows you to adopt Vectorchar... some of the stuff I'm using that interacts with LevelDB is also dealing with SharedBuffer, hence I've had to do some nasty casting from Vectoruint8_t to Vectorchar to allow me to call SharedBuffer.adoptVector() And again, we could tweak SharedBuffer to accept Vectorunsigned char but now we have two subsystems (LevelDB, and SharedBuffer) that seem to prefer char as raw data. Personally outside of WebKit I tend to see more char* as the common denominator for raw bytes. Further, there are no subsystems that actually depend on SerializedScriptValue using uint8_t - it was just what we decided to use when (ironically) we were hooking up IndexedDB to JSC, just a month or so ago. So far Benjamin objected, and then seems to have rescinded. Glenn, do you depend on SerializedScriptValue's current method signatures? Alec On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain benja...@webkit.org wrote: On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.org wrote: Well, nobody is explicitly using LChar with SerializedScriptValue (maybe it should, maybe that's another issue) but I guess this is why I'm asking - I'm happy to just deal with this in IDB with some ugly reinterpret_casts here and there (ok maybe not happy, but satisfied enough) if folks prefer that. I don't personally find uint8_t to be any more intuitive than char, but it sounds like some do. Nevermind... Well, since you never use character types and only raw data, just ignore my comment. As far as I know, it is already common to use signed char for raw data (in the network stack for example). Benjamin ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char
On Wed, Feb 6, 2013 at 4:59 PM, Alec Flett alecfl...@chromium.org wrote: On Wed, Feb 6, 2013 at 4:48 PM, Maciej Stachowiak m...@apple.com wrote: I think we should continue to use uint8_t instead of char as the primary way to represent a raw byte in WebKit. First, it's good to distinguish raw data from C strings at the type system level, and second, the unpredictable signedness of char is actively bad for byte-oriented processing. Another library making a different choice doesn't overcome these reasons. To be fair, there hasn't been a convention in WebKit at all. uint8_t was chosen for SerializedScriptValue roughly two months ago, with specific IndexedDB support in mind: https://bugs.webkit.org/show_bug.cgi?id=104354 - this usage is not widespread, and in fact the only consumer of this type is IndexedDB. If there's particular libraries we want to use which have different conventions, the adaptation should be done at the level of interfacing the library. Changing WebKit's conventiones because of one optional dependency does not make sense to me. Maybe more simply: Vectoruint8_t was chosen very recently to replace String, in support of cleaning up IndexedDB code. IndexedDB would like to use Vectorchar now for further cleanup. Would you feel the same if we were switching from String to Vectorchar? Yeah, We use char all over WebCore to represent a byte, including in SharedBuffer. We should use for SerializedScriptValue for consistency. If we think that uint8_t is better than char to represent bytes, then we should make that change globally in WebCore separately. Adam On Wed, Feb 6, 2013 at 4:48 PM, Maciej Stachowiak m...@apple.com wrote: I think we should continue to use uint8_t instead of char as the primary way to represent a raw byte in WebKit. First, it's good to distinguish raw data from C strings at the type system level, and second, the unpredictable signedness of char is actively bad for byte-oriented processing. Another library making a different choice doesn't overcome these reasons. If there's particular libraries we want to use which have different conventions, the adaptation should be done at the level of interfacing the library. Changing WebKit's conventiones because of one optional dependency does not make sense to me. Regards, Maciej On Feb 6, 2013, at 4:35 PM, Alec Flett alecfl...@chromium.org wrote: ok, so something else has come up: SharedBuffer. SharedBuffer has an adoptVector method that allows you to adopt Vectorchar... some of the stuff I'm using that interacts with LevelDB is also dealing with SharedBuffer, hence I've had to do some nasty casting from Vectoruint8_t to Vectorchar to allow me to call SharedBuffer.adoptVector() And again, we could tweak SharedBuffer to accept Vectorunsigned char but now we have two subsystems (LevelDB, and SharedBuffer) that seem to prefer char as raw data. Personally outside of WebKit I tend to see more char* as the common denominator for raw bytes. Further, there are no subsystems that actually depend on SerializedScriptValue using uint8_t - it was just what we decided to use when (ironically) we were hooking up IndexedDB to JSC, just a month or so ago. So far Benjamin objected, and then seems to have rescinded. Glenn, do you depend on SerializedScriptValue's current method signatures? Alec On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain benja...@webkit.org wrote: On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.org wrote: Well, nobody is explicitly using LChar with SerializedScriptValue (maybe it should, maybe that's another issue) but I guess this is why I'm asking - I'm happy to just deal with this in IDB with some ugly reinterpret_casts here and there (ok maybe not happy, but satisfied enough) if folks prefer that. I don't personally find uint8_t to be any more intuitive than char, but it sounds like some do. Nevermind... Well, since you never use character types and only raw data, just ignore my comment. As far as I know, it is already common to use signed char for raw data (in the network stack for example). Benjamin On Wed, Feb 6, 2013 at 4:06 PM, Alec Flett alecfl...@google.com wrote: ok, so something else has come up: SharedBuffer. SharedBuffer has an adoptVector method that allows you to adopt Vectorchar... some of the stuff I'm using that interacts with LevelDB is also dealing with SharedBuffer, hence I've had to do some nasty casting from Vectoruint8_t to Vectorchar to allow me to call SharedBuffer.adoptVector() And again, we could tweak SharedBuffer to accept Vectorunsigned char but now we have two subsystems (LevelDB, and SharedBuffer) that seem to prefer char as raw data. Personally outside of WebKit I tend to see more char* as the common denominator for raw bytes. Further, there are no subsystems that actually depend on SerializedScriptValue using uint8_t - it was just what we decided to use when
Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char
On Feb 6, 2013, at 4:59 PM, Alec Flett alecfl...@chromium.org wrote: On Wed, Feb 6, 2013 at 4:48 PM, Maciej Stachowiak m...@apple.com wrote: I think we should continue to use uint8_t instead of char as the primary way to represent a raw byte in WebKit. First, it's good to distinguish raw data from C strings at the type system level, and second, the unpredictable signedness of char is actively bad for byte-oriented processing. Another library making a different choice doesn't overcome these reasons. To be fair, there hasn't been a convention in WebKit at all. uint8_t was chosen for SerializedScriptValue roughly two months ago, with specific IndexedDB support in mind: https://bugs.webkit.org/show_bug.cgi?id=104354 - this usage is not widespread, and in fact the only consumer of this type is IndexedDB. I don't know where you get this idea from -- SerializedScriptValue has always used uint8 buffers. That IndexedDB used Strings and that _it_ changed to uint8 buffers is entirely irrelevant here. --Oliver ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
[webkit-dev] SerializedScriptValue: signed vs unsigned char
At the moment, SerializedScriptValue uses Vectoruint8_t (aka Vectorunsigned char) for both it's API (createFromWireBytes, toWireBytes) as well as its internal representation. (for both v8 and jsc implementations) The two largest consumers of this aspect of SerializedScriptValue seems to be IndexedDB and postMessage(). I'm jumping through some small hoops (i.e. reinterpret_cast and whatnot) in IndexedDB to convert between Vectoruint8_t and Vectorchar and a 'const char*' buffer in order to write out to LevelDB, who likes 'char' as opposed to unsigned char. postMessage() seems to be pretty agnostic to char vs unsigned char, since it uses SerializedScriptValue to both produce and consume the buffers it sends between windows. Before I did a code cleanup and just fixed up all the implementations to use Vectorchar I wanted to see if anyone had any objections here, on both the V8 and the JSC sides. The ultimate compiled code is going to be identical, but I'll be able to avoid all sorts of reinterpret_cast's at various points in the code. objections? Alec ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char
On Mon, Feb 4, 2013 at 3:51 PM, Alec Flett alecfl...@chromium.org wrote: At the moment, SerializedScriptValue uses Vectoruint8_t (aka Vectorunsigned char) for both it's API (createFromWireBytes, toWireBytes) as well as its internal representation. (for both v8 and jsc implementations) The two largest consumers of this aspect of SerializedScriptValue seems to be IndexedDB and postMessage(). I'm jumping through some small hoops (i.e. reinterpret_cast and whatnot) in IndexedDB to convert between Vectoruint8_t and Vectorchar and a 'const char*' buffer in order to write out to LevelDB, who likes 'char' as opposed to unsigned char. postMessage() seems to be pretty agnostic to char vs unsigned char, since it uses SerializedScriptValue to both produce and consume the buffers it sends between windows. Before I did a code cleanup and just fixed up all the implementations to use Vectorchar I wanted to see if anyone had any objections here, on both the V8 and the JSC sides. The ultimate compiled code is going to be identical, but I'll be able to avoid all sorts of reinterpret_cast's at various points in the code. objections? WTF/WebCore 8bits characters (LChar) are unsigned char. Wouldn't you just push the reinterpret_cast to the other side of the IndexedDB module? I find it a little odd to modify WebCore based on one particular dependency (LevelDB) but maybe there is a good reason. Benjamin ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char
On Mon, Feb 4, 2013 at 4:51 PM, Alec Flett alecfl...@chromium.org wrote: At the moment, SerializedScriptValue uses Vectoruint8_t (aka Vectorunsigned char) for both it's API (createFromWireBytes, toWireBytes) as well as its internal representation. (for both v8 and jsc implementations) The two largest consumers of this aspect of SerializedScriptValue seems to be IndexedDB and postMessage(). I'm jumping through some small hoops (i.e. reinterpret_cast and whatnot) in IndexedDB to convert between Vectoruint8_t and Vectorchar and a 'const char*' buffer in order to write out to LevelDB, who likes 'char' as opposed to unsigned char. postMessage() seems to be pretty agnostic to char vs unsigned char, since it uses SerializedScriptValue to both produce and consume the buffers it sends between windows. Before I did a code cleanup and just fixed up all the implementations to use Vectorchar I wanted to see if anyone had any objections here, on both the V8 and the JSC sides. The ultimate compiled code is going to be identical, but I'll be able to avoid all sorts of reinterpret_cast's at various points in the code. If you are suggesting to change IndexedDB to use Vectorchar in order to accommodate the LevelDB interfaces, then I would suggest you not do that. Vectoruint8_t is more clear than Vectorchar (or Vectorunsigned char) at expressing the intended usage (binary versus character data). ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char
Well, nobody is explicitly using LChar with SerializedScriptValue (maybe it should, maybe that's another issue) but I guess this is why I'm asking - I'm happy to just deal with this in IDB with some ugly reinterpret_casts here and there (ok maybe not happy, but satisfied enough) if folks prefer that. I don't personally find uint8_t to be any more intuitive than char, but it sounds like some do. Nevermind... (In my own experience I've always found explicit unsigned to be more of a hassle then a tool for good type safely, and I suspect that's why LevelDB uses it as its lowest-common-denominator, but we don't have to have that debate here :)) Alec On Mon, Feb 4, 2013 at 4:43 PM, Benjamin Poulain benja...@webkit.orgwrote: On Mon, Feb 4, 2013 at 3:51 PM, Alec Flett alecfl...@chromium.org wrote: At the moment, SerializedScriptValue uses Vectoruint8_t (aka Vectorunsigned char) for both it's API (createFromWireBytes, toWireBytes) as well as its internal representation. (for both v8 and jsc implementations) The two largest consumers of this aspect of SerializedScriptValue seems to be IndexedDB and postMessage(). I'm jumping through some small hoops (i.e. reinterpret_cast and whatnot) in IndexedDB to convert between Vectoruint8_t and Vectorchar and a 'const char*' buffer in order to write out to LevelDB, who likes 'char' as opposed to unsigned char. postMessage() seems to be pretty agnostic to char vs unsigned char, since it uses SerializedScriptValue to both produce and consume the buffers it sends between windows. Before I did a code cleanup and just fixed up all the implementations to use Vectorchar I wanted to see if anyone had any objections here, on both the V8 and the JSC sides. The ultimate compiled code is going to be identical, but I'll be able to avoid all sorts of reinterpret_cast's at various points in the code. objections? WTF/WebCore 8bits characters (LChar) are unsigned char. Wouldn't you just push the reinterpret_cast to the other side of the IndexedDB module? I find it a little odd to modify WebCore based on one particular dependency (LevelDB) but maybe there is a good reason. Benjamin ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char
On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.org wrote: Well, nobody is explicitly using LChar with SerializedScriptValue (maybe it should, maybe that's another issue) but I guess this is why I'm asking - I'm happy to just deal with this in IDB with some ugly reinterpret_casts here and there (ok maybe not happy, but satisfied enough) if folks prefer that. I don't personally find uint8_t to be any more intuitive than char, but it sounds like some do. Nevermind... Well, since you never use character types and only raw data, just ignore my comment. As far as I know, it is already common to use signed char for raw data (in the network stack for example). Benjamin ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev