Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char

2013-02-07 Thread Glenn Adams
On Wed, Feb 6, 2013 at 5:35 PM, Alec Flett alecfl...@chromium.org wrote:


 Personally outside of WebKit I tend to see more char* as the common
 denominator for raw bytes.


I've been coding in C since around 1972, and I admit that in the early
days, char was used as a synonym for byte, however, that convention has
long been dropped in favor of unsigned char. See [1] for a nice discussion,
especially the second answer.

[1]
http://stackoverflow.com/questions/653336/should-a-buffer-of-bytes-be-signed-or-unsigned-char-buffer



 So far Benjamin objected, and then seems to have rescinded. Glenn, do you
 depend on SerializedScriptValue's current method signatures?


I personally alway use unsigned char/uint8_t for raw byte strings. However,
I don't have any dependency on SerializedScriptValue.

Given Maciej and Adam's comments, you had best stay with unsigned
char/uint_8 and convert any non-conforming usage to this pattern when
needed.
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char

2013-02-06 Thread Alec Flett
ok, so something else has come up: SharedBuffer. SharedBuffer has an
adoptVector method that allows you to adopt Vectorchar... some of the
stuff I'm using that interacts with LevelDB is also dealing with
SharedBuffer, hence I've had to do some nasty casting from Vectoruint8_t
to Vectorchar to allow me to call SharedBuffer.adoptVector()

And again, we could tweak SharedBuffer to accept Vectorunsigned char but
now we have two subsystems (LevelDB, and SharedBuffer) that seem to prefer
char as raw data.

Personally outside of WebKit I tend to see more char* as the common
denominator for raw bytes.

Further, there are no subsystems that actually depend on
SerializedScriptValue using uint8_t - it was just what we decided to use
when (ironically) we were hooking up IndexedDB to JSC, just a month or so
ago.

So far Benjamin objected, and then seems to have rescinded. Glenn, do you
depend on SerializedScriptValue's current method signatures?

Alec


On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain benja...@webkit.org
 wrote:

 On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.org wrote:

 Well, nobody is explicitly using LChar with SerializedScriptValue (maybe
 it should, maybe that's another issue)  but I guess this is why I'm asking
 - I'm happy to just deal with this in IDB with some ugly reinterpret_casts
 here and there (ok maybe not happy, but satisfied enough) if folks prefer
 that. I don't personally find uint8_t to be any more intuitive than char,
 but it sounds like some do. Nevermind...


 Well, since you never use character types and only raw data, just ignore
 my comment.

 As far as I know, it is already common to use signed char for raw data (in
 the network stack for example).

 Benjamin



On Wed, Feb 6, 2013 at 4:06 PM, Alec Flett alecfl...@google.com wrote:

 ok, so something else has come up: SharedBuffer. SharedBuffer has an
 adoptVector method that allows you to adopt Vectorchar... some of the
 stuff I'm using that interacts with LevelDB is also dealing with
 SharedBuffer, hence I've had to do some nasty casting from Vectoruint8_t
 to Vectorchar to allow me to call SharedBuffer.adoptVector()

 And again, we could tweak SharedBuffer to accept Vectorunsigned char but
 now we have two subsystems (LevelDB, and SharedBuffer) that seem to prefer
 char as raw data.

 Personally outside of WebKit I tend to see more char* as the common
 denominator for raw bytes.

 Further, there are no subsystems that actually depend on
 SerializedScriptValue using uint8_t - it was just what we decided to use
 when (ironically) we were hooking up IndexedDB to JSC, just a month or so
 ago.

 So far Benjamin objected, and then seems to have rescinded. Glenn, do you
 depend on SerializedScriptValue's current method signatures?

 Alec


 On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain benja...@webkit.orgwrote:

 On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.orgwrote:

 Well, nobody is explicitly using LChar with SerializedScriptValue (maybe
 it should, maybe that's another issue)  but I guess this is why I'm asking
 - I'm happy to just deal with this in IDB with some ugly reinterpret_casts
 here and there (ok maybe not happy, but satisfied enough) if folks prefer
 that. I don't personally find uint8_t to be any more intuitive than char,
 but it sounds like some do. Nevermind...


 Well, since you never use character types and only raw data, just ignore
 my comment.

 As far as I know, it is already common to use signed char for raw data
 (in the network stack for example).

 Benjamin



___
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char

2013-02-06 Thread Maciej Stachowiak

I think we should continue to use uint8_t instead of char as the primary way to 
represent a raw byte in WebKit. First, it's good to distinguish raw data from C 
strings at the type system level, and second, the unpredictable signedness of 
char is actively bad for byte-oriented processing. Another library making a 
different choice doesn't overcome these reasons.

If there's particular libraries we want to use which have different 
conventions, the adaptation should be done at the level of interfacing the 
library. Changing WebKit's conventiones because of one optional dependency does 
not make sense to me.

Regards,
Maciej

On Feb 6, 2013, at 4:35 PM, Alec Flett alecfl...@chromium.org wrote:

 ok, so something else has come up: SharedBuffer. SharedBuffer has an 
 adoptVector method that allows you to adopt Vectorchar... some of the stuff 
 I'm using that interacts with LevelDB is also dealing with SharedBuffer, 
 hence I've had to do some nasty casting from Vectoruint8_t to Vectorchar 
 to allow me to call SharedBuffer.adoptVector()
 
 And again, we could tweak SharedBuffer to accept Vectorunsigned char but 
 now we have two subsystems (LevelDB, and SharedBuffer) that seem to prefer 
 char as raw data.
 
 Personally outside of WebKit I tend to see more char* as the common 
 denominator for raw bytes.
 
 Further, there are no subsystems that actually depend on 
 SerializedScriptValue using uint8_t - it was just what we decided to use when 
 (ironically) we were hooking up IndexedDB to JSC, just a month or so ago.
 
 So far Benjamin objected, and then seems to have rescinded. Glenn, do you 
 depend on SerializedScriptValue's current method signatures?
 
 Alec
 
 
 On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain benja...@webkit.org wrote:
 On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.org wrote:
 Well, nobody is explicitly using LChar with SerializedScriptValue (maybe it 
 should, maybe that's another issue)  but I guess this is why I'm asking - I'm 
 happy to just deal with this in IDB with some ugly reinterpret_casts here and 
 there (ok maybe not happy, but satisfied enough) if folks prefer that. I 
 don't personally find uint8_t to be any more intuitive than char, but it 
 sounds like some do. Nevermind...
 
 Well, since you never use character types and only raw data, just ignore my 
 comment.
 
 As far as I know, it is already common to use signed char for raw data (in 
 the network stack for example).
 
 Benjamin 
 
 
 On Wed, Feb 6, 2013 at 4:06 PM, Alec Flett alecfl...@google.com wrote:
 ok, so something else has come up: SharedBuffer. SharedBuffer has an 
 adoptVector method that allows you to adopt Vectorchar... some of the stuff 
 I'm using that interacts with LevelDB is also dealing with SharedBuffer, 
 hence I've had to do some nasty casting from Vectoruint8_t to Vectorchar 
 to allow me to call SharedBuffer.adoptVector()
 
 And again, we could tweak SharedBuffer to accept Vectorunsigned char but 
 now we have two subsystems (LevelDB, and SharedBuffer) that seem to prefer 
 char as raw data.
 
 Personally outside of WebKit I tend to see more char* as the common 
 denominator for raw bytes.
 
 Further, there are no subsystems that actually depend on 
 SerializedScriptValue using uint8_t - it was just what we decided to use when 
 (ironically) we were hooking up IndexedDB to JSC, just a month or so ago.
 
 So far Benjamin objected, and then seems to have rescinded. Glenn, do you 
 depend on SerializedScriptValue's current method signatures?
 
 Alec
 
 
 On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain benja...@webkit.org wrote:
 On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.org wrote:
 Well, nobody is explicitly using LChar with SerializedScriptValue (maybe it 
 should, maybe that's another issue)  but I guess this is why I'm asking - I'm 
 happy to just deal with this in IDB with some ugly reinterpret_casts here and 
 there (ok maybe not happy, but satisfied enough) if folks prefer that. I 
 don't personally find uint8_t to be any more intuitive than char, but it 
 sounds like some do. Nevermind...
 
 Well, since you never use character types and only raw data, just ignore my 
 comment.
 
 As far as I know, it is already common to use signed char for raw data (in 
 the network stack for example).
 
 Benjamin 
 
 
 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 https://lists.webkit.org/mailman/listinfo/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char

2013-02-06 Thread Adam Barth
On Wed, Feb 6, 2013 at 4:59 PM, Alec Flett alecfl...@chromium.org wrote:

 On Wed, Feb 6, 2013 at 4:48 PM, Maciej Stachowiak m...@apple.com wrote:

 I think we should continue to use uint8_t instead of char as the primary
 way to represent a raw byte in WebKit. First, it's good to distinguish raw
 data from C strings at the type system level, and second, the unpredictable
 signedness of char is actively bad for byte-oriented processing. Another
 library making a different choice doesn't overcome these reasons.

 To be fair, there hasn't been a convention in WebKit at all.  uint8_t was
 chosen for SerializedScriptValue roughly two months ago, with specific
 IndexedDB support in mind: https://bugs.webkit.org/show_bug.cgi?id=104354 -
 this usage is not widespread, and in fact the only consumer of this type is
 IndexedDB.


 If there's particular libraries we want to use which have different
 conventions, the adaptation should be done at the level of interfacing the
 library. Changing WebKit's conventiones because of one optional dependency
 does not make sense to me.


 Maybe more simply: Vectoruint8_t was chosen very recently to replace
 String, in support of cleaning up IndexedDB code. IndexedDB would like to
 use Vectorchar now for further cleanup. Would you feel the same if we
 were switching from String to Vectorchar?


Yeah, We use char all over WebCore to represent a byte, including in
SharedBuffer.  We should use for SerializedScriptValue for consistency.

If we think that uint8_t is better than char to represent bytes, then we
should make that change globally in WebCore separately.

Adam



 On Wed, Feb 6, 2013 at 4:48 PM, Maciej Stachowiak m...@apple.com wrote:


 I think we should continue to use uint8_t instead of char as the primary
 way to represent a raw byte in WebKit. First, it's good to distinguish raw
 data from C strings at the type system level, and second, the unpredictable
 signedness of char is actively bad for byte-oriented processing. Another
 library making a different choice doesn't overcome these reasons.

 If there's particular libraries we want to use which have different
 conventions, the adaptation should be done at the level of interfacing the
 library. Changing WebKit's conventiones because of one optional dependency
 does not make sense to me.

 Regards,
 Maciej

 On Feb 6, 2013, at 4:35 PM, Alec Flett alecfl...@chromium.org wrote:

 ok, so something else has come up: SharedBuffer. SharedBuffer has an
 adoptVector method that allows you to adopt Vectorchar... some of the
 stuff I'm using that interacts with LevelDB is also dealing with
 SharedBuffer, hence I've had to do some nasty casting from Vectoruint8_t
 to Vectorchar to allow me to call SharedBuffer.adoptVector()

 And again, we could tweak SharedBuffer to accept Vectorunsigned char
 but now we have two subsystems (LevelDB, and SharedBuffer) that seem to
 prefer char as raw data.

 Personally outside of WebKit I tend to see more char* as the common
 denominator for raw bytes.

 Further, there are no subsystems that actually depend on
 SerializedScriptValue using uint8_t - it was just what we decided to use
 when (ironically) we were hooking up IndexedDB to JSC, just a month or so
 ago.

 So far Benjamin objected, and then seems to have rescinded. Glenn, do you
 depend on SerializedScriptValue's current method signatures?

 Alec


 On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain benja...@webkit.org
  wrote:

 On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.org
  wrote:

 Well, nobody is explicitly using LChar with SerializedScriptValue
 (maybe it should, maybe that's another issue)  but I guess this is why I'm
 asking - I'm happy to just deal with this in IDB with some ugly
 reinterpret_casts here and there (ok maybe not happy, but satisfied enough)
 if folks prefer that. I don't personally find uint8_t to be any more
 intuitive than char, but it sounds like some do. Nevermind...


 Well, since you never use character types and only raw data, just ignore
 my comment.

 As far as I know, it is already common to use signed char for raw data
 (in the network stack for example).

 Benjamin



 On Wed, Feb 6, 2013 at 4:06 PM, Alec Flett alecfl...@google.com wrote:

 ok, so something else has come up: SharedBuffer. SharedBuffer has an
 adoptVector method that allows you to adopt Vectorchar... some of the
 stuff I'm using that interacts with LevelDB is also dealing with
 SharedBuffer, hence I've had to do some nasty casting from Vectoruint8_t
 to Vectorchar to allow me to call SharedBuffer.adoptVector()

 And again, we could tweak SharedBuffer to accept Vectorunsigned char
 but now we have two subsystems (LevelDB, and SharedBuffer) that seem to
 prefer char as raw data.

 Personally outside of WebKit I tend to see more char* as the common
 denominator for raw bytes.

 Further, there are no subsystems that actually depend on
 SerializedScriptValue using uint8_t - it was just what we decided to use
 when 

Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char

2013-02-06 Thread Oliver Hunt

On Feb 6, 2013, at 4:59 PM, Alec Flett alecfl...@chromium.org wrote:

 On Wed, Feb 6, 2013 at 4:48 PM, Maciej Stachowiak m...@apple.com wrote:
 
 I think we should continue to use uint8_t instead of char as the primary way 
 to represent a raw byte in WebKit. First, it's good to distinguish raw data 
 from C strings at the type system level, and second, the unpredictable 
 signedness of char is actively bad for byte-oriented processing. Another 
 library making a different choice doesn't overcome these reasons.
 
 To be fair, there hasn't been a convention in WebKit at all.  uint8_t was 
 chosen for SerializedScriptValue roughly two months ago, with specific 
 IndexedDB support in mind: https://bugs.webkit.org/show_bug.cgi?id=104354 - 
 this usage is not widespread, and in fact the only consumer of this type is 
 IndexedDB.

I don't know where you get this idea from -- SerializedScriptValue has always 
used uint8 buffers.  That IndexedDB used Strings and that _it_ changed to uint8 
buffers is entirely irrelevant here.

--Oliver
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev


[webkit-dev] SerializedScriptValue: signed vs unsigned char

2013-02-04 Thread Alec Flett
At the moment, SerializedScriptValue uses Vectoruint8_t (aka
Vectorunsigned char) for both it's API (createFromWireBytes, toWireBytes)
as well as its internal representation. (for both v8 and jsc
implementations)

The two largest consumers of this aspect of SerializedScriptValue seems to
be IndexedDB and postMessage().

I'm jumping through some small hoops (i.e. reinterpret_cast and whatnot) in
IndexedDB to convert between Vectoruint8_t and Vectorchar and a 'const
char*' buffer in order to write out to LevelDB, who likes 'char' as opposed
to unsigned char.

postMessage() seems to be pretty agnostic to char vs unsigned char, since
it uses SerializedScriptValue to both produce and consume the buffers it
sends between windows.

Before I did a code cleanup and just fixed up all the implementations to
use Vectorchar I wanted to see if anyone had any objections here, on both
the V8 and the JSC sides. The ultimate compiled code is going to be
identical, but I'll be able to avoid all sorts of reinterpret_cast's at
various points in the code.

objections?

Alec
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char

2013-02-04 Thread Benjamin Poulain
On Mon, Feb 4, 2013 at 3:51 PM, Alec Flett alecfl...@chromium.org wrote:

 At the moment, SerializedScriptValue uses Vectoruint8_t (aka
 Vectorunsigned char) for both it's API (createFromWireBytes, toWireBytes)
 as well as its internal representation. (for both v8 and jsc
 implementations)

 The two largest consumers of this aspect of SerializedScriptValue seems to
 be IndexedDB and postMessage().

 I'm jumping through some small hoops (i.e. reinterpret_cast and whatnot)
 in IndexedDB to convert between Vectoruint8_t and Vectorchar and a
 'const char*' buffer in order to write out to LevelDB, who likes 'char' as
 opposed to unsigned char.

 postMessage() seems to be pretty agnostic to char vs unsigned char, since
 it uses SerializedScriptValue to both produce and consume the buffers it
 sends between windows.

 Before I did a code cleanup and just fixed up all the implementations to
 use Vectorchar I wanted to see if anyone had any objections here, on both
 the V8 and the JSC sides. The ultimate compiled code is going to be
 identical, but I'll be able to avoid all sorts of reinterpret_cast's at
 various points in the code.

 objections?


WTF/WebCore 8bits characters (LChar) are unsigned char.
Wouldn't you just push the reinterpret_cast to the other side of
the IndexedDB module? I find it a little odd to modify WebCore based on one
particular dependency (LevelDB) but maybe there is a good reason.

Benjamin
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char

2013-02-04 Thread Glenn Adams
On Mon, Feb 4, 2013 at 4:51 PM, Alec Flett alecfl...@chromium.org wrote:

 At the moment, SerializedScriptValue uses Vectoruint8_t (aka
 Vectorunsigned char) for both it's API (createFromWireBytes, toWireBytes)
 as well as its internal representation. (for both v8 and jsc
 implementations)

 The two largest consumers of this aspect of SerializedScriptValue seems to
 be IndexedDB and postMessage().

 I'm jumping through some small hoops (i.e. reinterpret_cast and whatnot)
 in IndexedDB to convert between Vectoruint8_t and Vectorchar and a
 'const char*' buffer in order to write out to LevelDB, who likes 'char' as
 opposed to unsigned char.

 postMessage() seems to be pretty agnostic to char vs unsigned char, since
 it uses SerializedScriptValue to both produce and consume the buffers it
 sends between windows.

 Before I did a code cleanup and just fixed up all the implementations to
 use Vectorchar I wanted to see if anyone had any objections here, on both
 the V8 and the JSC sides. The ultimate compiled code is going to be
 identical, but I'll be able to avoid all sorts of reinterpret_cast's at
 various points in the code.


If you are suggesting to change IndexedDB to use Vectorchar in order to
accommodate the LevelDB interfaces, then I would suggest you not do that.
Vectoruint8_t is more clear than Vectorchar (or Vectorunsigned char)
at expressing the intended usage (binary versus character data).
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char

2013-02-04 Thread Alec Flett
Well, nobody is explicitly using LChar with SerializedScriptValue (maybe it
should, maybe that's another issue)  but I guess this is why I'm asking -
I'm happy to just deal with this in IDB with some ugly reinterpret_casts
here and there (ok maybe not happy, but satisfied enough) if folks prefer
that. I don't personally find uint8_t to be any more intuitive than char,
but it sounds like some do. Nevermind...

(In my own experience I've always found explicit unsigned to be more of a
hassle then a tool for good type safely, and I suspect that's why LevelDB
uses it as its lowest-common-denominator, but we don't have to have that
debate here :))

Alec


On Mon, Feb 4, 2013 at 4:43 PM, Benjamin Poulain benja...@webkit.orgwrote:

 On Mon, Feb 4, 2013 at 3:51 PM, Alec Flett alecfl...@chromium.org wrote:

 At the moment, SerializedScriptValue uses Vectoruint8_t (aka
 Vectorunsigned char) for both it's API (createFromWireBytes, toWireBytes)
 as well as its internal representation. (for both v8 and jsc
 implementations)

 The two largest consumers of this aspect of SerializedScriptValue seems
 to be IndexedDB and postMessage().

 I'm jumping through some small hoops (i.e. reinterpret_cast and whatnot)
 in IndexedDB to convert between Vectoruint8_t and Vectorchar and a
 'const char*' buffer in order to write out to LevelDB, who likes 'char' as
 opposed to unsigned char.

 postMessage() seems to be pretty agnostic to char vs unsigned char, since
 it uses SerializedScriptValue to both produce and consume the buffers it
 sends between windows.

 Before I did a code cleanup and just fixed up all the implementations to
 use Vectorchar I wanted to see if anyone had any objections here, on both
 the V8 and the JSC sides. The ultimate compiled code is going to be
 identical, but I'll be able to avoid all sorts of reinterpret_cast's at
 various points in the code.

 objections?


 WTF/WebCore 8bits characters (LChar) are unsigned char.
 Wouldn't you just push the reinterpret_cast to the other side of
 the IndexedDB module? I find it a little odd to modify WebCore based on one
 particular dependency (LevelDB) but maybe there is a good reason.

 Benjamin

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] SerializedScriptValue: signed vs unsigned char

2013-02-04 Thread Benjamin Poulain
On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett alecfl...@chromium.org wrote:

 Well, nobody is explicitly using LChar with SerializedScriptValue (maybe
 it should, maybe that's another issue)  but I guess this is why I'm asking
 - I'm happy to just deal with this in IDB with some ugly reinterpret_casts
 here and there (ok maybe not happy, but satisfied enough) if folks prefer
 that. I don't personally find uint8_t to be any more intuitive than char,
 but it sounds like some do. Nevermind...


Well, since you never use character types and only raw data, just ignore my
comment.

As far as I know, it is already common to use signed char for raw data (in
the network stack for example).

Benjamin
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev