On Jul 22, 12:40 pm, bob mcgee <[email protected]> wrote:
> On Jul 22, 10:23 am, kensystem <[email protected]> wrote:> 1) Should BYTEA
> columns accept ASCII input if quoted? [...]
> > Can it be changed to accept this on presumption that the string is an
> > ASCII one? I believe this is accepted by PG and some others.
>
> Probably not, for several good reasons.
>
> One: Adding ASCII input opens things up for ambiguity. Does 'FACE'
> mean the bytes 0xFACE (2 bytes) or the ASCII bytes (4 bytes)? What
> about NULL & unprintable characters? What about foreign languages --
> some keyboards may not even have the standard US-ASCII keys on them?
I agree; since H2 already treats inputs as HEX, it is too late to
change it. It cannot be compatible with PG syntax without modifying
the PG backup SQL.
>
> Two: Redundancy. The functions UTF8TOSTRING(bytes) and STRINGTOUTF8
> (string) already exist to convert text to raw bytes.
> See:http://www.h2database.com/html/functions.html#stringtoutf8 This
> should be the preferred way to store text as bytes.
I'm not sure here; while a java String can technically hold all 256
octets (including NULL), UTF8 codecs will mangle that. An ASCII or
ISO88591 encoding/decoding would not, though. As I recall PG uses a
binary safe encoding.
I believe SQL considers only apos as special character, and any other
octet, including \0 etc, are all just bytes and legal (for binary
data) inside apos-quoting (I know you know this but others may benefit
from my mentioning it here). Even so, not all DBMSs accept this (their
connectors, or wire-protocol, are confused by \0 and others).
>
> Thomas could add a function to convert bytes to/from arbitrary text
> encodings, but it is very easy to do this in java or roll your own
> user-defined function to do it.
>
> Three: Performance. Parsing text as hex characters is very, very
> fast
For binary data, I might differ in opinion: hex encoding, due to size
(2x for hex) is half speed, and some additional to decode (vs char-
>byte cast).
>, and is safe against the complexities of character sets &
> encodings. US-ASCII isn't suitable for international use (as noted
> above).
But in the context of bytea, where the input should already be
'binary' this is not an issue.
>
> Cheers,
> Bob McGee
Thank you,
-Ken
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "H2
Database" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/h2-database?hl=en
-~----------~----~----~----~------~----~------~--~---