Re: BYTEA type compatibility with SQL inputs

kensystem Wed, 22 Jul 2009 20:48:23 -0700

On Jul 22, 12:40 pm, bob mcgee <[email protected]> wrote:
> On Jul 22, 10:23 am, kensystem <[email protected]> wrote:> 1) Should BYTEA 
> columns accept ASCII input if quoted? [...]
> > Can it be changed to accept this on presumption that the string is an
> > ASCII one? I believe this is accepted by PG and some others.
>
> Probably not, for several good reasons.
>
> One:  Adding ASCII input opens things up for ambiguity.  Does 'FACE'
> mean the bytes 0xFACE (2 bytes) or the ASCII bytes (4 bytes)?  What
> about NULL & unprintable characters?  What about foreign languages --
> some keyboards may not even have the standard US-ASCII keys on them?

I agree; since H2 already treats inputs as HEX, it is too late to
change it. It cannot be compatible with PG syntax without modifying
the PG backup SQL.

>
> Two: Redundancy. The functions UTF8TOSTRING(bytes) and STRINGTOUTF8
> (string) already exist to convert text to raw bytes. 
> See:http://www.h2database.com/html/functions.html#stringtoutf8 This
> should be the preferred way to store text as bytes.

I'm not sure here; while a java String can technically hold all 256
octets (including NULL), UTF8 codecs will mangle that. An ASCII or
ISO88591 encoding/decoding would not, though. As I recall PG uses a
binary safe encoding.

I believe SQL considers only apos as special character, and any other
octet, including \0 etc, are all just bytes and legal (for binary
data) inside apos-quoting (I know you know this but others may benefit
from my mentioning it here). Even so, not all DBMSs accept this (their
connectors, or wire-protocol, are confused by \0 and others).

>
> Thomas could add a function to convert bytes to/from arbitrary text
> encodings, but it is very easy to do this in java or roll your own
> user-defined function to do it.
>
> Three: Performance.  Parsing text as hex characters is very, very
> fast

For binary data, I might differ in opinion: hex encoding, due to size
(2x for hex) is half speed, and some additional to decode (vs char-
>byte cast).

>, and is safe against the complexities of character sets &
> encodings.  US-ASCII isn't suitable for international use (as noted
> above).

But in the context of bytea, where the input should already be
'binary' this is not an issue.

>
> Cheers,
> Bob McGee

Thank you,
-Ken
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/h2-database?hl=en
-~----------~----~----~----~------~----~------~--~---
Re: BYTEA type compatibility with SQL inputs

Reply via email to