On Sun, Oct 27, 2002 at 02:33:31PM -0800, David Wheeler wrote:
> On Sunday, October 27, 2002, at 10:27 AM, Tim Bunce wrote:
>
> >A goal of the DBI is to make things simple.
> >
> >If some form of escaping works for binary and non-binary data then it
> >would seem to make most sense to use that.
>
> I agree, but unfortunately, I haven't been able to find a single
> escaping that works for both. The issue is this:
>
> * For string data, escapes need to be:
>
> '\'' => '\\\'' # or '\\047'
> '\\' => '\\\\', # or '\\134'
>
> * For binary data, escapes need to be:
>
> '\'' => '\\\'' # or '\\047'
> '\\' => '\\\\\\\\', # or '\\\\134'
> "\0" => '\\\\000' # or '\\\\000'
>
> The reason for the doubling of backslashes is that the SQL statement is
> parsed twice, first by the PostgreSQL parser, and then by the so-called
> "byteaian" parser. The byteaian parser doesn't care about single
> quotes, so they're only parsed once, by the PostgreSQL parser. The
> backslash and null characters, however, are parsed twice, so the
> backslashes have to be doubled.
That seems like a rather poor design.
> * The binary escape for the null character can be used with strings
> because the null character is considered an end-of-string terminator,
> anyway, so you wouldn't expect to see it in a string.
>
> * The binary escape for the single quote can be used with strings
> because it's exactly the same escape sequence.
>
> * The binary escape for the backslash *cannot* be used for strings,
> because strings are parsed only once! So the doubling up of the
> backslashes would leave a literal double-slash (or '\\134') in the
> string.
>
> The upshot is that, because of the difference in the behavior of
> escaping the backslash, the same escaping *cannot* be done for both
> binary and string data. (Bruce, please tell me if I'm missing something
> here). This also means that the current implementation of the quote()
> method does *not* work for binary data (sorry Crist�v�o). This is fine,
> according to the DBI documentation, but then the quoting of the null
> character in quote() is superfluous.
>
> So unfortunately, unless we decide to look up the data type of each
> column for which data is bound in dbd_st_execute (expensive, I think),
And probably not possible in all cases.
> binary data still has to be bound to a placeholder explicitly in order
> to be properly escaped. I'd certainly be interested in alternate
> approaches, though, as my knowledge of this stuff is limited.
>
> As for quote(), if folks agree it makes sense, I'll update it to do the
> right thing with binary data when it is passed a second argument
> ($data_type) indicating binary data (DBI::SQL_BINARY).
Beware that your semantics for SQL_BINARY must match those of ODBC.
I think ODBC requires SQL_BINARY values to be formated as pairs of
hex chars.
Best to use bind_param($col, $val, { pg_type => NNN })
where NNN is the value of the PostgreSQL bytea type.
Tim.