On Sun, Oct 27, 2002 at 02:33:31PM -0800, David Wheeler wrote:
> On Sunday, October 27, 2002, at 10:27  AM, Tim Bunce wrote:
> 
> >A goal of the DBI is to make things simple.
> >
> >If some form of escaping works for binary and non-binary data then it
> >would seem to make most sense to use that.
> 
> I agree, but unfortunately, I haven't been able to find a single 
> escaping that works for both. The issue is this:
> 
> * For string data, escapes need to be:
> 
>     '\'' => '\\\''  # or '\\047'
>     '\\' => '\\\\', # or '\\134'
> 
> * For binary data, escapes need to be:
> 
>     '\'' => '\\\''      # or '\\047'
>     '\\' => '\\\\\\\\', # or '\\\\134'
>     "\0" => '\\\\000'   # or '\\\\000'
> 
> The reason for the doubling of backslashes is that the SQL statement is 
> parsed twice, first by the PostgreSQL parser, and then by the so-called 
> "byteaian" parser. The byteaian parser doesn't care about single 
> quotes, so they're only parsed once, by the PostgreSQL parser. The 
> backslash and null characters, however, are parsed twice, so the 
> backslashes have to be doubled.

That seems like a rather poor design.

> * The binary escape for the null character can be used with strings 
> because the null character is considered an end-of-string terminator, 
> anyway, so you wouldn't expect to see it in a string.
> 
> * The binary escape for the single quote can be used with strings 
> because it's exactly the same escape sequence.
> 
> * The binary escape for the backslash *cannot* be used for strings, 
> because strings are parsed only once! So the doubling up of the 
> backslashes would leave a literal double-slash (or '\\134') in the 
> string.
> 
> The upshot is that, because of the difference in the behavior of 
> escaping the backslash, the same escaping *cannot* be done for both 
> binary and string data. (Bruce, please tell me if I'm missing something 
> here). This also means that the current implementation of the quote() 
> method does *not* work for binary data (sorry Crist�v�o). This is fine, 
> according to the DBI documentation, but then the quoting of the null 
> character in quote() is superfluous.
> 
> So unfortunately, unless we decide to look up the data type of each 
> column for which data is bound in dbd_st_execute (expensive, I think), 

And probably not possible in all cases.

> binary data still has to be bound to a placeholder explicitly in order 
> to be properly escaped. I'd certainly be interested in alternate 
> approaches, though, as my knowledge of this stuff is limited.
> 
> As for quote(), if folks agree it makes sense, I'll update it to do the 
> right thing with binary data when it is passed a second argument 
> ($data_type) indicating binary data (DBI::SQL_BINARY).

Beware that your semantics for SQL_BINARY must match those of ODBC.
I think ODBC requires SQL_BINARY values to be formated as pairs of
hex chars.

Best to use bind_param($col, $val, { pg_type => NNN })
where NNN is the value of the PostgreSQL bytea type.

Tim.

Reply via email to