On Sunday, October 27, 2002, at 10:27  AM, Tim Bunce wrote:

A goal of the DBI is to make things simple.

If some form of escaping works for binary and non-binary data then it
would seem to make most sense to use that.
I agree, but unfortunately, I haven't been able to find a single escaping that works for both. The issue is this:

* For string data, escapes need to be:

'\'' => '\\\'' # or '\\047'
'\\' => '\\\\', # or '\\134'

* For binary data, escapes need to be:

'\'' => '\\\'' # or '\\047'
'\\' => '\\\\\\\\', # or '\\\\134'
"\0" => '\\\\000' # or '\\\\000'

The reason for the doubling of backslashes is that the SQL statement is parsed twice, first by the PostgreSQL parser, and then by the so-called "byteaian" parser. The byteaian parser doesn't care about single quotes, so they're only parsed once, by the PostgreSQL parser. The backslash and null characters, however, are parsed twice, so the backslashes have to be doubled.

* The binary escape for the null character can be used with strings because the null character is considered an end-of-string terminator, anyway, so you wouldn't expect to see it in a string.

* The binary escape for the single quote can be used with strings because it's exactly the same escape sequence.

* The binary escape for the backslash *cannot* be used for strings, because strings are parsed only once! So the doubling up of the backslashes would leave a literal double-slash (or '\\134') in the string.

The upshot is that, because of the difference in the behavior of escaping the backslash, the same escaping *cannot* be done for both binary and string data. (Bruce, please tell me if I'm missing something here). This also means that the current implementation of the quote() method does *not* work for binary data (sorry Crist�v�o). This is fine, according to the DBI documentation, but then the quoting of the null character in quote() is superfluous.

So unfortunately, unless we decide to look up the data type of each column for which data is bound in dbd_st_execute (expensive, I think), binary data still has to be bound to a placeholder explicitly in order to be properly escaped. I'd certainly be interested in alternate approaches, though, as my knowledge of this stuff is limited.

As for quote(), if folks agree it makes sense, I'll update it to do the right thing with binary data when it is passed a second argument ($data_type) indicating binary data (DBI::SQL_BINARY).

Regards,

David

--
David Wheeler AIM: dwTheory
[EMAIL PROTECTED] ICQ: 15726394
http://david.wheeler.net/ Yahoo!: dew7e
Jabber: [EMAIL PROTECTED]

Reply via email to