On 20.12.2014 23:10, Tim Bunce wrote:
On Sat, Dec 20, 2014 at 05:35:55PM +0100, Alexander Foken wrote:
On 20.12.2014 15:38, Tim Bunce wrote:
Can you, or anyone else, think of any situation where a backslash before
a ? or :foo (or even $1) style placeholder might be valid SQL?
[...]
Do you plan to escape the escape character, i.e. use a double
backslash at DBI level to represent a single backslash at database
level?
That's a good question. I'm not sure. I think the answer has to be no.
I'd welcome any input on that.

The current problem with any kind of DBI placeholders is that there is no way to prevent DBI (including any DBDs) to interpret them as placeholders. This hasn't changed since 2000. The patches from 2000 disable placeholders by type (?, $1, :foo) or entirely, but it is not possible to pass :foo unchanged and uninterpreted to the database while at the same time having a named placeholder :bar.

The proposed backslash escape allows exactly this, so you can have a named placeholder :bar and can pass :foo unchanged and uninterpreted to the database, but you have to add a backslash in front of :foo. No problems with old code for this usage, because this was simply not possible.

But with the proposed backslash escape, the backslash in SQL statements passed to the DBI will suddenly have a new meaning, were it is currently passed unmodified to the database. This will break old code.

Making the backslash escape special only directly in front of possible placeholders will probably break less code, because all other backslashes will behave as before. At the same time, the rules when a backslash is needed will become more complicated, and they may depend on the DBD, because different DBDs implement differrent placeholders (as far as I understand DBI).

Also, the rule that a backslash in front of a possible placeholder passes the placeholder, but not the backslash, unmodified to the database prevents passing a backslash followed by an actual placeholder to the database. This is possible with the current DBI code.

So, the backslash-is-only-special-in-front-of-placeholders rule breaks some existing code, makes it impossible to have a backslask in front of a placeholder, and the rules for when to use a backslash depend on the DBD implementation. This will likely lead to another change that will again changes the rules and break old code.

Can we do better? I think that we can:

As there is currently no escaping mechanism for placeholders in DBI, any new escaping mechanism will break some existing code. Breaking old code that uses DBI has happened before, e.g. in DBI 1.20, 24th August 2001 (selectall_hashref) DBI 1.19, 20th July 2001 (fetchall_arrayref), DBI 0.91, 10th December 1997 (connect).

1. Use a clean escaping mechanism that won't need future changes that will again break old code:

Make all backslashes in SQL special. Require double backslashes for a single backslash at database level. Require escaping characters that otherwise might be parsed as placeholders (:, ?, $). Make unknown escape sequences fatal errors. This allows mixing placeholders and placeholder-like SQL sequences as before, it allows placing backslashes in front of placeholders, and it allows future extensions using the currently unknown escape sequences.

Existing code just has to double each and every backslash in SQL code.

2. Slowly introduce the new mechanism:

First, the backslash-escape must explicitly be enabled using an attribute BackslashEscape during connect(). Default is off, i.e. no backslash escaping. No old code should have problems. New code can enable backslash-escapes when needed by using connect() with BackslashEscape=on.

After a while, change the default to warn, so that backslash escaping stays off, but any backslash in SQL statements will issue a compatibility warning. Old code will still run unmodified, but will likely issue some warnings. Those warnings can be disabled by adding BackslashEscape=off to connect().

After some more time, make changing BackslashEscape to off cause a compatibility warning. Old unmaintained code will still run unmodified, but issue some warnings. Maintained old code that sets BackslashEscape to off will issue one warning per connect.

After even more time, change the default to on, so backslash escaping is enabled. This will intentionally break old unmaintained code that was not modified to switch off BackslashEscape. Old maintained code will continue to run, but issue one warning per connect.

Finally, remove the non-backslash-escape code and make swiching to that code (off or warn) a fatal error. This will intentionally break old code that had a one-time maintainance but was not changed to use backslashes properly.

The above is the user view of DBI.

DBI developer view:

As far as I understand the DBI, DBDs may also parse SQL code. They need to be changed to respect the BackslashEscape attribute. Using old DBDs that can't handle the BackslashEscape attribute must cause a fatal error when the user attempts to set BackslashEscape to on.

Because the backslash has no special meaning in existing SQL code, it can safely be doubled and passed to a new parser when BackslashEscape is not on. There is no need to keep the old parser around, neither in DBI nor in any DBD.

Alexander


Tim.


--
Alexander Foken
mailto:alexan...@foken.de  http://www.foken.de/alexander/

Reply via email to