On 20.12.2014 23:10, Tim Bunce wrote:
On Sat, Dec 20, 2014 at 05:35:55PM +0100, Alexander Foken wrote:
On 20.12.2014 15:38, Tim Bunce wrote:
Can you, or anyone else, think of any situation where a backslash before
a ? or :foo (or even $1) style placeholder might be valid SQL?
[...]
Do you plan to escape the escape character, i.e. use a double
backslash at DBI level to represent a single backslash at database
level?
That's a good question. I'm not sure. I think the answer has to be no.
I'd welcome any input on that.
The current problem with any kind of DBI placeholders is that there is
no way to prevent DBI (including any DBDs) to interpret them as
placeholders. This hasn't changed since 2000. The patches from 2000
disable placeholders by type (?, $1, :foo) or entirely, but it is not
possible to pass :foo unchanged and uninterpreted to the database while
at the same time having a named placeholder :bar.
The proposed backslash escape allows exactly this, so you can have a
named placeholder :bar and can pass :foo unchanged and uninterpreted to
the database, but you have to add a backslash in front of :foo. No
problems with old code for this usage, because this was simply not possible.
But with the proposed backslash escape, the backslash in SQL statements
passed to the DBI will suddenly have a new meaning, were it is currently
passed unmodified to the database. This will break old code.
Making the backslash escape special only directly in front of possible
placeholders will probably break less code, because all other
backslashes will behave as before. At the same time, the rules when a
backslash is needed will become more complicated, and they may depend on
the DBD, because different DBDs implement differrent placeholders (as
far as I understand DBI).
Also, the rule that a backslash in front of a possible placeholder
passes the placeholder, but not the backslash, unmodified to the
database prevents passing a backslash followed by an actual placeholder
to the database. This is possible with the current DBI code.
So, the backslash-is-only-special-in-front-of-placeholders rule breaks
some existing code, makes it impossible to have a backslask in front of
a placeholder, and the rules for when to use a backslash depend on the
DBD implementation. This will likely lead to another change that will
again changes the rules and break old code.
Can we do better? I think that we can:
As there is currently no escaping mechanism for placeholders in DBI, any
new escaping mechanism will break some existing code. Breaking old code
that uses DBI has happened before, e.g. in DBI 1.20, 24th August 2001
(selectall_hashref) DBI 1.19, 20th July 2001 (fetchall_arrayref), DBI
0.91, 10th December 1997 (connect).
1. Use a clean escaping mechanism that won't need future changes that
will again break old code:
Make all backslashes in SQL special. Require double backslashes for a
single backslash at database level. Require escaping characters that
otherwise might be parsed as placeholders (:, ?, $). Make unknown escape
sequences fatal errors. This allows mixing placeholders and
placeholder-like SQL sequences as before, it allows placing backslashes
in front of placeholders, and it allows future extensions using the
currently unknown escape sequences.
Existing code just has to double each and every backslash in SQL code.
2. Slowly introduce the new mechanism:
First, the backslash-escape must explicitly be enabled using an
attribute BackslashEscape during connect(). Default is off, i.e. no
backslash escaping. No old code should have problems. New code can
enable backslash-escapes when needed by using connect() with
BackslashEscape=on.
After a while, change the default to warn, so that backslash escaping
stays off, but any backslash in SQL statements will issue a
compatibility warning. Old code will still run unmodified, but will
likely issue some warnings. Those warnings can be disabled by adding
BackslashEscape=off to connect().
After some more time, make changing BackslashEscape to off cause a
compatibility warning. Old unmaintained code will still run unmodified,
but issue some warnings. Maintained old code that sets BackslashEscape
to off will issue one warning per connect.
After even more time, change the default to on, so backslash escaping is
enabled. This will intentionally break old unmaintained code that was
not modified to switch off BackslashEscape. Old maintained code will
continue to run, but issue one warning per connect.
Finally, remove the non-backslash-escape code and make swiching to that
code (off or warn) a fatal error. This will intentionally break old code
that had a one-time maintainance but was not changed to use backslashes
properly.
The above is the user view of DBI.
DBI developer view:
As far as I understand the DBI, DBDs may also parse SQL code. They need
to be changed to respect the BackslashEscape attribute. Using old DBDs
that can't handle the BackslashEscape attribute must cause a fatal error
when the user attempts to set BackslashEscape to on.
Because the backslash has no special meaning in existing SQL code, it
can safely be doubled and passed to a new parser when BackslashEscape is
not on. There is no need to keep the old parser around, neither in DBI
nor in any DBD.
Alexander
Tim.
--
Alexander Foken
mailto:alexan...@foken.de http://www.foken.de/alexander/