Re: [DB-SIG] In praise of pyformat

Art Protin Mon, 13 Aug 2007 16:10:16 -0700

Dear folks,
   Mike Meyer wrote:

On Sun, 12 Aug 2007 17:05:44 -0400 Carsten Haese <[EMAIL PROTECTED]> wrote:

-1. The problem that your proposal is trying to solve doesn't exist. For
supplying variable values, parameter binding as it is (with the addition
of making qmark and named mandatory as was decided recently) is
perfectly adequate.


Maybe. Maybe not. Not at issue.

For supplying variable table names, column names,
where clauses, and other syntax elements, string formatting seems
perfectly adequate.


It may seem adequate, but it isn't. Table/column names from external
sources have to deal with the exact same set of data injection issues
that values from external sources do.

It is a mistake to say "the exact same set of data injection issues" forwhile they

are data injection issues, they are not exactly the same.
Table names are a different type than are column names
which are different from strings or integers.
A valid string is any string of characters shorter than some maximum
that uses characters from some acceptable character set.
Processing a string parameter involves simply checking the
length and character set.
Processing an integer involves simply checking sign and size constraints.


A column name is much more constrained than that,
no only must its characters be from a more limited alphabet,

the name is not case sensitive, and the name must, depending on theoperation,

either exist or not exist within a specified table definition.

Further, it is generally (but not universally) recognized that while

SELECT * FROM SUPPLIER WHERE City = 'St. Paul'

and

SELECT * FROM SUPPLIER WHERE City = 'New York'

are structurally the same query looking at the same fields of the same table
and doing the same things with them, the query

SELECT * FROM AIRPORT WHERE City = 'St. Paul'

has only a very superficial similarity and next to none of the
operations to answer the first two would be applicable to the third.

The difference with the third include:
the number of columns involved,
the names of the columns involved,
the types of the columns involved,
the indices for the column City (there may be more than one),

the storage location of the table (which files, which partitions, whichdisks).

In short, just about everything.

Maybe you should illustrate the kinds of problems you're encountering in
whatever it is you're doing that makes you feel that the existing way is
inadequate. Right now, you're coming off somewhat trollish by making
proposals for solving non-existent problems.


I didn't say the existing way is inadequate; it can be fine (except
for portability issues). The proposed changes - basically including
killing off the mechanisms that do work - are what aren't fine.

You're right, in that the existing mechanisms *can* deal with the
issues. However, two of the points that comes up over and over again
here is "use parameters, don't build the query strings yourself" and
"we would rather the module authors do the work than the users". I'm
trying to figure out how *either* of those is miscible with "Just use
pythons string substitutions for table/column names", much less *both*
of them.

  <mike

OK, you win this detail.  The advice is misstated, using common assumptions.
You should use parameters for everything that parameters will work for,
namely, as a standin for literals, for values that when changed do not
structurally alter the query.

While I vaguely recollect doing an application that did construct queries,
using data from the GUI to select tables and columns, I have only done that
once and do not remember why that seemed appropriate at the time.
If this were a more common practice, we would be further along in defining
how to do it in a "standard" way.  Sorry.

   Thank you all,
   Art Protin

_______________________________________________
DB-SIG maillist  -  DB-SIG@python.org
http://mail.python.org/mailman/listinfo/db-sig

Re: [DB-SIG] In praise of pyformat

Reply via email to