Re: [sqlite] host parameters / bind variables - further workouts

Dennis Cote Wed, 25 Aug 2004 17:31:41 -0700

Darren Duncan wrote:
> As an internally-implemented thin wrapper on top of this, #1 would
> lead to the addition of one conceptual function or two actual
> functions like this:
>
> int sqlite3_host_param_name(sqlite3_stmt*, int, const char*,
> int n, void(*)(void*));
> int sqlite3_host_param_name16(sqlite3_stmt*, int, const
> void*, int n, void(*)(void*));
>
> Calling one of those functions would map a character string / "text"
> to one of the "named integer" host parameters so that they
> effectively become synonyms.  For example, calling:
>
> sqlite3_host_param_name(stmt, 1, 'foo', ...);
>
> ... would result in a statement "SELECT :foo AS a, ?1 AS b" having
> the same value in 'a' and 'b'.
>
> A caller application would use #1 or #2 in exactly the same way,
> except #1 has a single extra step of calling host_param_name also.
> Of course, it doesn't matter whether host_param_name() or bind() is
> called first; they both just have to be before the execute().


The application should not be setting the mapping between the parameter
names and their index numbers. This should be done automatically by SQLite
as it parses the SQL statement. As each named parameter is encountered
SQLite should scan the parameter table to see if this name has already be
assigned an index number. If it has, then that number is used for this
parameter in the VDBE code that is being generated. If not, then this is a
new parameter which is assigned the next highest unused index number
automatically.

If there is a concern with cost of scanning the table for each named
parameter, this could be replaced with a hash based lookup using SQLite's
existing hash functions. I doubt that this will beneficial for anything but
extreme cases with very many named parameters.

In either case this process only takes time when preparing the statement.
There is no additional overhead when executing a precompiled statement
multiple times.

The ?n style parameters, what I call numbered paramaters, are simply a way
to reuse the same parameter multiple times in a statement without having to
bind it multiple times. ?1 refers to the first parameter, ?2 the second,
etc.

Using this approach ?1 would be a synonym for the first named parameter if
it appeared after the named parameter, but an independent variable with no
name if it appeared before a named parameter in the SQL statement.

I don't know if there is much use for the numbered parameters in human
generated statements, but they could be useful in machine generated
statements. I think humnas will generally prefer to use named parameters
rather than numbered parameters.

Another problem with the ?n parameters is that a user can create a statement
with a large parameter number, say ?666, which would have to allocate and
initialize that many parameter values even if all the values between 1 and
665 are not used. For convenience, the ?n mechanism should allow a user to
enter the parameters in the query in any order (i.e. ?3 can appear before
?1), and the user should be able to edit a statement and remove a parameter
that is no longer needed without being required to renumber all the other
parameters. The pathological case results from these features and the fact
that the statement doesn't actually use anywhere near as many parameters as
it "declared".

I know this can be handled with sparse array techniques, but I'm not sure if
that is really a good idea. Instead of a simple array access, all parameter
array lookups that are done to execute the VDBE code would have to scan the
array to find the location and value of a parameter given its index number.
This will slow down the execution of every statement even if it uses a small
dense array of parameters.

I think it's best to allow the ?n style parameters for machine generated
code, and let the user suffer if he generates silly SQL statements using
these parameters.

The unnumbered ? parameters, what I call positional parameters, don't have
any of these problems and can safely be combined with named parameters.

I don't know why anyone would want to mix the different types of parameters
in a single statement, but it should always have a reasonable and well
defined behavior if someone does because it is allowed by the API.

I haven't looked at the implementation of the named parameters in SQLite
yet, so this may already be done this way, but I think it would be a mistake
to require the user to set the mapping between numbered and named
parameters.

Re: [sqlite] host parameters / bind variables - further workouts

Reply via email to