As promised, here's a summary of revisions I made to the material I sent out in
December. Attached is the complete material with revisions. Thanks to all of
you who responded. I believe that I have your feedback incorporated in the
revisions in some form or another.
The perldoc update...
1) Change the example's column name from "name" to "fullname".
2) Changed use of "parameter" to "placeholder".
3) Fixed the iterating script example.
4) Addressed the issue that if Example 0 works, Examples 4, 5, and 6 work, even
if their features aren't fully supported.
5) Updated the database-engine/working-example matrix.
6) A few misc grammar fixes.
The accompanying perl script...
1) Documented issue with Sybase columns defaulting to NOT NULL.
2) Padded char data with blanks for databases (e.g. Oracle) that don't pad char
values automatically.
3) Added display of row ids for styles that return rows, but not the expected
rows.
4) The script no longer relies on output from the "rows" method, and issues a
warning if it doesn't "work" as expected.
-----Original Message-----
From: CAMPBELL, BRIAN D (BRIAN) [mailto:[EMAIL PROTECTED]
Sent: Thursday, December 16, 2004 10:33 AM
To: [email protected]
Subject: Perl DBI Documentation Proposal: Using NULLs in Placeholders
Fellow DBI Users,
Tim Bunce has asked me to help revise the NULLs subsection in the "Placeholders
and Bind Values" section in the DBI documention for DBI 1.47. Attached is the
proposed text to be included, along with an associated perl script. You are
invited to review them. Please send comments back directly to me (not the DBI
users mailing list). I'll post a summary in January.
The revision addresses the issue of testing for NULL values in a WHERE clause.
There's no "one size fits all" solution, so several options are given. In
particular, several examples of WHERE clauses are presented that can support
both NULL and non-NULL tests. But the examples tend to rely on database
specific features or how flexibly they support placeholders (e.g. Informix
supports the NVL function but not with a placeholder).
Near the bottom of the attached text is a table that lists some database
engines and which WHERE clause examples work with them. The table is sparse.
Only Informix has been properly tested so far. Any other entries you see are
educated guesses. You can help us fill out the table. How? Use the attached
perl script that you can run to see which examples actually work on your
favorite database engine. I invite you to run it and send the results back to
me (not the DBI users mailing list). I'll post a summary. View the top of the
perl script for specific instructions. You'll likely want to change the suffix
of the script's file name to .pl.
B<NULL Values>
Undefined values, or C<undef>, are used to indicate NULL values.
You can insert and update columns with a NULL value as you would a
non-NULL value. These examples insert and update the column
C<age> with a NULL value:
$sth = $dbh->prepare(qq{
INSERT INTO people (fullname, age) VALUES (?, ?)
});
$sth->execute("Joe Bloggs", undef);
$sth = $dbh->prepare(qq{
UPDATE people SET age = ? WHERE fullname = ?
});
$sth->execute(undef, "Joe Bloggs");
However, care must be taken when trying to use NULL values in a
C<WHERE> clause. Consider:
SELECT fullname FROM people WHERE age = ?
Binding an C<undef> (NULL) to the placeholder will I<not> select rows
which have a NULL C<age>! At least for database engines that
conform to the SQL standard. Refer to the SQL manual for your database
engine or any SQL book for the reasons for this. To explicitly select
NULLs you have to say "C<WHERE age IS NULL>".
A common issue is to have a code fragment handle a value that could be
either C<defined> or C<undef> (non-NULL or NULL) at runtime.
A simple technique is to prepare the appropriate statement as needed,
and substitute the placeholder for non-NULL cases:
$sql_clause = defined $age? "age = ?" : "age IS NULL";
$sth = $dbh->prepare(qq{
SELECT fullname FROM people WHERE $sql_clause
});
$sth->execute(defined $age ? $age : ());
The following technique illustrates qualifying a C<WHERE> clause with
several columns, whose associated values (C<defined> or C<undef>) are
in a hash %h:
for my $col ("age", "phone", "email")
{
if (defined $h{$col})
{push @sql_qual, "$col = ?"; push @sql_bind, $h{$col};}
else
{push @sql_qual, "$col IS NULL";}
}
$sql_clause = join(" AND ", @sql_qual);
$sth = $dbh->prepare(qq{
SELECT fullname FROM people WHERE $sql_clause
});
$sth->execute(@sql_bind);
The techniques above prepare the SQL statement with each call to
execute. Because calls to prepare can be expensive, performance
can suffer when an application iterates many times over statements
like the above.
A better solution is a single C<WHERE> clause that supports both
NULL and non-NULL comparisons. Its SQL statement would need to be
prepared only once for all cases, thus improving performance.
Several examples of C<WHERE> clauses that support this are presented
below. But each example lacks portability, robustness, or simplicity.
Whether an example is supported on your database engine depends on
what SQL extensions it provides, and where it supports the C<?>
placeholder in a statement.
0) age = ?
1) NVL(age, xx) = NVL(?, xx)
2) ISNULL(age, xx) = ISNULL(?, xx)
3) DECODE(age, ?, 1, 0) = 1
4) age = ? OR (age IS NULL AND ? IS NULL)
5) age = ? OR (age IS NULL AND SP_ISNULL(?) = 1)
6) age = ? OR (age IS NULL AND ? = 1)
Statements formed with the above C<WHERE> clauses require execute
statements as follows. The arguments are required, whether their
values are C<defined> or C<undef>.
0-3) $sth->execute($age);
4,5) $sth->execute($age, $age);
6) $sth->execute($age, defined($age)?0:1);
Example 0 should not work (as mentioned earlier), but may work on
a few database engines anyway (e.g. Sybase). Example 0 is part
of examples 4, 5, and 6, so if example 0 works, these other
examples may work, even if the engine does not properly support
the right hand side of the C<OR> expression.
Examples 1 and 2 are not robust: they require that you provide a
valid column value xx (e.g. '~') which is not present in any row.
That means you must have some notion of what data won't be stored
in the column, and expect clients to adhere to that.
Example 5 requires that you provide a stored procedure (SP_ISNULL
in this example) that acts as a function: it checks whether a value
is null, and returns 1 if it is, or 0 if not.
Example 6, the least simple, is probably the most portable, i.e., it
should work with with most, if not all, database engines.
Here is a table that indicates which examples above are known to
work on various database engines:
-----Examples------
0 1 2 3 4 5 6
- - - - - - -
Oracle 9 N Y N Y Y ? Y
Informix IDS 9 N N N Y N Y Y
MS SQL N N Y N Y ? Y
Sybase Y N N N N N Y
AnyData,DBM,CSV Y N N N Y Y* Y
* Works only because Example 0 works.
DBI provides a sample perl script that will test the examples above
on your database engine and tell you which ones work. It is located
in the F<ex/> subdirectory of the DBI source distribution, or here:
L<http://svn.perl.org/modules/dbi/trunk/ex/perl_dbi_nulls_test.pl>
Please use the script to help us fill-in and maintain this table.
#! /usr/bin/perl -w
# This script checks which style of WHERE clause(s) will support both
# null and non-null values. Refer to the NULL Values sub-section
# of the "Placeholders and Bind Values" section in the DBI
# documention for more information on this issue. The clause styles
# and their numbering (0-6) map directly to the examples in the
# documentation.
#
# To use this script, update the connect method arguments to support
# your database engine and database. Set PrintError to 1 if you want
# see the reason WHY your engine won't support a particular style.
#
# If your database does not support NULL columns by default
# (e.g. Sybase) find and edit the CREATE TABLE statement accordingly.
#
# To properly test style #5, you need the capability to create the
# stored procedure SP_ISNULL that acts as a function: it tests its
# argument and returns 1 if it is null, 0 otherwise. For example,
# using Informix, a definition would look like:
#
# CREATE PROCEDURE SP_ISNULL (arg VARCHAR(32)) RETURNING INTEGER;
# IF arg IS NULL THEN RETURN 1;
# ELSE RETURN 0;
# END IF;
# END PROCEDURE;
#
# Warning: This script will attempt to create a table named by the
# $tablename variable (default dbi__null_test_tmp) and WILL DESTROY
# any pre-existing table so named.
use strict;
use DBI;
my $tablename = "dbi__null_test_tmp"; # DESTROYs this table. Must be obscure
die "DBI_DSN environment variable not defined"
unless $ENV{DBI_DSN};
my $dbh = DBI->connect(undef, undef, undef,
{
RaiseError => 0,
PrintError => 1
}
) || die DBI->errstr;
printf "Using %s, db version %s\n", $ENV{DBI_DSN}, $dbh->get_info(18) ||
"unknown";
my $sth;
my @ok;
print "=> Drop table '$tablename', if it already exists...\n";
$sth = $dbh->do("DROP TABLE $tablename");
print "=> Create table '$tablename'...\n";
$sth = $dbh->prepare("CREATE TABLE $tablename (myid int NOT NULL, mycol
char(8))");
# Use this if your database does not support NULL columns by default.
#$sth = $dbh->prepare("CREATE TABLE $tablename (myid int NOT NULL, mycol
char(8) NULL)");
$sth->execute()
|| $sth->errstr;
print "=> Insert 4 rows into the table...\n";
my @stv = ('slow', undef, 'quick', undef);
$sth = $dbh->prepare("INSERT INTO $tablename (myid, mycol) VALUES (?,?)");
for my $i (0..3)
{
$sth->execute($i+1, $stv[$i])
|| $sth->errstr;
}
# Define the SQL statements with the various WHERE clause styles we want to
test.
my @sel = (
qq{WHERE mycol = ?},
qq{WHERE NVL(mycol, '-') = NVL(?, '-')},
qq{WHERE ISNULL(mycol, '-') = ISNULL(?, '-')},
qq{WHERE DECODE(mycol, ?, 1, 0) = 1},
qq{WHERE mycol = ? OR (mycol IS NULL AND ? IS NULL)},
qq{WHERE mycol = ? OR (mycol IS NULL AND SP_ISNULL(?) = 1)},
qq{WHERE mycol = ? OR (mycol IS NULL AND ? = 1)},
);
# Define the execute method argument lists for non-null values.
# The order must map one to one with the above SQL statements.
my @nonnull_args = (
['quick '],
['quick '],
['quick '],
['quick '],
['quick ','quick '],
['quick ','quick '],
['quick ', 0],
);
# Define the execute method argument lists for null values.
# The order must map one to one with the above SQL statements.
my @null_args = (
[undef],
[undef],
[undef],
[undef],
[undef, undef],
[undef, undef],
[undef, 1],
);
# Run the tests...
for my $i (0..$#sel)
{
print "\n=> Testing clause style $i: $sel[$i]\n";
$sth = $dbh->prepare("SELECT myid,mycol FROM $tablename $sel[$i]")
or next;
$sth->execute(@{$nonnull_args[$i]})
or next;
my $r1 = $sth->fetchall_arrayref();
my $n1r = $sth->rows;
my $n1 = @$r1;
$sth->execute(@{$null_args[$i]})
or next;
my $r2 = $sth->fetchall_arrayref();
my $n2r = $sth->rows;
my $n2 = @$r2;
# Complain a bit...
print "Your DBD driver doesn't support the 'rows' method very well.\n"
unless ($n1r == $n1 && $n2r == $n2);
# Did we get back the expected "n"umber of rows?
# Did we get back the specific "r"ows we expected as identifed by the myid
column?
if ( $n1 == 1
&& $n2 == 2
&& $r1->[0][0] == 3
&& $r2->[0][0] == 2
&& $r2->[1][0] == 4)
{
print "=> WHERE clause style $i is supported.\n";
push @ok, "$i: $sel[$i]";
}
else {
print "=> WHERE clause style $i returned incorrect results.\n";
if ($n1 > 0)
{
print " Non-Null test rows returned: ";
print " ", $r1->[$_][0] for (0..$#{$r1});
print "\n";
}
if ($n2 > 0)
{
print " Null test rows returned: ";
print " ", $r2->[$_][0] for (0..$#{$r2});
print "\n";
}
}
}
$dbh->disconnect();
printf "\n%d styles are supported\n", scalar @ok;
print "$_\n" for @ok;
print "\n";