RE: db independent way of detecting duplicates?

Steve Howard Fri, 28 Sep 2001 22:36:08 -0700

I had a while tonight and thought I'd take this one on with perhaps more
than you actually asked. This is a hasty re-write of something I wrote and
use at work. Basically, I always try to develop a $where, and a $tablename,
and a $columnlist and an @values when I'm creating a SQL statement - whether
I am getting column lists from the DBI function, or whatever method. If I
follow that basic rule in preparing the statements, it makes a functions
like these easy to use. I made one modification to what I use at work
because I work with an RDBMS that can handle sub-selects, and don't really
care to be db independent - so I use an EXISTS in the select. MySQL will not
handle sub-selects so a count is how I would handle making that compatible -
not as fast as an EXISTS, but more db independent.


Additionally, I would normally use system catalogues or stored procedures to
look up primary keys and unique constraints. This is different among DBMS's.
On the particular DBMS I use most, I would probably use the sp_helpindex
stored procedure to find all Primary key and unique constraint columns. The
lookup will differ from DBMS to DBMS, and also the existence of unique
constraints that are not primary keys. So far as I know, MySQL or Access do
not have such a constraint, MS SQL definitely does, and I'll let someone
else comment on whether other DBMS's have unique constraints that are not
primary keys. Checking for Unique constraints is why there is the null
checking in the foreach loop in the get_dups subroutine - Unique constraints
can have a single null entry whereas a primary key can never have a null
entry.

Anyway, I told you a few problems with it, and yes, I'm counting (rather
than just checking existence), and this is a hasty re-write, but hopefully
it gives what I think is a good method of avoiding a key conflict error.

Steve H.

<snip>

###################################
# Preliminary stuff like connect
# and define test data for demo
# this is test data only. Use a more elegant method
# for real application.
###################################

use DBI;
my $tablename = 'customers';
my $columnlist = 'CustomerID, CompanyName, ContactName, ContactTitle,
Address, City, Region, PostalCode, Country, Phone, Fax';
my $keylist = 'CustomerID';
my @keyvals = ('ALFKZ');
my @values = ('ALFKZ', 'Alfreds Futterkiste',  'Maria Anders','Sales
Representative',
           'Obere Str. 58', 'Berlin', undef, '12209', 'Germany',
'030-0074321', '030-0076545');

my $dbh = DBI->connect($dsn,'sa','') || die $DBI::errstr;
my $dbh1 = DBI->connect($dsn:ODBC:northwind','sa','') || die $DBI::errstr;

###################################
#Call the sub to check for dups
###################################

$conflict = get_dups($keylist, \@keyvals, $tablename);

($conflict) ? (conflict_exists()) : (do_insert($columnlist, \@values,
$tablename));

sub get_dups {
    my @columns = split /,/, $_[0];
    my $where = 'WHERE ';
    my @exvals;
    foreach (0..$#{$_[1]}) {
                if (defined $_[1]->[$_]) {
                   $where .= "$columns[$_] = ? AND ";
                   push (@exvals, $_[1]->[$_]);
                   } else {
                   $where .= "$columns[$_] IS NULL AND ";
                   }
        }
    $where =~ s/AND\s*$//;                      # get rid of last and;

    my $select = qq{SELECT COUNT(*) FROM $_[2] $where};
    my $selecth=$dbh1->prepare($select) || die "Can't
prepare\n$select\n$DBI::errstr\n";
    $selecth->execute(@exvals);
    return ($selecth->fetchrow_array);



}   #end sub get_dups

sub do_insert {
    ##################################
    #Create the insert statement
    ##################################
    $values = '?' . ',?'x $#{$_[1]};
    my $insert = qq{INSERT INTO $_[2] ($_[0]) VALUES ($values)};

        #################################
        #prepare and execute
        #################################
    my $inserth = $dbh->prepare($insert) || die "Can't
prepare\n$insert\n$DBI::errstr\n";
    $inserth->execute(@{$_[1]}) || die "Can't
execute\n$insert\n$DBI::errstr\n"
}       #end sub do_insert


sub conflict_exists {
        # do something to prevent an error
        }

</snip>



-----Original Message-----
From: Hardy Merrill [mailto:[EMAIL PROTECTED]]
Sent: Friday, September 28, 2001 4:06 PM
To: [EMAIL PROTECTED]
Subject: db independent way of detecting duplicates?


I'm wondering what other people have developed as database
independent way(s) of determining if the insert or update being
done results in a duplicate key situation.

I can think of 2 methods to handle this:
  1. before doing the insert or update, for *EACH AND EVERY*
     unique key defined for a table, do a select with the where
     clause looking for equality on all the fields in the unique
     key - if any of those selects find rows, then your insert
     or update would cause a duplicate.

  2. allow the database to detect the "duplicate key" error -
     when you have RaiseError turned on and you do the insert
     or update in an eval, then the "duplicate" error message
     is manifested in the $@ variable.

       In this case, after the eval you would test the value
       of $@ like this:

         eval {
           $sth = $dbh->prepare($insert_sql);
           $rows_affected = $sth->execute($a, $b, $c);
         };
         if ($@) {
            ### eval had a problem - how do I(in a database
            ### independent way) determine if the problem was
            ### a "duplicate key" situation?
         }


I prefer number 2, but it has a big problem - what do you do
about the fact that different databases return different error
numbers and different error messages for the "duplicate" key
situation?  Do you just build a regular expression that attempts
to catch all the different duplicate key error messages that
could come back from different db's?

TIA.

--
Hardy Merrill
Mission Critical Linux, Inc.
http://www.missioncriticallinux.com

RE: db independent way of detecting duplicates?

Reply via email to