I have a big directory with a CSV dump of a directory.

The folder has a lot of table dumps in CSV format (amongst a lot of
other database related files) and all those files have a lovely
recognizable extension: .csv

So I found it stupid from DBD::CSV to have a table.csv file and then
forcing me to select from it using "table.csv" instead of "table".

Having written some perl code in the past, how hard could it be to look
for "open" and "opendir" and make it use "table.csv" if it exists, and
if not, fall back to the default "table".

Wrong. DBD::CSV is just a subclass of DBD::File, which does not support
this. Dumb dumber dumbest.

I'm sure Jeff wouldn't mind me taking over DBD::CSV if that were easy
to fix, but it is not, so the changes in DBD::CSV are futile compared
to the changes needed in DBD::File.

Attached patch implements the following:

  f_dir => "/data/csv/foo",
  f_ext => ".csv",
  f_ext => ".csv/i",  # Case ignore on extension, use given on write
  f_ext => ".csv/r",  # Extension is required, ignore other files in
  f_ext => ".csv/ri", #  f_dir. Options can be combined

Suggested implementation attached.
TODO includes tests and file creation (which I didn't write yet)

       f_ext
           This attribute is used for setting the file extension where (CSV)
           files are opened. There are several possibilities.

               DBI:CSV:f_dir=data;f_ext=.csv

           In this case, DBD::File will open only "table.csv" if both
           "table.csv" and "table" exist in the datadir. The table will still
           be named "table". If your datadir has files with extensions, and
           you do not pass this attribute, your table is named "table.csv",
           which is probably not what you wanted.

               DBI:CSV:f_dir=data;f_ext=.csv/i

           Same as above, but file name matching is done case‐insensitive.

               DBI:CSV:f_dir=data;f_ext=.csv/r

           In this case the extension is required, and all filenames that do
           not match are ignored.

               DBI:CSV:f_dir=data;f_ext=.csv/ri

           Same as above, but file name matching is done case‐insensitive.


I have tested that with DBD::CSV and
$ENV{DBI_DSN} = "DBI:CSV:f_dir=dta;f_ext=.csv"



-- 
H.Merijn Brand          Amsterdam Perl Mongers  http://amsterdam.pm.org/
using & porting perl 5.6.2, 5.8.x, 5.10.x, 5.11.x on HP-UX 10.20, 11.00,
11.11, 11.23, and 11.31, SuSE 10.1, 10.2, and 10.3, AIX 5.2, and Cygwin.
http://mirrors.develooper.com/hpux/           http://www.test-smoke.org/
http://qa.perl.org      http://www.goldmark.org/jeff/stupid-disclaimers/
--- lib/DBD/File.pm.org	2008-09-16 17:26:59.000000000 +0200
+++ lib/DBD/File.pm	2008-09-16 18:43:34.000000000 +0200
@@ -29,7 +29,7 @@ package DBD::File;
 
 use vars qw(@ISA $VERSION $drh $valid_attrs);
 
-$VERSION = '0.35';
+$VERSION = '0.36';
 
 $drh = undef;		# holds driver handle(s) once initialised
 
@@ -83,6 +83,8 @@ sub connect ($$;$$$) {
     if ($this) {
 	my($var, $val);
 	$this->{f_dir} = $haveFileSpec ? File::Spec->curdir() : '.';
+	$this->{f_ext} = "";
+	$this->{f_map} = {};
 	while (length($dbname)) {
 	    if ($dbname =~ s/^((?:[^\\;]|\\.)*?);//s) {
 		$var = $1;
@@ -97,9 +99,10 @@ sub connect ($$;$$$) {
 	    }
 	}
         $this->{f_valid_attrs} = {
-            f_version    => 1  # DBD::File version
-          , f_dir        => 1  # base directory
-          , f_tables     => 1  # base directory
+            f_version	=> 1  # DBD::File version
+          , f_dir	=> 1  # base directory
+          , f_ext	=> "",# file extension
+          , f_tables	=> 1  # base directory
         };
         $this->{sql_valid_attrs} = {
             sql_handler           => 1  # Nano or S:S
@@ -341,11 +344,45 @@ sub type_info_all ($) {
 	    return undef;
 	}
 	my($file, @tables, %names);
+	my($ext, $req, $ic) = ("", 0, 0);
+	if ($dbh->{f_ext}) {
+	    ($ext, my $opt) = split /\//, $dbh->{f_ext};
+	    if ($ext) {
+		$opt =~ /i/i and $ic  = 1;
+		$opt =~ /r/i and $req = 1;
+	    }
+	}
+	my $user = eval { getpwuid((stat(_))[4]) };
 	while (defined($file = readdir($dirh))) {
-	    if ($file ne '.'  &&  $file ne '..'  &&  -f "$dir/$file") {
-		my $user = eval { getpwuid((stat(_))[4]) };
-		push(@tables, [undef, $user, $file, "TABLE", undef]);
+	    $file eq '.' || $file eq '..'	and next;
+	    -f "$dir/$file"			or  next;
+	    my $f = $file;
+	    if ($ext) {
+		if ($req) {
+		    # File extension required
+		    if ($ic) {
+			$f =~ s/$ext$//i	or  next;
+		    }
+		    else {
+			$f =~ s/$ext$//		or  next;
+		    }
+		}
+		else {
+		    # File extension optional, skip if file with extension exists
+		    if ($ic) {
+			grep /$ext$/i, glob "$dir/$file*" and next;
+			$f =~ s/$ext$//i;
+		    }
+		    else {
+			-f "$dir/$file$ext"	and next;
+			$f =~ s/$ext$//;
+		    }
+		}
 	    }
+
+	    $dbh->{f_map}{$f} = $file;
+	    print STDERR "Found $file => $f (ext:$ext/req:$req/ic:$ic)\n";
+	    push(@tables, [undef, $user, $f, "TABLE", undef]);
 	}
 	if (!closedir($dirh)) {
 	    $dbh->set_err($DBI::stderr, "Cannot close directory $dir: $!");
@@ -550,9 +587,10 @@ sub get_file_name($$$) {
      and $file !~ m!^[/\\]!   # root
      and $file !~ m!^[a-z]\:! # drive letter
     ) {
+	my $realfile = $data->{Database}{f_map}{$table} || $table;
 	$file = $haveFileSpec ?
-	    File::Spec->catfile($data->{Database}->{'f_dir'}, $table)
-		: $data->{Database}->{'f_dir'} . "/$table";
+	    File::Spec->catfile($data->{Database}{f_dir}, $realfile)
+		: $data->{Database}{f_dir} . "/$realfile";
     }
     return($table,$file);
 }
@@ -735,8 +773,32 @@ This attribute is used for setting the d
 opened. Usually you set it in the dbh, it defaults to the current
 directory ("."). However, it is overwritable in the statement handles.
 
-=back
+=item f_ext
+
+This attribute is used for setting the file extension where (CSV) files are
+opened. There are several possibilities.
+
+    DBI:CSV:f_dir=data;f_ext=.csv
+
+In this case, DBD::File will open only C<table.csv> if both C<table.csv> and
+C<table> exist in the datadir. The table will still be named C<table>. If
+your datadir has files with extensions, and you do not pass this attribute,
+your table is named C<table.csv>, which is probably not what you wanted.
 
+    DBI:CSV:f_dir=data;f_ext=.csv/i
+
+Same as above, but file name matching is done case-insensitive.
+
+    DBI:CSV:f_dir=data;f_ext=.csv/r
+
+In this case the extension is required, and all filenames that do not match
+are ignored.
+
+    DBI:CSV:f_dir=data;f_ext=.csv/ri
+
+Same as above, but file name matching is done case-insensitive.
+
+=back
 
 =head2 Driver private methods
 

Reply via email to