I'm trying to import databases from CSV into an in-memory database.
These CSV files contain quoted delimiters (a,"b,c",d is three columns),
plus they contain a header row.  These two issues are not handled by the
SQLite Tcl extension's copy command.  The SQLite shell's .import command
used to have the same limitations but was improved, yet the SQLite Tcl
extension copy command has not kept pace.

Is there any interest in updating the SQLite Tcl extension copy command
to match the capability of the SQLite shell's .import command?

It makes sense to me, but the details are challenging.  Adding CSV
support to the shell's .import command meant adding a nontrivial amount
of code, and I don't like code duplication, so I'd prefer that most of
the underlying implementation be shared.  Yet it's inappropriate for CSV
parsing to be part of the SQLite core, and I don't think there's a
common shell library at this point, so either one would have to be made,
or CSV would have to be spun off to be its own object file, linked into
both the shell and the Tcl extension.

Yet as complicated as all that sounds, there's more.  The shell's
.import command needs to directly call stdio functions like fopen(), but
a modern implementation of the Tcl extension copy command should use the
Tcl I/O subsystem to leverage Tcl channels.  The result would be
virtualization not only of the input file format ("ASCII" vs. CSV field
readers) but also the underlying I/O.  Thankfully, the only I/O function
called by the field readers is fgetc(), so it shouldn't be too much work
to replace that with a call to a function pointer inside the ImportCtx
structure, through which any reasonable VFS can be invoked.

There are more details to be discussed, e.g. backward compatibility and
what to do about zFile, but for now I'm trying to raise awareness, gauge
interest, and document my first impressions.

See also my post from 2005 which got no replies.  I've asked about this
basic issue before.

Back to my immediate situation.  I could go off and try implementing the
above, but I could also get what I need by not having the database be
in-memory.  If it's on disk, I can alternate between [exec]'ing the
SQLite shell to import and loading it using the Tcl extension.  Far from
clean, but it would work right now.

And last, a question.  Are there any other functionalities common to the
SQLite shell and Tcl extension which could become common code?

Andy Goth | <andrew.m.goth/at/gmail/dot/com>

Attachment: signature.asc
Description: OpenPGP digital signature

sqlite-users mailing list

Reply via email to