These are various issues I had and some suggestions based on
writing a Python wrapper for SQLite 3.  Overall I was very
impressed by the quality of the doc, and of course SQLite 3
itself.  Overall the wrapper ended up being just under 3,000
lines of code and provides all the functionality of the C API
including the experimental functions.

Python supports both a String type (basically char[]) and a Unicode
type. I could use the plain or -16 suffix versions of the apis
as appropriate. Except that Python can store unicode as UTF16
or as UTF32. Several internal Python functions also put in byte order markers (BOMs). In some cases I have to convert from
the String type to Unicode (the encoding of the String type is
rarely UTF8) and then for UTF32 strings convert back to UTF8.
The SQLite source only copes with UTF16 BOMs, but BOMs exist
for UTF8, UTF16, UTF32 and even UTF64. To help get around
all this confusion, I suggest another series of APIs for
SQLite when dealing with strings. They take a void pointer
and the length in bytes. There is also a parameter saying
what the encoding is. This encoding is ignored if a BOM
is found.


I had to use SQLITE_TRANSIENT for all supplied byte data.
This is because the pointer I supplied is not the pointer
on which the free function needs to be called.  The free
function needs to be called on the containing object (in
my case a PyObject*) of which the pointer is a field.  I
don't have suggestions for how to improve this since they
would just make the code more complicated either for the
wrapper writer or the SQLite author.

Unfortunately I ended up with four places (two times two)
of almost duplicate code.


- When implementing user defined functions, I had to
convert a sqlite3_value* to a Python type using the
sqlite3_value_TYPE functions. For returning results from a row, an sqlite3_value* is not available so I
had to use the sqlite3_column_TYPE functions. If possible it would be nice to get a sqlite3_value* for
a column (assuming it is stored that way) and then
I can use one piece of code for converting from SQLite types to Python types


- A similar situation occurs when going from Python types
to SQLite types which happens when setting the results
of a user defined function and setting bindings, using sqlite3_result_TYPE and sqlite3_bind_TYPE respectively.
Being able to allocate a typed sqlite3_value* and then giving that to sqlite3_result/sqlite3_bind would reduce
things to one piece of code.
It is very hard for me to do garbage collection when a
registered function or collation overwrites one of the
same name. For example if I register a function "foo"
and then register "FOO" I would assume the earlier one
is no longer used in SQLite. I currently hold on to
my per callback blocks of data until the whole database
is closed, since I never know when an earlier function has
been release. The simplest solution would be if I could ask
SQLite if a particular void*userdata still belongs to
any registered function. That way I would know when SQLite
is done with it.


It would be nice to have an error code allocated for
user/wrapper. For example when there is an error in a Python callback, I have to set an error code in SQLite.
Currently I just use SQLITE_ERROR but later on it isn't
clear who generated that code. If there was something
like SQLITE_WRAPPER or SQLITE_USER then it would be
unambiguous.


The only API I didn't use was sqlite3_collation_needed.
It is very difficult to use in a wrapper library since
I would have to have a way of tracking from the sqlite3*
back to the encapsulating PyObject.  I also remain unconvinced
of its utility.

There are several overlapping ranges of constants. The textual
names don't have unique prefixes. For example 2 could be
SQLITE_INTERNAL, SQLITE_FLOAT, SQLITE_CREATE_TABLE, SQLITE_IGNORE or
one of SQLITE_UTF16BE, SQLITE_UTF16 (the latter two have conflicting definitions in the doc for create_collation and create_function).
This makes it somewhat messy converting a number back to a name
since you have to be very aware of the context and which textual
names are part of which series of constants.


  For example if I wanted to convert the error code 2 back into
  the error code name, I can't just find the first name begining
  with SQLITE_ that has a value of two.  If error code names had
  a unique prefix such as SQLITE_E_ then it would be easy.  This
  issue arises because all the constants share the same name space.
  In my wrapper you just say apsw.SQLITE_INTERNAL rather than
  apsw.errors.SQLITE_INTERNAL.

Roger

Reply via email to