Hi Python-Ideas,
I was browsing the source code to the sqlite3 standard library module
(as one does over the Christmas holidays) and was surprised to find that
the eviction mechanism for cached prepared statements is fairly broken
... and has been for the last 15 years! This is just from reading the
source code, so hopefully I've not made a fool of myself by misreading
it. I've reported the problem to the upstream pysqlite repository:
https://github.com/ghaering/pysqlite/issues/141
This got me thinking about potential fixes and improvements, which I
thought might be worth discussing here, especially as maybe not all
changes from that repo are still merged into mainline Python.
1. My first suggestion is to replace the odd cache logic with a
straightforward classic LRU cache. I'm hoping this is uncontroversial,
but I can imagine there might be programs that somehow depend on the
current behaviour. But my suspicion is that many more suffer and just no
one looked hard enough to notice.
2. My next suggestion is to allow disabling the cache (removing it
entirely isn't an option for backwards compatibility) and instead cache
one statement per cursor so long as consecutive calls to `execute()` or
`executemany()` use identical SQL. This neatly avoids changing the
interface while still allowing devs complete control over how they cache
statements: you can put them in your own LRU cache structure, or cache
specific ones by reusing just those cursors, or do anything else that
you like.
This API would also cleanly mirror how prepared statements work in the
underlying SQLite C library. My first thought was to have a separate
class representing a prepared statement, and allow that to be used as a
factory for creating cursors; but SQLite prepared statements can only be
executed and iterated once at a time before being reused, so using the
Cursor object itself for caching the statement makes more sense.
3. My final suggestion is a small backwards-compatible change to the
interface to make the previous idea particularly clear and easy to use:
allow the SQL parameter to `execute()` and `executemany()` to be omitted
(or None) to mean the same query as last used on that cursor, which
guarantees it will use the cached prepared statement. Even better, also
allow SQL to be passed in when creating the cursor. This leads to
particularly straightforward statement reuse code:
conn = sqlite3.connect(path_to_db, isolation_level=None)
cur = conn.cursor(sql="SELECT * FROM Foo WHERE i > ?")
print(cur.execute(parameters=(3,)).fetchall())
for row in cur.execute(parameters=(2,)):
print(row)
What do you think?
Thanks,
James
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/XG4XYK2JERNHV4QF5SVVYAK33VXSDYIG/
Code of Conduct: http://python.org/psf/codeofconduct/