[Python-ideas] Fixing and improving statement caching in sqlite3

James Oldfield Wed, 30 Dec 2020 15:42:39 -0800


Hi Python-Ideas,

I was browsing the source code to the sqlite3 standard library module(as one does over the Christmas holidays) and was surprised to find thatthe eviction mechanism for cached prepared statements is fairly broken... and has been for the last 15 years! This is just from reading thesource code, so hopefully I've not made a fool of myself by misreadingit. I've reported the problem to the upstream pysqlite repository:


https://github.com/ghaering/pysqlite/issues/141

This got me thinking about potential fixes and improvements, which Ithought might be worth discussing here, especially as maybe not allchanges from that repo are still merged into mainline Python.

1. My first suggestion is to replace the odd cache logic with astraightforward classic LRU cache. I'm hoping this is uncontroversial,but I can imagine there might be programs that somehow depend on thecurrent behaviour. But my suspicion is that many more suffer and just noone looked hard enough to notice.

2. My next suggestion is to allow disabling the cache (removing itentirely isn't an option for backwards compatibility) and instead cacheone statement per cursor so long as consecutive calls to `execute()` or`executemany()` use identical SQL. This neatly avoids changing theinterface while still allowing devs complete control over how they cachestatements: you can put them in your own LRU cache structure, or cachespecific ones by reusing just those cursors, or do anything else thatyou like.

This API would also cleanly mirror how prepared statements work in theunderlying SQLite C library. My first thought was to have a separateclass representing a prepared statement, and allow that to be used as afactory for creating cursors; but SQLite prepared statements can only beexecuted and iterated once at a time before being reused, so using theCursor object itself for caching the statement makes more sense.

3. My final suggestion is a small backwards-compatible change to theinterface to make the previous idea particularly clear and easy to use:allow the SQL parameter to `execute()` and `executemany()` to be omitted(or None) to mean the same query as last used on that cursor, whichguarantees it will use the cached prepared statement. Even better, alsoallow SQL to be passed in when creating the cursor. This leads toparticularly straightforward statement reuse code:


    conn = sqlite3.connect(path_to_db, isolation_level=None)
    cur = conn.cursor(sql="SELECT * FROM Foo WHERE i > ?")
    print(cur.execute(parameters=(3,)).fetchall())
    for row in cur.execute(parameters=(2,)):
        print(row)

What do you think?

Thanks,
James
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XG4XYK2JERNHV4QF5SVVYAK33VXSDYIG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Fixing and improving statement caching in sqlite3

Reply via email to