M.-A. Lemburg <m...@egenix.com> wrote: > On 18.12.2014 23:19, Michael Bayer wrote: >>> On Dec 18, 2014, at 3:57 PM, Michael Bayer <mike...@zzzcomputing.com> wrote: >>> >>>> On Dec 18, 2014, at 3:39 PM, M.-A. Lemburg <m...@egenix.com> wrote: >>>> >>>> >>>>> That would make the entire feature a non-starter for me. SQLAlchemy >>>>> doesn’t hold cursors open beyond a single statement. My users would >>>>> very much want a prepared-statement-per-transaction object. >>>> >>>> Perhaps you ought to reconsider this approach. Creating and closing >>>> cursors all the time does involve somewhat of an overhead. >>> >>> I will attempt to try this, though I am anticipating that DBAPIs are >>> going to be problematic with this approach. One concrete example is the >>> case where on psycopg2, we offer the option to use a “named” cursor, >>> which on psycopg2 has the effect of maintaining the state of this cursor >>> on the server side. However psycopg2 throws an error if such a cursor is >>> used for anything other than a SELECT statement. So right there, we need >>> more than one cursor based on the contents of the SQL. This is kind of a >>> very specific situation though, I’ll see if the approach in general >>> produces issues. >> >> So I tried this, and pleasantly, there’s not *too* much side effect, >> meaning a quick test against a few databases didn’t lead to many issues. >> Where there were issues are in the tests relating to connection >> invalidation within a 2pc context; I didn’t dig in to what the issues are >> but its possible that the MySQL and psycopg2 DBAPIs have some more quirks >> with cursors when 2pc is used (or my changes just were missing some edge >> cases). >> >> However, if I were to change this for real, it means that small bump in >> stability now gets sent out to everyone, working on databases I don’t >> even have regular access to such as sybase and DB2, and whatever quirks >> of reusing cursors might exist that I’ve not been able to test; many >> years of effort and user-feedback has gone into getting our Connection >> class to be stable and predictable in an extremely wide variety of >> situations (where we’re talking here about failure modes: disconnects, >> deadlocks, timeouts, intercepting these conditions perfectly and getting >> the system back into a stable state as efficiently and crash-free as >> possible), and here we’re presented with the potential of overhead from >> opening and closing many cursors, rather than keeping one around for…I >> would presume the transaction scope. >> >> This is exactly what I was getting at in my other email. We are >> considering a significant change in a key area of stability in the name >> of “reducing overhead”, so is it really worth it? For the drivers that >> the vast majority of my users care about at least, the effect would >> appear to be negligible, hitting barely a 1% difference with the pure >> Python drivers that have much bigger performance problems just by being >> in pure Python: >> >> psycopg2 single cursor: 6.159881 (10000 executions) >> psycopg2 multi cursor: 6.173749 (10000 executions) >> >> pg8000 single cursor: 28.213494 (1000 executions) >> pg8000 multi cursor: 28.620359 (1000 executions) >> >> mysqldb single cursor (10000 executions): 11.702930 >> mysqldb multi cursor (10000 executions): 11.809935 >> >> mysql connector single cursor (1000 executions): 25.707400 >> mysql connector multi cursor (1000 executions): 26.096313 > > The results are somewhat biased, since your test spends most of > the time with fetching data from the database, not with > the creation and deallocation of the cursor. > > You should try this with a table that only has e.g. 10-100 entries or > a query which only yields a few result rows for comparison.
Good point, here is five rows fetched and the number of runs multiplied by 10: psycopg2 single cursor: 9.307149 (100000 executions) psycopg2 multi cursor: 9.339134 (100000 executions) pg8000 single cursor: 3.406491 (10000 executions) pg8000 multi cursor: 3.391371 (10000 executions) mysqldb single cursor (100000 executions): 8.512559 mysqldb multi cursor (100000 executions): 9.355145 mysql connector single cursor (10000 executions): 3.218654 mysql connector multi cursor (10000 executions): 4.089793 Differences are still negligible for Postgresql but for MySQL they are now tangible, but less so for the C APIs which are already an order of magnitude faster than the Python API. That is, the speed hit in MySQL-connector is nothing compared to the slowness introduced by it being in pure Python. >> psycopg2.ProgrammingError: can't call .execute() on named cursors more >> than once >> >> Wow! So that’s definitely that :). > > I think there's a misunderstanding here. A named cursor refers to > a database cursor pointing into a result set. Once that result > set has been created and the cursor points to it, you cannot > run another .execute() against it, unless you first close the > named cursor - simply because it is already in use. This is in fact why SQLAlchemy works the way it does in the first place; cursors have always to me been something that you allocate for a single statement’s result set. In the old ASP days, microsoft’s tools always handed us a scrollable, updatable cursor, in fact. PG’s named behavior seems natural to me and it is surprising to me that you’re suggesting that cursor-per-statement is a poor practice. _______________________________________________ DB-SIG maillist - DB-SIG@python.org https://mail.python.org/mailman/listinfo/db-sig