Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken MacKenzie
On a side note the literal_column change worked fine performance wise allowing me to remove the literal_bind/compile step. I guess I am 50/50 on which is cleaner. Is either one more advantageous from a SQL injection perspective? SQL Injection is not a primary concern at the moment for this

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken MacKenzie
Tried doing various conversions on the pk values as they enter the statement: 1. to bytes 2. to ascii 3. to latin1 (technically the same encoding as the extract source before entering the db) None of which yielded a performance improvement for the non-compiled version. I have read that this

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken MacKenzie
So inspecting the elements of the tuple, they are both str, so hence unicode. Are you saying that if I convert those values to bytes it could improve performance? > I'd not bother with the literal_binds and just use a literal value: > > pkf = [(col == literal_column("'%s'" % v)) for (col,

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Mike Bayer
On Thu, Aug 31, 2017 at 2:19 PM, Ken MacKenzie wrote: > So here is my solution as of now. As far as I can tell hits the marks of: > > 1. DB Neutral > 2. Performance (9k records deleted in 5 seconds) > 3. Generic > > (excuse the print statements still there for testing) > >

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken MacKenzie
So here is my solution as of now. As far as I can tell hits the marks of: 1. DB Neutral 2. Performance (9k records deleted in 5 seconds) 3. Generic (excuse the print statements still there for testing) Would welcome comments on this proposed solution. I have learned a lot from the advice in

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken MacKenzie
Ending output from batch size of 10: 501 - 0:24:35.558103 491 - 0:24:37.197976 481 - 0:24:38.909318 471 - 0:24:40.645801 461 - 0:24:41.954807 451 - 0:24:43.628955 441 - 0:24:45.246469 431 - 0:24:46.662228 421 - 0:24:48.192343 411 - 0:24:49.839395 401 - 0:24:51.513340 391 - 0:24:53.153668 381 -

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken M
I tried a batch size as small as ten and each step took about 2 seconds. I set it up to show me the time between each batch for that run. Lowering the batch size actually seems to worsen performance. Which is not what I expected. Sent from my iPad > On Aug 31, 2017, at 5:42 AM, Simon King

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken M
Yes the double quote was a typo, sorry about that. Your point about non varchar pk, that is why I will double check other dbs. SQL server lets it fly but I think you are right about Postgres. Sent from my iPad > On Aug 31, 2017, at 5:36 AM, Simon King wrote: > > The

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Simon King
I'm pretty sure the bad performance is due to pyodbc (or maybe SQL Server) not handling *huge* numbers of bind parameters gracefully. You are generating a query with (batch_size * n_primary_key_columns) bind parameters, which even in your original version meant 2000 bind parameters. Try with a

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Simon King
The generic version would look something like this: table = cls.__table__ pkconditions = [] for pk in cls.SQL_PK: pkconditions.append(table.c[pk] == sa.bindparam(pk) condition = sa.and_(*pkconditions) statement = sa.delete(table, whereclause=condition) batchparams = [dict(zip(cls.SQL_PK, v))

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken M
When I get in tomorrow I will take a look at some of these options. I also intend to spin up a quick MySQL or postgresql db to test if my textual solution can work against an alternate target. Sent from my iPad > On Aug 30, 2017, at 6:20 PM, Jonathan Vanasco wrote: > >

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Jonathan Vanasco
Looking at your code and considering the size of your database, I've run into performance issues under PostgreSQL a few times that may be related... they're all usually because very slight text changes to a query can drastically alter the server's query planner. The cause/fixes were: 1.

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Mike Bayer
When you call create_engine: e = create_engine(url, echo=True) include the "echo=True" part. Then look at the SQL being emitted. That's the important part here. On Wed, Aug 30, 2017 at 5:09 PM, Ken MacKenzie wrote: > So the big difference between the methods when

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
So the big difference between the methods when seen with print(q) Textual SQL is building a statement with the values inline such as where college=888 and check=1234567 The methods using the column object are building with parameters where college = (?) and check = (?) That is all I can see

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
With a batch size of 1 based on the original selection, no improvement. I am going to see what sql this is creating. 2328928 function calls (2085124 primitive calls) in 1084.925 seconds Ordered by: internal time ncalls tottime percall cumtime percall

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
I am noticing the 16 execute calls. There should only be 10 with a batch size of 1000 and a record size of 9,000 and some change. Are you just wanting to see the sql via a print(q)? On Wednesday, August 30, 2017 at 2:01:31 PM UTC-4, Mike Bayer wrote: > > it looks like you have 16 SQL

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
I tried an alternate version: @classmethod def dict_delete2(cls, dict_in): """ Delete records from the database based on a dictionary keyed by PK tuple """ s = dbc.get_session() keys_in = ["-".join(k) for k in dict_in.keys()]

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Mike Bayer
it looks like you have 16 SQL statements that take a very different amount of time depending on format. Can you please paste the output with echo=True on create_engine() so that the difference in SQL statements can be seen? On Wed, Aug 30, 2017 at 12:30 PM, Ken MacKenzie

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
Not sure how I would iterate through a non predetermined number of primary keys. I guess part of me is wondering that although textual sql is not inherently db neutral how different between the db targets is the where field = 'value' syntax? On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4,

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Simon King
You could also try using executemany: http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements I think it would look something like this: table = cls.__table__ condition = sa.and_( table.c.pk1 == sa.bindparam('pk1'), table.c.pk2 == sa.bindparam('pk2'), )

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
Sorted Textual SQL profile ncalls tottime percall cumtime percall filename:lineno(function) 161.8400.1151.8400.115 {method 'execute' of 'pyodbc.Cursor' objects} 95720.1960.0000.7880.000 elements.py:1860(_construct) 577000.1890.000

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
Sorted: ncalls tottime percall cumtime percall filename:lineno(function) 16 1053.794 65.862 1053.794 65.862 {method 'execute' of 'pyodbc.Cursor' objects} 348916/3483110.5390.0000.5560.000 {built-in method isinstance} 160920/1609180.3130.0000.313

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
After the current sorted profile finishes I will revert to the textual version and run a profile on that. I expect another 10-15 minutes for this to finish right now. At present the batch size is set to 1000, total record count is just over 9000 in these tests. The reason for 1000 was at

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
After I finish the current profile run I will revert and profile that one for comparison. I am expecting another 10-15 minutes for this to finish. On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: > > It would be interesting to see the profile of the textual SQL version. >

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Simon King
It would be interesting to see the profile of the textual SQL version. It looks like most of the time is being spent inside pyodbc, rather than SQLAlchemy, so I guess it must be something to do with the processing of bind parameters. How many parameters are being sent in per query? ie. what is

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
So I implemented this version with one change. I moved this line: cols = [getattr(cls, colname) for colname in cls.SQL_PK] To the beginning before the loop, actually before the outer loop that parses the batch. However, the new version is MUCH slower than the original, so I profiled it and

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Simon King
On Tue, Aug 29, 2017 at 9:49 PM, Ken MacKenzie wrote: > I have a query I have constructed and I had to deal with a composite primary > key to select items; > > q = s.query(cls) > or_cond = [] > for x in id_batch: > > pkf =