Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken MacKenzie
On a side note the literal_column change worked fine performance wise 
allowing me to remove the literal_bind/compile step.

I guess I am 50/50 on which is cleaner.  Is either one more advantageous 
from a SQL injection perspective?
SQL Injection is not a primary concern at the moment for this process as 
what we are dealing with is a data replication service pulling from a u2 
data store to sql server.  The delete is part of batch cleanup testing.

On Thursday, August 31, 2017 at 5:13:58 PM UTC-4, Ken MacKenzie wrote:
>
> Tried doing various conversions on the pk values as they enter the 
> statement:
> 1.  to bytes
> 2.  to ascii
> 3. to latin1 (technically the same encoding as the extract source before 
> entering the db)
>
> None of which yielded a performance improvement for the non-compiled 
> version.
>
> I have read that this can be an issue with pyodbc and that there are 
> engine settings related to it.  Also perhaps I should try using the pymssql 
> driver to see if that changes anything.
>
> On Thursday, August 31, 2017 at 5:00:47 PM UTC-4, Ken MacKenzie wrote:
>>
>> So inspecting the elements of the tuple, they are both str, so hence 
>> unicode.
>>
>> Are you saying that if I convert those values to bytes it could improve 
>> performance?
>>
>>  
>>
>>> I'd not bother with the literal_binds and just use a literal value: 
>>>
>>> pkf = [(col == literal_column("'%s'" % v)) for (col, v) in zip(cols, x)] 
>>>
>>> but also I'd look to see what the nature of "v" is, if it's like a 
>>> Unicode object or something, you might be getting bogged down on the 
>>> decode/encode or something like that.   Sending as bytes() perhaps 
>>> might change that. 
>>>
>>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken MacKenzie
Tried doing various conversions on the pk values as they enter the 
statement:
1.  to bytes
2.  to ascii
3. to latin1 (technically the same encoding as the extract source before 
entering the db)

None of which yielded a performance improvement for the non-compiled 
version.

I have read that this can be an issue with pyodbc and that there are engine 
settings related to it.  Also perhaps I should try using the pymssql driver 
to see if that changes anything.

On Thursday, August 31, 2017 at 5:00:47 PM UTC-4, Ken MacKenzie wrote:
>
> So inspecting the elements of the tuple, they are both str, so hence 
> unicode.
>
> Are you saying that if I convert those values to bytes it could improve 
> performance?
>
>  
>
>> I'd not bother with the literal_binds and just use a literal value: 
>>
>> pkf = [(col == literal_column("'%s'" % v)) for (col, v) in zip(cols, x)] 
>>
>> but also I'd look to see what the nature of "v" is, if it's like a 
>> Unicode object or something, you might be getting bogged down on the 
>> decode/encode or something like that.   Sending as bytes() perhaps 
>> might change that. 
>>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken MacKenzie
So inspecting the elements of the tuple, they are both str, so hence 
unicode.

Are you saying that if I convert those values to bytes it could improve 
performance?

 

> I'd not bother with the literal_binds and just use a literal value: 
>
> pkf = [(col == literal_column("'%s'" % v)) for (col, v) in zip(cols, x)] 
>
> but also I'd look to see what the nature of "v" is, if it's like a 
> Unicode object or something, you might be getting bogged down on the 
> decode/encode or something like that.   Sending as bytes() perhaps 
> might change that. 
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Mike Bayer
On Thu, Aug 31, 2017 at 2:19 PM, Ken MacKenzie  wrote:
> So here is my solution as of now.  As far as I can tell hits the marks of:
>
> 1. DB Neutral
> 2. Performance (9k records deleted in 5 seconds)
> 3. Generic
>
> (excuse the print statements still there for testing)
>
> Would welcome comments on this proposed solution.  I have learned a lot from
> the advice in this thread and can always stand to learn more for the sake of
> my code.
>
> @classmethod
> def dict_delete(cls, dict_in):
> """
> Delete records from the database
> based on a dictionary keyed by
> PK tuple
> """
> s = dbc.get_session()
>
> d_start = datetime.datetime.now()
> keys_in = [k for k in dict_in.keys()]
> batch_size = 1000
> cols = [getattr(cls, colname) for colname in cls.SQL_PK]
> while len(keys_in):
> id_batch = keys_in[:batch_size]
> del keys_in[:batch_size]
>
> or_cond = []
> for x in id_batch:
>
> pkf = [(col == v) for (col, v) in zip(cols, x)]
> and_cond = and_(*pkf)
> or_cond.append(and_cond)
>
> stmt = cls.__table__.delete(or_(*or_cond))
> stmt_comp = str(stmt.compile(dialect=s.bind.dialect,
> compile_kwargs={"literal_binds": True}))

I'd not bother with the literal_binds and just use a literal value:

pkf = [(col == literal_column("'%s'" % v)) for (col, v) in zip(cols, x)]

but also I'd look to see what the nature of "v" is, if it's like a
Unicode object or something, you might be getting bogged down on the
decode/encode or something like that.   Sending as bytes() perhaps
might change that.


>
> print(stmt_comp)
> print(len(keys_in), "-", datetime.datetime.now() - d_start)
>
> s.execute(stmt_comp)
>
> s.commit()
>
> s.close()
>
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken MacKenzie
So here is my solution as of now.  As far as I can tell hits the marks of:

1. DB Neutral
2. Performance (9k records deleted in 5 seconds)
3. Generic

(excuse the print statements still there for testing)

Would welcome comments on this proposed solution.  I have learned a lot 
from the advice in this thread and can always stand to learn more for the 
sake of my code.

@classmethod
def dict_delete(cls, dict_in):
"""
Delete records from the database
based on a dictionary keyed by
PK tuple
"""
s = dbc.get_session()

d_start = datetime.datetime.now()
keys_in = [k for k in dict_in.keys()]
batch_size = 1000
cols = [getattr(cls, colname) for colname in cls.SQL_PK]
while len(keys_in):
id_batch = keys_in[:batch_size]
del keys_in[:batch_size]

or_cond = []
for x in id_batch:

pkf = [(col == v) for (col, v) in zip(cols, x)]
and_cond = and_(*pkf)
or_cond.append(and_cond)

stmt = cls.__table__.delete(or_(*or_cond))
stmt_comp = str(stmt.compile(dialect=s.bind.dialect, 
compile_kwargs={"literal_binds": True}))

print(stmt_comp)
print(len(keys_in), "-", datetime.datetime.now() - d_start)

s.execute(stmt_comp)

s.commit()

s.close()


-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken MacKenzie
Ending output from batch size of 10:

501 - 0:24:35.558103
491 - 0:24:37.197976
481 - 0:24:38.909318
471 - 0:24:40.645801
461 - 0:24:41.954807
451 - 0:24:43.628955
441 - 0:24:45.246469
431 - 0:24:46.662228
421 - 0:24:48.192343
411 - 0:24:49.839395
401 - 0:24:51.513340
391 - 0:24:53.153668
381 - 0:24:54.836501
371 - 0:24:56.372331
361 - 0:24:58.034510
351 - 0:24:59.714655
341 - 0:25:01.401695
331 - 0:25:03.026082
321 - 0:25:04.657724
311 - 0:25:06.324783
301 - 0:25:07.946133
291 - 0:25:09.347907
281 - 0:25:11.000936
271 - 0:25:12.640046
261 - 0:25:14.281258
251 - 0:25:15.939942
241 - 0:25:17.651838
231 - 0:25:19.307522
221 - 0:25:20.931354
211 - 0:25:22.589152
201 - 0:25:24.223949
191 - 0:25:25.868513
181 - 0:25:27.540003
171 - 0:25:29.198259
161 - 0:25:30.492486
151 - 0:25:31.778539
141 - 0:25:33.292928
131 - 0:25:34.966223
121 - 0:25:36.623493
111 - 0:25:38.276205
101 - 0:25:39.996527
91 - 0:25:41.374338
81 - 0:25:43.011192
71 - 0:25:44.639656
61 - 0:25:46.293672
51 - 0:25:47.954129
41 - 0:25:49.663463
31 - 0:25:51.333620
21 - 0:25:52.842479
11 - 0:25:54.489094
1 - 0:25:56.109371
0 - 0:25:56.431443


Basically the only conclusion I can draw is that it is a case of the bind 
params.

So I know this is not recommended but to retain DB neutrality can I use the 
output from query.statement.compile(dialect=s.bind.dialect, 
compile_kwargs={"literal_binds": True}) and feed that output back in direct 
to execute?


On Thursday, August 31, 2017 at 6:41:17 AM UTC-4, Ken M wrote:
>
> I tried a batch size as small as ten and each step took about 2 seconds. I 
> set it up to show me the time between each batch for that run.  Lowering 
> the batch size actually seems to worsen performance. Which is not what I 
> expected. 
>
> Sent from my iPad 
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken M
I tried a batch size as small as ten and each step took about 2 seconds. I set 
it up to show me the time between each batch for that run.  Lowering the batch 
size actually seems to worsen performance. Which is not what I expected.

Sent from my iPad

> On Aug 31, 2017, at 5:42 AM, Simon King  wrote:
> 
> I'm pretty sure the bad performance is due to pyodbc (or maybe SQL
> Server) not handling *huge* numbers of bind parameters gracefully. You
> are generating a query with (batch_size * n_primary_key_columns) bind
> parameters, which even in your original version meant 2000 bind
> parameters.
> 
> Try with a batch sizes of 10, 50 and 100. You'll run more queries but
> I suspect each query will perform better.
> 
> Simon
> 
>> On Wed, Aug 30, 2017 at 10:02 PM, Ken MacKenzie  wrote:
>> With a batch size of 1 based on the original selection, no improvement.
>> 
>> I am going to see what sql this is creating.
>> 
>> 2328928 function calls (2085124 primitive calls) in 1084.925
>> seconds
>> 
>>   Ordered by: internal time
>> 
>>   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>>   26 1074.733   41.336 1074.733   41.336 {method 'execute' of
>> 'pyodbc.Cursor' objects}
>>20.8710.4360.8720.436 {built-in method connect}
>> 349128/3485230.5560.0000.5730.000 {built-in method
>> isinstance}
>>   260.5020.0190.5020.019 {method 'close' of
>> 'pyodbc.Cursor' objects}
>>   200.4850.0240.4850.024 {method 'commit' of
>> 'pyodbc.Connection' objects}
>> 161062/1610600.3170.0000.3180.000 {built-in method getattr}
>>191230.2590.0000.7890.000 elements.py:861(__init__)
>> 95670.2580.0000.2580.000 {method 'fetchone' of
>> 'pyodbc.Cursor' objects}
>> 76492/191230.2480.0002.8420.000 operators.py:297(__eq__)
>> 67012/240.2360.0002.6010.108
>> visitors.py:75(_compiler_dispatch)
>>191230.2340.0000.4240.000 elements.py:2927(__init__)
>> 95820.2130.0001.1040.000 elements.py:1860(_construct)
>>191320.2050.0000.4350.000 compiler.py:676(visit_column)
>>191230.1890.0001.8060.000
>> compiler.py:1040(_generate_generic_binary)
>>191230.1830.0001.2630.000
>> default_comparator.py:290(_check_literal)
>> 57389/191430.1730.0002.7320.000 {built-in method eq}
>>191230.1710.0001.9120.000
>> default_comparator.py:22(_boolean_compare)
>>286840.1690.0000.3460.000
>> operators.py:1007(is_precedent)
>>191260.1430.0000.3520.000 elements.py:3962(__new__)
>>191230.1290.0000.7950.000
>> compiler.py:1164(visit_bindparam)
>>191230.1280.0000.9180.000 elements.py:3818(_bind_param)
>>191530.1260.0000.5810.000 base.py:1128(decorate)
>>10.1250.125 1084.925 1084.925 del_test.py:1()
>>   1168450.1210.0000.1210.000 {method 'get' of 'dict'
>> objects}
>> 38264/95810.1150.0002.5670.000 compiler.py:806()
>>191230.1130.0002.0480.000 compiler.py:1006(visit_binary)
>>191250.1130.0000.1740.000
>> compiler.py:1265(_process_anon)
>>191250.1090.0000.4510.000
>> compiler.py:1246(_truncated_identifier)
>>191230.1080.0002.4570.000 annotation.py:100(__eq__)
>>191230.1070.0002.1900.000 base.py:1306(visit_binary)
>>191230.1040.0002.0150.000 type_api.py:60(operate)
>>382830.1010.0000.1670.000 compiler.py:2901(quote)
>>191230.0950.0000.5770.000
>> compiler.py:1233(_truncate_bindparam)
>>192060.0940.0000.1630.000 elements.py:3918(__new__)
>>191250.0930.0000.3230.000 elements.py:4073(apply_map)
>>382460.0910.0000.0910.000 elements.py:640(self_group)
>>884600.0880.0000.0880.000 {method 'append' of 'list'
>> objects}
>>191230.0840.0002.6210.000 properties.py:269(operate)
>> 38264/95810.0840.0002.5860.000 compiler.py:804()
>>191230.0820.0000.7320.000
>> annotation.py:78(_compiler_dispatch)
>> 12703/31420.0760.0002.6050.001 {method 'join' of 'str'
>> objects}
>>191230.0750.0000.1290.000
>> compiler.py:959(_get_operator_dispatch)
>>10.0710.071 1082.250 1082.250 db_base.py:172(dict_delete)
>>191230.0700.0000.0890.000
>> compiler.py:1271(bindparam_string)
>>191230.0700.0002.0850.000 :1()
>> 95810.0670.0000.5200.000 elements.py:1883()
>> 95610.0640.0002.9050.000 db_base.py:208()
>> 23611/233230.0620.0000.0670.000 {built-in method 

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Ken M
Yes the double quote was a typo, sorry about that.

Your point about non varchar pk, that is why I will double check other dbs. SQL 
server lets it fly but I think you are right about Postgres.



Sent from my iPad

> On Aug 31, 2017, at 5:36 AM, Simon King  wrote:
> 
> The generic version would look something like this:
> 
> table = cls.__table__
> pkconditions = []
> for pk in cls.SQL_PK:
>pkconditions.append(table.c[pk] == sa.bindparam(pk)
> condition = sa.and_(*pkconditions)
> statement = sa.delete(table, whereclause=condition)
> batchparams = [dict(zip(cls.SQL_PK, v)) for v in id_batch]
> session.execute(statement, batchparams)
> 
> As for the portability of the textual version: your original code
> produced fragments like this:
> 
>  field == 'value'
> 
> a) Is the double-equals a typo, or is it syntax that SQL Server
> expects? I've never used double-equals in MySQL.
> b) Are the primary key fields always strings? You later said that they
> were, but if you ever had an integer primary key column, comparing it
> against a quoted value would be an error on postgres (I believe).
> c) Can the values that you are comparing against ever contain single
> quotes? That would break your query and potentially leave you open to
> SQL injection.
> d) Are your values unicode strings or byte strings? If unicode, does
> the encoding match what the database expects?
> 
> If none of those are issues for you, the textual version is probably
> pretty safe.
> 
> Simon
> 
>> On Wed, Aug 30, 2017 at 5:30 PM, Ken MacKenzie  wrote:
>> Not sure how I would iterate through a non predetermined number of primary
>> keys.
>> 
>> I guess part of me is wondering that although textual sql is not inherently
>> db neutral how different between the db targets is the where field = 'value'
>> syntax?
>> 
>>> On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4, Simon King wrote:
>>> 
>>> You could also try using executemany:
>>> 
>>> 
>>> http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements
>>> 
>>> I think it would look something like this:
>>> 
>>> table = cls.__table__
>>> condition = sa.and_(
>>>table.c.pk1 == sa.bindparam('pk1'),
>>>table.c.pk2 == sa.bindparam('pk2'),
>>> )
>>> statement = sa.delete(table, whereclause=condition)
>>> batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch]
>>> session.execute(statement, batchparams)
>>> 
>>> Simon
>>> 
 On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie  wrote:
 After the current sorted profile finishes I will revert to the textual
 version and run a profile on that.  I expect another 10-15 minutes for
 this
 to finish right now.
 
 At present the batch size is set to 1000, total record count is just
 over
 9000 in these tests.
 
 The reason for 1000 was at first I was looking at doing this as a
 tuple_(fld, fld).in_((val, val),(val,val)) format.  The 1000 should keep
 me
 under most DB restrictions on the in statement.
 
 However since SQL Server does not seem to support the tuple_ usage I
 reverted to this method.
 
 I technically have one more method and that is a concat_ in_ where I
 concat
 the fields.
 
 Other specifics, the table in question has 2 fields for the PK, both are
 varchar, one length 3, the other length 10.  There are 5 non key fields,
 3
 short varchars, one decimal at 14,2 precision and one varchar(800) which
 contains description text.
 
 Total record count of the table before any deletion is about 1.05
 million.
 
 Python version is 3.4.5, running on a modest CentOS desktop and to be
 fair
 the SQL Server instance is sub optimal for development.
 
> On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote:
> 
> It would be interesting to see the profile of the textual SQL version.
> It looks like most of the time is being spent inside pyodbc, rather
> than SQLAlchemy, so I guess it must be something to do with the
> processing of bind parameters. How many parameters are being sent in
> per query? ie. what is len(id_batch) * len(cls.SQL_PK)?
> 
> You could try playing with your batch sizes to see what sort of effect
> that has.
> 
> Simon
 
 --
 SQLAlchemy -
 The Python SQL Toolkit and Object Relational Mapper
 
 http://www.sqlalchemy.org/
 
 To post example code, please provide an MCVE: Minimal, Complete, and
 Verifiable Example. See http://stackoverflow.com/help/mcve for a full
 description.
 ---
 You received this message because you are subscribed to the Google
 Groups
 "sqlalchemy" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an
 email to sqlalchemy+...@googlegroups.com.
 To post to this group, send email to sqlal...@googlegroups.com.
 Visit 

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Simon King
I'm pretty sure the bad performance is due to pyodbc (or maybe SQL
Server) not handling *huge* numbers of bind parameters gracefully. You
are generating a query with (batch_size * n_primary_key_columns) bind
parameters, which even in your original version meant 2000 bind
parameters.

Try with a batch sizes of 10, 50 and 100. You'll run more queries but
I suspect each query will perform better.

Simon

On Wed, Aug 30, 2017 at 10:02 PM, Ken MacKenzie  wrote:
> With a batch size of 1 based on the original selection, no improvement.
>
> I am going to see what sql this is creating.
>
>  2328928 function calls (2085124 primitive calls) in 1084.925
> seconds
>
>Ordered by: internal time
>
>ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>26 1074.733   41.336 1074.733   41.336 {method 'execute' of
> 'pyodbc.Cursor' objects}
> 20.8710.4360.8720.436 {built-in method connect}
> 349128/3485230.5560.0000.5730.000 {built-in method
> isinstance}
>260.5020.0190.5020.019 {method 'close' of
> 'pyodbc.Cursor' objects}
>200.4850.0240.4850.024 {method 'commit' of
> 'pyodbc.Connection' objects}
> 161062/1610600.3170.0000.3180.000 {built-in method getattr}
> 191230.2590.0000.7890.000 elements.py:861(__init__)
>  95670.2580.0000.2580.000 {method 'fetchone' of
> 'pyodbc.Cursor' objects}
> 76492/191230.2480.0002.8420.000 operators.py:297(__eq__)
>  67012/240.2360.0002.6010.108
> visitors.py:75(_compiler_dispatch)
> 191230.2340.0000.4240.000 elements.py:2927(__init__)
>  95820.2130.0001.1040.000 elements.py:1860(_construct)
> 191320.2050.0000.4350.000 compiler.py:676(visit_column)
> 191230.1890.0001.8060.000
> compiler.py:1040(_generate_generic_binary)
> 191230.1830.0001.2630.000
> default_comparator.py:290(_check_literal)
> 57389/191430.1730.0002.7320.000 {built-in method eq}
> 191230.1710.0001.9120.000
> default_comparator.py:22(_boolean_compare)
> 286840.1690.0000.3460.000
> operators.py:1007(is_precedent)
> 191260.1430.0000.3520.000 elements.py:3962(__new__)
> 191230.1290.0000.7950.000
> compiler.py:1164(visit_bindparam)
> 191230.1280.0000.9180.000 elements.py:3818(_bind_param)
> 191530.1260.0000.5810.000 base.py:1128(decorate)
> 10.1250.125 1084.925 1084.925 del_test.py:1()
>1168450.1210.0000.1210.000 {method 'get' of 'dict'
> objects}
> 38264/95810.1150.0002.5670.000 compiler.py:806()
> 191230.1130.0002.0480.000 compiler.py:1006(visit_binary)
> 191250.1130.0000.1740.000
> compiler.py:1265(_process_anon)
> 191250.1090.0000.4510.000
> compiler.py:1246(_truncated_identifier)
> 191230.1080.0002.4570.000 annotation.py:100(__eq__)
> 191230.1070.0002.1900.000 base.py:1306(visit_binary)
> 191230.1040.0002.0150.000 type_api.py:60(operate)
> 382830.1010.0000.1670.000 compiler.py:2901(quote)
> 191230.0950.0000.5770.000
> compiler.py:1233(_truncate_bindparam)
> 192060.0940.0000.1630.000 elements.py:3918(__new__)
> 191250.0930.0000.3230.000 elements.py:4073(apply_map)
> 382460.0910.0000.0910.000 elements.py:640(self_group)
> 884600.0880.0000.0880.000 {method 'append' of 'list'
> objects}
> 191230.0840.0002.6210.000 properties.py:269(operate)
> 38264/95810.0840.0002.5860.000 compiler.py:804()
> 191230.0820.0000.7320.000
> annotation.py:78(_compiler_dispatch)
> 12703/31420.0760.0002.6050.001 {method 'join' of 'str'
> objects}
> 191230.0750.0000.1290.000
> compiler.py:959(_get_operator_dispatch)
> 10.0710.071 1082.250 1082.250 db_base.py:172(dict_delete)
> 191230.0700.0000.0890.000
> compiler.py:1271(bindparam_string)
> 191230.0700.0002.0850.000 :1()
>  95810.0670.0000.5200.000 elements.py:1883()
>  95610.0640.0002.9050.000 db_base.py:208()
> 23611/233230.0620.0000.0670.000 {built-in method hasattr}
> 287050.0610.0000.1590.000
> elements.py:4216(_expression_literal_as_text)
> 287460.0580.0000.0980.000
> elements.py:4220(_literal_as_text)
> 191230.0580.0000.1300.000
> type_api.py:504(coerce_compared_value)
>  95620.0560.0000.4160.000 result.py:1156(fetchone)
> 191250.0560.0000.230

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-31 Thread Simon King
The generic version would look something like this:

table = cls.__table__
pkconditions = []
for pk in cls.SQL_PK:
pkconditions.append(table.c[pk] == sa.bindparam(pk)
condition = sa.and_(*pkconditions)
statement = sa.delete(table, whereclause=condition)
batchparams = [dict(zip(cls.SQL_PK, v)) for v in id_batch]
session.execute(statement, batchparams)

As for the portability of the textual version: your original code
produced fragments like this:

  field == 'value'

a) Is the double-equals a typo, or is it syntax that SQL Server
expects? I've never used double-equals in MySQL.
b) Are the primary key fields always strings? You later said that they
were, but if you ever had an integer primary key column, comparing it
against a quoted value would be an error on postgres (I believe).
c) Can the values that you are comparing against ever contain single
quotes? That would break your query and potentially leave you open to
SQL injection.
d) Are your values unicode strings or byte strings? If unicode, does
the encoding match what the database expects?

If none of those are issues for you, the textual version is probably
pretty safe.

Simon

On Wed, Aug 30, 2017 at 5:30 PM, Ken MacKenzie  wrote:
> Not sure how I would iterate through a non predetermined number of primary
> keys.
>
> I guess part of me is wondering that although textual sql is not inherently
> db neutral how different between the db targets is the where field = 'value'
> syntax?
>
> On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4, Simon King wrote:
>>
>> You could also try using executemany:
>>
>>
>> http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements
>>
>> I think it would look something like this:
>>
>> table = cls.__table__
>> condition = sa.and_(
>> table.c.pk1 == sa.bindparam('pk1'),
>> table.c.pk2 == sa.bindparam('pk2'),
>> )
>> statement = sa.delete(table, whereclause=condition)
>> batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch]
>> session.execute(statement, batchparams)
>>
>> Simon
>>
>> On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie  wrote:
>> > After the current sorted profile finishes I will revert to the textual
>> > version and run a profile on that.  I expect another 10-15 minutes for
>> > this
>> > to finish right now.
>> >
>> > At present the batch size is set to 1000, total record count is just
>> > over
>> > 9000 in these tests.
>> >
>> > The reason for 1000 was at first I was looking at doing this as a
>> > tuple_(fld, fld).in_((val, val),(val,val)) format.  The 1000 should keep
>> > me
>> > under most DB restrictions on the in statement.
>> >
>> > However since SQL Server does not seem to support the tuple_ usage I
>> > reverted to this method.
>> >
>> > I technically have one more method and that is a concat_ in_ where I
>> > concat
>> > the fields.
>> >
>> > Other specifics, the table in question has 2 fields for the PK, both are
>> > varchar, one length 3, the other length 10.  There are 5 non key fields,
>> > 3
>> > short varchars, one decimal at 14,2 precision and one varchar(800) which
>> > contains description text.
>> >
>> > Total record count of the table before any deletion is about 1.05
>> > million.
>> >
>> > Python version is 3.4.5, running on a modest CentOS desktop and to be
>> > fair
>> > the SQL Server instance is sub optimal for development.
>> >
>> > On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote:
>> >>
>> >> It would be interesting to see the profile of the textual SQL version.
>> >> It looks like most of the time is being spent inside pyodbc, rather
>> >> than SQLAlchemy, so I guess it must be something to do with the
>> >> processing of bind parameters. How many parameters are being sent in
>> >> per query? ie. what is len(id_batch) * len(cls.SQL_PK)?
>> >>
>> >> You could try playing with your batch sizes to see what sort of effect
>> >> that has.
>> >>
>> >> Simon
>> >
>> > --
>> > SQLAlchemy -
>> > The Python SQL Toolkit and Object Relational Mapper
>> >
>> > http://www.sqlalchemy.org/
>> >
>> > To post example code, please provide an MCVE: Minimal, Complete, and
>> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>> > description.
>> > ---
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "sqlalchemy" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> > an
>> > email to sqlalchemy+...@googlegroups.com.
>> > To post to this group, send email to sqlal...@googlegroups.com.
>> > Visit this group at https://groups.google.com/group/sqlalchemy.
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received 

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken M
When I get in tomorrow I will take a look at some of these options. I also 
intend to spin up a quick MySQL or postgresql db to test if my textual solution 
can work against an alternate target.

Sent from my iPad

> On Aug 30, 2017, at 6:20 PM, Jonathan Vanasco  wrote:
> 
> Looking at your code and considering the size of your database, I've run into 
> performance issues under PostgreSQL a few times that may be related... 
> they're all usually because very slight text changes to a query can 
> drastically alter the server's query planner. 
> 
> The cause/fixes were:
> 
> 1. Indexes often got ignored, ending up with lots of row scans.  To fix, I 
> would wrap the "deletion" work with some sql to setup & tear down a new index 
> with all the columns present in the query.  That usually tricked it into 
> using the new index.
> 2. Too many fkey checks onto the table for deleted items.  Deferring 
> constraint checks often helped.
> 3. The position of the 'commit' ended up clearing the db's internal cache 
> each loop, so a reread of rows/indexes would happen each iteration.  
> Expanding the memory allocation for caching on the server helped handle that.
> 4. Sometimes there was a mix of all the above and/or an issue with the size 
> of the query (so many values).  What often helped was replacing the DELETE 
> with INSERTS of ids to delete into a new temporary table, then deleting from 
> that table.
> 

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Jonathan Vanasco
Looking at your code and considering the size of your database, I've run 
into performance issues under PostgreSQL a few times that may be related... 
they're all usually because very slight text changes to a query can 
drastically alter the server's query planner. 

The cause/fixes were:

1. Indexes often got ignored, ending up with lots of row scans.  To fix, I 
would wrap the "deletion" work with some sql to setup & tear down a new 
index with all the columns present in the query.  That usually tricked it 
into using the new index.
2. Too many fkey checks onto the table for deleted items.  Deferring 
constraint checks often helped.
3. The position of the 'commit' ended up clearing the db's internal cache 
each loop, so a reread of rows/indexes would happen each iteration. 
 Expanding the memory allocation for caching on the server helped handle 
that.
4. Sometimes there was a mix of all the above and/or an issue with the size 
of the query (so many values).  What often helped was replacing the DELETE 
with INSERTS of ids to delete into a new temporary table, then deleting 
from that table.


-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Mike Bayer
When you call create_engine:


e = create_engine(url, echo=True)


include the "echo=True" part.

Then look at the SQL being emitted.

That's the important part here.



On Wed, Aug 30, 2017 at 5:09 PM, Ken MacKenzie  wrote:
> So the big difference between the methods when seen with print(q)
>
> Textual SQL is building a statement with the values inline
>
> such as where college=888 and check=1234567
>
> The methods using the column object are building with parameters
> where college = (?) and check = (?)
>
> That is all I can see different.
>
>
> On Wednesday, August 30, 2017 at 4:41:38 PM UTC-4, Ken MacKenzie wrote:
>>
>> I am noticing the 16 execute calls.  There should only be 10 with a batch
>> size of 1000 and a record size of 9,000 and some change.
>>
>> Are you just wanting to see the sql via a print(q)?
>>
>> On Wednesday, August 30, 2017 at 2:01:31 PM UTC-4, Mike Bayer wrote:
>>>
>>> it looks like you have 16 SQL statements that take a very different
>>> amount of time depending on format.  Can you please paste the output
>>> with echo=True on create_engine() so that the difference in SQL
>>> statements can be seen?
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
So the big difference between the methods when seen with print(q)

Textual SQL is building a statement with the values inline

such as where college=888 and check=1234567

The methods using the column object are building with parameters
where college = (?) and check = (?)

That is all I can see different.


On Wednesday, August 30, 2017 at 4:41:38 PM UTC-4, Ken MacKenzie wrote:
>
> I am noticing the 16 execute calls.  There should only be 10 with a batch 
> size of 1000 and a record size of 9,000 and some change.
>
> Are you just wanting to see the sql via a print(q)?
>
> On Wednesday, August 30, 2017 at 2:01:31 PM UTC-4, Mike Bayer wrote:
>>
>> it looks like you have 16 SQL statements that take a very different 
>> amount of time depending on format.  Can you please paste the output 
>> with echo=True on create_engine() so that the difference in SQL 
>> statements can be seen? 
>>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
With a batch size of 1 based on the original selection, no improvement.

I am going to see what sql this is creating.

 2328928 function calls (2085124 primitive calls) in 1084.925 
seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   26 1074.733   41.336 1074.733   41.336 {method 'execute' of 
'pyodbc.Cursor' objects}
20.8710.4360.8720.436 {built-in method connect}
349128/3485230.5560.0000.5730.000 {built-in method 
isinstance}
   260.5020.0190.5020.019 {method 'close' of 
'pyodbc.Cursor' objects}
   200.4850.0240.4850.024 {method 'commit' of 
'pyodbc.Connection' objects}
161062/1610600.3170.0000.3180.000 {built-in method getattr}
191230.2590.0000.7890.000 elements.py:861(__init__)
 95670.2580.0000.2580.000 {method 'fetchone' of 
'pyodbc.Cursor' objects}
76492/191230.2480.0002.8420.000 operators.py:297(__eq__)
 67012/240.2360.0002.6010.108 
visitors.py:75(_compiler_dispatch)
191230.2340.0000.4240.000 elements.py:2927(__init__)
 95820.2130.0001.1040.000 elements.py:1860(_construct)
191320.2050.0000.4350.000 compiler.py:676(visit_column)
191230.1890.0001.8060.000 
compiler.py:1040(_generate_generic_binary)
191230.1830.0001.2630.000 
default_comparator.py:290(_check_literal)
57389/191430.1730.0002.7320.000 {built-in method eq}
191230.1710.0001.9120.000 
default_comparator.py:22(_boolean_compare)
286840.1690.0000.3460.000 
operators.py:1007(is_precedent)
191260.1430.0000.3520.000 elements.py:3962(__new__)
191230.1290.0000.7950.000 
compiler.py:1164(visit_bindparam)
191230.1280.0000.9180.000 elements.py:3818(_bind_param)
191530.1260.0000.5810.000 base.py:1128(decorate)
10.1250.125 1084.925 1084.925 del_test.py:1()
   1168450.1210.0000.1210.000 {method 'get' of 'dict' 
objects}
38264/95810.1150.0002.5670.000 compiler.py:806()
191230.1130.0002.0480.000 compiler.py:1006(visit_binary)
191250.1130.0000.1740.000 
compiler.py:1265(_process_anon)
191250.1090.0000.4510.000 
compiler.py:1246(_truncated_identifier)
191230.1080.0002.4570.000 annotation.py:100(__eq__)
191230.1070.0002.1900.000 base.py:1306(visit_binary)
191230.1040.0002.0150.000 type_api.py:60(operate)
382830.1010.0000.1670.000 compiler.py:2901(quote)
191230.0950.0000.5770.000 
compiler.py:1233(_truncate_bindparam)
192060.0940.0000.1630.000 elements.py:3918(__new__)
191250.0930.0000.3230.000 elements.py:4073(apply_map)
382460.0910.0000.0910.000 elements.py:640(self_group)
884600.0880.0000.0880.000 {method 'append' of 'list' 
objects}
191230.0840.0002.6210.000 properties.py:269(operate)
38264/95810.0840.0002.5860.000 compiler.py:804()
191230.0820.0000.7320.000 
annotation.py:78(_compiler_dispatch)
12703/31420.0760.0002.6050.001 {method 'join' of 'str' 
objects}
191230.0750.0000.1290.000 
compiler.py:959(_get_operator_dispatch)
10.0710.071 1082.250 1082.250 db_base.py:172(dict_delete)
191230.0700.0000.0890.000 
compiler.py:1271(bindparam_string)
191230.0700.0002.0850.000 :1()
 95810.0670.0000.5200.000 elements.py:1883()
 95610.0640.0002.9050.000 db_base.py:208()
23611/233230.0620.0000.0670.000 {built-in method hasattr}
287050.0610.0000.1590.000 
elements.py:4216(_expression_literal_as_text)
287460.0580.0000.0980.000 
elements.py:4220(_literal_as_text)
191230.0580.0000.1300.000 
type_api.py:504(coerce_compared_value)
 95620.0560.0000.4160.000 result.py:1156(fetchone)
191250.0560.0000.2300.000 
_collections.py:728(__missing__)
191230.0540.0000.2960.000 elements.py:2986(self_group)



-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email 

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
I am noticing the 16 execute calls.  There should only be 10 with a batch 
size of 1000 and a record size of 9,000 and some change.

Are you just wanting to see the sql via a print(q)?

On Wednesday, August 30, 2017 at 2:01:31 PM UTC-4, Mike Bayer wrote:
>
> it looks like you have 16 SQL statements that take a very different 
> amount of time depending on format.  Can you please paste the output 
> with echo=True on create_engine() so that the difference in SQL 
> statements can be seen? 
>
> On Wed, Aug 30, 2017 at 12:30 PM, Ken MacKenzie  > wrote: 
> > Not sure how I would iterate through a non predetermined number of 
> primary 
> > keys. 
> > 
> > I guess part of me is wondering that although textual sql is not 
> inherently 
> > db neutral how different between the db targets is the where field = 
> 'value' 
> > syntax? 
> > 
> > On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4, Simon King wrote: 
> >> 
> >> You could also try using executemany: 
> >> 
> >> 
> >> 
> http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements
>  
> >> 
> >> I think it would look something like this: 
> >> 
> >> table = cls.__table__ 
> >> condition = sa.and_( 
> >> table.c.pk1 == sa.bindparam('pk1'), 
> >> table.c.pk2 == sa.bindparam('pk2'), 
> >> ) 
> >> statement = sa.delete(table, whereclause=condition) 
> >> batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch] 
> >> session.execute(statement, batchparams) 
> >> 
> >> Simon 
> >> 
> >> On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie  
> wrote: 
> >> > After the current sorted profile finishes I will revert to the 
> textual 
> >> > version and run a profile on that.  I expect another 10-15 minutes 
> for 
> >> > this 
> >> > to finish right now. 
> >> > 
> >> > At present the batch size is set to 1000, total record count is just 
> >> > over 
> >> > 9000 in these tests. 
> >> > 
> >> > The reason for 1000 was at first I was looking at doing this as a 
> >> > tuple_(fld, fld).in_((val, val),(val,val)) format.  The 1000 should 
> keep 
> >> > me 
> >> > under most DB restrictions on the in statement. 
> >> > 
> >> > However since SQL Server does not seem to support the tuple_ usage I 
> >> > reverted to this method. 
> >> > 
> >> > I technically have one more method and that is a concat_ in_ where I 
> >> > concat 
> >> > the fields. 
> >> > 
> >> > Other specifics, the table in question has 2 fields for the PK, both 
> are 
> >> > varchar, one length 3, the other length 10.  There are 5 non key 
> fields, 
> >> > 3 
> >> > short varchars, one decimal at 14,2 precision and one varchar(800) 
> which 
> >> > contains description text. 
> >> > 
> >> > Total record count of the table before any deletion is about 1.05 
> >> > million. 
> >> > 
> >> > Python version is 3.4.5, running on a modest CentOS desktop and to be 
> >> > fair 
> >> > the SQL Server instance is sub optimal for development. 
> >> > 
> >> > On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: 
> >> >> 
> >> >> It would be interesting to see the profile of the textual SQL 
> version. 
> >> >> It looks like most of the time is being spent inside pyodbc, rather 
> >> >> than SQLAlchemy, so I guess it must be something to do with the 
> >> >> processing of bind parameters. How many parameters are being sent in 
> >> >> per query? ie. what is len(id_batch) * len(cls.SQL_PK)? 
> >> >> 
> >> >> You could try playing with your batch sizes to see what sort of 
> effect 
> >> >> that has. 
> >> >> 
> >> >> Simon 
> >> > 
> >> > -- 
> >> > SQLAlchemy - 
> >> > The Python SQL Toolkit and Object Relational Mapper 
> >> > 
> >> > http://www.sqlalchemy.org/ 
> >> > 
> >> > To post example code, please provide an MCVE: Minimal, Complete, and 
> >> > Verifiable Example. See http://stackoverflow.com/help/mcve for a 
> full 
> >> > description. 
> >> > --- 
> >> > You received this message because you are subscribed to the Google 
> >> > Groups 
> >> > "sqlalchemy" group. 
> >> > To unsubscribe from this group and stop receiving emails from it, 
> send 
> >> > an 
> >> > email to sqlalchemy+...@googlegroups.com. 
> >> > To post to this group, send email to sqlal...@googlegroups.com. 
> >> > Visit this group at https://groups.google.com/group/sqlalchemy. 
> >> > For more options, visit https://groups.google.com/d/optout. 
> > 
> > -- 
> > SQLAlchemy - 
> > The Python SQL Toolkit and Object Relational Mapper 
> > 
> > http://www.sqlalchemy.org/ 
> > 
> > To post example code, please provide an MCVE: Minimal, Complete, and 
> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> > description. 
> > --- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "sqlalchemy" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an 
> > email to sqlalchemy+...@googlegroups.com . 
> > To post to this group, send email to sqlal...@googlegroups.com 

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
I tried an alternate version:

@classmethod
def dict_delete2(cls, dict_in):
"""
Delete records from the database
based on a dictionary keyed by
PK tuple
"""
s = dbc.get_session()
keys_in = ["-".join(k) for k in dict_in.keys()]
batch_size = 1000
cols = [getattr(cls, colname) for colname in cls.SQL_PK]
cols_dash = []
for col in cols:
cols_dash.append(col)
cols_dash.append('-')

cols_dash = cols_dash[:-1]

print(cols_dash)
while len(keys_in):
id_batch = keys_in[:batch_size]
# print(id_batch)
del keys_in[:batch_size]
q = s.query(cls)

q = 
q.filter(func.concat(*cols_dash).in_(id_batch)).delete(synchronize_session=False)

s.commit()

s.close()


Performance was even worse?

 673960 function calls (659542 primitive calls) in 2315.534 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   16 2312.810  144.551 2312.810  144.551 {method 'execute' of 
'pyodbc.Cursor' objects}
 95670.1860.0000.1860.000 {method 'fetchone' of 
'pyodbc.Cursor' objects}
 95720.1660.0000.4330.000 elements.py:861(__init__)
10.1190.119 2315.536 2315.536 del_test.py:1()
62442/618370.1010.0000.1180.000 {built-in method isinstance}
 95720.0680.0000.4070.000 
compiler.py:1164(visit_bindparam)
 95750.0620.0000.1710.000 elements.py:3962(__new__)
 95610.0600.0000.4920.000 functions.py:436(_bind_param)
 95740.0590.0000.0900.000 
compiler.py:1265(_process_anon)
 95740.0540.0000.2270.000 
compiler.py:1246(_truncated_identifier)
   100.0520.0050.7380.074 
default_comparator.py:110(_in_impl)
 95620.0510.0000.3190.000 result.py:1156(fetchone)
 96550.0490.0000.0840.000 elements.py:3918(__new__)
 95720.0470.0000.2910.000 
compiler.py:1233(_truncate_bindparam)
20.0470.0230.0480.024 {built-in method connect}
 95740.0470.0000.1630.000 elements.py:4073(apply_map)
   100.0420.0040.0420.004 {method 'commit' of 
'pyodbc.Connection' objects}
406490.0400.0000.0400.000 {method 'append' of 'list' 
objects}
 95720.0390.0000.0490.000 
compiler.py:1271(bindparam_string)
  9696/140.0390.0000.5210.037 
visitors.py:75(_compiler_dispatch)
 96110.0390.0000.4850.000 compiler.py:806()
14090/138020.0370.0000.0420.000 {built-in method hasattr}
17697/176950.0370.0000.0380.000 {built-in method getattr}
 95620.0370.0000.0580.000 result.py:1085(process_rows)
 95610.0360.0000.0800.000 elements.py:4254(_is_literal)
286830.0360.0000.0360.000 del_test.py:14()
 95620.0300.0000.0680.000 
type_api.py:504(coerce_compared_value)
   220.0290.0010.1030.005 elements.py:1784()
127130.0270.0000.5360.000 {method 'join' of 'str' 
objects}
 95720.0260.0000.0380.000 
sqltypes.py:2608(_resolve_value_to_type)
   160.0260.0020.0260.002 {method 'close' of 
'pyodbc.Cursor' objects}
 95740.0260.0000.1160.000 
_collections.py:728(__missing__)
 95840.0260.0000.0670.000 compiler.py:494()
213250.0250.0000.0250.000 {method 'get' of 'dict' 
objects}
  1250.0250.0000.0250.000 {built-in method loads}
  750/7240.0250.0000.3380.000 {built-in method 
__build_class__}
 95720.0250.0000.0410.000 
type_api.py:452(_cached_bind_processor)
 95620.0240.0000.3430.000 result.py:868(__iter__)
 95640.0220.0000.2090.000 result.py:1053(_fetchone_impl)
 96030.0220.0000.0220.000 elements.py:640(self_group)
102060.0210.0000.0210.000 {method 'split' of 'str' 
objects}
 95620.0210.0000.0210.000 result.py:1098()
 96110.0210.0000.5060.000 compiler.py:804()
104210.0200.0000.0200.000 {built-in method __new__ of 
type object at 0x7fc6ee607e40}
 96030.0200.0000.0530.000 
elements.py:4216(_expression_literal_as_text)
 96240.0190.0000.0330.000 
elements.py:4220(_literal_as_text)
18953/188380.0190.0000.0190.000 {built-in method len}
102090.0160.0000.0160.000 weakref.py:364(__getitem__)
290/10.0160.000 2315.538 2315.538 {built-in method exec}
   680.0160.0000.0420.001 inspect.py:264(getmembers)
  3610.0140.000

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Mike Bayer
it looks like you have 16 SQL statements that take a very different
amount of time depending on format.  Can you please paste the output
with echo=True on create_engine() so that the difference in SQL
statements can be seen?

On Wed, Aug 30, 2017 at 12:30 PM, Ken MacKenzie  wrote:
> Not sure how I would iterate through a non predetermined number of primary
> keys.
>
> I guess part of me is wondering that although textual sql is not inherently
> db neutral how different between the db targets is the where field = 'value'
> syntax?
>
> On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4, Simon King wrote:
>>
>> You could also try using executemany:
>>
>>
>> http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements
>>
>> I think it would look something like this:
>>
>> table = cls.__table__
>> condition = sa.and_(
>> table.c.pk1 == sa.bindparam('pk1'),
>> table.c.pk2 == sa.bindparam('pk2'),
>> )
>> statement = sa.delete(table, whereclause=condition)
>> batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch]
>> session.execute(statement, batchparams)
>>
>> Simon
>>
>> On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie  wrote:
>> > After the current sorted profile finishes I will revert to the textual
>> > version and run a profile on that.  I expect another 10-15 minutes for
>> > this
>> > to finish right now.
>> >
>> > At present the batch size is set to 1000, total record count is just
>> > over
>> > 9000 in these tests.
>> >
>> > The reason for 1000 was at first I was looking at doing this as a
>> > tuple_(fld, fld).in_((val, val),(val,val)) format.  The 1000 should keep
>> > me
>> > under most DB restrictions on the in statement.
>> >
>> > However since SQL Server does not seem to support the tuple_ usage I
>> > reverted to this method.
>> >
>> > I technically have one more method and that is a concat_ in_ where I
>> > concat
>> > the fields.
>> >
>> > Other specifics, the table in question has 2 fields for the PK, both are
>> > varchar, one length 3, the other length 10.  There are 5 non key fields,
>> > 3
>> > short varchars, one decimal at 14,2 precision and one varchar(800) which
>> > contains description text.
>> >
>> > Total record count of the table before any deletion is about 1.05
>> > million.
>> >
>> > Python version is 3.4.5, running on a modest CentOS desktop and to be
>> > fair
>> > the SQL Server instance is sub optimal for development.
>> >
>> > On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote:
>> >>
>> >> It would be interesting to see the profile of the textual SQL version.
>> >> It looks like most of the time is being spent inside pyodbc, rather
>> >> than SQLAlchemy, so I guess it must be something to do with the
>> >> processing of bind parameters. How many parameters are being sent in
>> >> per query? ie. what is len(id_batch) * len(cls.SQL_PK)?
>> >>
>> >> You could try playing with your batch sizes to see what sort of effect
>> >> that has.
>> >>
>> >> Simon
>> >
>> > --
>> > SQLAlchemy -
>> > The Python SQL Toolkit and Object Relational Mapper
>> >
>> > http://www.sqlalchemy.org/
>> >
>> > To post example code, please provide an MCVE: Minimal, Complete, and
>> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>> > description.
>> > ---
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "sqlalchemy" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> > an
>> > email to sqlalchemy+...@googlegroups.com.
>> > To post to this group, send email to sqlal...@googlegroups.com.
>> > Visit this group at https://groups.google.com/group/sqlalchemy.
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to 

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
Not sure how I would iterate through a non predetermined number of primary 
keys.

I guess part of me is wondering that although textual sql is not inherently 
db neutral how different between the db targets is the where field = 
'value' syntax?

On Wednesday, August 30, 2017 at 12:07:52 PM UTC-4, Simon King wrote:
>
> You could also try using executemany: 
>
>
> http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements
>  
>
> I think it would look something like this: 
>
> table = cls.__table__ 
> condition = sa.and_( 
> table.c.pk1 == sa.bindparam('pk1'), 
> table.c.pk2 == sa.bindparam('pk2'), 
> ) 
> statement = sa.delete(table, whereclause=condition) 
> batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch] 
> session.execute(statement, batchparams) 
>
> Simon 
>
> On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie  > wrote: 
> > After the current sorted profile finishes I will revert to the textual 
> > version and run a profile on that.  I expect another 10-15 minutes for 
> this 
> > to finish right now. 
> > 
> > At present the batch size is set to 1000, total record count is just 
> over 
> > 9000 in these tests. 
> > 
> > The reason for 1000 was at first I was looking at doing this as a 
> > tuple_(fld, fld).in_((val, val),(val,val)) format.  The 1000 should keep 
> me 
> > under most DB restrictions on the in statement. 
> > 
> > However since SQL Server does not seem to support the tuple_ usage I 
> > reverted to this method. 
> > 
> > I technically have one more method and that is a concat_ in_ where I 
> concat 
> > the fields. 
> > 
> > Other specifics, the table in question has 2 fields for the PK, both are 
> > varchar, one length 3, the other length 10.  There are 5 non key fields, 
> 3 
> > short varchars, one decimal at 14,2 precision and one varchar(800) which 
> > contains description text. 
> > 
> > Total record count of the table before any deletion is about 1.05 
> million. 
> > 
> > Python version is 3.4.5, running on a modest CentOS desktop and to be 
> fair 
> > the SQL Server instance is sub optimal for development. 
> > 
> > On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote: 
> >> 
> >> It would be interesting to see the profile of the textual SQL version. 
> >> It looks like most of the time is being spent inside pyodbc, rather 
> >> than SQLAlchemy, so I guess it must be something to do with the 
> >> processing of bind parameters. How many parameters are being sent in 
> >> per query? ie. what is len(id_batch) * len(cls.SQL_PK)? 
> >> 
> >> You could try playing with your batch sizes to see what sort of effect 
> >> that has. 
> >> 
> >> Simon 
> > 
> > -- 
> > SQLAlchemy - 
> > The Python SQL Toolkit and Object Relational Mapper 
> > 
> > http://www.sqlalchemy.org/ 
> > 
> > To post example code, please provide an MCVE: Minimal, Complete, and 
> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> > description. 
> > --- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "sqlalchemy" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an 
> > email to sqlalchemy+...@googlegroups.com . 
> > To post to this group, send email to sqlal...@googlegroups.com 
> . 
> > Visit this group at https://groups.google.com/group/sqlalchemy. 
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Simon King
You could also try using executemany:

http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements

I think it would look something like this:

table = cls.__table__
condition = sa.and_(
table.c.pk1 == sa.bindparam('pk1'),
table.c.pk2 == sa.bindparam('pk2'),
)
statement = sa.delete(table, whereclause=condition)
batchparams = [{'pk1': v[0], 'pk2': v[1]} for v in id_batch]
session.execute(statement, batchparams)

Simon

On Wed, Aug 30, 2017 at 4:28 PM, Ken MacKenzie  wrote:
> After the current sorted profile finishes I will revert to the textual
> version and run a profile on that.  I expect another 10-15 minutes for this
> to finish right now.
>
> At present the batch size is set to 1000, total record count is just over
> 9000 in these tests.
>
> The reason for 1000 was at first I was looking at doing this as a
> tuple_(fld, fld).in_((val, val),(val,val)) format.  The 1000 should keep me
> under most DB restrictions on the in statement.
>
> However since SQL Server does not seem to support the tuple_ usage I
> reverted to this method.
>
> I technically have one more method and that is a concat_ in_ where I concat
> the fields.
>
> Other specifics, the table in question has 2 fields for the PK, both are
> varchar, one length 3, the other length 10.  There are 5 non key fields, 3
> short varchars, one decimal at 14,2 precision and one varchar(800) which
> contains description text.
>
> Total record count of the table before any deletion is about 1.05 million.
>
> Python version is 3.4.5, running on a modest CentOS desktop and to be fair
> the SQL Server instance is sub optimal for development.
>
> On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote:
>>
>> It would be interesting to see the profile of the textual SQL version.
>> It looks like most of the time is being spent inside pyodbc, rather
>> than SQLAlchemy, so I guess it must be something to do with the
>> processing of bind parameters. How many parameters are being sent in
>> per query? ie. what is len(id_batch) * len(cls.SQL_PK)?
>>
>> You could try playing with your batch sizes to see what sort of effect
>> that has.
>>
>> Simon
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
Sorted Textual SQL profile

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   161.8400.1151.8400.115 {method 'execute' of 
'pyodbc.Cursor' objects}
 95720.1960.0000.7880.000 elements.py:1860(_construct)
577000.1890.0000.1890.000 {method 'sub' of 
'_sre.SRE_Pattern' objects}
 95670.1850.0000.1850.000 {method 'fetchone' of 
'pyodbc.Cursor' objects}
100329/997240.1250.0000.1420.000 {built-in method 
isinstance}
191230.1200.0000.2000.000 elements.py:1225(__init__)
10.1170.1175.0425.042 del_test.py:1()
191230.1000.0000.2220.000 
compiler.py:737(visit_textclause)
38254/95710.0990.0000.5660.000 compiler.py:806()
 28738/140.0960.0000.5980.043 
visitors.py:75(_compiler_dispatch)
191230.0860.0000.2860.000 elements.py:1240(_create_text)
38254/95710.0790.0000.5850.000 compiler.py:804()
12693/31320.0680.0000.6030.000 {method 'join' of 'str' 
objects}
10.0650.0653.8113.811 db_base.py:138(dict_delete)
 95610.0620.0000.3980.000 db_base.py:178()
 95710.0600.0000.2400.000 elements.py:1883()
286950.0600.0000.1560.000 
elements.py:4216(_expression_literal_as_text)
287160.0580.0000.0960.000 
elements.py:4220(_literal_as_text)
 95620.0520.0000.3170.000 result.py:1156(fetchone)
 95620.0510.0000.1050.000 
operators.py:1007(is_precedent)
501540.0500.0000.0500.000 {method 'append' of 'list' 
objects}
191230.0500.0000.3360.000 :1(text)
 95720.0480.0000.2030.000 elements.py:1865()
27063/270610.0470.0000.0480.000 {built-in method getattr}
20.0440.0220.0450.022 {built-in method connect}
  9571/100.0370.0000.5950.059 
compiler.py:797(visit_clauselist)
 95620.0370.0000.5340.000 elements.py:1894(and_)
286830.0360.0000.0360.000 del_test.py:14()
 95620.0360.0000.0570.000 result.py:1085(process_rows)
 95610.0290.0000.1590.000 elements.py:1971(self_group)
28504/283890.0280.0000.0280.000 {built-in method len}
  1250.0260.0000.0260.000 {built-in method loads}
  750/7240.0260.0000.3400.000 {built-in method 
__build_class__}
 95610.0250.0000.5600.000 :1(and_)
 95610.0240.0000.1300.000 elements.py:1818(self_group)
 95620.0240.0000.3410.000 result.py:868(__iter__)
 95640.0220.0000.2070.000 result.py:1053(_fetchone_impl)
 95720.0220.0000.0320.000 
_collections.py:788(coerce_generator_arg)
 95620.0210.0000.0210.000 result.py:1098()
191220.0210.0000.0210.000 elements.py:1601(self_group)
212320.0210.0000.0210.000 {method 'get' of 'dict' 
objects}
   100.0190.0020.0190.002 {method 'commit' of 
'pyodbc.Connection' objects}
191230.0190.0000.0190.000 
compiler.py:734(post_process_text)
290/10.0160.0005.0445.044 {built-in method exec}
   680.0150.0000.0420.001 inspect.py:264(getmembers)
  3610.0140.0000.1030.000 inspect.py:943(getfullargspec)
 28840.0140.0000.0200.000 sre_parse.py:197(__next)
104200.0130.0000.0130.000 {built-in method __new__ of 
type object at 0x7f9abdca4e40}
 95610.0120.0000.0120.000 result.py:168(keys)
  3610.0120.0000.0650.000 inspect.py:2383(from_function)
  7140.0100.0000.0180.000 inspect.py:1056(formatargspec)
  2960.0100.0000.0410.000 :2016(find_spec)
4486/41980.0100.0000.0150.000 {built-in method hasattr}
   130/380.0100.0000.0440.001 sre_parse.py:448(_parse)


-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
Sorted:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   16 1053.794   65.862 1053.794   65.862 {method 'execute' of 
'pyodbc.Cursor' objects}
348916/3483110.5390.0000.5560.000 {built-in method 
isinstance}
160920/1609180.3130.0000.3130.000 {built-in method getattr}
191230.2820.0000.4710.000 elements.py:2927(__init__)
 66982/140.2340.0002.5610.183 
visitors.py:75(_compiler_dispatch)
191230.2310.0000.7020.000 elements.py:861(__init__)
76492/191230.2280.0002.7400.000 operators.py:297(__eq__)
 95720.2130.0001.0920.000 elements.py:1860(_construct)
191230.2010.0001.8780.000 
default_comparator.py:22(_boolean_compare)
191320.1940.0000.4240.000 compiler.py:676(visit_column)
 95670.1900.0000.1900.000 {method 'fetchone' of 
'pyodbc.Cursor' objects}
191230.1830.0001.7610.000 
compiler.py:1040(_generate_generic_binary)
191230.1800.0001.1550.000 
default_comparator.py:290(_check_literal)
286840.1710.0000.3470.000 
operators.py:1007(is_precedent)
57379/191330.1570.0002.6340.000 {built-in method eq}
191230.1330.0000.7860.000 
compiler.py:1164(visit_bindparam)
191430.1230.0000.5660.000 base.py:1128(decorate)
   1168350.1200.0000.1200.000 {method 'get' of 'dict' 
objects}
10.1170.117 1061.906 1061.906 del_test.py:1()
191230.1160.0002.0050.000 compiler.py:1006(visit_binary)
191230.1140.0000.8160.000 elements.py:3818(_bind_param)
191230.1130.0002.1540.000 base.py:1306(visit_binary)
191250.1110.0000.1730.000 
compiler.py:1265(_process_anon)
38254/95710.1110.0002.5280.000 compiler.py:806()
382730.1020.0000.1670.000 compiler.py:2901(quote)
191230.0990.0001.9760.000 type_api.py:60(operate)
191230.0980.0000.5650.000 
compiler.py:1233(_truncate_bindparam)
191250.0970.0000.4380.000 
compiler.py:1246(_truncated_identifier)
191260.0970.0000.2950.000 elements.py:3962(__new__)
382460.0930.0000.0930.000 elements.py:640(self_group)
191230.0920.0002.3760.000 annotation.py:100(__eq__)
191250.0920.0000.3210.000 elements.py:4073(apply_map)
192060.0890.0000.1550.000 elements.py:3918(__new__)
883980.0870.0000.0870.000 {method 'append' of 'list' 
objects}
38254/95710.0840.0002.5470.000 compiler.py:804()
191230.0820.0002.5260.000 properties.py:269(operate)
191230.0750.0000.1280.000 
compiler.py:959(_get_operator_dispatch)
12693/31320.0730.0002.5660.001 {method 'join' of 'str' 
objects}
191230.0690.0000.7040.000 
annotation.py:78(_compiler_dispatch)
191230.0680.0000.0870.000 
compiler.py:1271(bindparam_string)
10.0680.068 1060.665 1060.665 db_base.py:138(dict_delete)
 95710.0650.0000.5200.000 elements.py:1883()
191230.0650.0002.0410.000 :1()
 95610.0620.0002.8020.000 db_base.py:174()
23610/233220.0620.0000.0680.000 {built-in method hasattr}
286950.0610.0000.1560.000 
elements.py:4216(_expression_literal_as_text)
191230.0590.0000.1290.000 
type_api.py:504(coerce_compared_value)
287160.0570.0000.0950.000 
elements.py:4220(_literal_as_text)
191250.0570.0000.2290.000 
_collections.py:728(__missing__)
191230.0530.0000.2940.000 elements.py:2986(self_group)
 95620.0530.0000.3250.000 result.py:1156(fetchone)
191350.0530.0000.1340.000 compiler.py:494()
191230.0500.0000.0700.000 
sqltypes.py:2608(_resolve_value_to_type)
191230.0500.0000.0820.000 
type_api.py:452(_cached_bind_processor)



-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
After the current sorted profile finishes I will revert to the textual 
version and run a profile on that.  I expect another 10-15 minutes for this 
to finish right now.

At present the batch size is set to 1000, total record count is just over 
9000 in these tests.

The reason for 1000 was at first I was looking at doing this as a 
tuple_(fld, fld).in_((val, val),(val,val)) format.  The 1000 should keep me 
under most DB restrictions on the in statement.

However since SQL Server does not seem to support the tuple_ usage I 
reverted to this method.

I technically have one more method and that is a concat_ in_ where I concat 
the fields.

Other specifics, the table in question has 2 fields for the PK, both are 
varchar, one length 3, the other length 10.  There are 5 non key fields, 3 
short varchars, one decimal at 14,2 precision and one varchar(800) which 
contains description text.

Total record count of the table before any deletion is about 1.05 million.

Python version is 3.4.5, running on a modest CentOS desktop and to be fair 
the SQL Server instance is sub optimal for development.

On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote:
>
> It would be interesting to see the profile of the textual SQL version. 
> It looks like most of the time is being spent inside pyodbc, rather 
> than SQLAlchemy, so I guess it must be something to do with the 
> processing of bind parameters. How many parameters are being sent in 
> per query? ie. what is len(id_batch) * len(cls.SQL_PK)? 
>
> You could try playing with your batch sizes to see what sort of effect 
> that has. 
>
> Simon 
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
After I finish the current profile run I will revert and profile that one 
for comparison.  I am expecting another 10-15 minutes for this to finish.

On Wednesday, August 30, 2017 at 11:18:13 AM UTC-4, Simon King wrote:
>
> It would be interesting to see the profile of the textual SQL version. 
> It looks like most of the time is being spent inside pyodbc, rather 
> than SQLAlchemy, so I guess it must be something to do with the 
> processing of bind parameters. How many parameters are being sent in 
> per query? ie. what is len(id_batch) * len(cls.SQL_PK)? 
>
> You could try playing with your batch sizes to see what sort of effect 
> that has. 
>
> Simon 
>
> On Wed, Aug 30, 2017 at 4:01 PM, Ken MacKenzie  > wrote: 
> > So I implemented this version with one change.  I moved this line: 
> > 
> > cols = [getattr(cls, colname) for colname in cls.SQL_PK] 
> > 
> > To the beginning before the loop, actually before the outer loop that 
> parses 
> > the batch. 
> > 
> > However, the new version is MUCH slower than the original, so I profiled 
> it 
> > and this is what I get: 
> > 
> >  %prun import db_api_lib.del_test 
> >  1421296 function calls (1277536 primitive calls) in 620.434 
> seconds 
> > 
> >Ordered by: internal time 
> > 
> >ncalls  tottime  percall  cumtime  percall filename:lineno(function) 
> >12  615.512   51.293  615.512   51.293 {method 'execute' of 
> > 'pyodbc.Cursor' objects} 
> > 208876/2082710.3160.0000.3320.000 {built-in method 
> > isinstance} 
> > 96876/968740.1840.0000.1850.000 {built-in method 
> getattr} 
> > 111230.1370.0000.2420.000 elements.py:2927(__init__) 
> >  38970/100.1360.0001.5020.150 
> > visitors.py:75(_compiler_dispatch) 
> > 111230.1280.0000.3970.000 elements.py:861(__init__) 
> > 44492/111230.1270.0001.5040.000 operators.py:297(__eq__) 
> >  55680.1160.0000.6170.000 
> elements.py:1860(_construct) 
> > 111320.1130.0000.2460.000 
> compiler.py:676(visit_column) 
> > 111230.1110.0001.0390.000 
> > compiler.py:1040(_generate_generic_binary) 
> >  55670.1080.0000.1080.000 {method 'fetchone' of 
> > 'pyodbc.Cursor' objects} 
> > 111230.0990.0000.6460.000 
> > default_comparator.py:290(_check_literal) 
> > 166840.0950.0000.1950.000 
> > operators.py:1007(is_precedent) 
> > 111230.0940.0001.0100.000 
> > default_comparator.py:22(_boolean_compare) 
> > 33375/111290.0900.0001.4450.000 {built-in method eq} 
> > 111230.0750.0000.4650.000 
> > compiler.py:1164(visit_bindparam) 
> > 10.0750.075  619.467  619.467 
> db_base.py:138(dict_delete) 
> > 111390.0730.0000.3310.000 base.py:1128(decorate) 
> > 688310.0700.0000.0700.000 {method 'get' of 'dict' 
> > objects} 
> > 10.0680.068  620.471  620.471 del_test.py:1() 
> > 22250/55670.0670.0001.4820.000 
> compiler.py:806() 
> > 111250.0660.0000.1020.000 
> > compiler.py:1265(_process_anon) 
> > 111230.0650.0001.1790.000 
> compiler.py:1006(visit_binary) 
> > 111230.0630.0001.2630.000 base.py:1306(visit_binary) 
> > 111230.0610.0000.4580.000 
> elements.py:3818(_bind_param) 
> > 222690.0580.0000.0970.000 compiler.py:2901(quote) 
> > 111250.0580.0000.2610.000 
> > compiler.py:1246(_truncated_identifier) 
> > 111230.0580.0001.2990.000 annotation.py:100(__eq__) 
> > 111230.0560.0001.0660.000 type_api.py:60(operate) 
> > 563860.0550.0000.0550.000 {method 'append' of 'list' 
> > objects} 
> > 111260.0550.0000.1680.000 elements.py:3962(__new__) 
> > 111230.0550.0000.3340.000 
> > compiler.py:1233(_truncate_bindparam) 
> > 111250.0540.0000.1920.000 
> elements.py:4073(apply_map) 
> > 112060.0510.0000.0890.000 elements.py:3918(__new__) 
> > 222460.0510.0000.0510.000 
> elements.py:640(self_group) 
> > 22250/55670.0490.0001.4930.000 
> compiler.py:804() 
> > 20.0480.0240.0490.024 {built-in method connect} 
> > 111230.0470.0001.3860.000 properties.py:269(operate) 
> > 8689/31280.0450.0001.5080.000 {method 'join' of 'str' 
> > objects} 
> > 111230.0450.0000.0760.000 
> > compiler.py:959(_get_operator_dispatch) 
> > 111230.0450.0000.0560.000 
> > compiler.py:1271(bindparam_string) 
> > 111230.0400.0000.4110.000 
> > annotation.py:78(_compiler_dispatch) 
> > 15610/153220.0380.0000.0430.000 {built-in method 
> 

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Simon King
It would be interesting to see the profile of the textual SQL version.
It looks like most of the time is being spent inside pyodbc, rather
than SQLAlchemy, so I guess it must be something to do with the
processing of bind parameters. How many parameters are being sent in
per query? ie. what is len(id_batch) * len(cls.SQL_PK)?

You could try playing with your batch sizes to see what sort of effect that has.

Simon

On Wed, Aug 30, 2017 at 4:01 PM, Ken MacKenzie  wrote:
> So I implemented this version with one change.  I moved this line:
>
> cols = [getattr(cls, colname) for colname in cls.SQL_PK]
>
> To the beginning before the loop, actually before the outer loop that parses
> the batch.
>
> However, the new version is MUCH slower than the original, so I profiled it
> and this is what I get:
>
>  %prun import db_api_lib.del_test
>  1421296 function calls (1277536 primitive calls) in 620.434 seconds
>
>Ordered by: internal time
>
>ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>12  615.512   51.293  615.512   51.293 {method 'execute' of
> 'pyodbc.Cursor' objects}
> 208876/2082710.3160.0000.3320.000 {built-in method
> isinstance}
> 96876/968740.1840.0000.1850.000 {built-in method getattr}
> 111230.1370.0000.2420.000 elements.py:2927(__init__)
>  38970/100.1360.0001.5020.150
> visitors.py:75(_compiler_dispatch)
> 111230.1280.0000.3970.000 elements.py:861(__init__)
> 44492/111230.1270.0001.5040.000 operators.py:297(__eq__)
>  55680.1160.0000.6170.000 elements.py:1860(_construct)
> 111320.1130.0000.2460.000 compiler.py:676(visit_column)
> 111230.1110.0001.0390.000
> compiler.py:1040(_generate_generic_binary)
>  55670.1080.0000.1080.000 {method 'fetchone' of
> 'pyodbc.Cursor' objects}
> 111230.0990.0000.6460.000
> default_comparator.py:290(_check_literal)
> 166840.0950.0000.1950.000
> operators.py:1007(is_precedent)
> 111230.0940.0001.0100.000
> default_comparator.py:22(_boolean_compare)
> 33375/111290.0900.0001.4450.000 {built-in method eq}
> 111230.0750.0000.4650.000
> compiler.py:1164(visit_bindparam)
> 10.0750.075  619.467  619.467 db_base.py:138(dict_delete)
> 111390.0730.0000.3310.000 base.py:1128(decorate)
> 688310.0700.0000.0700.000 {method 'get' of 'dict'
> objects}
> 10.0680.068  620.471  620.471 del_test.py:1()
> 22250/55670.0670.0001.4820.000 compiler.py:806()
> 111250.0660.0000.1020.000
> compiler.py:1265(_process_anon)
> 111230.0650.0001.1790.000 compiler.py:1006(visit_binary)
> 111230.0630.0001.2630.000 base.py:1306(visit_binary)
> 111230.0610.0000.4580.000 elements.py:3818(_bind_param)
> 222690.0580.0000.0970.000 compiler.py:2901(quote)
> 111250.0580.0000.2610.000
> compiler.py:1246(_truncated_identifier)
> 111230.0580.0001.2990.000 annotation.py:100(__eq__)
> 111230.0560.0001.0660.000 type_api.py:60(operate)
> 563860.0550.0000.0550.000 {method 'append' of 'list'
> objects}
> 111260.0550.0000.1680.000 elements.py:3962(__new__)
> 111230.0550.0000.3340.000
> compiler.py:1233(_truncate_bindparam)
> 111250.0540.0000.1920.000 elements.py:4073(apply_map)
> 112060.0510.0000.0890.000 elements.py:3918(__new__)
> 222460.0510.0000.0510.000 elements.py:640(self_group)
> 22250/55670.0490.0001.4930.000 compiler.py:804()
> 20.0480.0240.0490.024 {built-in method connect}
> 111230.0470.0001.3860.000 properties.py:269(operate)
> 8689/31280.0450.0001.5080.000 {method 'join' of 'str'
> objects}
> 111230.0450.0000.0760.000
> compiler.py:959(_get_operator_dispatch)
> 111230.0450.0000.0560.000
> compiler.py:1271(bindparam_string)
> 111230.0400.0000.4110.000
> annotation.py:78(_compiler_dispatch)
> 15610/153220.0380.0000.0430.000 {built-in method hasattr}
>  55670.0380.0000.2940.000 elements.py:1883()
> 111250.0360.0000.1380.000
> _collections.py:728(__missing__)
> 166910.0350.0000.0900.000
> elements.py:4216(_expression_literal_as_text)
>  55610.0350.0001.5390.000 db_base.py:174()
> 167040.0330.0000.0550.000
> elements.py:4220(_literal_as_text)
> 111230.0330.0000.0730.000
> type_api.py:504(coerce_compared_value)
> 31584/314690.0310.000

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Ken MacKenzie
So I implemented this version with one change.  I moved this line:

cols = [getattr(cls, colname) for colname in cls.SQL_PK] 

To the beginning before the loop, actually before the outer loop that 
parses the batch.

However, the new version is MUCH slower than the original, so I profiled it 
and this is what I get:

 %prun import db_api_lib.del_test
 1421296 function calls (1277536 primitive calls) in 620.434 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   12  615.512   51.293  615.512   51.293 {method 'execute' of 
'pyodbc.Cursor' objects}
208876/2082710.3160.0000.3320.000 {built-in method 
isinstance}
96876/968740.1840.0000.1850.000 {built-in method getattr}
111230.1370.0000.2420.000 elements.py:2927(__init__)
 38970/100.1360.0001.5020.150 
visitors.py:75(_compiler_dispatch)
111230.1280.0000.3970.000 elements.py:861(__init__)
44492/111230.1270.0001.5040.000 operators.py:297(__eq__)
 55680.1160.0000.6170.000 elements.py:1860(_construct)
111320.1130.0000.2460.000 compiler.py:676(visit_column)
111230.1110.0001.0390.000 
compiler.py:1040(_generate_generic_binary)
 55670.1080.0000.1080.000 {method 'fetchone' of 
'pyodbc.Cursor' objects}
111230.0990.0000.6460.000 
default_comparator.py:290(_check_literal)
166840.0950.0000.1950.000 
operators.py:1007(is_precedent)
111230.0940.0001.0100.000 
default_comparator.py:22(_boolean_compare)
33375/111290.0900.0001.4450.000 {built-in method eq}
111230.0750.0000.4650.000 
compiler.py:1164(visit_bindparam)
10.0750.075  619.467  619.467 db_base.py:138(dict_delete)
111390.0730.0000.3310.000 base.py:1128(decorate)
688310.0700.0000.0700.000 {method 'get' of 'dict' 
objects}
10.0680.068  620.471  620.471 del_test.py:1()
22250/55670.0670.0001.4820.000 compiler.py:806()
111250.0660.0000.1020.000 
compiler.py:1265(_process_anon)
111230.0650.0001.1790.000 compiler.py:1006(visit_binary)
111230.0630.0001.2630.000 base.py:1306(visit_binary)
111230.0610.0000.4580.000 elements.py:3818(_bind_param)
222690.0580.0000.0970.000 compiler.py:2901(quote)
111250.0580.0000.2610.000 
compiler.py:1246(_truncated_identifier)
111230.0580.0001.2990.000 annotation.py:100(__eq__)
111230.0560.0001.0660.000 type_api.py:60(operate)
563860.0550.0000.0550.000 {method 'append' of 'list' 
objects}
111260.0550.0000.1680.000 elements.py:3962(__new__)
111230.0550.0000.3340.000 
compiler.py:1233(_truncate_bindparam)
111250.0540.0000.1920.000 elements.py:4073(apply_map)
112060.0510.0000.0890.000 elements.py:3918(__new__)
222460.0510.0000.0510.000 elements.py:640(self_group)
22250/55670.0490.0001.4930.000 compiler.py:804()
20.0480.0240.0490.024 {built-in method connect}
111230.0470.0001.3860.000 properties.py:269(operate)
8689/31280.0450.0001.5080.000 {method 'join' of 'str' 
objects}
111230.0450.0000.0760.000 
compiler.py:959(_get_operator_dispatch)
111230.0450.0000.0560.000 
compiler.py:1271(bindparam_string)
111230.0400.0000.4110.000 
annotation.py:78(_compiler_dispatch)
15610/153220.0380.0000.0430.000 {built-in method hasattr}
 55670.0380.0000.2940.000 elements.py:1883()
111250.0360.0000.1380.000 
_collections.py:728(__missing__)
166910.0350.0000.0900.000 
elements.py:4216(_expression_literal_as_text)
 55610.0350.0001.5390.000 db_base.py:174()
167040.0330.0000.0550.000 
elements.py:4220(_literal_as_text)
111230.0330.0000.0730.000 
type_api.py:504(coerce_compared_value)
31584/314690.0310.0000.0320.000 {built-in method len}
111230.0300.0000.1650.000 elements.py:2986(self_group)
 55620.0300.0000.1870.000 result.py:1156(fetchone)
111310.0300.0000.0770.000 compiler.py:494()
111230.0290.0000.0470.000 
type_api.py:452(_cached_bind_processor)

Any guidance to how I can improve the performance of this solution and 
remain DB neutral?

On Wednesday, August 30, 2017 at 4:43:01 AM UTC-4, Simon King wrote:
>
> On Tue, Aug 29, 2017 at 9:49 PM, Ken MacKenzie  > wrote: 
> > I have a query I have constructed and I had to deal with a 

Re: [sqlalchemy] Is Textual SQL DB neutral

2017-08-30 Thread Simon King
On Tue, Aug 29, 2017 at 9:49 PM, Ken MacKenzie  wrote:
> I have a query I have constructed and I had to deal with a composite primary
> key to select items;
>
> q = s.query(cls)
> or_cond = []
> for x in id_batch:
>
> pkf = [text(f + "=='" + v + "'") for f,v in zip(cls.SQL_PK,
> x)]
> and_cond = (and_(*pkf))
> or_cond.append(and_cond)
>
>
> q = q.filter(or_(*or_cond)).delete(synchronize_session=False)
>
> cls.SQL_PK is a tuple of the primary key fields for the model described by
> class.  This is a class method that is part of a inherited class to the
> model
>
> The current target is SQL Server.  My concern is using text('field =
> 'value'), is that going to work for other DB targets like say postgres?
>
> The first round of doing this I tried using a
> tuple_(*cls.SQL_PK).in_(id_batch), but that did not work and the resulting
> SQL id not work in SSMS leading me to believe that SQL Server (or at least
> the version we are using) does not support tuples.
>

Textual SQL is not DB-neutral in general. Luckily, in this case you
shouldn't even need it. Try something like this:

for x in id_batch:
cols = [getattr(cls, colname) for colname in cls.SQL_PK]
pkf = [(col == v) for (col, v) in zip(cols, x)]
and_cond = and_(*pkf)
or_cond.append(and_cond)

ie. use "getattr" to retrieve the actual column property from the class.

Hope that helps,

Simon

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.