Thank you Mike.

I actually tried with the limit raised up, and the performances became actually 
slightly worse even.

So the problem resides in the prepared statements logic of `mysqlclient`, that 
does some additional escaping on parameters. 

I will try to see if there’s a workaround on that… Thank you very much for your 
support!
On 2 novembre 2018 a 18:41:32, Mike Bayer ([email protected]) scritto:

On Fri, Nov 2, 2018 at 1:08 PM Ruben Di Battista  
<[email protected]> wrote:  
>  
> Thanks Mike as always,  
>  
> I'm diving a bit more in the problem. The solution they decided to apply is 
> to modify the code from above, like this:  
>  
> header = 'INSERT INTO `passageData` (`time`, `azimuth`, ' \  
> '`elevation`, `doppler`, `slant`, `passageID`) VALUES '  
>  
> with profiled():  
> for ix in range(n):  
> time_i = times[ix].strftime("%Y-%m-%d %H:%M:%S.%f")[:-3]  
> az_i = round(azimuth[ix], 2)  
> el_i = round(elevation[ix], 2)  
> dopp_i = round(doppler[ix], 10)  
> slant_i = round(slant[ix], 2)  
>  
> header += '(\'{0}\', {1}, {2}, {3}, {4}, {5})'.\  
> format(time_i, az_i, el_i, dopp_i, slant_i, self.id)  
>  
> if ix != n-1:  
> header += ','  
>  
> header += ';'  
> session.execute(header)  
>  
> And I'm profiling the previous one with:  
>  
> with profiled():  
> session.execute(  
> insert_query,  
> [  
> {  
> 'time': times[i],  
> 'elevation': elevation[i],  
> 'azimuth': azimuth[i],  
> 'doppler': doppler[i],  
> 'slant': slant[i],  
> 'passageID': passage_id  
> }  
> for i in six.moves.range(0, n)  
> ]  
> )  
>  
> And these are the lines I get from the profiling in the first case (with ugly 
> string interpolation):  
> 420358 function calls in 1.591 seconds  
>  
> Ordered by: cumulative time  
>  
> ncalls tottime percall cumtime percall filename:lineno(function)  
> 1 0.000 0.000 1.114 1.114 
> /lib/python2.7/site-packages/sqlalchemy/orm/session.py:1047(execute)  
> 1 0.000 0.000 0.971 0.971 
> /lib/python2.7/site-packages/sqlalchemy/engine/base.py:882(execute)  
> 1 0.000 0.000 0.971 0.971 
> /lib/python2.7/site-packages/sqlalchemy/sql/elements.py:267(_execute_on_connection)
>   
> 1 0.000 0.000 0.971 0.971 
> /lib/python2.7/site-packages/sqlalchemy/engine/base.py:1016(_execute_clauseelement)
>   
> 1 0.000 0.000 0.834 0.834 
> /lib/python2.7/site-packages/sqlalchemy/engine/base.py:1111(_execute_context) 
>  
> 1 0.000 0.000 0.789 0.789 
> /lib/python2.7/site-packages/sqlalchemy/engine/default.py:508(do_execute)  
> 1 0.005 0.005 0.789 0.789 
> /lib/python2.7/site-packages/MySQLdb/cursors.py:204(execute)  
> 1 0.000 0.000 0.780 0.780 
> /lib/python2.7/site-packages/MySQLdb/cursors.py:411(_query)  
> 1 0.000 0.000 0.780 0.780 
> /lib/python2.7/site-packages/MySQLdb/cursors.py:372(_do_query)  
> 1 0.780 0.780 0.780 0.780 
> /lib/python2.7/site-packages/MySQLdb/connections.py:267(query)  
> 3 0.267 0.089 0.267 0.089 {method 'sub' of '_sre.SRE_Pattern' objects}  
> 60000 0.205 0.000 0.205 0.000 {method 'strftime' of 'datetime.date' objects}  
> 1 0.011 0.011 0.143 0.143 
> /lib/python2.7/site-packages/sqlalchemy/sql/elements.py:4287(_literal_as_text)
>   
> 1 0.000 0.000 0.137 0.137 <string>:1(<lambda>)  
> 1 0.000 0.000 0.137 0.137 
> /lib/python2.7/site-packages/sqlalchemy/sql/elements.py:379(compile)  
> 1 0.000 0.000 0.137 0.137 
> /lib/python2.7/site-packages/sqlalchemy/sql/elements.py:444(_compiler)  
> 1 0.000 0.000 0.137 0.137 
> /lib/python2.7/site-packages/sqlalchemy/sql/compiler.py:393(__init__)  
> 1 0.000 0.000 0.137 0.137 
> /lib/python2.7/site-packages/sqlalchemy/sql/compiler.py:180(__init__)  
> 1 0.000 0.000 0.137 0.137 
> /lib/python2.7/site-packages/sqlalchemy/sql/compiler.py:244(process)  
> 1 0.000 0.000 0.137 0.137 
> /lib/python2.7/site-packages/sqlalchemy/sql/visitors.py:86(_compiler_dispatch)
>   
> 1 0.000 0.000 0.137 0.137 
> /lib/python2.7/site-packages/sqlalchemy/sql/compiler.py:759(visit_textclause) 
>  
> 1 0.000 0.000 0.132 0.132 
> /lib/python2.7/site-packages/sqlalchemy/sql/elements.py:1259(__init__)  
> 240000 0.120 0.000 0.120 0.000 {round}  
> 60000 0.104 0.000 0.104 0.000 {method 'format' of 'str' objects}  
> 60000 0.047 0.000 0.047 0.000 
> /lib/python2.7/site-packages/sqlalchemy/orm/attributes.py:234(__get__)  
>  
> While in the second case I get:  
>  
> 5103248 function calls (5103183 primitive calls) in 3.829 seconds  
>  
> Ordered by: cumulative time  
>  
> ncalls tottime percall cumtime percall filename:lineno(function)  
> 1 0.000 0.000 3.829 3.829 
> lib/python2.7/site-packages/sqlalchemy/orm/session.py:1047(execute)  
> 1 0.000 0.000 3.829 3.829 
> lib/python2.7/site-packages/sqlalchemy/engine/base.py:882(execute)  
> 1 0.000 0.000 3.829 3.829 
> lib/python2.7/site-packages/sqlalchemy/sql/elements.py:267(_execute_on_connection)
>   
> 1 0.000 0.000 3.829 3.829 
> lib/python2.7/site-packages/sqlalchemy/engine/base.py:1016(_execute_clauseelement)
>   
> 1 0.001 0.001 3.829 3.829 
> lib/python2.7/site-packages/sqlalchemy/engine/base.py:1111(_execute_context)  
> 1 0.000 0.000 2.829 2.829 
> lib/python2.7/site-packages/sqlalchemy/dialects/mysql/mysqldb.py:104(do_executemany)
>   
> 1 0.000 0.000 2.829 2.829 
> lib/python2.7/site-packages/MySQLdb/cursors.py:256(executemany)  
> 1 0.130 0.130 2.829 2.829 
> lib/python2.7/site-packages/MySQLdb/cursors.py:286(_do_execute_many)  
> 60000 0.198 0.000 1.970 0.000 
> lib/python2.7/site-packages/MySQLdb/cursors.py:105(_escape_args)  
> 420000 0.169 0.000 1.245 0.000 
> lib/python2.7/site-packages/MySQLdb/cursors.py:111(<genexpr>)  
> 360000 0.378 0.000 1.076 0.000 
> lib/python2.7/site-packages/MySQLdb/connections.py:302(literal)  
> 1 0.368 0.368 0.998 0.998 
> lib/python2.7/site-packages/sqlalchemy/engine/default.py:595(_init_compiled)  
> 61 0.001 0.000 0.703 0.012 
> lib/python2.7/site-packages/MySQLdb/cursors.py:204(execute)  
> 61 0.000 0.000 0.702 0.012 
> lib/python2.7/site-packages/MySQLdb/cursors.py:411(_query)  
> 61 0.001 0.000 0.701 0.011 
> lib/python2.7/site-packages/MySQLdb/cursors.py:372(_do_query)  
> 61 0.699 0.011 0.699 0.011 
> lib/python2.7/site-packages/MySQLdb/connections.py:267(query)  
> 360000 0.160 0.000 0.543 0.000 {method 'escape' of '_mysql.connection' 
> objects}  
> 60000 0.181 0.000 0.521 0.000 {map}  
> 360000 0.191 0.000 0.339 0.000 
> lib/python2.7/site-packages/MySQLdb/cursors.py:98(_ensure_bytes)  
> 1560325 0.323 0.000 0.323 0.000 {isinstance}  
> 60000 0.313 0.000 0.313 0.000 
> lib/python2.7/site-packages/sqlalchemy/sql/compiler.py:526(construct_params)  
> 240000 0.076 0.000 0.284 0.000 
> lib/python2.7/site-packages/sqlalchemy/sql/type_api.py:1181(process)  
> 240000 0.085 0.000 0.208 0.000 
> /Users/rubendibattista/git/Leaf/pyggdrasill/pyggdrasill/sql/types/small_float.py:10(process_bind_param)
>   
> 60000 0.034 0.000 0.206 0.000 
> lib/python2.7/site-packages/MySQLdb/times.py:125(DateTime2literal)  
> 240000 0.155 0.000 0.155 0.000 
> lib/python2.7/site-packages/MySQLdb/converters.py:68(Float2Str)  
> 60000 0.028 0.000 0.154 0.000 
> lib/python2.7/site-packages/MySQLdb/times.py:39(format_TIMESTAMP)  
> 60000 0.126 0.000 0.126 0.000 {method 'format' of 'str' objects}  
> 240000 0.123 0.000 0.123 0.000 {round}  
> 420015 0.033 0.000 0.033 0.000 {method 'append' of 'list' objects}  
> 60000 0.022 0.000 0.022 0.000 
> lib/python2.7/site-packages/MySQLdb/converters.py:58(Thing2Str)  
> 60000 0.018 0.000 0.018 0.000 {_mysql.string_literal}  
> 180206 0.014 0.000 0.014 0.000 {len}  
> 61 0.001 0.000 0.002 0.000 
> lib/python2.7/site-packages/MySQLdb/cursors.py:182(_do_get_result)  
>  
>  
> So I get a way bigger number of function calls. To me this seems related to 
> the args escaping from MySQLdb cursors.py. Is there any better way to 
> optimize that INSERT query?  

So, the callcounts are inflated by lines like these:  

1 0.130 0.130 2.829 2.829  
lib/python2.7/site-packages/MySQLdb/cursors.py:286(_do_execute_many)  
61 0.001 0.000 0.703 0.012  
lib/python2.7/site-packages/MySQLdb/cursors.py:204(execute)  
61 0.000 0.000 0.702 0.012  
lib/python2.7/site-packages/MySQLdb/cursors.py:411(_query)  
61 0.001 0.000 0.701 0.011  
lib/python2.7/site-packages/MySQLdb/cursors.py:372(_do_query)  
61 0.699 0.011 0.699 0.011  
lib/python2.7/site-packages/MySQLdb/connections.py:267(query)  
60000 0.198 0.000 1.970 0.000  
lib/python2.7/site-packages/MySQLdb/cursors.py:105(_escape_args)  
420000 0.169 0.000 1.245 0.000  
lib/python2.7/site-packages/MySQLdb/cursors.py:111(<genexpr>)  
360000 0.378 0.000 1.076 0.000  
lib/python2.7/site-packages/MySQLdb/connections.py:302(literal)  
360000 0.191 0.000 0.339 0.000  
lib/python2.7/site-packages/MySQLdb/cursors.py:98(_ensure_bytes)  

The first thing we note is that "site-packages/MySQLdb" is the  
mysqlclient package, which as the DBAPI is independent from SQLAlchemy  
and is documented at https://mysqlclient.readthedocs.io/. So when  
your colleague was incorrect, I didn't specify that he was not  
actually observing this. The source of your 65535 number is right  
here in that package:  

https://github.com/PyMySQL/mysqlclient-python/blob/master/MySQLdb/cursors.py#L52
  


This "max_stmt_length" is undocumented outside of the source code, but  
it's an attribute on the cursor, so certainly if you wanted to make  
that huge, you can set it just like this:  

from sqlalchemy import event  

@event.listens_for(my_engine, 'before_cursor_execute')  
def receive_before_cursor_execute(conn, cursor, statement, parameters,  
context, executemany):  
cursor.max_stmt_length = 300000000 # go nuts  

We can see that is the literal answer to your original question.  
But if we look at your profile results, I can tell you right now that  
changing that number is not going to do much for you as far as  
callcounts. As far as actual time spent, it may or may not help,  
because allocating memory in Python is itself often slow, and under  
concurrency the memory use can begin to get problematic more  
generally. Above, your profile shows that it invoked a statement  
61 times rather than one time. But the vast majority of callcounts  
are spent in processing and escaping of individual parameters in the  
statement, e.g. calls like "literal" and "_ensure_bytes", those  
numbers aren't going to change whether you have one 10M statement or  
60 smaller ones. SQLAlchemy is always going to use bound parameters  
for Core constructs like insert(), and the time spent here seems to be  
mostly a product of the expense of bound parameters within the  
mysqlclient driver. So while I encourage you to try the above recipe  
and such, to overcome that overhead you may still be stuck formatting  
the entire string yourself, but I would encourage you to familiarize  
with how MySQLdb is doing this over at  
https://github.com/PyMySQL/mysqlclient-python/ , since that's where  
you're hitting the limitations you are having.  










>  
> On Friday, November 2, 2018 at 5:23:51 PM UTC+1, Mike Bayer wrote:  
>>  
>> On Fri, Nov 2, 2018 at 11:17 AM Ruben Di Battista  
>> <[email protected]> wrote:  
>> >  
>> > Hello,  
>> >  
>> > I have a huge insert of the type:  
>> >  
>> > ```  
>> > session.execute(  
>> > insert_query,  
>> > [  
>> > {  
>> > 'time': times[i],  
>> > 'elevation': elevation[i],  
>> > 'azimuth': azimuth[i],  
>> > 'doppler': doppler[i],  
>> > 'slant': slant[i],  
>> > 'passageID': passage_id  
>> > }  
>> > for i in six.moves.range(0, n)  
>> > ]  
>> > )  
>> >  
>> > ```  
>> >  
>> > where n is huge.  
>> >  
>> > I was told by a colleague that SQLALchemy limits query length to 65536 
>> > chars while MySQL can cappet 33MB big queries. Can I tailor this in 
>> > SQLAlchemy?  
>>  
>> your colleague is mistaken. SQLAlchemy has no limit on the sizes of  
>> query strings or any other string.  
>>  
>>  
>> >  
>> > --  
>> > SQLAlchemy -  
>> > The Python SQL Toolkit and Object Relational Mapper  
>> >  
>> > http://www.sqlalchemy.org/  
>> >  
>> > To post example code, please provide an MCVE: Minimal, Complete, and 
>> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>> > description.  
>> > ---  
>> > You received this message because you are subscribed to the Google Groups 
>> > "sqlalchemy" group.  
>> > To unsubscribe from this group and stop receiving emails from it, send an 
>> > email to [email protected].  
>> > To post to this group, send email to [email protected].  
>> > Visit this group at https://groups.google.com/group/sqlalchemy.  
>> > For more options, visit https://groups.google.com/d/optout.  
>  
> --  
> SQLAlchemy -  
> The Python SQL Toolkit and Object Relational Mapper  
>  
> http://www.sqlalchemy.org/  
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.  
> ---  
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.  
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].  
> To post to this group, send email to [email protected].  
> Visit this group at https://groups.google.com/group/sqlalchemy.  
> For more options, visit https://groups.google.com/d/optout.  

--  
SQLAlchemy -  
The Python SQL Toolkit and Object Relational Mapper  

http://www.sqlalchemy.org/  

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example. See http://stackoverflow.com/help/mcve for a full description.  
---  
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.  
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].  
To post to this group, send email to [email protected].  
Visit this group at https://groups.google.com/group/sqlalchemy.  
For more options, visit https://groups.google.com/d/optout.  

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Attachment: signature.asc
Description: Message signed with OpenPGP using AMPGpg

Reply via email to