OK well of course also, as we have the exact same thing being asked in regards
to Oracle right now in another thread, you can of course always bypass a
"bound" value in the most direct way, using text() or literal_column():
q = s.query(Something).filter(Something.foo = literal_column("'my value'"))
On May 12, 2014, at 8:38 PM, Michael Bayer <[email protected]> wrote:
> well or a FreeTDS issue, more likely, if that's what you're using.
>
> the SQL compiler has a parameter called "literal_binds" that will make it
> render a bound parameter as an inline string, but it only supports a few very
> basic types. As far as getting this parameter set for a general class of
> queries, it depends on when you'd want it to happen and how. It likely
> would require some subclassing and possibly monkey patching.
>
>
> On May 12, 2014, at 8:23 PM, Seth P <[email protected]> wrote:
>
>> pymssql produces the same results as pyodbc. So it looks like a SQL Server
>> issue.
>>
>> On Monday, May 12, 2014 8:06:08 PM UTC-4, Seth P wrote:
>> Fair enough. I'll take a look at pymssql, though I suspect it may be a SQL
>> Server rather than a driver issue.
>>
>>
>> On Monday, May 12, 2014 7:50:03 PM UTC-4, Michael Bayer wrote:
>>
>> On May 12, 2014, at 7:35 PM, Seth P <[email protected]> wrote:
>>
>>> Looks like other people have encountered similar problems with indices
>>> being ignored by prepared sql statements:
>>> http://www.postgresql.org/message-id/[email protected].
>>> (If the diagnosis there is correct, then I'm guessing the server would use
>>> a unique index where all the columns of the index are specified.) Also,
>>> Thierry Florac's post
>>> https://groups.google.com/forum/#!topic/sqlalchemy/k_9ZGI-e85E sounds
>>> similar.
>>> (I suspect my earlier hypothesis about int vs varchar is a red herring.)
>>>
>>> I think it would be useful (albeit risky, if not careful) to have an option
>>> to plug in parameters client-side. I presume not trivial to add to
>>> SQLAlchemy? I don't see such an option for pyodbc.
>>
>> there's mechanisms for this but they aren't very widely advertised since as
>> you know allowing people to do such would be an *enormous* security hole,
>> and I don't have the resources to be responsible for parameter escaping.
>> It would be better if you could try pymssql (much more actively maintained
>> than pyodbc from what i can tell) and/or file a bug with pyodbc.
>>
>>
>>>
>>> On Monday, May 12, 2014 7:09:08 PM UTC-4, Seth P wrote:
>>> Yep, it's not a SQLAlchemy issue. The following code demonstrates the
>>> problem with direct pyodbc access.
>>>
>>> import pyodbc
>>> import time
>>>
>>> def print_timing(func):
>>> def wrapper(*arg):
>>> t1 = time.time()
>>> rows = func(*arg)
>>> t2 = time.time()
>>> print("%30s() len=%d, last=%s, runtime=%0.3fs" % (str(func).split('
>>> at')[0][10:], len(rows), rows[-1], t2 - t1))
>>> return t2 - t1
>>> return wrapper
>>>
>>> if __name__ == '__main__':
>>> cnxn = pyodbc.connect('DRIVER={SQL
>>> Server};SERVER=Compustat;DATABASE=Compustat')
>>> cursor = cnxn.cursor()
>>> sql_select_statement_base = "SELECT datadate, prcod FROM sec_dprc WHERE
>>> gvkey = ? ORDER BY datadate"
>>> key = '001045'
>>>
>>> @print_timing
>>> def execute_explicit_query():
>>> sql_select_statement_explicit =
>>> sql_select_statement_base.replace("?", "'%s'" % key)
>>> rows = cursor.execute(sql_select_statement_explicit).fetchall()
>>> return rows
>>>
>>> @print_timing
>>> def execute_parameterized_query():
>>> rows = cursor.execute(sql_select_statement_base, key).fetchall()
>>> return rows
>>>
>>> num_iterations = 5
>>> explicit_runtime = 0.0
>>> parameterized_runtime = 0.0
>>> for i in range(num_iterations):
>>> explicit_runtime += execute_explicit_query()
>>> parameterized_runtime += execute_parameterized_query()
>>> print("Total runtime for %d explicit queries = %0.3fs." %
>>> (num_iterations, explicit_runtime))
>>> print("Total runtime for %d parameterized queries = %0.3fs." %
>>> (num_iterations, parameterized_runtime))
>>>
>>>
>>> On Monday, May 12, 2014 6:40:48 PM UTC-4, Michael Bayer wrote:
>>>
>>> On May 12, 2014, at 6:33 PM, Seth P <[email protected]> wrote:
>>>
>>>> Is it possible that the (primary key index (which is a composite index
>>>> that begins with gvkey, and is the only index on the table) isn't being
>>>> used because the the gvkey parameter is somehow passed as an integer
>>>> rather than as a string?
>>>
>>> There's nothing in SQLAlchemy that coerces strings to integers. If the
>>> actual type of the column on the DB is an integer, then there might be some
>>> conversion within pyodbc or the ODBC driver.
>>>
>>> if you've got it narrowed down this much the next step is to figure out a
>>> raw pyodbc script that illustrates what the problem is.
>>>
>>>
>>>> The first EXEC below is pretty much instantaneous, whereas the second
>>>> takes about 8 seconds (and produces the same results).
>>>>
>>>> EXEC sp_executesql
>>>> N'SELECT sec_dprc.datadate AS sec_dprc_datadate, sec_dprc.prcod AS
>>>> sec_dprc_prcod
>>>> FROM sec_dprc WHERE sec_dprc.gvkey = @gvkey ORDER BY sec_dprc.datadate',
>>>> N'@gvkey VARCHAR(6)', '001045'
>>>>
>>>> EXEC sp_executesql
>>>> N'SELECT sec_dprc.datadate AS sec_dprc_datadate, sec_dprc.prcod AS
>>>> sec_dprc_prcod
>>>> FROM sec_dprc WHERE sec_dprc.gvkey = @gvkey ORDER BY sec_dprc.datadate',
>>>> N'@gvkey INT', 001045
>>>>
>>>>
>>>>
>>>> On Monday, May 12, 2014 5:00:27 PM UTC-4, Michael Bayer wrote:
>>>>
>>>> well there's only one parameter being processed here so there is clearly
>>>> negligible difference in time spent within Python as far as getting the
>>>> statement ready to execute and then executing it.
>>>>
>>>> So the time is either in what SQL Server spends fetching the rows, or the
>>>> number of rows being fetched (which seems to be the same). Which leaves
>>>> pretty much that SQL Server is making a different choice about the query
>>>> plan for this SELECT statement, this is typically due to an INDEX being
>>>> used or not. You'd need to analyze the plan being used. With SQL
>>>> Server, the option to get a plan within programmatic execution seems to be
>>>> per this answer
>>>> http://stackoverflow.com/questions/7359702/how-do-i-obtain-a-query-execution-plan
>>>> to execute "SET SHOWPLAN_TEXT ON" ahead of time.
>>>>
>>>> Besides that, you can confirm where the time is being spent exactly using
>>>> Python profiling. A description on how to achieve that is here:
>>>> http://stackoverflow.com/questions/1171166/how-can-i-profile-a-sqlalchemy-powered-application/1175677#1175677
>>>>
>>>>
>>>>
>>>> On May 12, 2014, at 3:48 PM, Seth P <[email protected]> wrote:
>>>>
>>>>> After tracking down some extreme slowness in loading a one-to-many
>>>>> relationship (e.g. myobject.foobars), I seem to have isolated the issue
>>>>> to engine.execute() being much slower with parameterized queries than
>>>>> with explicit queries. The following is actual code and output for
>>>>> loading 10,971 rows from a database table. (The actual table has more
>>>>> columns than I'm including here, and is not designed by me.) Note that
>>>>> each explicit query (where I explicitly set the WHERE clause parameter
>>>>> and pass the resulting SQL statement to engine.execute()) runs in under
>>>>> 0.1 seconds, whereas each parameterized query (where I let SQLAlchemy
>>>>> bind the WHERE clause parameter) takes over 8 seconds.
>>>>>
>>>>> The difference in runtimes is smaller when the number of rows returned is
>>>>> smaller, which seems odd since I would have thought that the binding of
>>>>> the WHERE clause parameters is just done once and would be virtually
>>>>> instantaneous.
>>>>>
>>>>> Any thoughts?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Seth
>>>>>
>>>>>
>>>>> import sqlalchemy as sa
>>>>> from sqlalchemy.orm import sessionmaker
>>>>> from sqlalchemy.ext.declarative import declarative_base
>>>>> import time
>>>>>
>>>>> engine = sa.create_engine('mssql+pyodbc://Compustat/Compustat')
>>>>> session = sessionmaker(bind=engine, autoflush=False,
>>>>> expire_on_commit=False)()
>>>>>
>>>>> class FooBar(declarative_base()):
>>>>> __tablename__ = 'sec_dprc'
>>>>> gvkey = sa.Column(sa.String(6), primary_key=True)
>>>>> datadate = sa.Column(sa.DateTime, primary_key=True)
>>>>> value = sa.Column(sa.Float, name='prcod')
>>>>>
>>>>> def print_timing(func):
>>>>> def wrapper(*arg):
>>>>> t1 = time.time()
>>>>> rows = func(*arg)
>>>>> t2 = time.time()
>>>>> print("%30s() len=%d, last=%s, runtime=%0.3fs" %
>>>>> (str(func).split(' at')[0][10:], len(rows), rows[-1], t2 - t1))
>>>>> return t2 - t1
>>>>> return wrapper
>>>>>
>>>>> if __name__ == '__main__':
>>>>>
>>>>> key = '001045'
>>>>> query = session.query(FooBar.datadate,
>>>>> FooBar.value).filter(sa.and_(FooBar.gvkey ==
>>>>> key)).order_by(FooBar.datadate)
>>>>> sql_select_statement_base = str(query)
>>>>> print(sql_select_statement_base)
>>>>>
>>>>> @print_timing
>>>>> def execute_explicit_query():
>>>>> sql_select_statement_explicit =
>>>>> sql_select_statement_base.replace(":gvkey_1", "'%s'" % key)
>>>>> rows =
>>>>> engine.execute(sa.text(sql_select_statement_explicit)).fetchall()
>>>>> return rows
>>>>>
>>>>> @print_timing
>>>>> def execute_parameterized_query():
>>>>> rows = engine.execute(sa.text(sql_select_statement_base),
>>>>> {'gvkey_1':key}).fetchall()
>>>>> return rows
>>>>>
>>>>> num_iterations = 5
>>>>> explicit_runtime = 0.0
>>>>> parameterized_runtime = 0.0
>>>>> for i in range(num_iterations):
>>>>> explicit_runtime += execute_explicit_query()
>>>>> parameterized_runtime += execute_parameterized_query()
>>>>> print("Total runtime for %d explicit queries = %0.3fs." %
>>>>> (num_iterations, explicit_runtime))
>>>>> print("Total runtime for %d parameterized queries = %0.3fs." %
>>>>> (num_iterations, parameterized_runtime))
>>>>>
>>>>>
>>>>> SELECT sec_dprc.datadate AS sec_dprc_datadate, sec_dprc.prcod AS
>>>>> sec_dprc_prcod
>>>>> FROM sec_dprc
>>>>> WHERE sec_dprc.gvkey = :gvkey_1 ORDER BY sec_dprc.datadate
>>>>> execute_explicit_query() len=10971, last=(datetime.datetime(2014,
>>>>> 5, 9, 0, 0), 37.96), runtime=0.082s
>>>>> execute_parameterized_query() len=10971, last=(datetime.datetime(2014,
>>>>> 5, 9, 0, 0), 37.96), runtime=8.852s
>>>>> execute_explicit_query() len=10971, last=(datetime.datetime(2014,
>>>>> 5, 9, 0, 0), 37.96), runtime=0.032s
>>>>> execute_parameterized_query() len=10971, last=(datetime.datetime(2014,
>>>>> 5, 9, 0, 0), 37.96), runtime=8.754s
>>>>> execute_explicit_query() len=10971, last=(datetime.datetime(2014,
>>>>> 5, 9, 0, 0), 37.96), runtime=0.039s
>>>>> execute_parameterized_query() len=10971, last=(datetime.datetime(2014,
>>>>> 5, 9, 0, 0), 37.96), runtime=9.182s
>>>>> execute_explicit_query() len=10971, last=(datetime.datetime(2014,
>>>>> 5, 9, 0, 0), 37.96), runtime=0.028s
>>>>> execute_parameterized_query() len=10971, last=(datetime.datetime(2014,
>>>>> 5, 9, 0, 0), 37.96), runtime=9.416s
>>>>> execute_explicit_query() len=10971, last=(datetime.datetime(2014,
>>>>> 5, 9, 0, 0), 37.96), runtime=0.080s
>>>>> execute_parameterized_query() len=10971, last=(datetime.datetime(2014,
>>>>> 5, 9, 0, 0), 37.96), runtime=8.425s
>>>>> Total runtime for 5 explicit queries = 0.260s.
>>>>> Total runtime for 5 parameterized queries = 44.629s.
>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google Groups
>>>>> "sqlalchemy" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an
>>>>> email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at http://groups.google.com/group/sqlalchemy.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups
>>>> "sqlalchemy" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an
>>>> email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at http://groups.google.com/group/sqlalchemy.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "sqlalchemy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/sqlalchemy.
>>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at http://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.