So I have replaced the regex-based rewrite with the PGValuesParam approach
that we talked about. One issue with the regex-based rewrite was that in
its current form it breaks for parameter-lists > 10 elements, because the
inner sorted(..) call does string-based sorting:
'IN \\(%\\(parent_ids_1\\)s, %\\(parent_ids_10\\)s, %\\(parent_ids_11\\)s,
...'
whereas the actual sequence in the statement is:
... IN (%(parent_ids_1)s, %(parent_ids_2)s, %(parent_ids_3)s ...)
I'm aware that we could modify the sorting to account for that, but it
starts to feel somewhat tricky. I like the PGValuesParam-idea a lot,
because it is an explicit opt-in approach, that avoids the tricky regex. So
I went with the following:
class PGValuesParam(BindParameter):
@staticmethod
def prepare(parameters):
return [(element,) for element in parameters]
Usage in the query then:
.filter(Item.parent_id.in_(PGValuesParam('parent_ids', expanding=True)))
and passing of parameters as:
query(session).params(parent_ids=PGValuesParam.prepare(parent_ids))
The transformation of each list element to a tuple is done in the
prepare-staticmethod. That way, the rest of the team doesn't have to
remember the details besides "Use the .prepare(...)-method when passing
parameter-lists to PGValuesParam."
The cherry on top would be to do an automatic transformation of the list to
the list-of-tuples for the PGValuesParam, but when I stepped into the
before_execute-event and inspected the data, I couldn't find the
PGValuesParam instance anywhere. I see the list of values as part of
multiparams, but not PGValuesParam itself (which would be needed as the
'signal' for doing the transformation).
Anyway, while I would be interested in how to achieve this (it would make
the whole approach very clean), I don't want to take up too much of your
time. So, thank you very much for your help already, Mike. And I've been
meaning to mention this: While I can feel myself getting somewhat
frustrated with Python's lack of good static typing and gravitating towards
.NET Core because of this, SQLAlchemy is the one library that
singlehandedly keeps me in the Python world, because it's so amazing.
Am Freitag, 8. Februar 2019 18:03:20 UTC+1 schrieb Mike Bayer:
>
> On Fri, Feb 8, 2019 at 10:05 AM Martin Stein <[email protected]
> <javascript:>> wrote:
> >
> > Am Donnerstag, 7. Februar 2019 23:45:09 UTC+1 schrieb Mike Bayer:
> >>
> >>
> >>
> >> note the mentioned issue is closed. IN supports an inline parameter
> now:
> >>
> >>
> https://docs.sqlalchemy.org/en/rel_1_2/orm/extensions/baked.html#baked-in
> >>
> >
> > We had started using the .op('IN') approach around v1.1 and I hadn't
> realized that SQLAlchemy supports this now. Very good to know, thanks.
> >
> > I have integrated your suggested approach with the
> 'before_cursor_execute' rewrite of the statement, and it works nicely,
> thanks again!
> >
> > Simply out of curiosity and because I'm trying to understand the
> internals a bit better:
> >
> > 1) Theoretically, would another possible approach be to implement a
> sub-class of BindParameter (let's say PGValuesBindParameter) and - in
> combination with the .op('IN') approach - overwrite the bind-parameter
> behavior so it passes the values as one long string like this:
> >
> .filter(Item.parent_id.op('IN')(PGValuesBindParameter('parent_ids')))
> >
> > # Parameters that are passed into the statement:
> > {
> > 'parent_ids': "(VALUES ('abc'), ('def'), ('gjh'))"
> > }
>
> so the event recipe I gave you , you probably *want* to do this, so
> that in the event, you not only look at bindparam.expanding, you also
> do isinstance(param, PGValuesBindParameter).
>
>
> >
> > 2) I've also considered using the following construct with the
> array-function from the postgres-dialect. It achieves the same performance
> improvement as the VALUES approach and doesn't require the tricky rewriting
> of the VALUES-part with the inner parentheses:
> >
> > func.unnest(array(parent_ids))
> >
> > but I couldn't get it to work with bindparams (tried also with
> expanding=True).
> >
> > Which one would you suggest/do you think is the cleanest... the VALUES
> statement-rewrite, or one of those other approaches?
>
> ideally the expanding IN system in SQLAlchemy would have a little bit
> of expansion built into it so you could just get in there to add this
> keyword.
>
> the big issue here is that you need to use the baked system so any
> writing of these parameters / arrays / whatever needs to happen
> *after* SQLAlchemy creates the string. the current IN trick you are
> doing is taking advantage of psycopg2's ability to pass a tuple as a
> parameter to libpq. psycopg2 probably can't achieve this with
> "UNNEST" because that's a SQL function, not something that libpq can
> accept as a parameter.
>
> So rewriting the statement after the fact is the best way to go.
> the workings of it are complicated because we have to also take your
> single parameter "foo" and add "foo_1", "foo_2", "foo_3", etc. to the
> parameter dictionary. If you are then passing in a list of tuples,
> then it has to break into a list of tuples. The logic is at
>
> https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/engine/default.py#L748,
>
>
> you can see it's pretty tedious, so it's probably best to keep
> leveraging that logic.
>
> Looking at the logic there, I can see we can leverage the existing
> "render a tuple" logic so that we can remove the regexp thing if you
> just pass the params as tuples. This can also be automated if you
> rewrite the parameter dictionary using the before_execute() event, but
> here's the more simple idea:
>
> from sqlalchemy import Column
> from sqlalchemy import create_engine
> from sqlalchemy import Integer
> from sqlalchemy import String
> from sqlalchemy.ext import baked
> from sqlalchemy.ext.compiler import compiles
> from sqlalchemy.ext.declarative import declarative_base
> from sqlalchemy.orm import Session
> from sqlalchemy.sql.expression import BindParameter
>
> class PGValuesParam(BindParameter):
> pass
>
> @compiles(PGValuesParam)
> def _compile_pgvalues(element, compiler, **kw):
> string = compiler.visit_bindparam(element, **kw)
> return "(VALUES %s" % string[1:]
>
>
> Base = declarative_base()
>
>
> class A(Base):
> __tablename__ = "a"
>
> id = Column(Integer, primary_key=True)
> data = Column(String)
>
>
> e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
> Base.metadata.create_all(e)
>
> s = Session(e)
>
> bakery = baked.bakery()
>
> query = bakery(
> lambda s: s.query(A).filter(A.id.in_(PGValuesParam("ids",
> expanding=True)))
> )
>
> q = query(s).params(ids=[(1,), (2,), (3,)])
>
>
> print(q)
> q.all()
>
>
>
>
>
>
> >
> > --
> > SQLAlchemy -
> > The Python SQL Toolkit and Object Relational Mapper
> >
> > http://www.sqlalchemy.org/
> >
> > To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> > ---
> > You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected] <javascript:>.
> > To post to this group, send email to [email protected]
> <javascript:>.
> > Visit this group at https://groups.google.com/group/sqlalchemy.
> > For more options, visit https://groups.google.com/d/optout.
>
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
http://www.sqlalchemy.org/
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable
Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.