Sorry, just looking over my simplified test case, I noticed that I
simplified too much.  The subquery needs to be fully correlated back
to TableA.  In plain English, my query is:  "Give me only the C's with
the maximum sort_key value for a given group_key value which are
related to an A (via B) with a particular value of some_key."

The SQL should look something like this:

SELECT table_c.*
FROM table_c
  JOIN table_b ON table_b.id = table_c.b_id
  JOIN table_a ON table_a.id = table_b.a_id
WHERE table_a.some_key = ?
  AND table_c.sort_key = (
    SELECT max(table_c_1.sort_key)
      FROM table_c AS table_c_1
        JOIN table_b AS table_b_1 ON table_b_1.id = table_c_1.b_id
      WHERE table_c.group_key = table_c_1.group_key
             AND table_b_1.a_id = table_a.some_key)

I think I got that right... anyway, hopefully you know what I mean
now.  When I add that last WHERE clause (that I left out in my
original example) as a filter on the query, SQLAlchemy also adds
table_a to the FROM of the subquery, which is also not what I want.

On Feb 23, 12:08 pm, Randall Nortman <[email protected]> wrote:
> I have a somewhat nasty query problem, which I have abstracted and
> (partially) simplified into the following test case:
>
> ----- test.py -----
> from sqlalchemy.ext.declarative import declarative_base
> from sqlalchemy.orm import scoped_session, sessionmaker, aliased,
> relationship
> from sqlalchemy import Column, Integer, ForeignKey, func,
> create_engine
>
> Session = scoped_session(sessionmaker())
> Base = declarative_base()
>
> class TableA (Base):
>     __tablename__ = 'table_a'
>
>     id = Column(Integer, primary_key=True)
>     some_key = Column(Integer, nullable=False)
>
> class TableB (Base):
>     __tablename__ = 'table_b'
>
>     id = Column(Integer, primary_key=True)
>     a_id = Column(Integer, ForeignKey(TableA.id), nullable=False)
>     a = relationship(TableA)
>
> class TableC (Base):
>     __tablename__ = 'table_c'
>
>     id = Column(Integer, primary_key=True)
>     b_id = Column(Integer, ForeignKey(TableB.id), nullable=False)
>     b = relationship(TableB)
>     group_key = Column(Integer, nullable=False)
>     sort_key = Column(Integer, nullable=False)
>
> def make_data():
>     a = TableA(some_key=42)
>     b1 = TableB(a=a)
>     Session.add(TableC(b=b1, group_key=11, sort_key=0))
>     Session.add(TableC(b=b1, group_key=11, sort_key=1))
>     Session.add(TableC(b=b1, group_key=7, sort_key=0))
>     Session.add(TableC(b=b1, group_key=7, sort_key=1))
>     b2 = TableB(a=a)
>     Session.add(TableC(b=b2, group_key=11, sort_key=2))
>
> def do_query():
>     b1 = aliased(TableB)
>     c1 = aliased(TableC)
>     q = (
>         Session.query(TableC.id, TableC.group_key)
>         .join(TableC.b, TableB.a)
>         .filter(TableA.some_key == 42)
>         .filter(TableC.sort_key ==
>                 Session.query(func.max(c1.sort_key))
>                 .join(b1)
>                 .filter(c1.group_key == TableC.group_key)
>                 .as_scalar())
>         )
>     print q.all()
>
> e = create_engine('sqlite:///', echo=True)
> Session.configure(bind=e)
> Base.metadata.create_all(bind=Session.bind)
>
> make_data()
> do_query()
> ----- test.py -----
>
> This returns one row: (5, 11).  I expect it to return two rows, one
> for each group_key  -- so it should also return (4, 7).  The emitted
> SQL is:
>
> SELECT table_c.id AS table_c_id, table_c.group_key AS
> table_c_group_key
> FROM table_c JOIN table_b ON table_b.id = table_c.b_id JOIN table_a ON
> table_a.id = table_b.a_id
> WHERE table_a.some_key = ? AND table_c.sort_key = (SELECT
> max(table_c_1.sort_key) AS max_1
> FROM table_c, table_c AS table_c_1 JOIN table_b AS table_b_1 ON
> table_b_1.id = table_c_1.b_id
> WHERE table_c.group_key = table_c_1.group_key)
>
> The problem is in the FROM clause of the subquery -- there is an extra
> table_c in there.  If I manually run that query without the first
> table_c in there, then I get the correct result.  So the query should
> be:
>
> SELECT table_c.id AS table_c_id, table_c.group_key AS
> table_c_group_key
> FROM table_c JOIN table_b ON table_b.id = table_c.b_id JOIN table_a ON
> table_a.id = table_b.a_id
> WHERE table_a.some_key = ? AND table_c.sort_key = (SELECT
> max(table_c_1.sort_key) AS max_1
> FROM table_c AS table_c_1 JOIN table_b AS table_b_1 ON table_b_1.id =
> table_c_1.b_id
> WHERE table_c.group_key = table_c_1.group_key)
>
> How do I cajole SQLAlchemy into producing this query?

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to