Thank you very much for your response. This solution isn't working for me,
and I'm 99% sure I'm not translating what you've written into something
that works for my use case. My situation is somewhat more complicated than
I initially wrote. My first question was in the vain hope that there was
some easy way using declarative to tell anything loaded through a
relationship() to load with a LIMIT. Obviously that isn't the case.
In my situation, I have a relationship which I have defined by hand. There
are not foreign keys between the messages and the objects being loaded
(Connections). When I try to use your lateral solution I get an error
thrown that there are no foreign key relationships, which is true.
Here is my model edited down to what I believe are the minimal relevant
fields. I'm trying to return Connections with a limited number of objects
in the Connection.messages field:
*class Connection(Base):*
* __tablename__ = 'Connection'*
* user_1_id = Column(Integer, ForeignKey('User.uid'))*
* user_2_id = Column(Integer, ForeignKey('User.uid'))*
* messages = relationship('Message',*
* secondary="join(MessageRecipient, Message,
MessageRecipient.message_id == Message.uid)",*
* primaryjoin= 'or_(Connection.user_1_id ==
MessageRecipient.recipient_id,'*
* 'Connection.user_2_id ==
MessageRecipient.recipient_id)',*
* order_by="Message.created.desc()")*
*class MessageRecipient(Base):*
* __tablename__ = 'MessageRecipient'*
* recipient_id = Column(Integer, ForeignKey('User.uid'))*
*class Message(Base):*
* __tablename__ = 'Message'*
* created = Column(DateTime, default=func.current_timestamp())*
* body = Column(String(2000))*
Now, I'm using Postgres, so I did attempt to translate your lateral
example:
* subq = Message.query.\*
* filter(and_(Message.uid == MessageRecipient.message_id,
or_(MessageRecipient.uid == Connection.user_1_id,*
* MessageRecipient.uid ==
Connection.user_2_id))).\*
* order_by(Message.created.desc()).limit(1).subquery().lateral()*
* q = Connection.query.outerjoin(subq). \*
* options(contains_eager(Connection.messages, alias=subq))*
The above throws an error:
Can't find any foreign key relationships between 'Connection' and
'%(140122530861688 anon)s'.
I also tried your non-lateral example
* subq = self.db.session.query(Message.created). \*
* filter(and_(Message.uid == MessageRecipient.message_id,
or_(MessageRecipient.recipient_id == Connection.user_1_id,*
* MessageRecipient.recipient_id ==
Connection.user_2_id))).\*
* order_by(Message.created.desc()). \*
* limit(1).offset(10).correlate(Connection).as_scalar()*
* q = self.db.session.query(Connection).join(*
* MessageRecipient,*
* or_(MessageRecipient.recipient_id == Connection.user_1_id,*
* MessageRecipient.recipient_id == Connection.user_2_id)).\*
* join( Message,*
* and_(MessageRecipient.message_id == Message.uid,
Message.created > subq)*
* ).options(contains_eager(Connection.messages)).all()*
The above does not throw an error, but it also does not return any
entities. I'm going to keep working on the non-lateral example because I
don't understand your example well enough to confidently say I'm being
faithful to it. If you have other advice based on this new information then
I'd really appreciate it. If I figure out what I was doing wrong I'll post
here for posterity.
On Mon, May 1, 2017 at 7:34 PM, mike bayer <[email protected]> wrote:
>
>
> On 05/01/2017 08:05 PM, Isaac Martin wrote:
>
>>
>> I am building an api which can return children of resources if the user
>> requests it. For example, |user| has |messages|. I want the query to be
>> able to limit the number of |message| objects that are returned.
>>
>> I found a useful tip about limiting the number of objects in child
>> collections here <http://stackoverflow.com/ques
>> tions/9148316/how-to-limit-offset-sqlalchemy-orm-relation
>> s-result?noredirect=1&lq=1>. Basically, it indicates the following flow:
>>
>> |classUser(...):# ...messages =relationship('Messages',order
>> _by='desc(Messages.date)',lazy='dynamic')user
>> =User.query.one()users.messages.limit(10)|
>>
>> My use case involves returning sometimes large numbers of users.
>>
>> If I were to follow the advice in that link and used |.limit()| then I
>> would need to iterate over the entire collection of users calling
>> |.limit()| on each one. This is much less efficient then, say, using
>> |LIMIT|in the original sql expression which created the collection.
>>
>> My question is whether it is possible using declarative to
>> efficiently(N+0) load a large collection of objects while limiting the
>> number of children in their child collections using sqlalchemy?
>>
>>
>> To be clear, the below is what I am trying to /avoid/.
>>
>> |users =User.query.all()messages ={}foruser inusers:messages[user.id
>> ]=user.messages.limit(10).all()|
>>
>> I want to do something more like:
>>
>> |users =User.query.option(User.messages.limit(10)).all()|
>>
>
> so you should ignore whether or not it uses "declarative", which has
> nothing to do with querying, and in fact at first ignore Query too, because
> first and foremost this is a SQL problem. You want one SQL statement that
> does this. What query in SQL would load lots of rows from the primary
> table, joined to the first ten rows of the secondary table for each primary?
>
> LIMIT is tricky because it's not actually part of the usual "relational
> algebra" calculation. It's outside of that because it's an artificial
> limit on rows. For example, my first thought on how to do this was wrong:
>
> select * from users left outer join (select * from messages limit
> 10) as anon_1 on users.id = anon_1.user_id
>
> This is wrong because it only gets the first ten messages in the
> aggregate, disregarding user. We want to get the first ten messages for
> each user, which means we need to do this "select from messages limit 10"
> individually for each user. That is, we need to correlate somehow. A
> correlated subquery though is not usually allowed as a FROM element, and is
> only allowed as a SQL expression, it can only return a single column and a
> single row; we can't normally JOIN to a correlated subquery in plain
> vanilla SQL. We can however, correlate inside the ON clause of the JOIN
> to make this possible in vanilla SQL.
>
> But first, if we are on a modern Postgresql version, we *can* break that
> usual rule of correlation and use a keyword called LATERAL, which allows
> correlation in a FROM clause. LATERAL is only supported by modern
> Postgresql versions, and it makes this easy:
>
> select * from users left outer join lateral
> (select * from message where message.user_id = users.id order by
> messages.date desc limit 10) as anon1 on users.id = anon_1.user_id
>
> we support the LATERAL keyword. The query above looks like this:
>
>
> subq = s.query(Messages).\
> filter(Messages.user_id == User.id).\
> order_by(Messages.date.desc()).limit(10).subquery().lateral()
>
> q = s.query(User).outerjoin(subq).\
> options(contains_eager(User.messages, alias=subq))
>
> Note that above, in order to SELECT both users and messages and produce
> them into the User.messages collection, the "contains_eager()" option must
> be used and for that the "dynamic" has to go away. This is not the only
> option, you can for example build a second relationship for User.messages
> that doesn't have the "dynamic" or you can just load from query(User,
> Message) separately and organize the result tuples as needed.
>
> if you aren't using Postgresql, or a version of Postgresql that doesn't
> support LATERAL, the correlation has to be worked into the ON clause of the
> join instead. The SQL looks like:
>
> select * from users left outer join messages on
> users.id = messages.user_id and messages.date > (select date from
> messages where messages.user_id = users.id order by date desc limit 1
> offset 10)
>
> Here, in order to jam the LIMIT in there, we are actually stepping through
> the first 10 rows with OFFSET and then doing LIMIT 1 to get the date that
> represents the lower bound date we want for each user. Then we have to
> join while comparing on that date, which can be expensive if this column
> isn't indexed and also can be inaccurate if there are duplicate dates.
>
> This query looks like:
>
> subq = s.query(Messages.date).\
> filter(Messages.user_id == User.id).\
> order_by(Messages.date.desc()).\
> limit(1).offset(10).correlate(User).as_scalar()
>
> q = s.query(User).join(
> Messages,
> and_(User.id == Messages.user_id, Messages.date >= subq)
> ).options(contains_eager(User.messages))
>
>
> These kinds of queries are the kind that I don't trust without a good
> test, so POC below includes both versions including a sanity check.
>
>
> from sqlalchemy import *
> from sqlalchemy.orm import *
> from sqlalchemy.ext.declarative import declarative_base
> import datetime
>
> Base = declarative_base()
>
>
> class User(Base):
> __tablename__ = 'user'
> id = Column(Integer, primary_key=True)
> messages = relationship(
> 'Messages', order_by='desc(Messages.date)')
>
> class Messages(Base):
> __tablename__ = 'message'
> id = Column(Integer, primary_key=True)
> user_id = Column(ForeignKey('user.id'))
> date = Column(Date)
>
> e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
> Base.metadata.drop_all(e)
> Base.metadata.create_all(e)
>
> s = Session(e)
>
> s.add_all([
> User(id=i, messages=[
> Messages(id=(i * 20) + j, date=datetime.date(2017, 3, j))
> for j in range(1, 20)
> ]) for i in range(1, 51)
> ])
>
> s.commit()
>
> top_ten_dates = set(datetime.date(2017, 3, j) for j in range(10, 20))
>
>
> def run_test(q):
> all_u = q.all()
> assert len(all_u) == 50
> for u in all_u:
>
> messages = u.messages
> assert len(messages) == 10
>
> for m in messages:
> assert m.user_id == u.id
>
> received = set(m.date for m in messages)
>
> assert received == top_ten_dates
>
> # version 1. no LATERAL
>
> s.close()
>
> subq = s.query(Messages.date).\
> filter(Messages.user_id == User.id).\
> order_by(Messages.date.desc()).\
> limit(1).offset(10).correlate(User).as_scalar()
>
> q = s.query(User).join(
> Messages,
> and_(User.id == Messages.user_id, Messages.date > subq)
> ).options(contains_eager(User.messages))
>
> run_test(q)
>
> # version 2. LATERAL
>
> s.close()
>
> subq = s.query(Messages).\
> filter(Messages.user_id == User.id).\
> order_by(Messages.date.desc()).limit(10).subquery().lateral()
>
> q = s.query(User).outerjoin(subq).\
> options(contains_eager(User.messages, alias=subq))
>
> run_test(q)
>
>
>
>
>
>>
>> SO post here: http://stackoverflow.com/questions/43727268/limit-child-
>> collections-in-initial-query-sqlalchemy/43727705?noredirect
>> =1#comment74499641_43727705
>>
>> --
>> SQLAlchemy -
>> The Python SQL Toolkit and Object Relational Mapper
>>
>> http://www.sqlalchemy.org/
>>
>> To post example code, please provide an MCVE: Minimal, Complete, and
>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>> description.
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <mailto:
>> [email protected]>.
>> To post to this group, send email to [email protected] <mailto:
>> [email protected]>.
>> Visit this group at https://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> --- You received this message because you are subscribed to a topic in the
> Google Groups "sqlalchemy" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/to
> pic/sqlalchemy/lOedjqu61G4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.
>
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
http://www.sqlalchemy.org/
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable
Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.