Thank you very much for your response. This solution isn't working for me,
and I'm 99% sure I'm not translating what you've written into something
that works for my use case. My situation is somewhat more complicated than
I initially wrote. My first question was in the vain hope that there was
some easy way using declarative to tell anything loaded through a
relationship() to load with a LIMIT. Obviously that isn't the case.

In my situation, I have a relationship which I have defined by hand. There
are not foreign keys between the messages and the objects being loaded
(Connections). When I try to use your lateral solution I get an error
thrown that there are no foreign key relationships, which is true.

Here is my model edited down to what I believe are the minimal relevant
fields. I'm trying to return Connections with a limited number of objects
in the Connection.messages field:

*class Connection(Base):*
*    __tablename__ = 'Connection'*

*    user_1_id = Column(Integer, ForeignKey('User.uid'))*
*    user_2_id = Column(Integer, ForeignKey('User.uid'))*
*    messages =  relationship('Message',*
*                             secondary="join(MessageRecipient, Message,
MessageRecipient.message_id == Message.uid)",*
*                             primaryjoin= 'or_(Connection.user_1_id ==
MessageRecipient.recipient_id,'*
*                                           'Connection.user_2_id ==
MessageRecipient.recipient_id)',*
*                             order_by="Message.created.desc()")*

*class MessageRecipient(Base):*
*    __tablename__ = 'MessageRecipient'*

*    recipient_id = Column(Integer, ForeignKey('User.uid'))*

*class Message(Base):*
*    __tablename__ = 'Message'*

*    created = Column(DateTime, default=func.current_timestamp())*

*    body = Column(String(2000))*


Now, I'm using Postgres, so I did attempt to translate your lateral
example:

*        subq = Message.query.\*
*            filter(and_(Message.uid == MessageRecipient.message_id,
or_(MessageRecipient.uid == Connection.user_1_id,*
*                                            MessageRecipient.uid ==
Connection.user_2_id))).\*
*            order_by(Message.created.desc()).limit(1).subquery().lateral()*

*        q = Connection.query.outerjoin(subq). \*
*            options(contains_eager(Connection.messages, alias=subq))*

The above throws an error:
Can't find any foreign key relationships between 'Connection' and
'%(140122530861688 anon)s'.

I also tried your non-lateral example
*        subq = self.db.session.query(Message.created). \*
*            filter(and_(Message.uid == MessageRecipient.message_id,
or_(MessageRecipient.recipient_id == Connection.user_1_id,*
*                            MessageRecipient.recipient_id ==
Connection.user_2_id))).\*
*            order_by(Message.created.desc()). \*
*            limit(1).offset(10).correlate(Connection).as_scalar()*

*        q = self.db.session.query(Connection).join(*
*            MessageRecipient,*
*            or_(MessageRecipient.recipient_id == Connection.user_1_id,*
*                MessageRecipient.recipient_id == Connection.user_2_id)).\*
*            join( Message,*
*            and_(MessageRecipient.message_id == Message.uid,
Message.created > subq)*
*        ).options(contains_eager(Connection.messages)).all()*

The above does not throw an error, but it also does not return any
entities. I'm going to keep working on the non-lateral example because I
don't understand your example well enough to confidently say I'm being
faithful to it. If you have other advice based on this new information then
I'd really appreciate it.  If I figure out what I was doing wrong I'll post
here for posterity.

On Mon, May 1, 2017 at 7:34 PM, mike bayer <[email protected]> wrote:

>
>
> On 05/01/2017 08:05 PM, Isaac Martin wrote:
>
>>
>> I am building an api which can return children of resources if the user
>> requests it. For example, |user| has |messages|. I want the query to be
>> able to limit the number of |message| objects that are returned.
>>
>> I found a useful tip about limiting the number of objects in child
>> collections here <http://stackoverflow.com/ques
>> tions/9148316/how-to-limit-offset-sqlalchemy-orm-relation
>> s-result?noredirect=1&lq=1>. Basically, it indicates the following flow:
>>
>> |classUser(...):# ...messages =relationship('Messages',order
>> _by='desc(Messages.date)',lazy='dynamic')user
>> =User.query.one()users.messages.limit(10)|
>>
>> My use case involves returning sometimes large numbers of users.
>>
>> If I were to follow the advice in that link and used |.limit()| then I
>> would need to iterate over the entire collection of users calling
>> |.limit()| on each one. This is much less efficient then, say, using
>> |LIMIT|in the original sql expression which created the collection.
>>
>> My question is whether it is possible using declarative to
>> efficiently(N+0) load a large collection of objects while limiting the
>> number of children in their child collections using sqlalchemy?
>>
>>
>> To be clear, the below is what I am trying to /avoid/.
>>
>> |users =User.query.all()messages ={}foruser inusers:messages[user.id
>> ]=user.messages.limit(10).all()|
>>
>> I want to do something more like:
>>
>> |users =User.query.option(User.messages.limit(10)).all()|
>>
>
> so you should ignore whether or not it uses "declarative", which has
> nothing to do with querying, and in fact at first ignore Query too, because
> first and foremost this is a SQL problem.  You want one SQL statement that
> does this.  What query in SQL would load lots of rows from the primary
> table, joined to the first ten rows of the secondary table for each primary?
>
> LIMIT is tricky because it's not actually part of the usual "relational
> algebra" calculation.  It's outside of that because it's an artificial
> limit on rows.    For example, my first thought on how to do this was wrong:
>
>         select * from users left outer join (select * from messages limit
> 10) as anon_1 on users.id = anon_1.user_id
>
> This is wrong because it only gets the first ten messages in the
> aggregate, disregarding user.  We want to get the first ten messages for
> each user, which means we need to do this "select from messages limit 10"
> individually for each user. That is, we need to correlate somehow.  A
> correlated subquery though is not usually allowed as a FROM element, and is
> only allowed as a SQL expression, it can only return a single column and a
> single row; we can't normally JOIN to a correlated subquery in plain
> vanilla SQL.   We can however, correlate inside the ON clause of the JOIN
> to make this possible in vanilla SQL.
>
> But first, if we are on a modern Postgresql version, we *can* break that
> usual rule of correlation and use a keyword called LATERAL, which allows
> correlation in a FROM clause.  LATERAL is only supported by modern
> Postgresql versions, and it makes this easy:
>
>         select * from users left outer join lateral
>         (select * from message where message.user_id = users.id order by
> messages.date desc limit 10) as anon1 on users.id = anon_1.user_id
>
> we support the LATERAL keyword.   The query above looks like this:
>
>
>     subq = s.query(Messages).\
>         filter(Messages.user_id == User.id).\
>         order_by(Messages.date.desc()).limit(10).subquery().lateral()
>
>     q = s.query(User).outerjoin(subq).\
>          options(contains_eager(User.messages, alias=subq))
>
> Note that above, in order to SELECT both users and messages and produce
> them into the User.messages collection, the "contains_eager()" option must
> be used and for that the "dynamic" has to go away.  This is not the only
> option, you can for example build a second relationship for User.messages
> that doesn't have the "dynamic" or you can just load from query(User,
> Message) separately and organize the result tuples as needed.
>
> if you aren't using Postgresql, or a version of Postgresql that doesn't
> support LATERAL, the correlation has to be worked into the ON clause of the
> join instead.   The SQL looks like:
>
>         select * from users left outer join messages on
>         users.id = messages.user_id and messages.date > (select date from
> messages where messages.user_id = users.id order by date desc limit 1
> offset 10)
>
> Here, in order to jam the LIMIT in there, we are actually stepping through
> the first 10 rows with OFFSET and then doing LIMIT 1 to get the date that
> represents the lower bound date we want for each user.  Then we have to
> join while comparing on that date, which can be expensive if this column
> isn't indexed and also can be inaccurate if there are duplicate dates.
>
> This query looks like:
>
> subq = s.query(Messages.date).\
>     filter(Messages.user_id == User.id).\
>     order_by(Messages.date.desc()).\
>     limit(1).offset(10).correlate(User).as_scalar()
>
> q = s.query(User).join(
>     Messages,
>     and_(User.id == Messages.user_id, Messages.date >= subq)
> ).options(contains_eager(User.messages))
>
>
> These kinds of queries are the kind that I don't trust without a good
> test, so POC below includes both versions including a sanity check.
>
>
> from sqlalchemy import *
> from sqlalchemy.orm import *
> from sqlalchemy.ext.declarative import declarative_base
> import datetime
>
> Base = declarative_base()
>
>
> class User(Base):
>     __tablename__ = 'user'
>     id = Column(Integer, primary_key=True)
>     messages = relationship(
>         'Messages', order_by='desc(Messages.date)')
>
> class Messages(Base):
>     __tablename__ = 'message'
>     id = Column(Integer, primary_key=True)
>     user_id = Column(ForeignKey('user.id'))
>     date = Column(Date)
>
> e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
> Base.metadata.drop_all(e)
> Base.metadata.create_all(e)
>
> s = Session(e)
>
> s.add_all([
>     User(id=i, messages=[
>         Messages(id=(i * 20) + j, date=datetime.date(2017, 3, j))
>         for j in range(1, 20)
>     ]) for i in range(1, 51)
> ])
>
> s.commit()
>
> top_ten_dates = set(datetime.date(2017, 3, j) for j in range(10, 20))
>
>
> def run_test(q):
>     all_u = q.all()
>     assert len(all_u) == 50
>     for u in all_u:
>
>         messages = u.messages
>         assert len(messages) == 10
>
>         for m in messages:
>             assert m.user_id == u.id
>
>         received = set(m.date for m in messages)
>
>         assert received == top_ten_dates
>
> # version 1.   no LATERAL
>
> s.close()
>
> subq = s.query(Messages.date).\
>     filter(Messages.user_id == User.id).\
>     order_by(Messages.date.desc()).\
>     limit(1).offset(10).correlate(User).as_scalar()
>
> q = s.query(User).join(
>     Messages,
>     and_(User.id == Messages.user_id, Messages.date > subq)
> ).options(contains_eager(User.messages))
>
> run_test(q)
>
> # version 2.  LATERAL
>
> s.close()
>
> subq = s.query(Messages).\
>     filter(Messages.user_id == User.id).\
>     order_by(Messages.date.desc()).limit(10).subquery().lateral()
>
> q = s.query(User).outerjoin(subq).\
>     options(contains_eager(User.messages, alias=subq))
>
> run_test(q)
>
>
>
>
>
>>
>> SO post here: http://stackoverflow.com/questions/43727268/limit-child-
>> collections-in-initial-query-sqlalchemy/43727705?noredirect
>> =1#comment74499641_43727705
>>
>> --
>> SQLAlchemy -
>> The Python SQL Toolkit and Object Relational Mapper
>>
>> http://www.sqlalchemy.org/
>>
>> To post example code, please provide an MCVE: Minimal, Complete, and
>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>> description.
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <mailto:
>> [email protected]>.
>> To post to this group, send email to [email protected] <mailto:
>> [email protected]>.
>> Visit this group at https://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example.  See  http://stackoverflow.com/help/mcve for a full
> description.
> --- You received this message because you are subscribed to a topic in the
> Google Groups "sqlalchemy" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/to
> pic/sqlalchemy/lOedjqu61G4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to