On Fri, Oct 26, 2018 at 12:10 PM Alex Wang <[email protected]> wrote:
>
> I ended up needing one more layer of selectinload(Node.direct_children), but 
> it got rid of all the lazy loads. Thanks for the suggestion; that's 
> definitely a lot better than what I initially had. I was thinking that I 
> needed to specify eager loading separately, but now that I think more about 
> your suggestion the redundancy becomes clearer.
>
> There's one more layer of complications --- if I need to get multiple entries 
> off of each node, is there a clean-ish way of eagerly loading those 
> attributes in addition to the entries mentioned in my original post? Or 
> should I just write a function like what you suggested and use that?

if you want to keep specifying the same path and add more attributes,
defaultload() helps with that:

selectinload(Foo.bar).selectinload(Bar.attribute1)
defaultload(Foo.bar).selectinload(Bar.attribute2)
defaultload(Foo.bar).selectinload(Bar.attribute3)

above, the "defaultload" just means, "here is a token in the path but
don't change the existing loading options on this token".





>
> And just to make sure, selectinloads should be the right choice for this kind 
> of nested collection, right?
>
> Thanks!
>
> On Thursday, October 25, 2018 at 6:54:53 PM UTC-4, Mike Bayer wrote:
>>
>> On Thu, Oct 25, 2018 at 4:21 PM Alex Wang <[email protected]> wrote:
>> >
>> > Hi all!
>> >
>> > I'm trying to write a small script to interface with a database controlled 
>> > by a third-party application, and I'm not sure the way I set up eager 
>> > loading is right.
>> >
>> > The code I have looks something like the following:
>> >
>> > from sqlalchemy.ext.declarative import declarative_base
>> > from sqlalchemy.dialects.mssql import FLOAT, NVARCHAR, UNIQUEIDENTIFIER
>> > from sqlalchemy.orm import foreign, relationship, remote
>> > from sqlalchemy.schema import Column
>> > Base = declarative_base()
>> > class Node(Base):
>> >     NodeID = Column(NVARCHAR(length=65), primary_key=True, nullable=False)
>> >     ParentNodeID = Column(NVARCHAR(length=65), nullable=True)
>> >     direct_children = relationship('Node', primaryjoin=(NodeID == 
>> > remote(foreign(ParentNodeID))))
>> >     entries = relationship('Entry', primaryjoin=('Node.NodeID == 
>> > remote(foreign(Entry.NodeID))')
>> >
>> >     def node_and_all_children(self):
>> >         result = [self]
>> >         for child in self.direct_children:
>> >             result += child.node_and_all_children()
>> >         return result
>> >
>> >     def cost(self):
>> >         entries = [e for p in self.node_and_all_children() for e in 
>> > p.entries]
>> >         return sum(e.Value1 * e.Value2 for e in entries)
>> >
>> > class Entry(Base):
>> >     TEID = Column(UNIQUEIDENTIFIER, primary_key=True, nullable=False)
>> >     NodeID = Column(NVARCHAR(length=65), nullable=False)
>> >     Value1 = Column(FLOAT, nullable=True)
>> >     Value2 = Column(FLOAT, nullable=True)
>> >
>> > I want to write something like this:
>> >
>> > def get_costs(session, node_ids: List[str]):
>> >     nodes = session.query(Node).filter(Node.NodeID.in_(node_ids)).all()
>> >     return {n.NodeID: n.cost() for n in nodes}
>> >
>> > From what I understand, this results in an N+1-ish access pattern. I can 
>> > eagerly load the children easily enough (I think I can guarantee <= 3 
>> > levels of children, so three selectinload() calls are hopefully enough?):
>> >
>> > def get_costs(session, node_ids: List[str]):
>> >     nodes = (
>> >         session.query(Node)
>> >         .options(
>> >             selectinload(Node.direct_children)
>> >             .selectinload(Node.direct_children)
>> >             .selectinload(Node.direct_children)
>> >         )
>> >         .filter(Node.NodeID.in_(node_ids))
>> >         .all()
>> >     )
>> >     return {n.NodeID: n.cost() for n in nodes}
>> >
>> > It's here that I get stuck, though. This eagerly loads the children, but 
>> > doesn't eagerly load each node's entries, which results in a query being 
>> > sent to the database for each child. I can specify that entries should be 
>> > eagerly loaded at each level:
>> >
>> > def get_costs(session, node_ids: List[str]):
>> >     nodes = (
>> >         session.query(Node)
>> >         .options(
>> >             selectinload(Node.direct_children)
>> >             .selectinload(Node.direct_children)
>> >             .selectinload(Node.direct_children),
>> >             selectinload(Node.entries),
>> >             selectinload(Node.direct_children).selectinload(Node.entries),
>> >             
>> > selectinload(Node.direct_children).selectinload(Node.direct_children).selectinload(Node.entries)
>> >         )
>> >         .filter(Node.NodeID.in_(node_ids))
>> >         .all()
>> >     )
>> >     return {n.NodeID: n.cost() for n in nodes}
>> >
>> > But this feels pretty gross, and I'm hoping there is a better way.
>>
>> you don't have to specify selectinload() twice like that, you can say:
>>
>> selectinload(Node.entries),
>> selectinload(Node.direct_children).selectinload(Node.entries),
>> selectinload(Node.direct_children).selectinload(Node.direct_children).selectinload(Node.entries)
>>
>> if you are using query.options(), that's the API.  if you put the
>> selectinload on your relationship, you can use join_depth:
>>
>> direct_children = relationship(Node, lazy="selectin", join_depth=3)
>>
>> The options form of this doesn't have a "join_depth" option right now
>> but you can certainly make yourself a function, something like this:
>>
>> def selectinload_selfref(attr, depth):
>>     opt = selectinload(attr)
>>     for i in range(depth - 1):
>>        opt = opt.selectinload(attr)
>>    return opt
>>
>>
>>
>>
>> >
>> > Is there a cleaner way of specifying that entries should be eagerly loaded 
>> > for each level of children? Or is there a totally different query 
>> > structure that would be even better?
>> >
>> > Thanks!
>> >
>> > --
>> > SQLAlchemy -
>> > The Python SQL Toolkit and Object Relational Mapper
>> >
>> > http://www.sqlalchemy.org/
>> >
>> > To post example code, please provide an MCVE: Minimal, Complete, and 
>> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>> > description.
>> > ---
>> > You received this message because you are subscribed to the Google Groups 
>> > "sqlalchemy" group.
>> > To unsubscribe from this group and stop receiving emails from it, send an 
>> > email to [email protected].
>> > To post to this group, send email to [email protected].
>> > Visit this group at https://groups.google.com/group/sqlalchemy.
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> ---
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to