On Thu, Oct 25, 2018 at 4:21 PM Alex Wang <[email protected]> wrote:
>
> Hi all!
>
> I'm trying to write a small script to interface with a database controlled by 
> a third-party application, and I'm not sure the way I set up eager loading is 
> right.
>
> The code I have looks something like the following:
>
> from sqlalchemy.ext.declarative import declarative_base
> from sqlalchemy.dialects.mssql import FLOAT, NVARCHAR, UNIQUEIDENTIFIER
> from sqlalchemy.orm import foreign, relationship, remote
> from sqlalchemy.schema import Column
> Base = declarative_base()
> class Node(Base):
>     NodeID = Column(NVARCHAR(length=65), primary_key=True, nullable=False)
>     ParentNodeID = Column(NVARCHAR(length=65), nullable=True)
>     direct_children = relationship('Node', primaryjoin=(NodeID == 
> remote(foreign(ParentNodeID))))
>     entries = relationship('Entry', primaryjoin=('Node.NodeID == 
> remote(foreign(Entry.NodeID))')
>
>     def node_and_all_children(self):
>         result = [self]
>         for child in self.direct_children:
>             result += child.node_and_all_children()
>         return result
>
>     def cost(self):
>         entries = [e for p in self.node_and_all_children() for e in p.entries]
>         return sum(e.Value1 * e.Value2 for e in entries)
>
> class Entry(Base):
>     TEID = Column(UNIQUEIDENTIFIER, primary_key=True, nullable=False)
>     NodeID = Column(NVARCHAR(length=65), nullable=False)
>     Value1 = Column(FLOAT, nullable=True)
>     Value2 = Column(FLOAT, nullable=True)
>
> I want to write something like this:
>
> def get_costs(session, node_ids: List[str]):
>     nodes = session.query(Node).filter(Node.NodeID.in_(node_ids)).all()
>     return {n.NodeID: n.cost() for n in nodes}
>
> From what I understand, this results in an N+1-ish access pattern. I can 
> eagerly load the children easily enough (I think I can guarantee <= 3 levels 
> of children, so three selectinload() calls are hopefully enough?):
>
> def get_costs(session, node_ids: List[str]):
>     nodes = (
>         session.query(Node)
>         .options(
>             selectinload(Node.direct_children)
>             .selectinload(Node.direct_children)
>             .selectinload(Node.direct_children)
>         )
>         .filter(Node.NodeID.in_(node_ids))
>         .all()
>     )
>     return {n.NodeID: n.cost() for n in nodes}
>
> It's here that I get stuck, though. This eagerly loads the children, but 
> doesn't eagerly load each node's entries, which results in a query being sent 
> to the database for each child. I can specify that entries should be eagerly 
> loaded at each level:
>
> def get_costs(session, node_ids: List[str]):
>     nodes = (
>         session.query(Node)
>         .options(
>             selectinload(Node.direct_children)
>             .selectinload(Node.direct_children)
>             .selectinload(Node.direct_children),
>             selectinload(Node.entries),
>             selectinload(Node.direct_children).selectinload(Node.entries),
>             
> selectinload(Node.direct_children).selectinload(Node.direct_children).selectinload(Node.entries)
>         )
>         .filter(Node.NodeID.in_(node_ids))
>         .all()
>     )
>     return {n.NodeID: n.cost() for n in nodes}
>
> But this feels pretty gross, and I'm hoping there is a better way.

you don't have to specify selectinload() twice like that, you can say:

selectinload(Node.entries),
selectinload(Node.direct_children).selectinload(Node.entries),
selectinload(Node.direct_children).selectinload(Node.direct_children).selectinload(Node.entries)

if you are using query.options(), that's the API.  if you put the
selectinload on your relationship, you can use join_depth:

direct_children = relationship(Node, lazy="selectin", join_depth=3)

The options form of this doesn't have a "join_depth" option right now
but you can certainly make yourself a function, something like this:

def selectinload_selfref(attr, depth):
    opt = selectinload(attr)
    for i in range(depth - 1):
       opt = opt.selectinload(attr)
   return opt




>
> Is there a cleaner way of specifying that entries should be eagerly loaded 
> for each level of children? Or is there a totally different query structure 
> that would be even better?
>
> Thanks!
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> ---
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to