Hi all!

I'm trying to write a small script to interface with a database controlled 
by a third-party application, and I'm not sure the way I set up eager 
loading is right.

The code I have looks something like the following:

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.dialects.mssql import FLOAT, NVARCHAR, UNIQUEIDENTIFIER
from sqlalchemy.orm import foreign, relationship, remote
from sqlalchemy.schema import Column
Base = declarative_base()
class Node(Base):
    NodeID = Column(NVARCHAR(length=65), primary_key=True, nullable=False)
    ParentNodeID = Column(NVARCHAR(length=65), nullable=True)
    direct_children = relationship('Node', primaryjoin=(NodeID == remote(
foreign(ParentNodeID))))
    entries = relationship('Entry', primaryjoin=('Node.NodeID == 
remote(foreign(Entry.NodeID))')

    def node_and_all_children(self):
        result = [self]
        for child in self.direct_children:
            result += child.node_and_all_children()
        return result

    def cost(self):
        entries = [e for p in self.node_and_all_children() for e in p.
entries]
        return sum(e.Value1 * e.Value2 for e in entries)

class Entry(Base):
    TEID = Column(UNIQUEIDENTIFIER, primary_key=True, nullable=False)
    NodeID = Column(NVARCHAR(length=65), nullable=False)
    Value1 = Column(FLOAT, nullable=True)
    Value2 = Column(FLOAT, nullable=True)

I want to write something like this:

def get_costs(session, node_ids: List[str]):
    nodes = session.query(Node).filter(Node.NodeID.in_(node_ids)).all()
    return {n.NodeID: n.cost() for n in nodes}

>From what I understand, this results in an N+1-ish access pattern. I can 
eagerly load the children easily enough (I think I can guarantee <= 3 
levels of children, so three selectinload() calls are hopefully enough?):

def get_costs(session, node_ids: List[str]):
    nodes = (
        session.query(Node)
        .options(
            selectinload(Node.direct_children)
            .selectinload(Node.direct_children)
            .selectinload(Node.direct_children)
        )
        .filter(Node.NodeID.in_(node_ids))
        .all()
    )
    return {n.NodeID: n.cost() for n in nodes}

It's here that I get stuck, though. This eagerly loads the children, but 
doesn't eagerly load each node's entries, which results in a query being 
sent to the database for each child. I can specify that entries should be 
eagerly loaded at each level:

def get_costs(session, node_ids: List[str]):
    nodes = (
        session.query(Node)
        .options(
            selectinload(Node.direct_children)
            .selectinload(Node.direct_children)
            .selectinload(Node.direct_children),
            selectinload(Node.entries),
            selectinload(Node.direct_children).selectinload(Node.entries),
            selectinload(Node.direct_children).selectinload(Node.
direct_children).selectinload(Node.entries)
        )
        .filter(Node.NodeID.in_(node_ids))
        .all()
    )
    return {n.NodeID: n.cost() for n in nodes}

But this feels pretty gross, and I'm hoping there is a better way.

Is there a cleaner way of specifying that entries should be eagerly loaded 
for each level of children? Or is there a totally different query structure 
that would be even better?

Thanks!

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to