Hi all!
I'm trying to write a small script to interface with a database controlled
by a third-party application, and I'm not sure the way I set up eager
loading is right.
The code I have looks something like the following:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.dialects.mssql import FLOAT, NVARCHAR, UNIQUEIDENTIFIER
from sqlalchemy.orm import foreign, relationship, remote
from sqlalchemy.schema import Column
Base = declarative_base()
class Node(Base):
NodeID = Column(NVARCHAR(length=65), primary_key=True, nullable=False)
ParentNodeID = Column(NVARCHAR(length=65), nullable=True)
direct_children = relationship('Node', primaryjoin=(NodeID == remote(
foreign(ParentNodeID))))
entries = relationship('Entry', primaryjoin=('Node.NodeID ==
remote(foreign(Entry.NodeID))')
def node_and_all_children(self):
result = [self]
for child in self.direct_children:
result += child.node_and_all_children()
return result
def cost(self):
entries = [e for p in self.node_and_all_children() for e in p.
entries]
return sum(e.Value1 * e.Value2 for e in entries)
class Entry(Base):
TEID = Column(UNIQUEIDENTIFIER, primary_key=True, nullable=False)
NodeID = Column(NVARCHAR(length=65), nullable=False)
Value1 = Column(FLOAT, nullable=True)
Value2 = Column(FLOAT, nullable=True)
I want to write something like this:
def get_costs(session, node_ids: List[str]):
nodes = session.query(Node).filter(Node.NodeID.in_(node_ids)).all()
return {n.NodeID: n.cost() for n in nodes}
>From what I understand, this results in an N+1-ish access pattern. I can
eagerly load the children easily enough (I think I can guarantee <= 3
levels of children, so three selectinload() calls are hopefully enough?):
def get_costs(session, node_ids: List[str]):
nodes = (
session.query(Node)
.options(
selectinload(Node.direct_children)
.selectinload(Node.direct_children)
.selectinload(Node.direct_children)
)
.filter(Node.NodeID.in_(node_ids))
.all()
)
return {n.NodeID: n.cost() for n in nodes}
It's here that I get stuck, though. This eagerly loads the children, but
doesn't eagerly load each node's entries, which results in a query being
sent to the database for each child. I can specify that entries should be
eagerly loaded at each level:
def get_costs(session, node_ids: List[str]):
nodes = (
session.query(Node)
.options(
selectinload(Node.direct_children)
.selectinload(Node.direct_children)
.selectinload(Node.direct_children),
selectinload(Node.entries),
selectinload(Node.direct_children).selectinload(Node.entries),
selectinload(Node.direct_children).selectinload(Node.
direct_children).selectinload(Node.entries)
)
.filter(Node.NodeID.in_(node_ids))
.all()
)
return {n.NodeID: n.cost() for n in nodes}
But this feels pretty gross, and I'm hoping there is a better way.
Is there a cleaner way of specifying that entries should be eagerly loaded
for each level of children? Or is there a totally different query structure
that would be even better?
Thanks!
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
http://www.sqlalchemy.org/
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable
Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.