Re: [sqlalchemy] Mimik joinedload for core tables/queries

Michael Bayer Tue, 03 Sep 2013 21:20:05 -0700

On Sep 3, 2013, at 11:03 PM, gbr <[email protected]> wrote:

> Thanks. That's quite an interesting piece of code. There's a bit of magic 
> happening in this code and it's not quite compatible for my use case (use of 
> queries instead of tables, no ORM mapping), so allow me to ask some 
> questions. I've annotated the code, so perhaps you can correct any of my 
> assumptions that are wrong. My aim is to apply a similar concept to two 
> queries that are not mapped to a class.



keep in mind any ORM query has an accessor called .statement which will give 
you the core select() construct.


> def disjoint_load(query, attr):
>     # This is just to extract the join condition.
>     target = attr.prop.mapper
>     local_cols, remote_cols = zip(*attr.prop.local_remote_pairs)
>     
>     # As far as I can tell, this creates a SELECT from the original parent 
> query.
>     # I'm not sure how this join works, as `attr` is a reference to 
>     # `Parent.children` (no a condition), but I guess I could replace it 
>     # with a condition that I pass in to the function.
>     # The `order_by` may not be necessary...
>     # Question: Does this also work if `target` already is a select query 
> containing a CTE?
>     child_q = query.from_self(target).join(attr).order_by(*remote_cols)

"Parent.children" is as good as a (target, onclause) for Query.join() - see the 
examples in the tutorial for how this is used.

as far as a CTE, specifics will affect this but you can often use 
query.select_entity_from(stmt) and the Query will use "stmt" in the place of 
the original Entity.


>     if attr.prop.order_by:
>         # No idea why/what this does. Is this necessary?
>         child_q = child_q.order_by(*attr.prop.order_by)
this is maintaining the "order_by" of the relationship(), if one was present.

>     
>     # This is creating an identity map (parent id -> children list), but how 
> do we 
>     # know the `parent.id` at this point? The query hasn't been issued yet...
>     collections = dict((k, list(v)) for k, v in groupby(
>                     child_q, 
>                     lambda x:tuple([getattr(x, c.key) for c in remote_cols])
>                 ))

child_q is a Query, which means it's an iterator.  groupby() is an itertools 
helper that also is an iterator.  when dict(...list(v)...) is invoked, the 
iterator is run and child_q is emitted as SQL to the database, results are 
returned.


>   
>     # `engine.echo=True` revealed that this is issuing 2 queries (which is 
> what I want)
>     # The order is (1) query for children (joining on parent query), (2) 
> parent query
>     # How/where is the children query attached to the parent query and where 
> is it sent?
>     parents = query.all()

well in the example here the child_q is just run right above before 
query.all()....

>     
>     # This does the final assignment of 'list of children per parent' -> 
> parent.children
>     for p in parents:
>         attributes.set_committed_value(
>             p, 
>             attr.key, 
>             collections.get(
>                 tuple([getattr(p, c.key) for c in local_cols]), 
>                 ())
>         )
>     return parents
> 
> 
> 
> This is pretty much what I was looking for, but it needs a bit of tinkering 
> to work for me. Do you think it's advisable to use some dummy classes to map 
> the two queries to in order to reuse as much as possible from the above (or 
> adapt it to work with select queries)? What would be the implications in 
> terms of performance (would any of the ORM features such as attribute 
> tracking, identity map, etc. that I don't necessarily need be used in such a 
> case)?

assembling an ORM Query is more expensive than assembling a core select(), but 
not much.  as far as the load overhead, if the Query is told to load individual 
columns, that overhead goes down to be very comparable to that of the 
ResultProxy itself (returns plain tuples), or you can execute the .statement 
returned by Query using execute().

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [sqlalchemy] Mimik joinedload for core tables/queries

Reply via email to