On Sep 3, 2013, at 11:03 PM, gbr <[email protected]> wrote: > Thanks. That's quite an interesting piece of code. There's a bit of magic > happening in this code and it's not quite compatible for my use case (use of > queries instead of tables, no ORM mapping), so allow me to ask some > questions. I've annotated the code, so perhaps you can correct any of my > assumptions that are wrong. My aim is to apply a similar concept to two > queries that are not mapped to a class.
keep in mind any ORM query has an accessor called .statement which will give you the core select() construct. > def disjoint_load(query, attr): > # This is just to extract the join condition. > target = attr.prop.mapper > local_cols, remote_cols = zip(*attr.prop.local_remote_pairs) > > # As far as I can tell, this creates a SELECT from the original parent > query. > # I'm not sure how this join works, as `attr` is a reference to > # `Parent.children` (no a condition), but I guess I could replace it > # with a condition that I pass in to the function. > # The `order_by` may not be necessary... > # Question: Does this also work if `target` already is a select query > containing a CTE? > child_q = query.from_self(target).join(attr).order_by(*remote_cols) "Parent.children" is as good as a (target, onclause) for Query.join() - see the examples in the tutorial for how this is used. as far as a CTE, specifics will affect this but you can often use query.select_entity_from(stmt) and the Query will use "stmt" in the place of the original Entity. > if attr.prop.order_by: > # No idea why/what this does. Is this necessary? > child_q = child_q.order_by(*attr.prop.order_by) this is maintaining the "order_by" of the relationship(), if one was present. > > # This is creating an identity map (parent id -> children list), but how > do we > # know the `parent.id` at this point? The query hasn't been issued yet... > collections = dict((k, list(v)) for k, v in groupby( > child_q, > lambda x:tuple([getattr(x, c.key) for c in remote_cols]) > )) child_q is a Query, which means it's an iterator. groupby() is an itertools helper that also is an iterator. when dict(...list(v)...) is invoked, the iterator is run and child_q is emitted as SQL to the database, results are returned. > > # `engine.echo=True` revealed that this is issuing 2 queries (which is > what I want) > # The order is (1) query for children (joining on parent query), (2) > parent query > # How/where is the children query attached to the parent query and where > is it sent? > parents = query.all() well in the example here the child_q is just run right above before query.all().... > > # This does the final assignment of 'list of children per parent' -> > parent.children > for p in parents: > attributes.set_committed_value( > p, > attr.key, > collections.get( > tuple([getattr(p, c.key) for c in local_cols]), > ()) > ) > return parents > > > > This is pretty much what I was looking for, but it needs a bit of tinkering > to work for me. Do you think it's advisable to use some dummy classes to map > the two queries to in order to reuse as much as possible from the above (or > adapt it to work with select queries)? What would be the implications in > terms of performance (would any of the ORM features such as attribute > tracking, identity map, etc. that I don't necessarily need be used in such a > case)? assembling an ORM Query is more expensive than assembling a core select(), but not much. as far as the load overhead, if the Query is told to load individual columns, that overhead goes down to be very comparable to that of the ResultProxy itself (returns plain tuples), or you can execute the .statement returned by Query using execute().
signature.asc
Description: Message signed with OpenPGP using GPGMail
