On Oct 18, 2011, at 3:40 PM, Russ wrote: > I often mix up the SQL expression language with the use of an ORM session, > and it is great that SQLAlchemy more than supports this. > > But... what are the recommended ways to keep the session in sync with what > you do with the SQL expression stuff?
well pretty much being saavy about expiration is the primary approach. The rows you affect via an execute(), if they've been loaded in the session they'd need to be expired from memory. Unless you can be coarse grained about it and expire everything (as occurs normally after a commit()), or large chunks of things, you'd need to work out specific situations as needed. Of course the best scenario is when you aren't intermingling ORM state with SQL-emitted UPDATE statements so much. Query.update() and Query.delete() were introduced to help with this as they can apply the same filter() criteria to the objects within the Session, expiring those that match - it can evaluate simple SQL criterion in memory for this purpose. > > For example, with the ORM you can't really do a batch/bulk insert (that I > know of), but you can mix in some SQL expression language to achieve it. > Here is a complete example that shows this for the standard Users/Addresses > example: > http://pastebin.com/BArU6hci so there, your options are: 1. expire bob.addresses ( see below, it appears you weren't aware of this) 2. use lazy="dynamic" on bob.addresses, so that SQL is emitted each time the relationship is accessed. > > Although the latter behaviour isn't really surprising, can the query be > avoided somehow? Is there a way to manually update the session/identity-map > with the info? You can put things straight into the identity map using Session.add(), assuming the object has a database identity. Session.merge(, load=False) does it too and is normally used to merge in state from a cache into the Session where the target identities may already be present. But to get identity on the object and have it be "clean" as far as pending changes you need to tinker. The identity key part needs to be via instance_state(obj).key = object_mapper(obj).identity_key_from_instance(obj), so you can see there you're starting to dive into private APIs. The "fixed" attributes with no history you can assign via set_committed_value: http://www.sqlalchemy.org/docs/orm/session.html#sqlalchemy.orm.attributes.set_committed_value , this is more of a public API as it's there to help people write custom loader logic for special cases. There would be nothing controversial about making a more "public" api to assign identity to an object but we don't usually add things without a use case we see fit; otherwise people don't know why such a method is there and if/when they should be using it. The use cases also drive how the API would look, as there are several ways a "put an object in the identity map" API could look here. You can see this approach is starting to tinker with things, it will work completely fine and the patterns there aren't changing anytime soon, but this is not the usual level that people work at. The APIs in this regard are not as well suited (or widely tested) towards the use case of manipulating the Session directly to that fine a degree. That said we already have "make_transient()" which is the opposite of "set_my_identity()", so, not too controversial as long as we don't confuse people with it. > > In general, is there a better way to do what I'm doing? The example is > obviously a simplified one... my actual use case is batch inserting thousands > of records this way and I'd like to be able to keep the ORM and avoid the > extra query if I could. Wondering though why I've never had this use case. Batch install thousands of records with straight INSERT, great. But then you need all of those thousands in memory via an ORM state all at once ? Why's that ? Are doing more "batch" work with all N-thousand items in the ORM sense ? Usually if I write an installer, then its done. The ORM use case comes later, loads in just a handful of those records, then works with them. If I need to batch insert and work with the full set of objects in an ORM sense simultaneously, I work with the ORM all the way through. In modern SQLAlchemy this can be nearly as fast - just assign the primary key value to your objects ahead of time so that no "post fetch of primary key" is needed, and the ORM will batch the INSERT statements just as you're doing with your example. Tons and tons of work has gone into getting the ORM to be better at batch inserts, since I use it in this way myself quite a lot. > Another wrinkle to this is that if I already have relationship data within > the ORM, but then add records outside of the ORM with the expression > language, I can't figure out how to reconcile this efficiently. > As a specific example, if I add the snippet below to my original example, you > can see the ORM falls out of sync. > #I can get them by expiring all of Joe... > # - is there a way to do this without expiring Joe? the set_committed_value() above can do this, though id be careful of complexity here > # - Can the relation be expired directly? absolutely, Session.expire accepts a second parameter that's a list of attribute names (for the things you're looking to do here, and since you have a very good sense of how the ORM does things, start digging into those API docs ! not sure if you looked around but its here: http://www.sqlalchemy.org/docs/orm/session.html#refreshing-expiring as well as docstrings at http://www.sqlalchemy.org/docs/orm/session.html#sqlalchemy.orm.session.Session.expire ) > # - Better would be to to inform the ORM of new data (instead of expiring old > # data), but I'm looking for workarounds. yeah its not something I've dealt with much since I try to find another way to solve the problem without introducing a lot of brittle back and forth. There's a larger problem here, which is why do you really need to use direct SQL to emit inserts and updates within code that is complex enough to otherwise require the ORM. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
