On Sep 5, 2010, at 7:23 PM, Thadeus Burgess wrote: > Seems that things break when you use __dict__.... So don't use it. > > Use a getattr and a setattr. If you really want, you can implement > getitem and setitem that just wrap setattr and getattr on your model. > > > Why doesn't sqlalchemy basemodel do this already? Everything in python > is a dictionary, seems natural to provide dict access as an > alternative by default.
First off, I don't know that its accurate to say "everything in Python is a dictionary", if you mean a dict with foo['bar'] style access. Everything in Python is certainly based on a type with an attribute namespace, though, and our usage of descriptors is how we instrument that namespace on a user defined class. If __dict__ access is supported as an end-user system of working with transparently persisted objects, how would you go about tracking change events ? Not to mention allowing stale/unloaded attributes to fire load events. We use Python descriptors for this purpose, as they are designed for just this use case, are the simplest and perform the best. There is the concept of replacing obj.__dict__ with a custom dictionary that instruments __getitem__, __setitem__, but then there's no way to affect the state of an object internally, such as when its being loaded from the DB which is an extremely performance critical block, without triggering those events as well when they're not appropriate, unless additional complex and performance-impacting schemes were devised to circumvent (suffice to say its been considered, long ago). Subclasses of dict, even if they do nothing, already perform more poorly than a raw dict due to the way Python optimizes non-subclassed dictionaries. So we prefer not to hardwire the extra latency and complexity of that approach. Years ago, when there was still a hint of controversy over the __dict__ issue, we had an approach whereby upon load, every attribute would be populated as it is now into __dict__ directly, and additionally into a second, private dictionary. At flush time, every single attribute on every object in the session would run a comparison of the "original" value versus the __dict__ value in order to detect changes, since we tried making the assumption that __dict__ might have been modified by the end user (basically, assuming the use case you're asking about here). If you've worked with Python for any amount of time you'd know that this approach is crushingly slow, compared to detecting only actual "set" and "delete" operations as events - both on the populate side as well as the "did anything change?" side. Once autoflush was introduced, its especially extremely critical that we can detect that no changes have occured wihtin a session in O(1) time. The previous approach was one of the worst examples of horrendous amounts of processing time being spent on an almost completely vapor use case, and the project suffered (the nature of which I leave that as an exercise for the reader) until we rewrote all that. In the end nobody really needed __dict__ and it was silly that we weren't using descriptors as they were designed. So in exchange for not-bone-crushingly-slow performance (actually quite fast), flawless change tracking, and crisp, transparent refreshing of stale attributes from the database, the user has to give up being able to populate and access __dict__ directly for regular operations. Suffice to say any ORM in Python you use, not to mention any other state management system, uses descriptors, to a lesser or greater degree depending on the system's reliance on intelligent state management. The __dict__ in turn is how the descriptors are usually bypassed. You can certainly provide dict-like access to your objects using the usual __getitem__/__setitem__ approach. Its also possible to entirely modify how SQLAlchemy persists state on the object, using not __dict__ but some other means, and in fact, you could plug in a change-tracking dict of your own and wire it all up, get your __dict__ that works and fires off the events SQLAlchemy looks for, and get all the requisite complexity and performance degradation (see examples/custom_attributes/custom_management.py) ... but there's no reason to do that unless you were integrating with some other object management system (which is why we even have that extension point...I'd much prefer it wasn't needed). We did an integration with Trellis and one with Zope securitty policies, both of which required a very open ended approach where __dict__ is not the usual thing we'd see. While you may see this as bad news that __dict__ access is never going to be built in as a feature, I'd see it as great news - you're coming to SQLAlchemy over five years into the project long after we've made and resolved every dumb mistake imaginable, been through all kinds of serious API upheavals and drama, and today SQLAlchemy is a fierce, battle tested library deployed in thousands of environments with very few issues. Attribute instrumentation took us a really fricking long time to get right, and there's still work to be done. Regarding commit, it expires all attributes so that new data, now available once the transaction is new, is fetched- this is mentioned at http://www.sqlalchemy.org/docs/orm/session.html#committing (note the navigation is new for the docs). It can be disabled for situations where concurrent modifications to rows are not a concern. > > -- > Thadeus > > > > > > On Fri, Sep 3, 2010 at 9:08 PM, Thadeus Burgess <[email protected]> wrote: >> If I have a record object. >> >> me = Person.query.get(id) >> >> and I access me.__dict__ everything looks good. >> >> However when I execute a db.session.commit() >> >> the me.__dict__ disappears and only contains _sa_state_instance >> >> The second I access an attribute of the me instance, __dict__ comes back. >> >> What is the best way to always make sure the __dict__ instance is >> always populated with the object data without knowing any of the >> column names ahead of time ? >> >> -- >> Thadeus >> > > -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/sqlalchemy?hl=en. > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
