I wish I could replicate this easily. However, I have a layer of abstraction on top of sqlalchemy to make this harder than it should be. I am actually building a user level data loader.
The profiler seems to say that get_history is called on after when "cascade_iterator" is run on the properties. Here is a picture of the output. http://i.imgur.com/mNZoM.png. The other exceptional thing about my mappers is that there is an attribute extension on each non relation mapped field. I don't know if that would make a difference but I know it affects get_history. They also all have a version_id field. I think that this issue may be the amount of times that it is called, not the speed of the method itself. It appears to be called once per "add". As I manage the cascades I probably do more "add"s than are needed. I have used your patch and it does not makes any difference, if anything it makes it very slightly worse. However, I have ran the test a few more times and the effect is more modest than my first tests. Putting the return at the top only seems to shave about 6% off on average. This however includes all the other things my application is doing (i.e.it validates each object etc.) which take up about half the time. So it could mean sqlalchemy loosing 12% because of it. I hope you can glean something out of this. If I get time I will try and get a profilable case. It may take me a while. Maybe setting up the mappers like I have and pathologically "add"ing will do the trick. On Sep 29, 12:20 am, Michael Bayer <[email protected]> wrote: > On Sep 28, 2010, at 6:32 PM, kindly wrote: > > > I have been doing some profiling on a batch job I have been running. > > > I control all my own cascading, so I set the cascade flag on each > > relation to "none". Even so mapper.cascade_iterator does quite a lot > > of work. > > > I did the crudest test by just placing a return at the top of > > cascade_iterator. It speeds up my job by %10-20. I imagine this > > would be more if the my relation tree was more complicated. > > > Do you think this is worth having a mapper option for no cascades? > > Or detecting there are not any and therefore not pre-emptively > > recursing the relation tree? > > I'd need to see a specific example for detail on this. If all of your > relationships() are configured with no cascade, you'd basically see calls to > mapper.cascade_iterator() that bundle up all the relationships into a list, > it then calls "cascade_iterator" on all of those, and they should all exit > immediately. > > There is certainly a case to be made that the check for "cascades" could be > done inside of mapper.cascade_iterator(), thereby avoiding the call to > RelationshipProperty. There is further an optimization such that when > mapper.cascade_iterator() calls upon self._props.itervalues(), that itself > could be changed to return a cached collection of all RelationshipProprerties > which include the cascade - so in the case of no cascades, that collection > would be empty, and mapper.cascade_iterator() could be reduced to one call, > one boolean pull and then it returns. The catch of "StopIteration" is > certainly a possible bottleneck on this, and we'd have to revisit Ants' > refactoring here which originally allowed it to work without recursion. > > If we were to optimize this, that's how we would do it. A top level mapper > option would be way too specific to the internals and esoteric to most users. > > What we need here though is some code to run profiling on. It's hard to > understand that the cascade_iterator() call with several no-op calls to > RelationshipProperty.cascade_iterator() is taking up 20% of a batch operation > - since when cascade is called its usually inside of some other operation > like a merge() or flush() that is overall much more expensive. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
