Hi Michael,
I've ported your modifications on my tests, but I still can't get the same
results: runs 2X slower than with 0.2.8 and I can't figure out why (MySQL
version ???)... Anyway, I've continued my investigation, and may have a
proposal to make it way faster.
Back to my profiling results, I've compared the ncalls between 0.2.8 and
0.3.1. Thats ~700 000 vs. ~ 2 400 000 calls (~ 3X more), running the test. As
you said, this is due the "two-or-three hop path from MapperProperty->eager
loading code", and reducing this path doesn't improve speed that much. But
profiling also shows a interresting thing: "isinstance" is called 10X more in
0.3 than 0.2.8. Looking at SA code, I've found that
orm.attributes.AttributeManager.managed_attributes() is called a *lot* of
time while saving objs. Clearly, this ncalls difference comes from here. This
func, given a class, returns a list (yielded) a InstrumentedAttribute. The
problem is it's called for each *object*, repeatidly searching those
attributes:
for attr in manager.managed_attributes(obj.__class__):
# blabla
If objects are of the same class, this dramatically decrease the speed. I can
suggest to cache the results of this func. In
orm.attributes.AttributeManager.managed_attributes():
class AttributeManager(object):
def __init__(self):
self.attr_cache = {}
# blabla
def managed_attributes(self, class_):
# results in cache ?
if self.attr_cache.has_key(class_):
return self.attr_cache[class_]
# nope. caching...
self.attr_cache[class_] = []
if not isinstance(class_, type):
raise repr(class_) + " is not a type"
for key in dir(class_):
value = getattr(class_, key, None)
if isinstance(value, InstrumentedAttribute):
self.attr_cache[class_].append(value)
return self.attr_cache[class_]
(yes, yield is not used anymore is this code...)
This is memory vs. cpu optimisation, and I didn't track the used memory, but I
bet it's not that much... And that's worth it! Here the results *without*
this modifications (0.3.1 rev 2127 still ~ 2X slower for me):
SA 0.2.8:
2006-12-06 12:13:44.932276: Loading sis
2006-12-06 12:13:45.614073: 377 sis loaded
2006-12-06 12:13:45.614562: Creating indicators
2006-12-06 12:13:45.735367: 127 indicators created
2006-12-06 12:13:45.736245: registering...
2006-12-06 12:13:46.966250: ok done !
total time 2.03361010551
real 0m2.595s
user 0m1.860s
sys 0m0.112s
SA 0.3.1 rev 2127
2006-12-06 12:14:30.619091: Loading sis
2006-12-06 12:14:31.466031: 377 sis loaded
2006-12-06 12:14:31.466499: Creating indicators
2006-12-06 12:14:31.562919: 127 indicators created
2006-12-06 12:14:31.563745: registering...
2006-12-06 12:14:34.798755: ok done !
total time 4.17961120605
real 0m4.770s
user 0m4.152s
sys 0m0.092s
Now with caching, this gives:
SA 0.3.1 rev 2127 + attr caching:
2006-12-06 12:17:28.765317: Loading sis
2006-12-06 12:17:29.487502: 377 sis loaded
2006-12-06 12:17:29.487972: Creating indicators
2006-12-06 12:17:29.583032: 127 indicators created
2006-12-06 12:17:29.583848: registering...
2006-12-06 12:17:31.175295: ok done !
total time 2.40989303589
real 0m3.115s
user 0m2.368s
sys 0m0.100s
What do you think about this suggestion ?
Cheers,
--
Sébastien LELONG
[EMAIL PROTECTED]
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---