I am trying to speed up Model.__init__, but it seems it is pretty well
optimized already. However, there are still some things that could be
done to speed up it even further.
Deprecate pre_init and post_init signals. I wonder if these are
actually used in third-party code. These signals are not the easiest
to use, as they get the field values in *args, or in **kwargs, or part
in *args, part in **kwargs. Django core uses them in generic foreign
keys:
django/contrib/contenttypes/generic.py:
class GenericForeignKey(object):
def contribute_to_class(self, cls, name):
...
# For some reason I don't totally understand, using weakrefs
here doesn't work.
signals.pre_init.connect(self.instance_pre_init, sender=cls,
weak=False)
...
def instance_pre_init(self, signal, sender, args, kwargs,
**_kwargs):
"""
Handles initializing an object with the generic FK instaed of
content-type/object-id fields.
"""
if self.name in kwargs:
value = kwargs.pop(self.name)
kwargs[self.ct_field] = self.get_content_type(obj=value)
kwargs[self.fk_field] = value._get_pk_val()
It is probably possible to fix that use case.
Are there more use cases? I would think the two possible use cases are
manipulating the initialization of third-party models and fields
needing to manipulate the initialization of a model (as in above).
I tested the effect of removing the signals using models T1 and T2. T1
has just id field, T2 has also 10 text fields.
Fetch of 10000 objects from DB:
T1: 0.16 s -> 0.13 s
T2: 0.36 s -> 0.33 s
Without DB (just a loop with T(args) calls):
T1: 0.10s -> 0.7s
T2: 0.18s -> 0.15s
So, one could save about 10% to 20% using this.
If there are use cases that are hard to do without pre_init and
post_init signals, then 10% to 20% speed loss for allowing those cases
isn't that bad. I just wonder if there are use cases like that?
One idea is to change _state from object to dict. This shaves of
additional 0.01 seconds per 10000 objects. The question here is if
_state should have logic attached to it. Currently it does not have
any logic, it is just a data container.
Still another idea is to get rid of the izip call and use an index
variable instead. That seems to shave off another 0.01 to 0.02 seconds
per 10000 objects. I need to work a little more on that idea still.
The total speed up for 10000 T1(id_val) calls is 50% and 10000 T2 from
DB is around 20%.
Otherwise I can't think anything to speed up model __init__ without
going to code generation.
For the interested, using cursor.execute("select * from t1");
list(cursor.fetchall()) takes 0.015 seconds. For t2 the time is 0.1
seconds. This is using postgresql. So, the overhead of fetching
objects instead of as-raw-as-possible sql is around 200% with all
optimizations, and up to 250-500+% without any optimizations.
- Anssi
--
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en.