On Tue, 2007-06-12 at 06:16 -0700, Brian Harring wrote:
> On Mon, Jun 11, 2007 at 07:39:08PM +1000, Malcolm Tredinnick wrote:
> >
> > On Sun, 2007-06-10 at 09:07 -0700, Brian Harring wrote:
> > > Curious, how many folks are actually using dispatch at all?
> > >
> > > For my personal usage, I'm actually not using any of the hooks- I
> > > suspect most folks aren't either. That said, I'm paying a fairly
> > > hefty price for them.
> > >
> > > With Model.__init__'s send left in for 53.3k record instantiation
> > > (just a walk of the records), time required is 9.2s. Without the
> > > send, takes 7.0s. Personally, I'd like to get that quarter of the
> > > time slice back. :)
> >
> > Since you already have your own version of the Spanish Inquisition set
> > up for testing, what portion of this overhead is just the function call?
> > If you the dispatch function is replaced with just "return", do we save
> > much.
>
> Offhand, replacing the dispatch with just 'return' is actually
> semi tricky, since there are a few receivers required for the django
> internals (class preparation). Basically requires delegating the send
> to the signal in select cases (for *_delete, and request_*, don't see
> much option unless they can be shifted around also).
>
> For __init__ and save however, the wrap trick will fly- meaning don't
> even need the empty function call.
>
> Either way, profile dump follows.
>
> Top 30 via lsprof (cProfile for 2.5); with send left in Model.__init__
>
> >>> ps.sort_stats("ti").print_stats(30)
> Mon Jun 11 02:55:18 2007 dump.stats
>
> 1747388 function calls (1745991 primitive calls) in 18.627 CPU
> seconds
>
> Ordered by: internal time
> List reduced from 916 to 30 due to restriction <30>
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 53332 3.314 0.000 9.046 0.000 base.py:97(__init__)
> 106720 2.592 0.000 3.162 0.000
> dispatcher.py:271(getAllReceivers)
> 373318 1.995 0.000 3.056 0.000 base.py:38(utf8)
> 2 1.723 0.861 1.723 0.861 base.py:99(execute)
> 53332 1.631 0.000 4.687 0.000 base.py:37(utf8rowFactory)
> 537 1.540 0.003 6.227 0.012 ~:0(<method 'fetchmany' of
> 'pysqlite2.dbapi2.Cursor' objects>)
> 380348 1.329 0.000 1.329 0.000 ~:0(<setattr>)
> 196950 1.061 0.000 1.061 0.000 ~:0(<method 'encode' of
> 'unicode' objects>)
> 53334 0.809 0.000 17.958 0.000 query.py:171(iterator)
> 106678 0.808 0.000 3.980 0.000 dispatcher.py:317(send)
> 213374 0.570 0.000 0.570 0.000 ~:0(<id>)
> 116228/115981 0.321 0.000 0.322 0.000 ~:0(<len>)
> 3 0.168 0.056 18.126 6.042 query.py:468(_get_data)
> 53334 0.162 0.000 0.162 0.000 ~:0(<iter>)
> 60 0.060 0.001 0.118 0.002 functional.py:26(__init__)
> 245/60 0.042 0.000 0.158 0.003 sre_parse.py:374(_parse)
> 6660 0.036 0.000 0.036 0.000 functional.py:36(__promise__)
> 3118 0.035 0.000 0.051 0.000 sre_parse.py:182(__next)
> 458/56 0.032 0.000 0.126 0.002 sre_compile.py:27(_compile)
> 1 0.027 0.027 0.032 0.032
> sre_compile.py:296(_optimize_unicode)
> 206 0.022 0.000 0.067 0.000
> sre_compile.py:202(_optimize_charset)
> 7 0.021 0.003 0.455 0.065 __init__.py:1(?)
> 6360 0.017 0.000 0.017 0.000 ~:0(<method 'append' of 'list'
> objects>)
> 2573 0.017 0.000 0.059 0.000 sre_parse.py:201(get)
> 634/243 0.014 0.000 0.018 0.000 sre_parse.py:140(getwidth)
> 1 0.014 0.014 18.627 18.627 full-run.py:2(?)
> 19/12 0.012 0.001 0.187 0.016 ~:0(<__import__>)
> 1 0.008 0.008 0.011 0.011 socket.py:43(?)
> 1 0.007 0.007 0.109 0.109 urllib2.py:71(?)
> 182/57 0.007 0.000 0.160 0.003 sre_parse.py:301(_parse_sub)
>
>
> <pstats.Stats instance at 0xb7ce268c>
>
> without
>
> >>> ps.sort_stats("ti").print_stats(30)
> Mon Jun 11 03:02:10 2007 /home/bharring/dump2.stats
>
> 1320732 function calls (1319335 primitive calls) in 13.970 CPU
> seconds
>
> Ordered by: internal time
> List reduced from 916 to 30 due to restriction <30>
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 53332 2.428 0.000 4.475 0.000 base.py:97(__init__)
> 373318 2.004 0.000 3.059 0.000 base.py:38(utf8)
> 2 1.715 0.858 1.716 0.858 base.py:99(execute)
> 380348 1.623 0.000 1.623 0.000 ~:0(<setattr>)
> 53332 1.617 0.000 4.676 0.000 base.py:37(utf8rowFactory)
> 537 1.538 0.003 6.215 0.012 ~:0(<method 'fetchmany' of
> 'pysqlite2.dbapi2.Cursor' objects>)
> 196950 1.055 0.000 1.056 0.000 ~:0(<method 'encode' of
> 'unicode' objects>)
> 53334 0.744 0.000 13.305 0.000 query.py:171(iterator)
> 116228/115981 0.315 0.000 0.317 0.000 ~:0(<len>)
> 3 0.166 0.055 13.470 4.490 query.py:468(_get_data)
> 53334 0.158 0.000 0.158 0.000 ~:0(<iter>)
> 60 0.060 0.001 0.119 0.002 functional.py:26(__init__)
> 245/60 0.042 0.000 0.158 0.003 sre_parse.py:374(_parse)
> 6660 0.036 0.000 0.036 0.000 functional.py:36(__promise__)
> 3118 0.035 0.000 0.051 0.000 sre_parse.py:182(__next)
> 458/56 0.032 0.000 0.127 0.002 sre_compile.py:27(_compile)
> 1 0.027 0.027 0.032 0.032
> sre_compile.py:296(_optimize_unicode)
> 206 0.022 0.000 0.067 0.000
> sre_compile.py:202(_optimize_charset)
> 7 0.021 0.003 0.455 0.065 __init__.py:1(?)
> 2573 0.017 0.000 0.059 0.000 sre_parse.py:201(get)
> 6360 0.017 0.000 0.017 0.000 ~:0(<method 'append' of 'list'
> objects>)
> 634/243 0.014 0.000 0.018 0.000 sre_parse.py:140(getwidth)
> 1 0.014 0.014 13.970 13.970 full-run.py:2(?)
> 19/12 0.011 0.001 0.188 0.016 ~:0(<__import__>)
> 1 0.008 0.008 0.011 0.011 socket.py:43(?)
> 182/57 0.007 0.000 0.160 0.003 sre_parse.py:301(_parse_sub)
> 1 0.007 0.007 0.109 0.109 urllib2.py:71(?)
> 1456 0.007 0.000 0.015 0.000 sre_parse.py:195(match)
> 206 0.007 0.000 0.076 0.000
> sre_compile.py:173(_compile_charset)
> 20 0.006 0.000 0.007 0.000 sre_compile.py:253(_mk_bitmap)
>
>
> Model.__init__ is still a bit of a kick in the teeth offhand;
> addressing that one however requires some semi-nasty work shifting
> some of the fields related testing to be cached in _meta; not
> expecting a huge gain out of it, plus it'll likely be fairly nasty so
> I'd rather hold off on that one till a later date.
It's also going to get a little heavier (not noticable in the common
cases, maybe noticable in the 10^6 case) with the unicode branch and
when we add Field sub-classing (which will be very soon), since both
require an extra function call per field in the average case. This is
the standard trade-off: functionality costs time and the functionality
in both cases is worth having.
> Not yet advocating it (mainly since digging it out would be ugly), but
> if you take a look at the bits above, having the option to disable
> verification on read *would* have a nice kick in the pants for ORM
> object instantiation when the admin has decided the data is guranteed
> to be the correct types.
I have a blog post I'm still collecting the data for (making pretty
graphs, mostly), but it seems that the extreme cases you can actually
make instantiation much faster by just writing an __init__ method on the
Model sub-class that does just the right thing -- customised for the
model fields -- since it's only an attribute populator when you get
right down to it. One can even generate this code automatically. Again,
not something to worry about for the 90% case, but for people wanting to
create 10^6 objects, it's worth the ten minutes to code it up by hand.
That'll appear on the community aggregator when I post it.
[...]
> > Given that upstream pydispatcher isn't really being maintained, I don't
> > think we should be too hesitant to tweak it for our needs.
>
> Don't spose it could just be thrown out? The code really *is* ugly :)
Agree with the second part (although I'd be less hyperbolic, if I saw it
from a colleague, we'd be having discussions). Keeping the API similar
to what it is (or at least, "routine" to port -- possible to do with a
reg-exp, say) would be worthwhile, since there is a lot of code in the
wild using the signal infrastructure. The main argument for keeping the
implementation as it is now (expressed by Jacob in the past, but it
makes sense), was to ease our maintenance by synchronising with
upstream. Upstream has gone away. It's drought conditions. (As usual,
speaking only for myself, but I don't think this is too highly
controversial.)
Regards,
Malcolm
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---