This is indeed a very interesting result and I am able to reproduce 
similar ratios for total running time.

However, I think the semilogx result is somewhat of a red herring.  If 
you change the order of the tests in your script, you'll notice that the 
first "*log*" plot always takes the longest run time.  If you run each 
test in a separate process, all of the "*log*" run times are 
approximately equal (with loglog being slightly slower).  The reason for 
this is the caching of mathtext expressions.  I agree that mathtext is 
the bottleneck -- but mathtext expressions are only parsed and rendered 
the first time they are encountered, and simply pulled from a cache 
after that.

It's sort of a "known issue" that mathtext is slow-ish.  It's a very 
function-call heavy and object-oriented bit of code and most attempts at 
optimization seem to lead to too much uglification.  The algorithms 
themselves are from TeX, so I don't know if there's much room for 
improvement, but there is something about the translation from Pascal/C 
to Python that creates a very different performance profile.

An interesting result may be to disable the mathtext rendering for log 
plots (by setting the axis formatters to something static) and comparing 
those numbers.  That would give a better sense of the overhead of merely 
log-transforming the points and the transformation system itself.  I 
don't think a factor of 2 is too problematic, given all of the extra 
work that has to be done to maintain two copies of the data, extra care 
to calculate xlim and ylim etc.

Mike

Andrew Hawryluk wrote:
> I've observed a significant difference in the time required by different
> plotting functions. With a plot of 5000 random data points (all
> positive, non-zero), plt.semilogx takes 3.5 times as long as plt.plot.
> (Data for the case of saving to PDF, ratio changes to about 3.1 for PNG
> on my machine.)
>
> I used cProfile (script attached) and found several significant
> differences between the profiles of each plotting command. On my first
> analysis, it appears that most of the difference is due to increased use
> of mathtext in semilogx:
>
>                                 ==================================
>                                 Plotting command
> ==================================================================
> cumtime (s)                     plot    semilogx  semilogy  loglog
> ==================================================================
> total running time              0.618   2.192     0.953     1.362
> axis.py:181(draw)               0.118   1.500     0.412     0.569
> text.py:504(draw)               0.056   1.353     0.290     0.287
> mathtext.py:2765(__init__)      0.000   1.018     0.104     0.103
> mathtext.py:2772(parse)         ---     1.294     0.143     0.254
> pyparsing.py:1018(parseString)  ---     0.215     0.216     0.221
> pyparsing.py:3129(oneOf)        ---     0.991     ---       ---
> pyparsing.py:3147(<lambda>)     ---     0.358     ---       ---
> lines.py:918(_draw_solid)       0.243   0.358     0.234     0.352
> =================================================================
>
> It seems that semilogx could be made as fast as semilogy since they have
> to do the same amount of work, but I'm not sure where the differences
> lie. Can anyone suggest where I should look first?
>
> Much thanks,
>
> Andrew Hawryluk
>
> matplotlib.__version__ = '0.99.1'
> Windows XP Professional
> Version 2002, Service Pack 3
> Intel Pentium 4 CPU 3.00 GHz, 2.99 GHz, 0.99 GB of RAM
>   
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------------
> Download Intel&#174; Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> ------------------------------------------------------------------------
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/matplotlib-devel
>   

-- 
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

Reply via email to