Re: [matplotlib-devel] performance (speed) of logarithmic plots

2010-03-19 Thread Gökhan Sever
On Thu, Mar 18, 2010 at 6:21 PM, Andrew Hawryluk hawr...@novachem.comwrote:

 I've observed a significant difference in the time required by different
 plotting functions. With a plot of 5000 random data points (all
 positive, non-zero), plt.semilogx takes 3.5 times as long as plt.plot.
 (Data for the case of saving to PDF, ratio changes to about 3.1 for PNG
 on my machine.)

 I used cProfile (script attached) and found several significant
 differences between the profiles of each plotting command. On my first
 analysis, it appears that most of the difference is due to increased use
 of mathtext in semilogx:

==
Plotting command
 ==
 cumtime (s) plotsemilogx  semilogy  loglog
 ==
 total running time  0.618   2.192 0.953 1.362
 axis.py:181(draw)   0.118   1.500 0.412 0.569
 text.py:504(draw)   0.056   1.353 0.290 0.287
 mathtext.py:2765(__init__)  0.000   1.018 0.104 0.103
 mathtext.py:2772(parse) --- 1.294 0.143 0.254
 pyparsing.py:1018(parseString)  --- 0.215 0.216 0.221
 pyparsing.py:3129(oneOf)--- 0.991 ---   ---
 pyparsing.py:3147(lambda) --- 0.358 ---   ---
 lines.py:918(_draw_solid)   0.243   0.358 0.234 0.352
 =

 It seems that semilogx could be made as fast as semilogy since they have
 to do the same amount of work, but I'm not sure where the differences
 lie. Can anyone suggest where I should look first?

 Much thanks,

 Andrew Hawryluk

 matplotlib.__version__ = '0.99.1'
 Windows XP Professional
 Version 2002, Service Pack 3
 Intel Pentium 4 CPU 3.00 GHz, 2.99 GHz, 0.99 GB of RAM



Hello,

How did you get the cumtime listing? The output of the run doesn't produce a
cumulative sum table as you showed here.


Platform :
Linux-2.6.31.9-174.fc12.i686.PAE-i686-with-fedora-12-Constantine
Python   : ('CPython', 'tags/r262', '71600')
NumPy: 1.5.0.dev8038
Matplotlib   : 1.0.svn




-- 
Gökhan
--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] performance (speed) of logarithmic plots

2010-03-19 Thread Michael Droettboom
This is indeed a very interesting result and I am able to reproduce 
similar ratios for total running time.

However, I think the semilogx result is somewhat of a red herring.  If 
you change the order of the tests in your script, you'll notice that the 
first *log* plot always takes the longest run time.  If you run each 
test in a separate process, all of the *log* run times are 
approximately equal (with loglog being slightly slower).  The reason for 
this is the caching of mathtext expressions.  I agree that mathtext is 
the bottleneck -- but mathtext expressions are only parsed and rendered 
the first time they are encountered, and simply pulled from a cache 
after that.

It's sort of a known issue that mathtext is slow-ish.  It's a very 
function-call heavy and object-oriented bit of code and most attempts at 
optimization seem to lead to too much uglification.  The algorithms 
themselves are from TeX, so I don't know if there's much room for 
improvement, but there is something about the translation from Pascal/C 
to Python that creates a very different performance profile.

An interesting result may be to disable the mathtext rendering for log 
plots (by setting the axis formatters to something static) and comparing 
those numbers.  That would give a better sense of the overhead of merely 
log-transforming the points and the transformation system itself.  I 
don't think a factor of 2 is too problematic, given all of the extra 
work that has to be done to maintain two copies of the data, extra care 
to calculate xlim and ylim etc.

Mike

Andrew Hawryluk wrote:
 I've observed a significant difference in the time required by different
 plotting functions. With a plot of 5000 random data points (all
 positive, non-zero), plt.semilogx takes 3.5 times as long as plt.plot.
 (Data for the case of saving to PDF, ratio changes to about 3.1 for PNG
 on my machine.)

 I used cProfile (script attached) and found several significant
 differences between the profiles of each plotting command. On my first
 analysis, it appears that most of the difference is due to increased use
 of mathtext in semilogx:

 ==
 Plotting command
 ==
 cumtime (s) plotsemilogx  semilogy  loglog
 ==
 total running time  0.618   2.192 0.953 1.362
 axis.py:181(draw)   0.118   1.500 0.412 0.569
 text.py:504(draw)   0.056   1.353 0.290 0.287
 mathtext.py:2765(__init__)  0.000   1.018 0.104 0.103
 mathtext.py:2772(parse) --- 1.294 0.143 0.254
 pyparsing.py:1018(parseString)  --- 0.215 0.216 0.221
 pyparsing.py:3129(oneOf)--- 0.991 ---   ---
 pyparsing.py:3147(lambda) --- 0.358 ---   ---
 lines.py:918(_draw_solid)   0.243   0.358 0.234 0.352
 =

 It seems that semilogx could be made as fast as semilogy since they have
 to do the same amount of work, but I'm not sure where the differences
 lie. Can anyone suggest where I should look first?

 Much thanks,

 Andrew Hawryluk

 matplotlib.__version__ = '0.99.1'
 Windows XP Professional
 Version 2002, Service Pack 3
 Intel Pentium 4 CPU 3.00 GHz, 2.99 GHz, 0.99 GB of RAM
   
 

 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 

 ___
 Matplotlib-devel mailing list
 Matplotlib-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/matplotlib-devel
   

-- 
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


[matplotlib-devel] performance (speed) of logarithmic plots

2010-03-18 Thread Andrew Hawryluk
I've observed a significant difference in the time required by different
plotting functions. With a plot of 5000 random data points (all
positive, non-zero), plt.semilogx takes 3.5 times as long as plt.plot.
(Data for the case of saving to PDF, ratio changes to about 3.1 for PNG
on my machine.)

I used cProfile (script attached) and found several significant
differences between the profiles of each plotting command. On my first
analysis, it appears that most of the difference is due to increased use
of mathtext in semilogx:

==
Plotting command
==
cumtime (s) plotsemilogx  semilogy  loglog
==
total running time  0.618   2.192 0.953 1.362
axis.py:181(draw)   0.118   1.500 0.412 0.569
text.py:504(draw)   0.056   1.353 0.290 0.287
mathtext.py:2765(__init__)  0.000   1.018 0.104 0.103
mathtext.py:2772(parse) --- 1.294 0.143 0.254
pyparsing.py:1018(parseString)  --- 0.215 0.216 0.221
pyparsing.py:3129(oneOf)--- 0.991 ---   ---
pyparsing.py:3147(lambda) --- 0.358 ---   ---
lines.py:918(_draw_solid)   0.243   0.358 0.234 0.352
=

It seems that semilogx could be made as fast as semilogy since they have
to do the same amount of work, but I'm not sure where the differences
lie. Can anyone suggest where I should look first?

Much thanks,

Andrew Hawryluk

matplotlib.__version__ = '0.99.1'
Windows XP Professional
Version 2002, Service Pack 3
Intel Pentium 4 CPU 3.00 GHz, 2.99 GHz, 0.99 GB of RAM


semilogPerformance.py
Description: semilogPerformance.py
--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel