Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

John Hunter Tue, 12 Dec 2006 09:05:23 -0800

>>>>> "David" == David Cournapeau <[EMAIL PROTECTED]> writes:


    David> Hi, I am a regular user of matplotlib since I moved from
    David> matlab to python/numpy/scipy. Even if I find matplotlib to
    David> be a real help during the transition from matlab to python,
    David> I must confess I found it the most disappointing compare
    David> other packages ( essentially numpy/scipy/ipython).  This is

   Meatloaf: Now don't be sad, cause two out of three ain't bad

If you consider the fact that matplotlib was originally an ipython
patch that was rejected, you can see why we are such a bastard child
of the scientific python world.  There is a seed of truth in this;
Numeric, scipy and ipython were all mature packages in widespread use
before the first line of matplotlib code was written.  So they are
farther along in terms of maturity, documentation, usability,
etc... than matplotlib is.  

But we've achieved a lot in a comparably short time.  When I started
working on matplotlib there were probably two dozen plotting packages
that people used and recommended.  Now we are down to 5 or 6, with
matplotlib doing most of what most people need.  I've focused on
making something that does most of what people (and I) need rather
than doing it the fastest, so it is too slow for some purposes but
fast enough for most.  When we get a well defined important test case
that is too slow, we typically try and optimize it, sometimes with
dramatic results (eg 25 fold speedups); more on this below.

A consequence of trying to support most of the needs of most users is
this: we run on all major operating systems and all major GUIs with
all major array packages.  Consider the combinatorial problem: 5
graphical user interfaces with two or more versions in the wild across
3 operating systems and you will get a feel for what the support
problem we have.  This is not an academic point.  Most of the GUI
maintainers for *a single backend* burn out in short order.  Most
graphics packages *solve* this problem by supporting a single output
format (PYX) or GUI (chaco) which is a damned fine and admirable
solution.  But the consequence of this is plotting fragmentation:
people who need GTK cannot use Chaco, people who need SVG cannot use
PYX, and so on, and so they'll write their own plotting library for
their own GUI or output format (the situation before matplotlib).  You
can certainly get closer to bare metal speed by reducing choices and
focusing on a single target -- part of the performance price we pay is
in our abstraction layers, part is in trying to support features that
may be rarely used but cost something (masked array support, rotated
text with newlines), part is because we need to get to work and
optimize the slow parts.

    David> not a rant; I want to know if this slowness is coming from
    David> my lack of matplotlib knowledge or not; I apologize in
    David> advance if the following hurts anyone feelings :)

    Meatloaf: But -- there ain't no way I'm ever gonna love you

OK, I'll stop now.

    David>     First, I must admit that whereas I took a significant
    David> amount of time to study numpy and scipy, I didn't take that
    David> same time for matplotlib.  So this disappointment may just
    David> be a consequences of this laziness.

I suspect this is partly true; see below.

    David>     My main problem with matplotlib is speed: I find it
    David> really annoying to use in an interactive manner. For
    David> example, when I need to display some 2d information, such
    David> as spectrogramm or correlogram, this take 1 or 2 seconds
    David> for a small signal (~4500 frames of 256 samples). My
    David> function correlogram (similar to specgram, but compute
    David> correlation instead of log spectrum) uses imshow, and this
    David> function takes 20 times more time than imagesc of matlab
    David> for the same size.  Also, I found changing the size of the

This is where you can help us.  Saying specgram is slow is only
marginally more useful than saying matplotlib is slow or python is
slow.  What is helpful is to post a complete, free-standing script
that we can run, with some attached performance numbers.  For
starters, just run it with the Agg backend so we can isolate
matplotlib from the respective GUIs.  Show us how the performance
scales with the specgram parameters (frames and samples).  specgram is
divided into two parts (if you look at the Axes.specgram you will see
that it calls matplotlib.mlab.specgram to do the computation and
Axes.imshow to visualize it.  Which part is slow: the mlab.specgram
computation or the visualizion (imshow) part or both?  You can paste
this function into your own python file and start timing different
parts.  The most helpful "this is slow" posts come with profiler
output so we can see where the bottlenecks are.  

Such a post by Fernando Perez on "plot" with markers yielded
performance boosts of 25x for large numbers of points when he showed
we were making about one hundred thousand function calls per plot.

    David> matplotlib window really 'annoying to the eye': I compared
    David> to matlab, and this may be due to the fact that the whole
    David> window is redrawn with matplotlib, including the toolbar,
    David> whereas in matlab, the top toolbar is not redrawn.

It would be nice if we exposed the underlying GTK widgets to you so
you could customize the "expand" and "fill" properties of the gtk
toolbar, but this gets us into the multiple GUI, multiple version
problem discussed above.  Providing an abstract interface to such
details that works across the mpl backends is a lot of work that takes
us away from our core incompetency -- plotting.  What we do is enable
you to write your own widgets and embed mpl in them; see
examples/embedding_in_gtk2.py which shows you how to do this for
GTK/GTKAgg.  You can then customize the toolbar to your heart's
content.

    David> Finally, plotting many data (using plot(X, Y) with X and Y
    David> around 1000/10000 samples) is 'slow' (the '' are because I
    David> don't know much about computer graphics, and I understand
    David> that slow in the rendering is often just a perception)

This shouldn't be slow -- again a test script with some performance
numbers would help so we can compare what we are getting.  One
thought: make sure you are using the numerix layer properly -- ie, if
you are creating arrays with numpy make sure you have numerix set to
numpy ( i see below that you set numerix to numpy but
--verbose-helpful will confirm the setting).  A good way to start is
to write a demonstration script that you find too slow which makes a
call to savefig, and run it with

  > time myscript.py --verbose-helpful -dAgg

and post the output and script.  Then we might be able to help.

    David> So, is this a current limitation of matplotlib, is
    David> matplotlib optimized for good rendering for publication,
    David> and not for interactive use, or I am just misguided in my
    David> use of matplotlib ?

Many people use it interactively, but a number of power users find it
slow.

JDH

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

Reply via email to