Re: [Numpy-discussion] slow numpy.clip ?

2006-12-19 Thread David Cournapeau
Gael Varoquaux wrote:
 On Tue, Dec 19, 2006 at 02:10:29PM +0900, David Cournapeau wrote:
 I would really like to see the imshow/show calls goes in the range of a 
 few hundred ms; for interactive plotting, this really change a lot in my 
 opinion.

 I think this is strongly dependant on some parameters. I did some
 interactive plotting on both a pentium 2, linux, WxAgg (thus Gtk behind
 Wx), and a pentium 4, windows, WxAgg (thus MFC behin Wx), and there was a
 huge difference between the speeds. The speed difference was a few orders
 of magnitudes. I couldn't explain it but it was a good surprise, as the
 application was developped for the windows box.
I started to investigate the problem because under matlab, plotting a 
spectrogram is negligeable compared to computing it, whereas in 
matplotlib with numpy array backend, plotting it takes as much time as 
computing it, which didn't make sense to me.

Most of the computing time is spend into code which is independent of 
the backend, that is during the conversion from the rank 2 array to rgba 
(60 % of the time of my fast workstation, 85 % of the time on my laptop 
with a pentium M @ 1.2 Ghz), so I don't think the GUI backend makes any 
difference.

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] slow numpy.clip ?

2006-12-19 Thread David Cournapeau
Eric Firing wrote:
 David Cournapeau wrote:
 Well, this is something I would be willing to try *if* this is the main 
 bottleneck of imshow/show. I am still unsure about the problem, because 
 if I change numpy.clip to my function, including a copy, I really get a 
 big difference myself:

 val = ma.array(nx.clip(val.filled(vmax), vmin, vmax),
 mask=mask)

 vs

 def myclip(b, m, M):
 a   = b.copy()
 a[am]  = m
 a[aM]  = M
 return a
 val = ma.array(myclip(val.filled(vmax), vmin, vmax), mask=mask)

 By trying the best result, I get 0.888 ms vs 0.784 for a show() call, 
 which is already a 10 % improvement, and I get almost a 15 % if I remove 
 the copy. I am updating numpy/scipy/mpl on my laptop to see if this is 
 specific to the CPU of my workstation (big cache, high frequency clock, 
 bi CPU with HT enabled).

 Please try the putmask version without the copy on your machines; I 
 expect it will be quite a bit faster on both machines.  The relative 
 speeds of the versions may differ widely depending on how many values 
 actually get changed, though.
On my workstation (dual xeon; I run each corresponding script 5 times 
and took the best result):
- nx.clip takes ~ 170 ms (of 920 ms for the whole show call)
- your fast clip, with copy: ~ 50 ms (of ~820 ms)
- mine, with copy: ~50 ms (of ~830 ms)
- your wo copy:  ~ 30 ms (of 830 ms)
- mine wo copy:  ~ 40 ms (of 830 ms)

Same on my laptop (pentium M @ 1.2 Ghz):

- nx.clip takes ~ 230 ms (of 1460 ms)
- mine with copy ~ 70 ms (of 1200 ms)
- mine wo copy ~ 55 ms (of 1300 ms)
- yours with copy ~ 80 ms (of 1300 ms)
- yours wo copy ~ 67 ms (of 1300 ms)

Basically, at least from those figures, both versions are pretty 
similar, and not worth improving much anyway for matplotlib. There is 
something funny with numpy version, though.

cheers,

David

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] slow numpy.clip ?

2006-12-19 Thread Robert Kern
David Cournapeau wrote:

 Basically, at least from those figures, both versions are pretty 
 similar, and not worth improving much anyway for matplotlib. There is 
 something funny with numpy version, though.

Looking at the code, it's certainly not surprising that the current
implementation of clip() is slow. It is a direct numpy C API translation of the
following (taken from numarray, but it is the same in Numeric):


def clip(m, m_min, m_max):
clip()  returns a new array with every entry in m that is less than m_min
replaced by m_min, and every entry greater than m_max replaced by m_max.

selector = ufunc.less(m, m_min)+2*ufunc.greater(m, m_max)
return choose(selector, (m, m_min, m_max))


Creating that integer selector array is probably the most expensive part.
Copying the array, then using putmask() or similar is certainly a better
approach, and I can see no drawbacks to it.

If anyone is up to translating their faster clip() into C, I'm more than happy
to check it in. I might also entertain adding a copy=True keyword argument, but
I'm not entirely certain we should be expanding the API during the 1.0.x series.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] slow numpy.clip ?

2006-12-19 Thread David Cournapeau
Robert Kern wrote:

 Looking at the code, it's certainly not surprising that the current
 implementation of clip() is slow. It is a direct numpy C API translation of 
 the
 following (taken from numarray, but it is the same in Numeric):


 def clip(m, m_min, m_max):
 clip()  returns a new array with every entry in m that is less than 
 m_min
 replaced by m_min, and every entry greater than m_max replaced by m_max.
 
 selector = ufunc.less(m, m_min)+2*ufunc.greater(m, m_max)
 return choose(selector, (m, m_min, m_max))


 Creating that integer selector array is probably the most expensive part.
 Copying the array, then using putmask() or similar is certainly a better
 approach, and I can see no drawbacks to it.

 If anyone is up to translating their faster clip() into C, I'm more than happy
 to check it in. I might also entertain adding a copy=True keyword argument, 
 but
 I'm not entirely certain we should be expanding the API during the 1.0.x 
 series.

I  would be happy to code the function; for new code to be added to 
numpy, is there another branch than the current one ? What is the 
approach for a 1.1.x version of numpy ?

For now, putting the function with a copy (the current behaviour ?) 
would be ok, right ? The copy part is a much smaller problem than the 
rest of the function anyway, at least from my modest benchmarking,

David

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] slow numpy.clip ?

2006-12-19 Thread Robert Kern
David Cournapeau wrote:
 I  would be happy to code the function; for new code to be added to 
 numpy, is there another branch than the current one ? What is the 
 approach for a 1.1.x version of numpy ?

I don't think we've decided on one, yet.

 For now, putting the function with a copy (the current behaviour ?) 
 would be ok, right ? The copy part is a much smaller problem than the 
 rest of the function anyway, at least from my modest benchmarking,

I'd prefer that you simply modify PyArray_Clip to use a better approach than to
make an entirely new function. In that case, it certainly must make a copy.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Profiling numpy ? (parts written in C)

2006-12-19 Thread Charles R Harris

On 12/19/06, Francesc Altet [EMAIL PROTECTED] wrote:


A Dimarts 19 Desembre 2006 08:12, David Cournapeau escrigué:
 Hi,




snip






My guess is that the real bottleneck is in calling so many times
memmove (once per element in the array). Perhaps the algorithm can be
changed to do a block copy at the beginning and then modify only the
places on which the clip should act (kind of the same that you have
made in Python, but at C level).



IIRC, doing a simple type specific assignment is faster than either memmov
or memcpy. If speed is really of the essence it would probably be worth
writing a type specific version of clip. A special function combining clip
with RGB conversion might do even better.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Unexpected output using numpy.ndarray and __radd__

2006-12-19 Thread Mark Hoffmann

Hi,

The following issue has puzzled me for a while. I want to add a
numpy.ndarray and an instance of my own class. I define this operation
by implementing the methods __add__ and __radd__. My programme
(including output) looks like:

#!/usr/local/bin/python

import numpy

class Cyclehist:
def __init__(self,vals):
self.valuearray = numpy.array(vals)

def __str__(self):
return 'Cyclehist object: valuearray = '+str(self.valuearray)

def __add__(self,other):
print __add__ : ,self,other
return self.valuearray + other

def __radd__(self,other):
print __radd__ : ,self,other
return other + self.valuearray

c = Cyclehist([1.0,-21.2,3.2])
a = numpy.array([-1.0,2.2,-2.2])
print c + a
print a + c

# -- OUTPUT --
#
# addprob $ addprob.py
# __add__ :  Cyclehist object: valuearray = [  1.  -21.2   3.2] [-1.
2.2 -2.2]
# [  0. -19.   1.]
# __radd__ :  Cyclehist object: valuearray = [  1.  -21.2   3.2] -1.0
# __radd__ :  Cyclehist object: valuearray = [  1.  -21.2   3.2] 2.2
# __radd__ :  Cyclehist object: valuearray = [  1.  -21.2   3.2] -2.2
# [[  0.  -22.2   2.2] [  3.2 -19.5.4] [ -1.2 -23.4   1. ]]
# addprob $
#
# 


I expected the output of c+a and a+c to be identical, however, the
output of a+c gets nested in an elementwise fashion. Can anybody
explain this? Is it a bug or a feature? I'm using Python 2.4.4c1 and
numpy 1.0. I tried the programme using an older version of Python and
numpy and there the result of c+a and a+c are identical.


Regards,

Mark Hoffmann

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] (no subject)

2006-12-19 Thread Bandler, Derek
Hi,

I would like to get information on the software licenses for numpy 
numeric.  On the sourceforge home for the packages, the listed license
is OSI-Approved Open Source /softwaremap/trove_list.php?form_cat=14 .
Is it possible to get more information on this?  A copy of the document
would be useful.  Thank you.
Best regards,
Derek Bandler 

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] (no subject)

2006-12-19 Thread Greg Willden

Hi Derek,
Like all Free  Open Source Software (FOSS) projects the license is
distributed with the source code.
There is a file called LICENSE.txt in the numpy tar archive.
Here are the contents of that file.
license
Copyright (c) 2005, NumPy Developers
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

   * Redistributions of source code must retain the above copyright
  notice, this list of conditions and the following disclaimer.

   * Redistributions in binary form must reproduce the above
  copyright notice, this list of conditions and the following
  disclaimer in the documentation and/or other materials provided
  with the distribution.

   * Neither the name of the NumPy Developers nor the names of any
  contributors may be used to endorse or promote products derived
  from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
/license

Greg

On 12/19/06, Bandler, Derek [EMAIL PROTECTED] wrote:


 Hi,

I would like to get information on the software licenses for numpy 
numeric.  On the sourceforge home for the packages, the listed license is
*OSI-Approved Open Source*.  Is it possible to get more information on
this?  A copy of the document would be useful.  Thank you.

Best regards,
Derek Bandler

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion






--
Linux.  Because rebooting is for adding hardware.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] (no subject)

2006-12-19 Thread Robert Kern
Bandler, Derek wrote:
 Hi,
 
 I would like to get information on the software licenses for numpy 
 numeric.  On the sourceforge home for the packages, the listed license
 is _OSI-Approved Open Source_
 file:///softwaremap/trove_list.php?form_cat=14.  Is it possible to get
 more information on this?  A copy of the document would be useful. 
 Thank you.

They are both BSD-like licenses.

http://projects.scipy.org/scipy/numpy/browser/trunk/LICENSE.txt
http://projects.scipy.org/scipy/scipy/browser/trunk/LICENSE.txt

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] slow numpy.clip ?

2006-12-19 Thread Travis Oliphant
David Cournapeau wrote:
 Robert Kern wrote:
   
 Looking at the code, it's certainly not surprising that the current
 implementation of clip() is slow. It is a direct numpy C API translation of 
 the
 following (taken from numarray, but it is the same in Numeric):


 def clip(m, m_min, m_max):
 clip()  returns a new array with every entry in m that is less than 
 m_min
 replaced by m_min, and every entry greater than m_max replaced by m_max.
 
 selector = ufunc.less(m, m_min)+2*ufunc.greater(m, m_max)
 return choose(selector, (m, m_min, m_max))


 Creating that integer selector array is probably the most expensive part.
 Copying the array, then using putmask() or similar is certainly a better
 approach, and I can see no drawbacks to it.

 If anyone is up to translating their faster clip() into C, I'm more than 
 happy
 to check it in. I might also entertain adding a copy=True keyword argument, 
 but
 I'm not entirely certain we should be expanding the API during the 1.0.x 
 series.

 
 I  would be happy to code the function; for new code to be added to 
 numpy, is there another branch than the current one ? What is the 
 approach for a 1.1.x version of numpy ?
   
The idea is to make a 1.0.x branch as soon as the trunk changes the C-API. 

The guarantee is that extension modules won't have to be rebuilt until 
1.1.  I don't know that we've specified if there will be *no* API 
changes.  For example, there have already been some backward-compatible 
extensions to the 1.0.X series. 

I like the idea of being able to add functions to the 1.0.X series but 
without breaking compatibility.  I also don't mind adding new keywords 
to functions (but not to C-API calls as that would require a re-compile 
of extension modules).


-Travis


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] slow numpy.clip ?

2006-12-19 Thread Travis Oliphant
Robert Kern wrote:
 David Cournapeau wrote:

   
 Basically, at least from those figures, both versions are pretty 
 similar, and not worth improving much anyway for matplotlib. There is 
 something funny with numpy version, though.
 

 Looking at the code, it's certainly not surprising that the current
 implementation of clip() is slow. It is a direct numpy C API translation of 
 the
 following (taken from numarray, but it is the same in Numeric):


 def clip(m, m_min, m_max):
 clip()  returns a new array with every entry in m that is less than 
 m_min
 replaced by m_min, and every entry greater than m_max replaced by m_max.
 
 selector = ufunc.less(m, m_min)+2*ufunc.greater(m, m_max)
 return choose(selector, (m, m_min, m_max))

   

There are a lot of functions that are essentially this.   Many things 
were done to just get something working.  It would seem like a good idea 
to re-code many of these to speed them up.
 Creating that integer selector array is probably the most expensive part.
 Copying the array, then using putmask() or similar is certainly a better
 approach, and I can see no drawbacks to it.

 If anyone is up to translating their faster clip() into C, I'm more than happy
 to check it in. I might also entertain adding a copy=True keyword argument, 
 but
 I'm not entirely certain we should be expanding the API during the 1.0.x 
 series.

   
The problem with the copy=True keyword is that it would imply needing to 
expand the C-API for PyArray_Clip and should not be done until 1.1 IMHO.

We would probably be better off not expanding the keyword arguments to 
methods as well until that time.

-Travis

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] slow numpy.clip ?

2006-12-19 Thread Robert Kern
Travis Oliphant wrote:
 The problem with the copy=True keyword is that it would imply needing to 
 expand the C-API for PyArray_Clip and should not be done until 1.1 IMHO.

I don't think we have to change the signature of PyArray_Clip() at all.
PyArray_Clip() takes an out argument. Currently, this is only set to something
other than NULL if explicitly provided as a keyword out= argument to
numpy.ndarray.clip(). All we have to do is modify the implementation of
array_clip() to parse a copy= argument and set out = self before calling
PyArray_Clip().

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] slow numpy.clip ?

2006-12-19 Thread Robert Kern
Travis Oliphant wrote:
 There are a lot of functions that are essentially this.   Many things 
 were done to just get something working.  It would seem like a good idea 
 to re-code many of these to speed them up.

Off the top of your head, do you have a list of these?

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Profiling numpy ? (parts written in C)

2006-12-19 Thread David Cournapeau
Charles R Harris wrote:





 My guess is that the real bottleneck is in calling so many times
 memmove (once per element in the array). Perhaps the algorithm can be
 changed to do a block copy at the beginning and then modify only the
 places on which the clip should act (kind of the same that you have
 made in Python, but at C level).


 IIRC, doing a simple type specific assignment is faster than either 
 memmov or memcpy. If speed is really of the essence it would probably 
 be worth writing a type specific version of clip. A special function 
 combining clip with RGB conversion might do even better.
At the end, in the original context (speeding the drawing of 
spectrogram), this is the problem. Even if multiple backend/toolkits 
have obviously an impact in performances, I really don't see why a numpy 
function to convert an array to a RGB representation should be 10-20 
times slower than matlab on the same machine.

I will take into account all those helpful messages, and hopefully come 
with something for the end of the week :),

cheers

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Profiling numpy ? (parts written in C)

2006-12-19 Thread John Hunter
 David == David Cournapeau [EMAIL PROTECTED] writes:
David At the end, in the original context (speeding the drawing
David of spectrogram), this is the problem. Even if multiple
David backend/toolkits have obviously an impact in performances,
David I really don't see why a numpy function to convert an array
David to a RGB representation should be 10-20 times slower than
David matlab on the same machine.

This isn't exactly right.  When matplotlib converts a 2D grayscale
array to rgba, a lot goes on under the hood.  It's all numpy, but it's
far from single function and it involves many passes through the
data.  In principle, this could be done with one or two passes through
the data.  In practice, our normalization and colormapping abstractions
are so abstract that it is difficult (though not impossible) to
special case and optimize.

The top-level routine is

def to_rgba(self, x, alpha=1.0):
'''Return a normalized rgba array corresponding to x.
If x is already an rgb or rgba array, return it unchanged.
'''
if hasattr(x, 'shape') and len(x.shape)2: return x
x = ma.asarray(x)
x = self.norm(x)
x = self.cmap(x, alpha)
return x

which implies at a minimum two passes through the data, one for norm
and one for cmap.

In 99% of the use cases, cmap is a LinearSegmentedColormap though
users can define their own as long as it is callable.  My guess is
that the expensive part is Colormap.__call__, the base class for
LinearSegmentedColormap.  We could probably write some extension code
that does the following routine in one pass through the data.  But it
would be hairy.  In a quick look and rough count, I see about 10
passes through the data in the function below.

If you are interested in optimizing colormapping in mpl, I'd start
here.  I suspect there may be some low hanging fruit.

def __call__(self, X, alpha=1.0):

X is either a scalar or an array (of any dimension).
If scalar, a tuple of rgba values is returned, otherwise
an array with the new shape = oldshape+(4,). If the X-values
are integers, then they are used as indices into the array.
If they are floating point, then they must be in the
interval (0.0, 1.0).
Alpha must be a scalar.

if not self._isinit: self._init()
alpha = min(alpha, 1.0) # alpha must be between 0 and 1
alpha = max(alpha, 0.0)
self._lut[:-3, -1] = alpha
mask_bad = None
if not iterable(X):
vtype = 'scalar'
xa = array([X])
else:
vtype = 'array'
xma = ma.asarray(X)
xa = xma.filled(0)
mask_bad = ma.getmask(xma)
if typecode(xa) in typecodes['Float']:
putmask(xa, xa==1.0, 0.999) #Treat 1.0 as slightly less than 1.
xa = (xa * self.N).astype(Int)
# Set the over-range indices before the under-range;
# otherwise the under-range values get converted to over-range.
putmask(xa, xaself.N-1, self._i_over)
putmask(xa, xa0, self._i_under)
if mask_bad is not None and mask_bad.shape == xa.shape:
putmask(xa, mask_bad, self._i_bad)
rgba = take(self._lut, xa)
if vtype == 'scalar':
rgba = tuple(rgba[0,:])
return rgba



 

David I will take into account all those helpful messages, and
David hopefully come with something for the end of the week :),

David cheers

David David ___
David Numpy-discussion mailing list Numpy-discussion@scipy.org
David http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Profiling numpy ? (parts written in C)

2006-12-19 Thread Eric Firing
John,

The current version of __call__ already includes substantial speedups 
prompted by David's profiling, and if I understand correctly the present 
bottleneck is actually the numpy take function.  That is not to say that 
other improvements can't be made, of course.

Eric

John Hunter wrote:
 David == David Cournapeau [EMAIL PROTECTED] writes:
 David At the end, in the original context (speeding the drawing
 David of spectrogram), this is the problem. Even if multiple
 David backend/toolkits have obviously an impact in performances,
 David I really don't see why a numpy function to convert an array
 David to a RGB representation should be 10-20 times slower than
 David matlab on the same machine.
 
 This isn't exactly right.  When matplotlib converts a 2D grayscale
 array to rgba, a lot goes on under the hood.  It's all numpy, but it's
 far from single function and it involves many passes through the
 data.  In principle, this could be done with one or two passes through
 the data.  In practice, our normalization and colormapping abstractions
 are so abstract that it is difficult (though not impossible) to
 special case and optimize.
 
 The top-level routine is
 
 def to_rgba(self, x, alpha=1.0):
 '''Return a normalized rgba array corresponding to x.
 If x is already an rgb or rgba array, return it unchanged.
 '''
 if hasattr(x, 'shape') and len(x.shape)2: return x
 x = ma.asarray(x)
 x = self.norm(x)
 x = self.cmap(x, alpha)
 return x
 
 which implies at a minimum two passes through the data, one for norm
 and one for cmap.
 
 In 99% of the use cases, cmap is a LinearSegmentedColormap though
 users can define their own as long as it is callable.  My guess is
 that the expensive part is Colormap.__call__, the base class for
 LinearSegmentedColormap.  We could probably write some extension code
 that does the following routine in one pass through the data.  But it
 would be hairy.  In a quick look and rough count, I see about 10
 passes through the data in the function below.
 
 If you are interested in optimizing colormapping in mpl, I'd start
 here.  I suspect there may be some low hanging fruit.
 
 def __call__(self, X, alpha=1.0):
 
 X is either a scalar or an array (of any dimension).
 If scalar, a tuple of rgba values is returned, otherwise
 an array with the new shape = oldshape+(4,). If the X-values
 are integers, then they are used as indices into the array.
 If they are floating point, then they must be in the
 interval (0.0, 1.0).
 Alpha must be a scalar.
 
 if not self._isinit: self._init()
 alpha = min(alpha, 1.0) # alpha must be between 0 and 1
 alpha = max(alpha, 0.0)
 self._lut[:-3, -1] = alpha
 mask_bad = None
 if not iterable(X):
 vtype = 'scalar'
 xa = array([X])
 else:
 vtype = 'array'
 xma = ma.asarray(X)
 xa = xma.filled(0)
 mask_bad = ma.getmask(xma)
 if typecode(xa) in typecodes['Float']:
 putmask(xa, xa==1.0, 0.999) #Treat 1.0 as slightly less than 
 1.
 xa = (xa * self.N).astype(Int)
 # Set the over-range indices before the under-range;
 # otherwise the under-range values get converted to over-range.
 putmask(xa, xaself.N-1, self._i_over)
 putmask(xa, xa0, self._i_under)
 if mask_bad is not None and mask_bad.shape == xa.shape:
 putmask(xa, mask_bad, self._i_bad)
 rgba = take(self._lut, xa)
 if vtype == 'scalar':
 rgba = tuple(rgba[0,:])
 return rgba
 
 
 
  
 
 David I will take into account all those helpful messages, and
 David hopefully come with something for the end of the week :),
 
 David cheers
 
 David David ___
 David Numpy-discussion mailing list Numpy-discussion@scipy.org
 David http://projects.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Profiling numpy ? (parts written in C)

2006-12-19 Thread David Cournapeau
Francesc Altet wrote:



 So, cProfile is only showing where the time is spent at the
 first-level calls in extension level. If we want more introspection on
 the C stack, and you are running un Linux, oprofile
 (http://oprofile.sourceforge.net) is a very nice profiler. Here are
 the outputs for the above routines on my machine.

 For clip1:

 Profiling through timer interrupt
 samples  %image name   symbol name
 643  54.6769  libc-2.3.6.somemmove
 151  12.8401  multiarray.soPyArray_Choose
 352.9762  umath.so BYTE_multiply
 342.8912  umath.so DOUBLE_greater
 322.7211  mtrand.sork_random
 322.7211  umath.so DOUBLE_less
 302.5510  libc-2.3.6.somemcpy


 For clip2:

 Profiling through timer interrupt
 samples  %image name   symbol name
 188  24.5111  libc-2.3.6.somemmove
 143  18.6441  multiarray.so_nonzero_indices
 126  16.4276  multiarray.soPyArray_MapIterNext
 374.8240  umath.so DOUBLE_greater
 364.6936  mtrand.sork_gauss
 334.3025  umath.so DOUBLE_less
 243.1291  libc-2.3.6.somemcpy
Could you detail a bit how you did the profiling with oprofile ? I don't 
manage to get the same results than you (that is on per application 
basis when the application is a python script and not a 'binary' program)

Thank you,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion