Re: [Numpy-discussion] multiprocessing shared arrays and numpy

2010-03-05 Thread Francesc Alted
Yeah, 10% of improvement by using multi-cores is an expected figure for memory 
bound problems.  This is something people must know: if their computations are 
memory bound (and this is much more common that one may initially think), then 
they should not expect significant speed-ups on their parallel codes.

Thanks for sharing your experience anyway,
Francesc

A Thursday 04 March 2010 18:54:09 Nadav Horesh escrigué:
 I can not give a reliable answer yet, since I have some more improvement to
  make. The application is an analysis of a stereoscopic-movie raw-data
  recording (both channels are recorded in the same file). I treat the data
  as a huge memory mapped file. The idea was to process each channel (left
  and right) on a different core. Right now the application is IO bounded
  since I do classical numpy operation, so each channel (which is handled as
  one array) is scanned several time. The improvement now over a single
  process is 10%, but I hope to achieve 10% ore after trivial optimizations.
 
  I used this application as an excuse to dive into multi-processing. I hope
  that the code I posted here would help someone.
 
   Nadav.
 
 
 -Original Message-
 From: numpy-discussion-boun...@scipy.org on behalf of Francesc Alted
 Sent: Thu 04-Mar-10 15:12
 To: Discussion of Numerical Python
 Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy
 
 What kind of calculations are you doing with this module?  Can you please
  send some examples and the speed-ups you are getting?
 
 Thanks,
 Francesc
 
 A Thursday 04 March 2010 14:06:34 Nadav Horesh escrigué:
  Extended module that I used for some useful work.
  Comments:
1. Sturla's module is better designed, but did not work with very large
   (although sub GB) arrays 2. Tested on 64 bit linux (amd64) +
  python-2.6.4 + numpy-1.4.0
 
Nadav.
 
 
  -Original Message-
  From: numpy-discussion-boun...@scipy.org on behalf of Nadav Horesh
  Sent: Thu 04-Mar-10 11:55
  To: Discussion of Numerical Python
  Subject: RE: [Numpy-discussion] multiprocessing shared arrays and numpy
 
  Maybe the attached file can help. Adpted and tested on amd64 linux
 
Nadav
 
 
  -Original Message-
  From: numpy-discussion-boun...@scipy.org on behalf of Nadav Horesh
  Sent: Thu 04-Mar-10 10:54
  To: Discussion of Numerical Python
  Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy
 
  There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf
  and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in
  the cookbook page. I am into the same issue and going to test it today.
 
Nadav
 
  On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote:
   Hi people,
  
   I was wondering about the status of using the standard library
   multiprocessing module with numpy. I found a cookbook example last
   updated one year ago which states that:
  
   This page was obsolete as multiprocessing's internals have changed.
   More information will come shortly; a link to this page will then be
   added back to the Cookbook.
  
   http://www.scipy.org/Cookbook/multiprocessing
  
   I also found the code that used to be on this page in the cookbook but
   it does not work any more. So my question is:
  
   Is it possible to use numpy arrays as shared arrays in an application
   using multiprocessing and how do you do it?
  
   Best regards,
   Jesper
   ___
   NumPy-Discussion mailing list
   NumPy-Discussion@scipy.org
   http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Is this a bug in numpy.ma.reduce?

2010-03-05 Thread David Goldsmith
Hi!  Sorry for the cross-post, but my own investigation has led me to
suspect that mine is actually a numpy problem, not a matplotlib problem.
I'm getting the following traceback from a call to matplotlib.imshow:

Traceback (most recent call last):
 File
C:\Users\Fermat\Documents\Fractals\Python\Source\Zodiac\aquarius_test.py,
line 108, in module
ax.imshow(part2plot, cmap_name, extent = extent)
 File C:\Python254\lib\site-packages\matplotlib\axes.py, line 6261, in
imshow
im.autoscale_None()
 File C:\Python254\lib\site-packages\matplotlib\cm.py, line 236, in
autoscale_None
self.norm.autoscale_None(self._A)
 File C:\Python254\lib\site-packages\matplotlib\colors.py, line 792, in
autoscale_None
if self.vmin is None: self.vmin = ma.minimum(A)
 File C:\Python254\Lib\site-packages\numpy\ma\core.py, line , in
__call__
return self.reduce(a)
 File C:\Python254\Lib\site-packages\numpy\ma\core.py, line 5570, in
reduce
t = self.ufunc.reduce(target, **kargs)
ValueError: zero-size array to ufunc.reduce without identity
Script terminated.

Based on examination of the code, the last self is an instance of
ma._extrema_operation (or one of its subclasses) - is there a reason why
this class is unable to deal with a zero-size array to ufunc.reduce without
identity, (i.e., was it thought that it would - or should - never get one)
or was this merely an oversight?  Either way, there's other instances on the
lists of this error cropping up, so this circumstance should probably be
handled more robustly.  In the meantime, workaround?

DG
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] multiprocessing shared arrays and numpy

2010-03-05 Thread Gael Varoquaux
On Fri, Mar 05, 2010 at 09:53:02AM +0100, Francesc Alted wrote:
 Yeah, 10% of improvement by using multi-cores is an expected figure for
 memory bound problems.  This is something people must know: if their
 computations are memory bound (and this is much more common that one
 may initially think), then they should not expect significant speed-ups
 on their parallel codes.

Hey Francesc,

Any chance this can be different for NUMA (non uniform memory access)
architectures? AMD multicores used to be NUMA, when I was still following
these problems.

FWIW, I observe very good speedups on my problems (pretty much linear in
the number of CPUs), and I have data parallel problems on fairly large
data (~100Mo a piece, doesn't fit in cache), with no synchronisation at
all between the workers. CPUs are Intel Xeons.

Gael
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is this a bug in numpy.ma.reduce?

2010-03-05 Thread Pierre GM
On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote:
 Hi!  Sorry for the cross-post, but my own investigation has led me to suspect 
 that mine is actually a numpy problem, not a matplotlib problem.  I'm getting 
 the following traceback from a call to matplotlib.imshow:
 ...
 Based on examination of the code, the last self is an instance of 
 ma._extrema_operation (or one of its subclasses) - is there a reason why this 
 class is unable to deal with a zero-size array to ufunc.reduce without 
 identity, (i.e., was it thought that it would - or should - never get one) 
 or was this merely an oversight?  Either way, there's other instances on the 
 lists of this error cropping up, so this circumstance should probably be 
 handled more robustly.  In the meantime, workaround?


'm'fraid no. I gonna have to investigate that. Please open a ticket with a 
self-contained example that reproduces the issue.
Thx in advance...
P.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is this a bug in numpy.ma.reduce?

2010-03-05 Thread Vincent Schut
On 03/05/2010 11:51 AM, Pierre GM wrote:
 On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote:
 Hi!  Sorry for the cross-post, but my own investigation has led me to 
 suspect that mine is actually a numpy problem, not a matplotlib problem.  
 I'm getting the following traceback from a call to matplotlib.imshow:
 ...
 Based on examination of the code, the last self is an instance of 
 ma._extrema_operation (or one of its subclasses) - is there a reason why 
 this class is unable to deal with a zero-size array to ufunc.reduce without 
 identity, (i.e., was it thought that it would - or should - never get one) 
 or was this merely an oversight?  Either way, there's other instances on the 
 lists of this error cropping up, so this circumstance should probably be 
 handled more robustly.  In the meantime, workaround?


 'm'fraid no. I gonna have to investigate that. Please open a ticket with a 
 self-contained example that reproduces the issue.
 Thx in advance...
 P.

This might be completely wrong, but I seem to remember a similar issue, 
which I then traced down to having a masked array with a mask that was 
set to True or False, instead of being a full fledged bool mask array. I 
was in a hurry then and completely forgot about it later, so filed no 
bug report whatsoever, for which I apologize.

VS.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] multiprocessing shared arrays and numpy

2010-03-05 Thread Francesc Alted
Gael,

On Fri, Mar 05, 2010 at 10:51:12AM +0100, Gael Varoquaux wrote:
 On Fri, Mar 05, 2010 at 09:53:02AM +0100, Francesc Alted wrote:
  Yeah, 10% of improvement by using multi-cores is an expected figure for
  memory bound problems.  This is something people must know: if their
  computations are memory bound (and this is much more common that one
  may initially think), then they should not expect significant speed-ups
  on their parallel codes.
 
 Hey Francesc,
 
 Any chance this can be different for NUMA (non uniform memory access)
 architectures? AMD multicores used to be NUMA, when I was still following
 these problems.

As far as I can tell, NUMA architectures work better accelerating
independent processes that run independently one of each other.  In
this case, hardware is in charge of putting closely-related data in
memory that is 'nearer' to each processor.  This scenario *could*
happen in truly parallel process too, but as I said, in general it
works best for independent processes (read multiuser machines).

 FWIW, I observe very good speedups on my problems (pretty much linear in
 the number of CPUs), and I have data parallel problems on fairly large
 data (~100Mo a piece, doesn't fit in cache), with no synchronisation at
 all between the workers. CPUs are Intel Xeons.

Maybe your processes are not as memory-bound as you think.  Do you get
much better speed-up by using NUMA than a simple multi-core machine
with one single path to memory?  I don't think so, but maybe I'm wrong
here.

Francesc
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] multiprocessing shared arrays and numpy

2010-03-05 Thread Gael Varoquaux
On Fri, Mar 05, 2010 at 08:14:51AM -0500, Francesc Alted wrote:
  FWIW, I observe very good speedups on my problems (pretty much linear in
  the number of CPUs), and I have data parallel problems on fairly large
  data (~100Mo a piece, doesn't fit in cache), with no synchronisation at
  all between the workers. CPUs are Intel Xeons.

 Maybe your processes are not as memory-bound as you think. 

That's the only explaination that I can think of. I have two types of
bottlenecks. One is blas level 3 operations (mainly SVDs) on large
matrices, the second is resampling, where are repeat the same operation
many times over almost the same chunk of data. In both cases the data is
fairly large, so I expected the operations to be memory bound.

However, thinking of it, I believe that when I had timed these operations
carefully, it seems that processes were alternating a starving period,
during which they were IO-bound, and a productive period, during which
they were CPU-bound. After a few cycles, the different periods would fall
in a mutually disynchronised alternation, with one process IO-bound, and
the others CPU-bound, and it would become fairly efficient. Of course,
this is possible because I have no cross-talk between the processes.

 Do you get much better speed-up by using NUMA than a simple multi-core
 machine with one single path to memory?  I don't think so, but maybe
 I'm wrong here.

I don't know. All the boxes around here have Intel CPUs, and I believe
that this is all SMPs.

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] printing structured arrays

2010-03-05 Thread Bruce Schultz
Hi,

I've just started playing with numpy and have noticed that when printing
a structured array that the output is not nicely formatted. Is there a
way to make the formatting look the same as it does for an unstructured
array?

Here an example of what I mean:

data = [ (1, 2), (3, 4.1) ]
dtype = [('x', float), ('y', float)]
print '### ndarray'
a = numpy.array(data)
print a
print '### structured array'
a = numpy.array(data, dtype=dtype)
print a

Output is:
### ndarray
[[ 1.   2. ]
 [ 3.   4.1]]
### structured array
[(1.0, 2.0) (3.0, 4.0996)]


Thanks
Bruce

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why does np.nan{min, max} clobber my array mask?

2010-03-05 Thread Bruce Southey
On Mon, Feb 15, 2010 at 9:24 PM, Bruce Southey bsout...@gmail.com wrote:
 On Mon, Feb 15, 2010 at 8:35 PM, Pierre GM pgmdevl...@gmail.com wrote:
 On Feb 15, 2010, at 8:51 PM, David Carmean wrote:
 On Sun, Feb 14, 2010 at 03:22:04PM -0500, Pierre GM wrote:


 I'm sorry, I can't follow you. Can you post a simpler self-contained 
 example I can play with ?
 Why using np.nanmin/max ? These functions are designed for ndarrays, to 
 avoid using a masked array: can't you just use min/max on the masked array 
 ?

 I was using np.nanmin/max because I did not yet understand how masked 
 arrays worked; perhaps the
 docs for those methods need a note indicating that If you can take the 
 (small?) memory hit,
 use Masked Arrays instead.   Now that I know different... I'm  going to 
 drop it unless you
 reall want to dig into it.


 I'm curious. Can you post an excerpt of your array, so that I can check what 
 goes wrong?

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 Hi,
 David, please file a bug report.

 I think is occurs with np.nansum, np.nanmin and np.nanmax. Perhaps
 some thing with the C99 changes as I think it exists with numpy 1.3.

 I think this code shows the problem with Linux and recent numpy svn:

 import numpy as np
 uut = np.array([[2, 1, 3, np.nan], [5, 2, 3, np.nan]])
 msk = np.ma.masked_invalid(uut)
 msk
 np.nanmin(msk, axis=1)
 msk

 $ python
 Python 2.6 (r26:66714, Nov  3 2009, 17:33:18)
 [GCC 4.4.1 20090725 (Red Hat 4.4.1-2)] on linux2
 Type help, copyright, credits or license for more information.
 import numpy as np
 uut = np.array([[2, 1, 3, np.nan], [5, 2, 3, np.nan]])
 msk = np.ma.masked_invalid(uut)
 msk
 masked_array(data =
  [[2.0 1.0 3.0 --]
  [5.0 2.0 3.0 --]],
             mask =
  [[False False False  True]
  [False False False  True]],
       fill_value = 1e+20)

 np.nanmin(msk, axis=1)
 masked_array(data = [1.0 2.0],
             mask = [False False],
       fill_value = 1e+20)

 msk
 masked_array(data =
  [[2.0 1.0 3.0 nan]
  [5.0 2.0 3.0 nan]],
             mask =
  [[False False False False]
  [False False False False]],
       fill_value = 1e+20)


 Bruce


Hi,
I filed this ticket and hopefully the provided code is sufficient for a test:
http://projects.scipy.org/numpy/ticket/1421

The bug is with the _nanop function because nansum, nanmin, nanmax,
nanargmin and nanargmax have the same issue.

Bruce



Bruce
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] multiprocessing shared arrays and numpy

2010-03-05 Thread Francesc Alted
A Friday 05 March 2010 14:46:00 Gael Varoquaux escrigué:
 On Fri, Mar 05, 2010 at 08:14:51AM -0500, Francesc Alted wrote:
   FWIW, I observe very good speedups on my problems (pretty much linear
   in the number of CPUs), and I have data parallel problems on fairly
   large data (~100Mo a piece, doesn't fit in cache), with no
   synchronisation at all between the workers. CPUs are Intel Xeons.
 
  Maybe your processes are not as memory-bound as you think.
 
 That's the only explaination that I can think of. I have two types of
 bottlenecks. One is blas level 3 operations (mainly SVDs) on large
 matrices, the second is resampling, where are repeat the same operation
 many times over almost the same chunk of data. In both cases the data is
 fairly large, so I expected the operations to be memory bound.

Not at all.  BLAS 3 operations are mainly CPU-bounded, because algorithms (if 
they are correctly implemented, of course, but any decent BLAS 3 library will 
do) have many chances to reuse data from caches.  BLAS 1 (and lately 2 too) 
are the ones that are memory-bound.

And in your second case, you are repeating the same operation over the same 
chunk of data.  If this chunk is small enough to fit in cache, then the 
bottleneck is CPU again (and probably access to L1/L2 cache), and not access 
to memory.  But if, as you said, you are seeing periods that are memory-
bounded (i.e. CPUs are starving), then it may well be that this chunksize does 
not fit well in cache, and then your problem is memory access for this case.  
Maybe you can get better performance by reducing your chunksize so that it 
fits in cache (L1 or L2).

So, I do not think that NUMA architectures would perform your current 
computations any better than your current SMP platform (and you know that NUMA 
architectures are much more complex and expensive than SMP ones).  But 
experimenting is *always* the best answer to these hairy questions ;-)

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Loading bit strings

2010-03-05 Thread Dan Lenski
Is there a good way in NumPy to convert from a bit string to a boolean
array?

For example, if I have a 2-byte string s='\xfd\x32', I want to get a
16-length boolean array out of it.

Here's what I came up with:

A = fromstring(s, dtype=uint8)
out = empty(A.size * 8, dtype=bool)
for bit in range(0,8):
  out[bit::8] = A(1bit)

I just can't shake the feeling that there may be a better way to
do this, though...

Dan

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is this a bug in numpy.ma.reduce?

2010-03-05 Thread David Goldsmith
On Fri, Mar 5, 2010 at 2:51 AM, Pierre GM pgmdevl...@gmail.com wrote:

 On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote:
  Hi!  Sorry for the cross-post, but my own investigation has led me to
 suspect that mine is actually a numpy problem, not a matplotlib problem.
  I'm getting the following traceback from a call to matplotlib.imshow:
  ...
  Based on examination of the code, the last self is an instance of
 ma._extrema_operation (or one of its subclasses) - is there a reason why
 this class is unable to deal with a zero-size array to ufunc.reduce without
 identity, (i.e., was it thought that it would - or should - never get one)
 or was this merely an oversight?  Either way, there's other instances on the
 lists of this error cropping up, so this circumstance should probably be
 handled more robustly.  In the meantime, workaround?


 'm'fraid no. I gonna have to investigate that. Please open a ticket with a
 self-contained example that reproduces the issue.


I'll do my best, but since it's a call from matplotlib and I don't really
know what's causing the problem (other than a literal reading of the
exception) I'm not sure I can.

DG


 Thx in advance...
 P.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Loading bit strings

2010-03-05 Thread Robert Kern
On Fri, Mar 5, 2010 at 11:11, Dan Lenski dlen...@gmail.com wrote:
 Is there a good way in NumPy to convert from a bit string to a boolean
 array?

 For example, if I have a 2-byte string s='\xfd\x32', I want to get a
 16-length boolean array out of it.

 Here's what I came up with:

 A = fromstring(s, dtype=uint8)
 out = empty(A.size * 8, dtype=bool)
 for bit in range(0,8):
  out[bit::8] = A(1bit)

 I just can't shake the feeling that there may be a better way to
 do this, though...

For short enough strings, it probably doesn't really matter. Any
correct way will do.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is this a bug in numpy.ma.reduce?

2010-03-05 Thread David Goldsmith
On Fri, Mar 5, 2010 at 9:22 AM, David Goldsmith d.l.goldsm...@gmail.comwrote:

 On Fri, Mar 5, 2010 at 2:51 AM, Pierre GM pgmdevl...@gmail.com wrote:

 On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote:
  Hi!  Sorry for the cross-post, but my own investigation has led me to
 suspect that mine is actually a numpy problem, not a matplotlib problem.
  I'm getting the following traceback from a call to matplotlib.imshow:
  ...
  Based on examination of the code, the last self is an instance of
 ma._extrema_operation (or one of its subclasses) - is there a reason why
 this class is unable to deal with a zero-size array to ufunc.reduce without
 identity, (i.e., was it thought that it would - or should - never get one)
 or was this merely an oversight?  Either way, there's other instances on the
 lists of this error cropping up, so this circumstance should probably be
 handled more robustly.  In the meantime, workaround?


 'm'fraid no. I gonna have to investigate that. Please open a ticket with a
 self-contained example that reproduces the issue.


 I'll do my best, but since it's a call from matplotlib and I don't really
 know what's causing the problem (other than a literal reading of the
 exception) I'm not sure I can.


Well, that was easy:

mn = N.ma.core._minimum_operation()
mn.reduce(N.array(()))
Traceback (most recent call last):
  File input, line 1, in module
  File C:\Python254\Lib\site-packages\numpy\ma\core.py, line 5570, in
reduce
t = self.ufunc.reduce(target, **kargs)
ValueError: zero-size array to ufunc.reduce without identity

I'll file a ticket.

DG



 DG


 Thx in advance...
 P.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is this a bug in numpy.ma.reduce?

2010-03-05 Thread David Goldsmith
On Fri, Mar 5, 2010 at 9:43 AM, David Goldsmith d.l.goldsm...@gmail.comwrote:

 On Fri, Mar 5, 2010 at 9:22 AM, David Goldsmith 
 d.l.goldsm...@gmail.comwrote:

 On Fri, Mar 5, 2010 at 2:51 AM, Pierre GM pgmdevl...@gmail.com wrote:

 On Mar 5, 2010, at 4:38 AM, David Goldsmith wrote:
  Hi!  Sorry for the cross-post, but my own investigation has led me to
 suspect that mine is actually a numpy problem, not a matplotlib problem.
  I'm getting the following traceback from a call to matplotlib.imshow:
  ...
  Based on examination of the code, the last self is an instance of
 ma._extrema_operation (or one of its subclasses) - is there a reason why
 this class is unable to deal with a zero-size array to ufunc.reduce without
 identity, (i.e., was it thought that it would - or should - never get one)
 or was this merely an oversight?  Either way, there's other instances on the
 lists of this error cropping up, so this circumstance should probably be
 handled more robustly.  In the meantime, workaround?


 'm'fraid no. I gonna have to investigate that. Please open a ticket with
 a self-contained example that reproduces the issue.


 I'll do my best, but since it's a call from matplotlib and I don't really
 know what's causing the problem (other than a literal reading of the
 exception) I'm not sure I can.


 Well, that was easy:

 mn = N.ma.core._minimum_operation()
 mn.reduce(N.array(()))

 Traceback (most recent call last):
   File input, line 1, in module

   File C:\Python254\Lib\site-packages\numpy\ma\core.py, line 5570, in
 reduce
 t = self.ufunc.reduce(target, **kargs)
 ValueError: zero-size array to ufunc.reduce without identity

 I'll file a ticket.


OK, Ticket #1422 filed.

DG
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Loading bit strings

2010-03-05 Thread Zachary Pincus

 Is there a good way in NumPy to convert from a bit string to a boolean
 array?

 For example, if I have a 2-byte string s='\xfd\x32', I want to get a
 16-length boolean array out of it.

numpy.unpackbits(numpy.fromstring('\xfd\x32', dtype=numpy.uint8))
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] multiprocessing shared arrays and numpy

2010-03-05 Thread Brian Granger
Francesc,

Yeah, 10% of improvement by using multi-cores is an expected figure for
 memory
 bound problems.  This is something people must know: if their computations
 are
 memory bound (and this is much more common that one may initially think),
 then
 they should not expect significant speed-ups on their parallel codes.


+1

Thanks for emphasizing this.  This is definitely a big issue with multicore.

Cheers,

Brian



 Thanks for sharing your experience anyway,
 Francesc

 A Thursday 04 March 2010 18:54:09 Nadav Horesh escrigué:
  I can not give a reliable answer yet, since I have some more improvement
 to
   make. The application is an analysis of a stereoscopic-movie raw-data
   recording (both channels are recorded in the same file). I treat the
 data
   as a huge memory mapped file. The idea was to process each channel (left
   and right) on a different core. Right now the application is IO bounded
   since I do classical numpy operation, so each channel (which is handled
 as
   one array) is scanned several time. The improvement now over a single
   process is 10%, but I hope to achieve 10% ore after trivial
 optimizations.
 
   I used this application as an excuse to dive into multi-processing. I
 hope
   that the code I posted here would help someone.
 
Nadav.
 
 
  -Original Message-
  From: numpy-discussion-boun...@scipy.org on behalf of Francesc Alted
  Sent: Thu 04-Mar-10 15:12
  To: Discussion of Numerical Python
  Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy
 
  What kind of calculations are you doing with this module?  Can you please
   send some examples and the speed-ups you are getting?
 
  Thanks,
  Francesc
 
  A Thursday 04 March 2010 14:06:34 Nadav Horesh escrigué:
   Extended module that I used for some useful work.
   Comments:
 1. Sturla's module is better designed, but did not work with very
 large
(although sub GB) arrays 2. Tested on 64 bit linux (amd64) +
   python-2.6.4 + numpy-1.4.0
  
 Nadav.
  
  
   -Original Message-
   From: numpy-discussion-boun...@scipy.org on behalf of Nadav Horesh
   Sent: Thu 04-Mar-10 11:55
   To: Discussion of Numerical Python
   Subject: RE: [Numpy-discussion] multiprocessing shared arrays and numpy
  
   Maybe the attached file can help. Adpted and tested on amd64 linux
  
 Nadav
  
  
   -Original Message-
   From: numpy-discussion-boun...@scipy.org on behalf of Nadav Horesh
   Sent: Thu 04-Mar-10 10:54
   To: Discussion of Numerical Python
   Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy
  
   There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf
   and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in
   the cookbook page. I am into the same issue and going to test it today.
  
 Nadav
  
   On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote:
Hi people,
   
I was wondering about the status of using the standard library
multiprocessing module with numpy. I found a cookbook example last
updated one year ago which states that:
   
This page was obsolete as multiprocessing's internals have changed.
More information will come shortly; a link to this page will then be
added back to the Cookbook.
   
http://www.scipy.org/Cookbook/multiprocessing
   
I also found the code that used to be on this page in the cookbook
 but
it does not work any more. So my question is:
   
Is it possible to use numpy arrays as shared arrays in an application
using multiprocessing and how do you do it?
   
Best regards,
Jesper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
  
   ___
   NumPy-Discussion mailing list
   NumPy-Discussion@scipy.org
   http://mail.scipy.org/mailman/listinfo/numpy-discussion
 

 --
 Francesc Alted
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Iterative Matrix Multiplication

2010-03-05 Thread Friedrich Romstedt
Do you have doublets in the v_array?

In case not, then you owe me a donut.

See attachment.

Friedrich

P.S.: You misunderstood too, the line you wanted to change was in
context to detect back-facing triangles, and there one vertex is
sufficient.


shading.py
Description: Binary data
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] printing structured arrays

2010-03-05 Thread Gökhan Sever
On Fri, Mar 5, 2010 at 8:00 AM, Bruce Schultz bruce.schu...@gmail.comwrote:

  Hi,

 I've just started playing with numpy and have noticed that when printing a
 structured array that the output is not nicely formatted. Is there a way to
 make the formatting look the same as it does for an unstructured array?

 Here an example of what I mean:

 data = [ (1, 2), (3, 4.1) ]
 dtype = [('x', float), ('y', float)]
 print '### ndarray'
 a = numpy.array(data)
 print a
 print '### structured array'
 a = numpy.array(data, dtype=dtype)
 print a

 Output is:
 ### ndarray
 [[ 1.   2. ]
  [ 3.   4.1]]
 ### structured array
 [(1.0, 2.0) (3.0, 4.0996)]


 Thanks
 Bruce


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


I still couldn't figure out how floating point numbers look nicely on screen
in cases like yours (i.e., trying numpy.array2string()) but you can make
sure by using numpy.savetxt(file, array, fmt=%.1f) you will always have
specified precision in the written file.

-- 
Gökhan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Iterative Matrix Multiplication

2010-03-05 Thread Ian Mallett
Cool--this works perfectly now :-)

Unfortunately, it's actually slower :P  Most of the slowest part is in the
removing doubles section.

Some of the costliest calls:

#takes 0.04 seconds
inner = np.inner(ns, v1s - some_point)

#0.0840001106262
sum_1 = sum.reshape((len(sum), 1)).repeat(len(sum), axis = 1)

#0.032923706
sum_2 = sum.reshape((1, len(sum))).repeat(len(sum), axis = 0)

#0.026504089
comparison_sum = (sum_1 == sum_2)

#0.0909998416901
diff_1 = diff.reshape((len(diff), 1)).repeat(len(diff), axis = 1)

#0.0340001583099
diff_2 = diff.reshape((1, len(diff))).repeat(len(diff), axis = 0)

#0.026504089
comparison_diff = (diff_1 == diff_2)

#0.023019073
same_edges = comparison_sum * comparison_diff

#0.12848502
doublet_count = same_edges.sum(axis = 0)

Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Building Numpy Windows Superpack

2010-03-05 Thread David Cournapeau
On Fri, Mar 5, 2010 at 1:22 PM, Patrick Marsh patrickmars...@gmail.com wrote:


 I've run the Numpy superpack installer for Python 2.6 built with MinGW
 through the dependency walker.  Unfortunately, outside of checking for some
 extremely obviously things, I'm in way over my head in interpreting the
 output (although, I'd like to learn).  I've put the output from the program
 here: http://www.patricktmarsh.com/numpy/20100303.py26.superpack.dependencies.txt.
  I can also put the binary up somewhere, too if someone wants to check that.

I have just attempted to build the super pack installer on windows 7
Ultimate (32 bits), and did not encounter any issue, the testsuite
passing everything but a few things unrelated to our problem here.

Could you put your binary somewhere so that I can look at it ?

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion