Re: [Numpy-discussion] Need faster equivalent to digitize

2010-04-14 Thread Peter Shinners
On 04/14/2010 11:34 PM, Nadav Horesh wrote:
> import numpy as N
> N.repeat(N.arange(len(a)), a)
>
>Nadav
>
> -Original Message-
> From: numpy-discussion-boun...@scipy.org on behalf of Peter Shinners
> Sent: Thu 15-Apr-10 08:30
> To: Discussion of Numerical Python
> Subject: [Numpy-discussion] Need faster equivalent to digitize
>
> I am using digitize to create a list of indices. This is giving me
> exactly what I want, but it's terribly slow. Digitize is obviously not
> the tool I want for this case, but what numpy alternative do I have?
>
> I have an array like np.array((4, 3, 3)). I need to create an index
> array with each index repeated by the its value: np.array((0, 0, 0, 0,
> 1, 1, 1, 2, 2, 2)).
>
>   >>>  a = np.array((4, 3, 3))
>   >>>  b = np.arange(np.sum(a))
>   >>>  c = np.digitize(b, a)
>   >>>  print c
> [0 0 0 0 1 1 1 2 2 2]
>
> On an array where a.size==65536 and sum(a)==65536 this is taking over 6
> seconds to compute. As a comparison, using a Python list solution runs
> in 0.08 seconds. That is plenty fast, but I would guess there is a
> faster Numpy solution that does not require a dynamically growing
> container of PyObjects ?
>
>   >>>  a = np.array((4, 3, 3))
>   >>>  c = []
>   >>>  for i, v in enumerate(a):
> ... c.extend([i] * v)
>

Excellent. The Numpy version is a bit faster, and I prefer having an 
ndarray as the end result.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Need faster equivalent to digitize

2010-04-14 Thread Nadav Horesh

import numpy as N
N.repeat(N.arange(len(a)), a)

  Nadav

-Original Message-
From: numpy-discussion-boun...@scipy.org on behalf of Peter Shinners
Sent: Thu 15-Apr-10 08:30
To: Discussion of Numerical Python
Subject: [Numpy-discussion] Need faster equivalent to digitize
 
I am using digitize to create a list of indices. This is giving me 
exactly what I want, but it's terribly slow. Digitize is obviously not 
the tool I want for this case, but what numpy alternative do I have?

I have an array like np.array((4, 3, 3)). I need to create an index 
array with each index repeated by the its value: np.array((0, 0, 0, 0, 
1, 1, 1, 2, 2, 2)).

 >>> a = np.array((4, 3, 3))
 >>> b = np.arange(np.sum(a))
 >>> c = np.digitize(b, a)
 >>> print c
[0 0 0 0 1 1 1 2 2 2]

On an array where a.size==65536 and sum(a)==65536 this is taking over 6 
seconds to compute. As a comparison, using a Python list solution runs 
in 0.08 seconds. That is plenty fast, but I would guess there is a 
faster Numpy solution that does not require a dynamically growing 
container of PyObjects ?

 >>> a = np.array((4, 3, 3))
 >>> c = []
 >>> for i, v in enumerate(a):
... c.extend([i] * v)


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

<>___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Need faster equivalent to digitize

2010-04-14 Thread Peter Shinners
I am using digitize to create a list of indices. This is giving me 
exactly what I want, but it's terribly slow. Digitize is obviously not 
the tool I want for this case, but what numpy alternative do I have?

I have an array like np.array((4, 3, 3)). I need to create an index 
array with each index repeated by the its value: np.array((0, 0, 0, 0, 
1, 1, 1, 2, 2, 2)).

 >>> a = np.array((4, 3, 3))
 >>> b = np.arange(np.sum(a))
 >>> c = np.digitize(b, a)
 >>> print c
[0 0 0 0 1 1 1 2 2 2]

On an array where a.size==65536 and sum(a)==65536 this is taking over 6 
seconds to compute. As a comparison, using a Python list solution runs 
in 0.08 seconds. That is plenty fast, but I would guess there is a 
faster Numpy solution that does not require a dynamically growing 
container of PyObjects ?

 >>> a = np.array((4, 3, 3))
 >>> c = []
 >>> for i, v in enumerate(a):
... c.extend([i] * v)


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Math Library

2010-04-14 Thread James Bergstra
On Tue, Apr 6, 2010 at 10:08 AM, David Cournapeau wrote:
> could be put out of multiarray proper. Also, exposing an API for
> things like fancy indexing would be very useful, but I don't know if
> it even makes sense - I think a pure python implementation of fancy
> indexing as a reference would be very useful for array-like classes (I
> am thinking about scipy.sparse, for example).
>
> Unfortunately, I won't be able to help much in the near future (except
> maybe for the fancy indexing as this could be useful for my job),
>
> David

Hi, I am also interested in a pure python implementation of fancy
indexing... at least the array-type fancy indexing, if not the boolean
kind.  If someone knows of an implementation please let me know.  I'll
email the list again if I make any serious progress on it.

James
-- 
http://www-etud.iro.umontreal.ca/~bergstrj
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Charles R Harris
On Wed, Apr 14, 2010 at 4:39 PM, Keith Goodman  wrote:

> On Wed, Apr 14, 2010 at 3:12 PM, Anne Archibald
>  wrote:
> > On 14 April 2010 16:56, Keith Goodman  wrote:
> >> On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath 
> wrote:
> >>> Keith Goodman  writes:
>  On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman 
> wrote:
> > On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath 
> wrote:
> >> Hello,
> >>
> >> How do I best find out the indices of the largest x elements in an
> >> array?
> >>
> >> Example:
> >>
> >> a = [ [1,8,2], [2,1,3] ]
> >> magic_function(a, 2) == [ (0,1), (1,2) ]
> >>
> >> Since the largest 2 elements are at positions (0,1) and (1,2).
> >
> > Here's a quick way to rank the data if there are no ties and no NaNs:
> 
>  ...or if you need the indices in order:
> 
> >> shape = (3,2)
> >> x = np.random.rand(*shape)
> >> x
>  array([[ 0.52420123,  0.43231286],
> [ 0.97995333,  0.87416228],
> [ 0.71604075,  0.66018382]])
> >> r = x.reshape(-1).argsort().argsort()
> >>>
> >>> I don't understand why this works. Why do you call argsort() twice?
> >>> Doesn't that give you the indices of the sorted indices?
> >>
> >> It is confusing. Let's look at an example:
> >>
>  x = np.random.rand(4)
>  x
> >>   array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])
> >>
> >> If we call argsort once we get the index that will sort x:
> >>
>  idx = x.argsort()
>  idx
> >>   array([2, 0, 3, 1])
>  x[idx]
> >>   array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])
> >>
> >> Notice that the first element of idx is 2. That's because element x[2]
> >> is the min of x. But that's not what we want. We want the first
> >> element to be the rank of the first element of x. So we need to
> >> shuffle idx around so that the order aligns with x. How do we do that?
> >> We sort it!
> >
> > Unless I am mistaken, what you are doing here is inverting the
> > permutation returned by the first argsort. The second argsort is an n
> > log n method, though, and permutations can be inverted in linear time:
> >
> > ix = np.argsort(X)
> > ixinv = np.zeros_like(ix)
> > ixinv[ix] = np.arange(len(ix))
> >
> > This works because if ix is a permutation and ixinv is its inverse,
> > A = B[ix]
> > is the same as
> > A[ixinv] = B
> > This also means that you can often do without the actual inverse by
> > moving the indexing operation to the other side of the equal sign.
> > (Not in the OP's case, though.)
>
> That is very nice. And very fast for large arrays:
>
> >> x = np.random.rand(4)
> >> timeit idx = x.argsort().argsort()
> 100 loops, best of 3: 1.45 us per loop
> >> timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] =
> np.arange(len(idx))
> 10 loops, best of 3: 9.52 us per loop
>
> >> x = np.random.rand(1000)
> >> timeit idx = x.argsort().argsort()
> 1 loops, best of 3: 112 us per loop
> >> timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] =
> np.arange(len(idx))
> 1 loops, best of 3: 82.9 us per loop
>
> >> x = np.random.rand(10)
> >> timeit idx = x.argsort().argsort()
> 10 loops, best of 3: 20.4 ms per loop
> >> timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] =
> np.arange(len(idx))
> 100 loops, best of 3: 13.2 ms per loop
>
> > I'll also add that if the OP needs the top m for 1 > whole input array is not the most efficient algorithm; there are
> > priority-queue-based schemes that are asymptotically more efficient,
> > but none exists in numpy. Since numpy's sorting is quite fast, I
> > personally would just use the sorting.
>
> Partial sorting would find a lot of uses in the numpy community (like
> in median).
>

Thinking about it... along with a lot of other things.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] rc2 for NumPy 1.4.1 and Scipy 0.7.2

2010-04-14 Thread Ralf Gommers
2010/4/15 Charles سمير Doutriaux 

> Just downloaded this.
>
> On my mac 10.6, using python 2.6.5  i get:
>

Which download, what build command and what python/gcc versions?

Looks like you're trying to build a 64-bit binary without passing in -arch
x86_64 flags?

Ralf




>
> Running from numpy source directory.
> non-existing path in 'numpy/distutils': 'site.cfg'
> F2PY Version 2
> blas_opt_info:
>   FOUND:
> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
> define_macros = [('NO_ATLAS_INFO', 3)]
> extra_compile_args = ['-faltivec',
> '-I/System/Library/Frameworks/vecLib.framework/Headers']
>
> lapack_opt_info:
>   FOUND:
> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
> define_macros = [('NO_ATLAS_INFO', 3)]
> extra_compile_args = ['-faltivec']
>
> running build
> running config_cc
> unifing config_cc, config, build_clib, build_ext, build commands --compiler
> options
> running config_fc
> unifing config_fc, config, build_clib, build_ext, build commands
> --fcompiler options
> running build_src
> build_src
> building py_modules sources
> creating build
> creating build/src.macosx-10.3-fat-2.6
> creating build/src.macosx-10.3-fat-2.6/numpy
> creating build/src.macosx-10.3-fat-2.6/numpy/distutils
> building library "npymath" sources
> customize NAGFCompiler
> Could not locate executable f95
> customize AbsoftFCompiler
> Could not locate executable f90
> Could not locate executable f77
> customize IBMFCompiler
> Could not locate executable xlf90
> Could not locate executable xlf
> customize IntelFCompiler
> Could not locate executable ifort
> Could not locate executable ifc
> customize GnuFCompiler
> Could not locate executable g77
> customize Gnu95FCompiler
> Found executable /usr/local/bin/gfortran
> /Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/fcompiler/gnu.py:125:
> UserWarning: Env. variable MACOSX_DEPLOYMENT_TARGET set to 10.3
>   warnings.warn(s)
> customize Gnu95FCompiler
> customize Gnu95FCompiler using config
> C compiler: gcc -arch ppc -arch i386 -isysroot / -fno-strict-aliasing
> -DNDEBUG
>
> compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
> -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath
> -Inumpy/core/include -I/lgm/cdat/trunk/include/python2.6 -c'
> gcc: _configtest.c
> gcc _configtest.o -o _configtest
> ld: warning: in _configtest.o, missing required architecture x86_64 in file
> Undefined symbols:
>   "_main", referenced from:
>   __start in crt1.o
> ld: symbol(s) not found
> collect2: ld returned 1 exit status
> ld: warning: in _configtest.o, missing required architecture x86_64 in file
> Undefined symbols:
>   "_main", referenced from:
>   __start in crt1.o
> ld: symbol(s) not found
> collect2: ld returned 1 exit status
> failure.
> removing: _configtest.c _configtest.o
> Traceback (most recent call last):
>   File "setup.py", line 187, in 
> setup_package()
>   File "setup.py", line 180, in setup_package
> configuration=configuration )
>   File "/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/core.py",
> line 186, in setup
> return old_setup(**new_attr)
>   File "/lgm/cdat/trunk/lib/python2.6/distutils/core.py", line 152, in
> setup
> dist.run_commands()
>   File "/lgm/cdat/trunk/lib/python2.6/distutils/dist.py", line 975, in
> run_commands
> self.run_command(cmd)
>   File "/lgm/cdat/trunk/lib/python2.6/distutils/dist.py", line 995, in
> run_command
> cmd_obj.run()
>   File
> "/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/command/build.py",
> line 37, in run
> old_build.run(self)
>   File "/lgm/cdat/trunk/lib/python2.6/distutils/command/build.py", line
> 134, in run
> self.run_command(cmd_name)
>   File "/lgm/cdat/trunk/lib/python2.6/distutils/cmd.py", line 333, in
> run_command
> self.distribution.run_command(command)
>   File "/lgm/cdat/trunk/lib/python2.6/distutils/dist.py", line 995, in
> run_command
> cmd_obj.run()
>   File
> "/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/command/build_src.py",
> line 152, in run
> self.build_sources()
>   File
> "/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/command/build_src.py",
> line 163, in build_sources
> self.build_library_sources(*libname_info)
>   File
> "/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/command/build_src.py",
> line 298, in build_library_sources
> sources = self.generate_sources(sources, (lib_name, build_info))
>   File
> "/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/command/build_src.py",
> line 385, in generate_sources
> source = func(extension, build_dir)
>   File "numpy/core/setup.py", line 657, in get_mathlib_info
> raise RuntimeError("Broken toolchain: cannot link a simple C program")
> RuntimeError: Broken toolchain: cannot link a simple C program
>
> On Apr 11, 2010, at 2:09 AM, Ralf Gommers wrote:
>
> Hi,
>
> I am pleased to announce the second release candidate of both Scipy 0.7.2

Re: [Numpy-discussion] binomial coefficient, factorial

2010-04-14 Thread Christopher Barker
Anne Archibald wrote:
>> The problem is that everyone has a different "basic". What we really
>> need is an easier way for folks to use sub-packages of scipy. I've found
>> myself hand-extracting just what I need, too.
> 
> Maybe I've been spoiled by using Linux, but my answer to this sort of
> thing is always "just install scipy already". If it's not already
> packaged for your operating system (e.g. Linux, OSX, or Windows), is
> it really difficult to compile it yourself?

yes, it is.

I think the Windows builds are pretty good, but with the MS compiler 
mess, that can be a pain, too (64 bit issues have not been resolved, 
though I guess not for numpy either...). And it is certainly not trivial 
to do yourself.

I suffer on OS-X: what with Apple's Python, python.org's one, fink, 
macports, etc, then you have OS-X version 10.4 - 10.6 to support, and 
both PPC and Intel, and all of this with Fortran code, it really is a mess.

Also, I deliver stand-alone tools, built with py2exe and py2app, and 
really don't like to be delivering the entire pile of scipy when I do.

So yes, it's still an issue.

 > (If so, perhaps post a
> report to scipy-user and see if we can fix it so it isn't.)

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Keith Goodman
On Wed, Apr 14, 2010 at 3:12 PM, Anne Archibald
 wrote:
> On 14 April 2010 16:56, Keith Goodman  wrote:
>> On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath  wrote:
>>> Keith Goodman  writes:
 On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman  wrote:
> On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath  wrote:
>> Hello,
>>
>> How do I best find out the indices of the largest x elements in an
>> array?
>>
>> Example:
>>
>> a = [ [1,8,2], [2,1,3] ]
>> magic_function(a, 2) == [ (0,1), (1,2) ]
>>
>> Since the largest 2 elements are at positions (0,1) and (1,2).
>
> Here's a quick way to rank the data if there are no ties and no NaNs:

 ...or if you need the indices in order:

>> shape = (3,2)
>> x = np.random.rand(*shape)
>> x
 array([[ 0.52420123,  0.43231286],
        [ 0.97995333,  0.87416228],
        [ 0.71604075,  0.66018382]])
>> r = x.reshape(-1).argsort().argsort()
>>>
>>> I don't understand why this works. Why do you call argsort() twice?
>>> Doesn't that give you the indices of the sorted indices?
>>
>> It is confusing. Let's look at an example:
>>
 x = np.random.rand(4)
 x
>>   array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])
>>
>> If we call argsort once we get the index that will sort x:
>>
 idx = x.argsort()
 idx
>>   array([2, 0, 3, 1])
 x[idx]
>>   array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])
>>
>> Notice that the first element of idx is 2. That's because element x[2]
>> is the min of x. But that's not what we want. We want the first
>> element to be the rank of the first element of x. So we need to
>> shuffle idx around so that the order aligns with x. How do we do that?
>> We sort it!
>
> Unless I am mistaken, what you are doing here is inverting the
> permutation returned by the first argsort. The second argsort is an n
> log n method, though, and permutations can be inverted in linear time:
>
> ix = np.argsort(X)
> ixinv = np.zeros_like(ix)
> ixinv[ix] = np.arange(len(ix))
>
> This works because if ix is a permutation and ixinv is its inverse,
> A = B[ix]
> is the same as
> A[ixinv] = B
> This also means that you can often do without the actual inverse by
> moving the indexing operation to the other side of the equal sign.
> (Not in the OP's case, though.)

That is very nice. And very fast for large arrays:

>> x = np.random.rand(4)
>> timeit idx = x.argsort().argsort()
100 loops, best of 3: 1.45 us per loop
>> timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] = 
>> np.arange(len(idx))
10 loops, best of 3: 9.52 us per loop

>> x = np.random.rand(1000)
>> timeit idx = x.argsort().argsort()
1 loops, best of 3: 112 us per loop
>> timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] = 
>> np.arange(len(idx))
1 loops, best of 3: 82.9 us per loop

>> x = np.random.rand(10)
>> timeit idx = x.argsort().argsort()
10 loops, best of 3: 20.4 ms per loop
>> timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] = 
>> np.arange(len(idx))
100 loops, best of 3: 13.2 ms per loop

> I'll also add that if the OP needs the top m for 1 whole input array is not the most efficient algorithm; there are
> priority-queue-based schemes that are asymptotically more efficient,
> but none exists in numpy. Since numpy's sorting is quite fast, I
> personally would just use the sorting.

Partial sorting would find a lot of uses in the numpy community (like
in median).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] binomial coefficient, factorial

2010-04-14 Thread Anne Archibald
On 14 April 2010 17:37, Christopher Barker  wrote:
> jah wrote:
>> Is there any chance that a binomial coefficent and factorial function
>> can make their way into NumPy?
>
> probably not -- numpy is over-populated already
>
>> I know these exist in Scipy, but I don't
>> want to have to install SciPy just to have something so "basic".
>
> The problem is that everyone has a different "basic". What we really
> need is an easier way for folks to use sub-packages of scipy. I've found
> myself hand-extracting just what I need, too.

Maybe I've been spoiled by using Linux, but my answer to this sort of
thing is always "just install scipy already". If it's not already
packaged for your operating system (e.g. Linux, OSX, or Windows), is
it really difficult to compile it yourself? (If so, perhaps post a
report to scipy-user and see if we can fix it so it isn't.)



Anne

> -Chris
>
>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Anne Archibald
On 14 April 2010 16:56, Keith Goodman  wrote:
> On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath  wrote:
>> Keith Goodman  writes:
>>> On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman  wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath  wrote:
> Hello,
>
> How do I best find out the indices of the largest x elements in an
> array?
>
> Example:
>
> a = [ [1,8,2], [2,1,3] ]
> magic_function(a, 2) == [ (0,1), (1,2) ]
>
> Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:
>>>
>>> ...or if you need the indices in order:
>>>
> shape = (3,2)
> x = np.random.rand(*shape)
> x
>>> array([[ 0.52420123,  0.43231286],
>>>[ 0.97995333,  0.87416228],
>>>[ 0.71604075,  0.66018382]])
> r = x.reshape(-1).argsort().argsort()
>>
>> I don't understand why this works. Why do you call argsort() twice?
>> Doesn't that give you the indices of the sorted indices?
>
> It is confusing. Let's look at an example:
>
>>> x = np.random.rand(4)
>>> x
>   array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])
>
> If we call argsort once we get the index that will sort x:
>
>>> idx = x.argsort()
>>> idx
>   array([2, 0, 3, 1])
>>> x[idx]
>   array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])
>
> Notice that the first element of idx is 2. That's because element x[2]
> is the min of x. But that's not what we want. We want the first
> element to be the rank of the first element of x. So we need to
> shuffle idx around so that the order aligns with x. How do we do that?
> We sort it!

Unless I am mistaken, what you are doing here is inverting the
permutation returned by the first argsort. The second argsort is an n
log n method, though, and permutations can be inverted in linear time:

ix = np.argsort(X)
ixinv = np.zeros_like(ix)
ixinv[ix] = np.arange(len(ix))

This works because if ix is a permutation and ixinv is its inverse,
A = B[ix]
is the same as
A[ixinv] = B
This also means that you can often do without the actual inverse by
moving the indexing operation to the other side of the equal sign.
(Not in the OP's case, though.)

I'll also add that if the OP needs the top m for 1>> idx.argsort()
>   array([1, 3, 0, 2])
>
> The min value of x is x[2], that's why 2 is the first element of idx
> which means that we want ranked(x) to contain a 0 at position 2 which
> it does.
>
> Bah, it's all magic.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for new ufunc functionality

2010-04-14 Thread Stephen Simmons
I would really like to see this become a core part of numpy...

For groupby-like summing over arrays, I use a modified version of
numpy.bincount() which has optional arguments that greatly enhance its 
flexibility:
   bincount(bin, weights=, max_bins=. out=)
where:
   *  bins- numpy array of bin numbers (uint8, int16 or int32).
  [1]  *Negative bins numbers indicate weights to be ignored
   *  weights - (opt) numpy array of weights (float or double)
  [2]  *  max_bin - (opt) bin numbers greater than this are ignored when 
counting
   *  out - (opt) numpy output array (int32 or double)

[1]  This is how I support Robert Kern's comment below "If there are some
areas you want to ignore, that's difficult to do with reduceat()."

[2]  Specifying the number of bins up front has two benefits: (i) saves
scanning the bins array to see how big the output needs to be;
and (ii) allows you to control the size of the output array, as you may
want it bigger than the number of bins would suggest.


I look forward to the draft NEP!

Best regards
Stephen Simmons



On 13/04/2010 10:34 PM, Robert Kern wrote:
> On Sat, Apr 10, 2010 at 17:59, Robert Kern  wrote:
>
>> On Sat, Apr 10, 2010 at 12:45, Pauli Virtanen  wrote:
>>  
>>> la, 2010-04-10 kello 12:23 -0500, Travis Oliphant kirjoitti:
>>> [clip]
>>>
 Here are my suggested additions to NumPy:
 ufunc methods:
  
>>> [clip]
>>>
* reducein (array, indices, axis=0)
 similar to reduce-at, but the indices provide both the
 start and end points (rather than being fence-posts like reduceat).
  
>>> Is the `reducein` important to have, as compared to `reduceat`?
>>>
>> Yes, I think so. If there are some areas you want to ignore, that's
>> difficult to do with reduceat().
>>  
> And conversely overlapping areas are highly useful but completely
> impossible to do with reduceat.
>
>

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] binomial coefficient, factorial

2010-04-14 Thread Christopher Barker
jah wrote:
> Is there any chance that a binomial coefficent and factorial function 
> can make their way into NumPy? 

probably not -- numpy is over-populated already

> I know these exist in Scipy, but I don't 
> want to have to install SciPy just to have something so "basic".

The problem is that everyone has a different "basic". What we really 
need is an easier way for folks to use sub-packages of scipy. I've found 
myself hand-extracting just what I need, too.

-Chris




-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread eat
Nikolaus Rath  rath.org> writes:

> 
> Hello,
> 
> How do I best find out the indices of the largest x elements in an
> array?
> 
> Example:
> 
> a = [ [1,8,2], [2,1,3] ]
> magic_function(a, 2) == [ (0,1), (1,2) ]
> 
> Since the largest 2 elements are at positions (0,1) and (1,2).
> 
> Best,
> 
>-Niko
> 
Hi,

Just
a= np.asarray([[1, 8, 2], [2, 1, 3]])
print np.where((a.T== a.max(axis= 1)).T)

However, if any row contains more than 1 max entity, above will fail. Please 
let me to know if that's relevant for you.

-eat


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Keith Goodman
On Wed, Apr 14, 2010 at 1:56 PM, Keith Goodman  wrote:
> On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath  wrote:
>> Keith Goodman  writes:
>>> On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman  wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath  wrote:
> Hello,
>
> How do I best find out the indices of the largest x elements in an
> array?
>
> Example:
>
> a = [ [1,8,2], [2,1,3] ]
> magic_function(a, 2) == [ (0,1), (1,2) ]
>
> Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:
>>>
>>> ...or if you need the indices in order:
>>>
> shape = (3,2)
> x = np.random.rand(*shape)
> x
>>> array([[ 0.52420123,  0.43231286],
>>>        [ 0.97995333,  0.87416228],
>>>        [ 0.71604075,  0.66018382]])
> r = x.reshape(-1).argsort().argsort()
>>
>> I don't understand why this works. Why do you call argsort() twice?
>> Doesn't that give you the indices of the sorted indices?
>
> It is confusing. Let's look at an example:
>
>>> x = np.random.rand(4)
>>> x
>   array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])
>
> If we call argsort once we get the index that will sort x:
>
>>> idx = x.argsort()
>>> idx
>   array([2, 0, 3, 1])
>>> x[idx]
>   array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])
>
> Notice that the first element of idx is 2. That's because element x[2]
> is the min of x. But that's not what we want. We want the first
> element to be the rank of the first element of x. So we need to
> shuffle idx around so that the order aligns with x. How do we do that?
> We sort it!
>
>>> idx.argsort()
>   array([1, 3, 0, 2])
>
> The min value of x is x[2], that's why 2 is the first element of idx
> which means that we want ranked(x) to contain a 0 at position 2 which
> it does.
>
> Bah, it's all magic.

You can also use rankdata from scipy:

>> from scipy.stats import rankdata
>> rankdata(x)
   array([ 2.,  4.,  1.,  3.])

Note the the smallest rank is 1.

>> rankdata(x) - 1
   array([ 1.,  3.,  0.,  2.])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Keith Goodman
On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath  wrote:
> Keith Goodman  writes:
>> On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman  wrote:
>>> On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath  wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).
>>>
>>> Here's a quick way to rank the data if there are no ties and no NaNs:
>>
>> ...or if you need the indices in order:
>>
 shape = (3,2)
 x = np.random.rand(*shape)
 x
>> array([[ 0.52420123,  0.43231286],
>>        [ 0.97995333,  0.87416228],
>>        [ 0.71604075,  0.66018382]])
 r = x.reshape(-1).argsort().argsort()
>
> I don't understand why this works. Why do you call argsort() twice?
> Doesn't that give you the indices of the sorted indices?

It is confusing. Let's look at an example:

>> x = np.random.rand(4)
>> x
   array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])

If we call argsort once we get the index that will sort x:

>> idx = x.argsort()
>> idx
   array([2, 0, 3, 1])
>> x[idx]
   array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])

Notice that the first element of idx is 2. That's because element x[2]
is the min of x. But that's not what we want. We want the first
element to be the rank of the first element of x. So we need to
shuffle idx around so that the order aligns with x. How do we do that?
We sort it!

>> idx.argsort()
   array([1, 3, 0, 2])

The min value of x is x[2], that's why 2 is the first element of idx
which means that we want ranked(x) to contain a 0 at position 2 which
it does.

Bah, it's all magic.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Nikolaus Rath
Keith Goodman  writes:
> On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman  wrote:
>> On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath  wrote:
>>> Hello,
>>>
>>> How do I best find out the indices of the largest x elements in an
>>> array?
>>>
>>> Example:
>>>
>>> a = [ [1,8,2], [2,1,3] ]
>>> magic_function(a, 2) == [ (0,1), (1,2) ]
>>>
>>> Since the largest 2 elements are at positions (0,1) and (1,2).
>>
>> Here's a quick way to rank the data if there are no ties and no NaNs:
>
> ...or if you need the indices in order:
>
>>> shape = (3,2)
>>> x = np.random.rand(*shape)
>>> x
> array([[ 0.52420123,  0.43231286],
>[ 0.97995333,  0.87416228],
>[ 0.71604075,  0.66018382]])
>>> r = x.reshape(-1).argsort().argsort()

I don't understand why this works. Why do you call argsort() twice?
Doesn't that give you the indices of the sorted indices?


Thanks,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] rc2 for NumPy 1.4.1 and Scipy 0.7.2

2010-04-14 Thread Charles سمير Doutriaux
Just downloaded this.

On my mac 10.6, using python 2.6.5  i get:

Running from numpy source directory.
non-existing path in 'numpy/distutils': 'site.cfg'
F2PY Version 2
blas_opt_info:
  FOUND:
extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
define_macros = [('NO_ATLAS_INFO', 3)]
extra_compile_args = ['-faltivec', 
'-I/System/Library/Frameworks/vecLib.framework/Headers']

lapack_opt_info:
  FOUND:
extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
define_macros = [('NO_ATLAS_INFO', 3)]
extra_compile_args = ['-faltivec']

running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands --compiler 
options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands --fcompiler 
options
running build_src
build_src
building py_modules sources
creating build
creating build/src.macosx-10.3-fat-2.6
creating build/src.macosx-10.3-fat-2.6/numpy
creating build/src.macosx-10.3-fat-2.6/numpy/distutils
building library "npymath" sources
customize NAGFCompiler
Could not locate executable f95
customize AbsoftFCompiler
Could not locate executable f90
Could not locate executable f77
customize IBMFCompiler
Could not locate executable xlf90
Could not locate executable xlf
customize IntelFCompiler
Could not locate executable ifort
Could not locate executable ifc
customize GnuFCompiler
Could not locate executable g77
customize Gnu95FCompiler
Found executable /usr/local/bin/gfortran
/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/fcompiler/gnu.py:125: 
UserWarning: Env. variable MACOSX_DEPLOYMENT_TARGET set to 10.3
  warnings.warn(s)
customize Gnu95FCompiler
customize Gnu95FCompiler using config
C compiler: gcc -arch ppc -arch i386 -isysroot / -fno-strict-aliasing -DNDEBUG

compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core 
-Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath 
-Inumpy/core/include -I/lgm/cdat/trunk/include/python2.6 -c'
gcc: _configtest.c
gcc _configtest.o -o _configtest
ld: warning: in _configtest.o, missing required architecture x86_64 in file
Undefined symbols:
  "_main", referenced from:
  __start in crt1.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
ld: warning: in _configtest.o, missing required architecture x86_64 in file
Undefined symbols:
  "_main", referenced from:
  __start in crt1.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
failure.
removing: _configtest.c _configtest.o
Traceback (most recent call last):
  File "setup.py", line 187, in 
setup_package()
  File "setup.py", line 180, in setup_package
configuration=configuration )
  File "/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/core.py", line 
186, in setup
return old_setup(**new_attr)
  File "/lgm/cdat/trunk/lib/python2.6/distutils/core.py", line 152, in setup
dist.run_commands()
  File "/lgm/cdat/trunk/lib/python2.6/distutils/dist.py", line 975, in 
run_commands
self.run_command(cmd)
  File "/lgm/cdat/trunk/lib/python2.6/distutils/dist.py", line 995, in 
run_command
cmd_obj.run()
  File 
"/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/command/build.py", 
line 37, in run
old_build.run(self)
  File "/lgm/cdat/trunk/lib/python2.6/distutils/command/build.py", line 134, in 
run
self.run_command(cmd_name)
  File "/lgm/cdat/trunk/lib/python2.6/distutils/cmd.py", line 333, in 
run_command
self.distribution.run_command(command)
  File "/lgm/cdat/trunk/lib/python2.6/distutils/dist.py", line 995, in 
run_command
cmd_obj.run()
  File 
"/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/command/build_src.py",
 line 152, in run
self.build_sources()
  File 
"/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/command/build_src.py",
 line 163, in build_sources
self.build_library_sources(*libname_info)
  File 
"/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/command/build_src.py",
 line 298, in build_library_sources
sources = self.generate_sources(sources, (lib_name, build_info))
  File 
"/Users/doutriaux1/Desktop/numpy-1.4.1rc2/numpy/distutils/command/build_src.py",
 line 385, in generate_sources
source = func(extension, build_dir)
  File "numpy/core/setup.py", line 657, in get_mathlib_info
raise RuntimeError("Broken toolchain: cannot link a simple C program")
RuntimeError: Broken toolchain: cannot link a simple C program

On Apr 11, 2010, at 2:09 AM, Ralf Gommers wrote:

> Hi,
> 
> I am pleased to announce the second release candidate of both Scipy 0.7.2 and 
> NumPy 1.4.1, please test them.
> 
> The issues reported with rc1 should be fixed, and for NumPy there are now 
> Python 2.5 binaries as well. For SciPy there will be no 2.5 binaries - 
> because 0.7.x is built against NumPy 1.2 it does not work on OS X 10.6, and 
> on Windows I see some incomprehensible build failures.
> 
> Binaries, sources and release notes can be found at 
> http://*www.*filefactory.com/f/4452c50056df

Re: [Numpy-discussion] How to combine a pair of 1D arrays?

2010-04-14 Thread Anne Archibald
On 14 April 2010 11:34, Robert Kern  wrote:
> On Wed, Apr 14, 2010 at 10:25, Peter Shinners  wrote:
>> Is there a way to combine two 1D arrays with the same size into a 2D
>> array? It seems like the internal pointers and strides could be
>> combined. My primary goal is to not make any copies of the data.
>
> There is absolutely no way to get around that, I am afraid.

Well, I'm not quite sure I agree with this.

The best way, of course, is to allocate the original two arrays as
subarrays of one larger array, that is, start with the fused array and
select your two 1D arrays as subarrays. Of course, this depends on how
you're generating the 1D arrays; if they're simply being returned to
you from a black-box function, you're stuck with them. But if it's a
ufunc-like operation, it may have an output argument that lets you
write to a supplied array rather than allocating a new one. If they're
coming from a file, you may be able to map the whole file (or a large
chunk) as an array and select them as subarrays (even if alignment and
type are issues).

You should also keep in mind that allocating arrays and copying data
really isn't very expensive - malloc() is extremely fast, especially
for large arrays which are just grabbed as blocks from the OS - and
copying arrays is also cheap, and can be used to reorder data into a
more cache-friendly order. If the problem is that your arrays almost
fill available memory, you will already have noticed that using numpy
is kind of a pain, because many operations involve copies. But if you
really have to do this, it may be possible.

numpy arrays are specified by giving a data area, and strides into
that data area. The steps along each axis must be uniform, but if you
really have two arrays with the same stride, you may be able to use a
gruesome hack to make it work. Each of your two arrays has a data
pointer, which essentially points to the first element. So if you make
up your two-row array using the same data pointer as your first array,
and give it a stride along the second dimension equal to the
difference between pointers, this should work. Of course, you have to
make sure python doesn't deallocate the second array out from under
you, and you may have to defeat some error checking, but in principle
it should be possible.

Anne
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to combine a pair of 1D arrays?

2010-04-14 Thread Zachary Pincus
> On Wed, Apr 14, 2010 at 10:25, Peter Shinners   
> wrote:
>> Is there a way to combine two 1D arrays with the same size into a 2D
>> array? It seems like the internal pointers and strides could be
>> combined. My primary goal is to not make any copies of the data.
>
> There is absolutely no way to get around that, I am afraid.

You could start with the 2d array... instead of:
>  >>> a = np.array((1,2,3,4))
>  >>> b = np.array((11,12,13,14))
>  >>> c = np.magical_fuse(a, b)   # what goes here?

perhaps something like:

  >>> c = np.empty((2,4), int)
  >>> a = c[0]
  >>> b = c[1]

Now a and b are views on c. So if you (say) pass them to, say, some  
(strides-aware) C routine that fills them in element-by-element, c  
will get filled in at the same time.

Zach


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is this a small bug in numpydoc?

2010-04-14 Thread Fernando Perez
On Wed, Apr 14, 2010 at 3:21 AM, Pauli Virtanen  wrote:
>> But I didn't write numpydoc and I'm tired, so I don't want to commit
>> this without a second pair of eyes...
>
> Yeah, it's a bug, I think.
>

Thanks, fixed in r8333.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Keith Goodman
On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman  wrote:
> On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath  wrote:
>> Hello,
>>
>> How do I best find out the indices of the largest x elements in an
>> array?
>>
>> Example:
>>
>> a = [ [1,8,2], [2,1,3] ]
>> magic_function(a, 2) == [ (0,1), (1,2) ]
>>
>> Since the largest 2 elements are at positions (0,1) and (1,2).
>
> Here's a quick way to rank the data if there are no ties and no NaNs:
>
>>> shape = (3,2)
>>> x = np.random.rand(*shape)
>>> x
> array([[ 0.83424288,  0.17821326],
>       [ 0.62718311,  0.63514286],
>       [ 0.18373934,  0.90634162]])
>>> r = x.reshape(-1).argsort().argsort().reshape(shape)
>>> r
> array([[4, 0],
>       [2, 3],
>       [1, 5]])
>
> To find the indices you can use where:
>
>>> r < 2
> array([[False,  True],
>       [False, False],
>       [ True, False]], dtype=bool)
>>> np.where(r < 2)
>   (array([0, 2]), array([1, 0]))
>
> ...but the indices will not be in order.

...or if you need the indices in order:

>> shape = (3,2)
>> x = np.random.rand(*shape)
>> x
array([[ 0.52420123,  0.43231286],
   [ 0.97995333,  0.87416228],
   [ 0.71604075,  0.66018382]])
>> r = x.reshape(-1).argsort().argsort()
>> np.unravel_index(r[0], shape)
   (0, 1)
>> np.unravel_index(r[1], shape)
   (0, 0)
>> np.unravel_index(r[2], shape)
   (2, 1)
>> np.unravel_index(r[-1], shape)
   (1, 0)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Gökhan Sever
On Wed, Apr 14, 2010 at 10:16 AM, Nikolaus Rath  wrote:

> Hello,
>
> How do I best find out the indices of the largest x elements in an
> array?
>
> Example:
>
> a = [ [1,8,2], [2,1,3] ]
> magic_function(a, 2) == [ (0,1), (1,2) ]
>
> Since the largest 2 elements are at positions (0,1) and (1,2).
>
> I[1]: a = np.array([ [1,8,2], [2,1,3] ])

I[2]: b = a.max(axis=1)[:,np.newaxis]

I[3]: b
O[3]:
array([[8],
   [3]])

I[4]: np.where(a==b)
O[4]: (array([0, 1]), array([1, 2]))



-- 
Gökhan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Keith Goodman
On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath  wrote:
> Hello,
>
> How do I best find out the indices of the largest x elements in an
> array?
>
> Example:
>
> a = [ [1,8,2], [2,1,3] ]
> magic_function(a, 2) == [ (0,1), (1,2) ]
>
> Since the largest 2 elements are at positions (0,1) and (1,2).

Here's a quick way to rank the data if there are no ties and no NaNs:

>> shape = (3,2)
>> x = np.random.rand(*shape)
>> x
array([[ 0.83424288,  0.17821326],
   [ 0.62718311,  0.63514286],
   [ 0.18373934,  0.90634162]])
>> r = x.reshape(-1).argsort().argsort().reshape(shape)
>> r
array([[4, 0],
   [2, 3],
   [1, 5]])

To find the indices you can use where:

>> r < 2
array([[False,  True],
   [False, False],
   [ True, False]], dtype=bool)
>> np.where(r < 2)
   (array([0, 2]), array([1, 0]))

...but the indices will not be in order.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Skipper Seabold
On Wed, Apr 14, 2010 at 11:16 AM, Nikolaus Rath  wrote:
> Hello,
>
> How do I best find out the indices of the largest x elements in an
> array?
>
> Example:
>
> a = [ [1,8,2], [2,1,3] ]
> magic_function(a, 2) == [ (0,1), (1,2) ]
>
> Since the largest 2 elements are at positions (0,1) and (1,2).

Something like this might be made to work, if you want the max
elements along all the rows.

In [3]: a = np.asarray(a)

In [4]: a[range(len(a)),np.argmax(a, axis=1)]
Out[4]: array([8, 3])

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Release candidate 2 for NumPy 1.4.1 and SciPy 0.7.2

2010-04-14 Thread Ralf Gommers
Hi,

I am pleased to announce the second release candidate of both Scipy 0.7.2
and NumPy 1.4.1. Please test, and report any problems on the NumPy or SciPy
list. I also want to specifically ask you to report success/failure with
other libraries (Matplotlib, Pygame, ) based on
NumPy or SciPy.

Binaries, sources and release notes can be found at
https://sourceforge.net/projects/numpy/files/
https://sourceforge.net/projects/scipy/files/


NumPy 1.4.1
==
The main change over 1.4.0 is that datetime support has been removed. This
fixes the binary incompatibility issues between NumPy and other libraries
like SciPy and Matplotlib.

There are also a number of other bug fixes, and no new features.

Binaries for Python 2.5 and 2.6 are available for both Windows and OS X.


SciPy 0.7.2
=
The only change compared to 0.7.1 is that the C sources for Cython code have
been regenerated with Cython 0.12.1. This ensures that SciPy 0.7.2 will work
with NumPy 1.4.1, while also retaining backwards compatibility with NumPy
1.3.0.

Note that the 0.7.x branch was created in January 2009, so a lot of fixes
and new functionality in current trunk is not present in this release.

Binaries for Python 2.6 are available for both Windows and OS X. Due to the
age of the code no binaries for Python 2.5 are available.


On behalf of the NumPy and SciPy developers,
Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Nikolaus Rath
Hello,

How do I best find out the indices of the largest x elements in an
array?

Example:

a = [ [1,8,2], [2,1,3] ]
magic_function(a, 2) == [ (0,1), (1,2) ]

Since the largest 2 elements are at positions (0,1) and (1,2).


Best,

   -Niko

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to combine a pair of 1D arrays?

2010-04-14 Thread Robert Kern
On Wed, Apr 14, 2010 at 10:25, Peter Shinners  wrote:
> Is there a way to combine two 1D arrays with the same size into a 2D
> array? It seems like the internal pointers and strides could be
> combined. My primary goal is to not make any copies of the data.

There is absolutely no way to get around that, I am afraid.

> It
> might be doable with a bit of ctypes if there is not a native numpy call.
>
>  >>> import numpy as np
>  >>> a = np.array((1,2,3,4))
>  >>> b = np.array((11,12,13,14))
>  >>> c = np.magical_fuse(a, b)   # what goes here?

c = np.vstack([a, b])

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] How to combine a pair of 1D arrays?

2010-04-14 Thread Peter Shinners
Is there a way to combine two 1D arrays with the same size into a 2D 
array? It seems like the internal pointers and strides could be 
combined. My primary goal is to not make any copies of the data. It 
might be doable with a bit of ctypes if there is not a native numpy call.

 >>> import numpy as np
 >>> a = np.array((1,2,3,4))
 >>> b = np.array((11,12,13,14))
 >>> c = np.magical_fuse(a, b)   # what goes here?
 >>> print c.shape
(2, 4)
 >>> a == c[0]
True
 >>> a[1] = -2
 >>> a == c[0]
True

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] rc2 for NumPy 1.4.1 and Scipy 0.7.2

2010-04-14 Thread Francesc Alted
A Wednesday 14 April 2010 15:36:00 Charles R Harris escrigué:
> > /home/faltet/PyTables/pytables/trunk/tables/table.py:38: RuntimeWarning:
> > numpy.dtype size changed, may indicate binary incompatibility
> >
> > I'm using current stable Cython 12.1.  Is the warning above intended or
> > I'm doing something wrong?
> 
> I believe Cython decided to issue a warning instead of bailing, which makes
> it usable but constantly annoying. The warning can probably be caught in
>  the __init__ but it would be nice to have some sort of cython  flag that
>  would disable it completely.

Yeah, a flag would be nice.  Ok, I'll have to manage with that then.

Thanks,

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] rc2 for NumPy 1.4.1 and Scipy 0.7.2

2010-04-14 Thread Charles R Harris
On Wed, Apr 14, 2010 at 7:08 AM, Francesc Alted  wrote:

> A Sunday 11 April 2010 11:09:53 Ralf Gommers escrigué:
> > Hi,
> >
> > I am pleased to announce the second release candidate of both Scipy 0.7.2
> > and NumPy 1.4.1, please test them.
> >
> > The issues reported with rc1 should be fixed, and for NumPy there are now
> > Python 2.5 binaries as well. For SciPy there will be no 2.5 binaries -
> > because 0.7.x is built against NumPy 1.2 it does not work on OS X 10.6,
> and
> > on Windows I see some incomprehensible build failures.
> >
> > Binaries, sources and release notes can be found at
> > http://www.filefactory.com/f/4452c50056df8bba/ (apologies for the flashy
> >  ads again).
>
> I've had a try at the new 1.4.1rc2 and my existing binaries that were
> linked
> with 1.3.0 runs all the tests without difficulty (so ABI compatibility
> seems
> good), but every time that a Cython extension is loaded, I get the next
> warning:
>
> /home/faltet/PyTables/pytables/trunk/tables/__init__.py:56: RuntimeWarning:
> numpy.dtype size changed, may indicate binary incompatibility
>  from tables.utilsExtension import getPyTablesVersion, getHDF5Version
> /home/faltet/PyTables/pytables/trunk/tables/file.py:43: RuntimeWarning:
> numpy.dtype size changed, may indicate binary incompatibility
>  from tables import hdf5Extension
> /home/faltet/PyTables/pytables/trunk/tables/link.py:32: RuntimeWarning:
> numpy.dtype size changed, may indicate binary incompatibility
>  from tables import linkExtension
> /home/faltet/PyTables/pytables/trunk/tables/table.py:38: RuntimeWarning:
> numpy.dtype size changed, may indicate binary incompatibility
>
> I'm using current stable Cython 12.1.  Is the warning above intended or I'm
> doing something wrong?
>
>
I believe Cython decided to issue a warning instead of bailing, which makes
it usable but constantly annoying. The warning can probably be caught in the
__init__ but it would be nice to have some sort of cython  flag that would
disable it completely.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] binomial coefficient, factorial

2010-04-14 Thread Alan G Isaac
On 4/13/2010 11:16 PM, jah wrote:
> binomial coefficent and factorial function

http://code.google.com/p/econpy/source/browse/trunk/pytrix/pytrix.py

fwiw,
Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] rc2 for NumPy 1.4.1 and Scipy 0.7.2

2010-04-14 Thread Francesc Alted
A Sunday 11 April 2010 11:09:53 Ralf Gommers escrigué:
> Hi,
> 
> I am pleased to announce the second release candidate of both Scipy 0.7.2
> and NumPy 1.4.1, please test them.
> 
> The issues reported with rc1 should be fixed, and for NumPy there are now
> Python 2.5 binaries as well. For SciPy there will be no 2.5 binaries -
> because 0.7.x is built against NumPy 1.2 it does not work on OS X 10.6, and
> on Windows I see some incomprehensible build failures.
> 
> Binaries, sources and release notes can be found at
> http://www.filefactory.com/f/4452c50056df8bba/ (apologies for the flashy
>  ads again).

I've had a try at the new 1.4.1rc2 and my existing binaries that were linked 
with 1.3.0 runs all the tests without difficulty (so ABI compatibility seems 
good), but every time that a Cython extension is loaded, I get the next 
warning:

/home/faltet/PyTables/pytables/trunk/tables/__init__.py:56: RuntimeWarning: 
numpy.dtype size changed, may indicate binary incompatibility
  from tables.utilsExtension import getPyTablesVersion, getHDF5Version
/home/faltet/PyTables/pytables/trunk/tables/file.py:43: RuntimeWarning: 
numpy.dtype size changed, may indicate binary incompatibility
  from tables import hdf5Extension
/home/faltet/PyTables/pytables/trunk/tables/link.py:32: RuntimeWarning: 
numpy.dtype size changed, may indicate binary incompatibility
  from tables import linkExtension
/home/faltet/PyTables/pytables/trunk/tables/table.py:38: RuntimeWarning: 
numpy.dtype size changed, may indicate binary incompatibility

I'm using current stable Cython 12.1.  Is the warning above intended or I'm 
doing something wrong?

Thanks,

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is this a small bug in numpydoc?

2010-04-14 Thread Pauli Virtanen
Wed, 14 Apr 2010 01:17:37 -0700, Fernando Perez wrote:
[clip]
> Exception occurred:
>   File "/home/fperez/ipython/repo/trunk-lp/docs/sphinxext/numpydoc.py",
> line 71, in mangle_signature
> 'initializes x; see ' in pydoc.getdoc(obj.__init__)):
> AttributeError: class Bunch has no attribute '__init__' The full
> traceback has been saved in /tmp/sphinx-err-H1VlaY.log, if you want to
> report the issue to the developers.
> 
> so it seems indeed that accessing __init__ without checking it's there
> isn't a very good idea.
> 
> But I didn't write numpydoc and I'm tired, so I don't want to commit
> this without a second pair of eyes...

Yeah, it's a bug, I think.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] stdlib docstring format under discussion

2010-04-14 Thread Ralf Gommers
Hi all,

On doc-sig there is a discussion going on about adopting a standard
docstring format for the stdlib. The suggested format is epydoc. If you care
about readability in a terminal then it may be good to join this discussion.
http://mail.python.org/pipermail/doc-sig/2010-April/003819.html

Cheers,
Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] look for value, depending to y position

2010-04-14 Thread Nadav Horesh
I assume that you forgot to specify the range between 300 and 400. But anyway 
this piece of code may give you a direction:

--
import numpy as np

ythreshold = np.repeat(np.arange(4,-1,-1), 100) * 20 +190
bin_image = image > ythreshold[:,None]
--

Anyway I advice you to look at image morphology operations in scipy.ndimage

  Nadav


-Original Message-
From: numpy-discussion-boun...@scipy.org on behalf of ioannis syntychakis
Sent: Wed 14-Apr-10 10:11
To: Discussion of Numerical Python
Subject: [Numpy-discussion] look for value, depending to y position
 
Hallo everybody

maybe somebody can help with the following:

i'm using numpy and pil to find objects in a grayscale image. I make an
array of the image and then i look for pixels with the value above the 230.
Then i convert the array to image and i see my objects.

What i want is to make the grayscale depented to the place on th image.

the image is 500 to 500 pixels.

and for example i want that the pixelvalue the program is looking for
decreases in the y direction.

on position y= 0 to 100 the programm is looking  for pixelvalues above the
250
 on position y= 100 to 200 the programm is looking  for pixelvalues above
the 230
 on position y= 200 to 300 the programm is looking  for pixelvalues above
the 210
 on position y= 400 to 500the programm is looking  for pixelvalues above the
190

is this possible?

thanks in advance.

greetings, Jannis

<>___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Is this a small bug in numpydoc?

2010-04-14 Thread Fernando Perez
Howdy,

in ipython we use numpydoc, and as I was just trying to build the docs
from a clean checkout of ipython's trunk, I kept getting errors that I
was able to fix with this patch:

amirbar[sphinxext]> svn diff
Index: numpydoc.py
===
--- numpydoc.py (revision 8332)
+++ numpydoc.py (working copy)
@@ -73,7 +73,8 @@
 def mangle_signature(app, what, name, obj, options, sig, retann):
 # Do not try to inspect classes that don't define `__init__`
 if (inspect.isclass(obj) and
-'initializes x; see ' in pydoc.getdoc(obj.__init__)):
+(not hasattr(obj, '__init__') or
+'initializes x; see ' in pydoc.getdoc(obj.__init__))):
 return '', ''

 if not (callable(obj) or hasattr(obj, '__argspec_is_invalid_')): return


The errors were always of the type:

Exception occurred:
  File "/home/fperez/ipython/repo/trunk-lp/docs/sphinxext/numpydoc.py",
line 71, in mangle_signature
'initializes x; see ' in pydoc.getdoc(obj.__init__)):
AttributeError: class Bunch has no attribute '__init__'
The full traceback has been saved in /tmp/sphinx-err-H1VlaY.log, if
you want to report the issue to the developers.

so it seems indeed that accessing __init__ without checking it's there
isn't a very good idea.

But I didn't write numpydoc and I'm tired, so I don't want to commit
this without a second pair of eyes...

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] look for value, depending to y position

2010-04-14 Thread ioannis syntychakis
Hallo everybody

maybe somebody can help with the following:

i'm using numpy and pil to find objects in a grayscale image. I make an
array of the image and then i look for pixels with the value above the 230.
Then i convert the array to image and i see my objects.

What i want is to make the grayscale depented to the place on th image.

the image is 500 to 500 pixels.

and for example i want that the pixelvalue the program is looking for
decreases in the y direction.

on position y= 0 to 100 the programm is looking  for pixelvalues above the
250
 on position y= 100 to 200 the programm is looking  for pixelvalues above
the 230
 on position y= 200 to 300 the programm is looking  for pixelvalues above
the 210
 on position y= 400 to 500the programm is looking  for pixelvalues above the
190

is this possible?

thanks in advance.

greetings, Jannis
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] how to tally the values seen

2010-04-14 Thread Peter Shinners

On 04/13/2010 11:44 PM, Gökhan Sever wrote:



On Wed, Apr 14, 2010 at 1:34 AM, Warren Weckesser 
> wrote:


Gökhan Sever wrote:
>
>
> On Wed, Apr 14, 2010 at 1:10 AM, Peter Shinners
mailto:p...@shinners.org>
> >> wrote:
>
> I have an array that represents the number of times a value
has been
> given. I'm trying to find a direct numpy way to add into
these sums
> without requiring a Python loop.
>
> For example, say there are 10 possible values. I start with an
> array of
> zeros.
>
> >>> counts = numpy.zeros(10, numpy.int 
)
>
> Now I get an array with several values in them, I want to
add into
> counts. All I can think of is a for loop that will give my the
> results I
> want.
>
>
> >>> values = numpy.array((2, 8, 1))
> >>> for v in values:
> ...counts[v] += 1
> >>> print counts
> [0 1 1 0 0 0 0 0 1 0]
>
>
> This is easy:
>
> I[3]: a
> O[3]: array([ 0.,  1.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.])
>
> I[4]: a = np.zeros(10)
>
> I[5]: b = np.array((2,8,1))
>
> I[6]: a[b] = 1
>
> I[7]: a
> O[7]: array([ 0.,  1.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.])
>
> Let me think about the other case :)
>
>
> I also need to handle the case where a value is listed more
than once.
> So if values is (2, 8, 1, 2) then count[2] would equal 2.
>


numpy.bincount():


In [1]: import numpy as np

In [2]: x = np.array([2,8,1,2,7,7,2,7,0,2])

In [3]: np.bincount(x)
Out[3]: array([1, 1, 4, 0, 0, 0, 0, 3, 1])



I knew a function exists in numpy for this case too :)

This is also safer way to handle the given situation to prevent index 
out of bounds errors.


Thanks guys. Numpy always surprises me. It's like you guys have borrowed 
Guido's time machine.

This is running far faster than I ever hoped.

My next problem involves a bit of indexing and reordering. But I'm gonna 
spend a night of my own time to see where I can get with it.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion