Re: [Numpy-discussion] indexing to sort with argsort(..., axis=1)

2010-04-13 Thread Angus McMorland
On 13 April 2010 04:01, Gökhan Sever gokhanse...@gmail.com wrote:


 On Mon, Apr 12, 2010 at 9:41 PM, Angus McMorland amcm...@gmail.comwrote:

 Hi all,

 I want to sort a 2d array along one dimension, with the indices returned
 by argsort, but the subsequent indexing syntax to get the sorted array is
 not obvious.

 The following works, but I wonder if there is a simpler way:

 a = np.random.random(size=(5,3))
 s = np.argsort(a, axis=1)
 sorted = a[:,s][np.eye(5,5, dtype=bool)] # it looks like this line could
 be simpler

 What's the correct, concise way to do this?


 Why not just:

 b = np.sort(a)

 What advantage does argsort provides in this case?


I want to be able to sort another array the same way; calculating b is
really just a check that I was doing the sort correctly.

Thanks Josef, for the reminder about using arange. I realise I've seen it
before, but haven't got it intuitive in my head yet.

A.
-- 
AJC McMorland
Post-doctoral research fellow
Neurobiology, University of Pittsburgh
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] [ANN] toydist version 0.0.2

2010-04-13 Thread David Cournapeau
Hi,

I am glad to announce the first release of toydist, the 0.0.2 release:

http://github.com/cournape/toydist/downloads

Toydist is a pythonic, no-nonsense packaging solution for python softwares. The
main highlights of this first public release are:

* Package description grammar formalized through PLY, the python
  Lex-yacc package (included in toydist for the time being)
* Toymaker, the command line interface to toydist, can install itself

Although still very much in infancy, I have succesfully used the conversion
procedure on sphinx, jinja and a few python-only Enthought packages.

How to help ?
-

Toydist is experimental, and should not be used for anything production-worthy
at this point. There are two main reasons why I have released such a
preliminary version:

* To get suggestions/comments on the current file format. I believe the
  current syntax strikes a good balance for human readers/writers, but
  it cannot be parsed with just regular expressions, which means syntax
  highlithing may be difficult (the difficulty is to be able to parse
  arbitary rest for some metadata). The easiest way to get a feeling of
  the format is to convert your favorite python package, assuming it is
  simple enough.
* The API for package description is still regularly changed, and the
  command-line interface has no sane API yet, but I certainly welcome
  suggestions from people interested in writing debian or
other builders.

I have started documenting toydist, an up-to-date version can be found on
http://cournape.github.com/toydist. Example of simple commands may be found in
toydist/commands (sdist and install are particularly simple).

The focus for 0.0.3 is hook support - the goal is to have an API robust enough
so that future toydist code will be based on hooks, as well a basic support for
waf or scons-based C builds.

cheers,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for new ufunc functionality

2010-04-13 Thread Travis Oliphant


On Apr 12, 2010, at 5:31 PM, Robert Kern wrote:


We should collect all of these proposals into a NEP.  To  
clarify what I

mean by group-by behavior.
Suppose I have an array of floats and an array of integers.   Each  
element
in the array of integers represents a region in the float array of  
a certain

kind.   The reduction should take place over like-kind values:
Example:
add.reduceby(array=[1,2,3,4,5,6,7,8,9], by=[0,1,0,1,2,0,0,2,2])
results in the calculations:
1 + 3 + 6 + 7
2 + 4
5 + 8 + 9
and therefore the output (notice the two arrays --- perhaps a  
structured

array should be returned instead...)
[0,1,2],
[17, 6, 22]

The real value is when you have tabular data and you want to do  
reductions
in one field based on values in another field.   This happens all  
the time
in relational algebra and would be a relatively straightforward  
thing to

support in ufuncs.


I might suggest a simplification where the by array must be an array
of non-negative ints such that they are indices into the output. For
example (note that I replace 2 with 3 and have no 2s in the by array):

add.reduceby(array=[1,2,3,4,5,6,7,8,9], by=[0,1,0,1,3,0,0,3,3]) ==
[17, 6, 0, 22]

This basically generalizes bincount() to other binary ufuncs.



Interesting proposal.   I do like the having only one output.

I'm particularly interested in reductions with by arrays of  
strings.  i.e.  something like:


add.reduceby([10,11,12,13,14,15,16],  
by=['red','green','red','green','red','blue', 'blue']).


resulting in:

10+12+14
11+13
15+16

In practice, these would have to be essentially mapped to the kind of  
integer array I used in the original example, and so I suppose if we  
couple your proposal with the segment function from the rest of my  
original proposal, then the same resulting functionality is available  
(with perhaps the extra intermediate integer array that may not be  
strictly necessary).


But, having simple building blocks is usually better in the long run  
(and typically leads to better optimizations by human programmers).


Thanks,

-Travis


--
Travis Oliphant
Enthought Inc.
1-512-536-1057
http://www.enthought.com
oliph...@enthought.com





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for new ufunc functionality

2010-04-13 Thread Travis Oliphant


On Apr 12, 2010, at 5:54 PM, Warren Weckesser wrote:


A bit more generalization of `by` gives behavior like matlab's  
accumarray
(http://www.mathworks.com/access/helpdesk/help/techdoc/ref/accumarray.html 
),

which I partly cloned here:
[This would be a link to the scipy cookbook, but scipy.org is not
responding.]


Reading the accumarray docstring, it does seem related, but they use  
the subs array as an index into the original array (instead of an  
index into the output array like Robert's simplification).I do  
like the Matlab functionality, but would propose a different reduction  
function:  reduceover  to implement it.


It also feels like we should figure out different kinds of reductions  
for generalized ufuncs as well.If anyone has a primer on  
generalized ufuncs, I would love to see it.   Isn't a reduction on a  
generalized ufunc just another generalized ufunc?   Perhaps we could  
automatically create these reduced generalized ufuncs


I would love to explore just how general these generalized ufuncs are  
and what can be subsumed by them.



-Travis



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for new ufunc functionality

2010-04-13 Thread josef . pktd
On Tue, Apr 13, 2010 at 10:03 AM, Travis Oliphant
oliph...@enthought.com wrote:

 On Apr 12, 2010, at 5:31 PM, Robert Kern wrote:

 We should collect all of these proposals into a NEP.      To clarify what I

 mean by group-by behavior.

 Suppose I have an array of floats and an array of integers.   Each element

 in the array of integers represents a region in the float array of a certain

 kind.   The reduction should take place over like-kind values:

 Example:

 add.reduceby(array=[1,2,3,4,5,6,7,8,9], by=[0,1,0,1,2,0,0,2,2])

 results in the calculations:

 1 + 3 + 6 + 7

 2 + 4

 5 + 8 + 9

 and therefore the output (notice the two arrays --- perhaps a structured

 array should be returned instead...)

 [0,1,2],

 [17, 6, 22]

 The real value is when you have tabular data and you want to do reductions

 in one field based on values in another field.   This happens all the time

 in relational algebra and would be a relatively straightforward thing to

 support in ufuncs.

 I might suggest a simplification where the by array must be an array
 of non-negative ints such that they are indices into the output. For
 example (note that I replace 2 with 3 and have no 2s in the by array):

 add.reduceby(array=[1,2,3,4,5,6,7,8,9], by=[0,1,0,1,3,0,0,3,3]) ==
 [17, 6, 0, 22]

 This basically generalizes bincount() to other binary ufuncs.


 Interesting proposal.   I do like the having only one output.
 I'm particularly interested in reductions with by arrays of strings.  i.e.
  something like:
 add.reduceby([10,11,12,13,14,15,16],
 by=['red','green','red','green','red','blue', 'blue']).
 resulting in:
 10+12+14
 11+13
 15+16
 In practice, these would have to be essentially mapped to the kind of
 integer array I used in the original example, and so I suppose if we couple
 your proposal with the segment function from the rest of my original
 proposal, then the same resulting functionality is available (with perhaps
 the extra intermediate integer array that may not be strictly necessary).
 But, having simple building blocks is usually better in the long run (and
 typically leads to better optimizations by human programmers).

Currently I'm using unique return_inverse to do the recoding into integers

 np.unique(['red','green','red','green','red','blue', 
 'blue'],return_inverse=True)
(array(['blue', 'green', 'red'],
  dtype='|S5'), array([2, 1, 2, 1, 2, 0, 0]))

and then feed into bincount.

Your plans are a good generalization and speedup.

Josef


 Thanks,
 -Travis

 --
 Travis Oliphant
 Enthought Inc.
 1-512-536-1057
 http://www.enthought.com
 oliph...@enthought.com





 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] SciPy 2010 News: Specialized track deadline extended

2010-04-13 Thread Amenity Applewhite


Have you been meaning to prepare an abstract to submit for a SciPy 2010
specialized track (http://conference.scipy.org/scipy2010/papers.html#tracks 
)?

Didn't find the time? Well you're in luck.
This weekend, we had technical issues with the email submissions for the
specialized tracks. In light of the inconvenience, we've decided to  
extend

the deadline an additional two weeks until Sunday, April 25th.

If you have an abstract ready for one of the four specialized tracks,  
please

use the links below to submit it to the program chair. If you previously
submitted one and didn't receive confirmation that we received it, it  
would

be a great idea to submit it again to ensure we get it.

  * Biomedical/bioinformatics chaired by Glen Otero, Dell
submit/contact: 2010bioinformat...@scipy.org

  * Financial analysis chaired by Wes McKinney, AQR Capital Management
submit/contact: 2010fina...@scipy.org

  * Geophysics chaired by Alan Jackson, Shell
submit/contact: 2010geophys...@scipy.org

  * Parallel processing  cloud computing co-chaired by Ken Elkabany,
PiCloud  Brian Granger, CalPoly
submit/contact: 2010paral...@scipy.org


Main Conference Submissions
Submissions for the main SciPy 2010 conference closed Sunday. Thanks to
everyone who submitted. We'll announce the accepted talks Tuesday April
20th.

Student Sponsorships
If you're an academic and contribute to SciPy or related projects,  
make sure

to apply for one of our student sponsorships. The deadline to apply is
April 18th. We are also accepting nominations.
http://conference.scipy.org/scipy2010/student.html

Don't forget to register...
Registrations are coming in pretty steadily now. Remember that to get  
early

registration prices you need to [8]register before May 10th!
https://conference.scipy.org/scipy2010/registration.html

The SciPy 2010 Team
@SciPy2010 on Twitter

--
Amenity Applewhite
Enthought, Inc.
Scientific Computing Solutions
www.enthought.com









___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for new ufunc functionality

2010-04-13 Thread Robert Kern
On Sat, Apr 10, 2010 at 17:59, Robert Kern robert.k...@gmail.com wrote:
 On Sat, Apr 10, 2010 at 12:45, Pauli Virtanen p...@iki.fi wrote:
 la, 2010-04-10 kello 12:23 -0500, Travis Oliphant kirjoitti:
 [clip]
 Here are my suggested additions to NumPy:
 ufunc methods:
 [clip]
       * reducein (array, indices, axis=0)
                similar to reduce-at, but the indices provide both the
 start and end points (rather than being fence-posts like reduceat).

 Is the `reducein` important to have, as compared to `reduceat`?

 Yes, I think so. If there are some areas you want to ignore, that's
 difficult to do with reduceat().

And conversely overlapping areas are highly useful but completely
impossible to do with reduceat.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] binomial coefficient, factorial

2010-04-13 Thread jah
Is there any chance that a binomial coefficent and factorial function can
make their way into NumPy?  I know these exist in Scipy, but I don't want to
have to install SciPy just to have something so basic.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] patch to pickle np.memmap

2010-04-13 Thread Brent Pedersen
hi, i posted a patch to allow pickling of np.memmap objects.
http://projects.scipy.org/numpy/ticket/1452

currently, it always returns 'r' for the mode.
is that the best thing to do there?
any other changes?
-brent
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] patch to pickle np.memmap

2010-04-13 Thread Brent Pedersen
On Tue, Apr 13, 2010 at 8:52 PM, Brent Pedersen bpede...@gmail.com wrote:
 hi, i posted a patch to allow pickling of np.memmap objects.
 http://projects.scipy.org/numpy/ticket/1452

 currently, it always returns 'r' for the mode.
 is that the best thing to do there?
 any other changes?
 -brent


and i guess it should (but does not with that patch) correctly handle:

 a = np.memmap(...)
 b = a[2:]
 cPickle.dumps(b)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion