Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-01 Thread Eelco Hoogendoorn
Sure, id like to do the hashing things out, but I would also like some
preliminary feedback as to whether this is going in a direction anyone else
sees the point of, if it conflicts with other plans, and indeed if we can
agree that numpy is the right place for it; a point which I would very much
like to defend. If there is some obvious no-go that im missing, I can do
without the drudgery of writing proper documentation ;).

As for whether this belongs in numpy: yes, I would say so. There are the
extension of functionality to functions already in numpy, which are a
no-brainer (it need not cost anything performance wise, and ive needed
unique graph edges many many times), and there is the grouping
functionality, which is the main novelty.

However, note that the grouping functionality itself is a very small
addition, just a few 100 lines of pure python, given that the indexing
logic has been factored out of the classic arraysetops. At least from a
developers perspective, it very much feels like a logical extension of the
same 'thing'.

But also from a conceptual numpy perspective, grouping is really more an
'elementary manipulation of an ndarray' than a 'special purpose algorithm'.
It is useful for literally all kinds of programming; hence there is similar
functionality in the python standard library (itertools.groupby); so why
not have an efficient vectorized equivalent in numpy? It belongs there more
than the linalg module, arguably.

Also, from a community perspective, a significant fraction of all
stackoverflow numpy questions are (unknowingly) exactly about 'how to do
grouping in numpy'.


On Mon, Sep 1, 2014 at 4:36 AM, Charles R Harris charlesr.har...@gmail.com
wrote:




 On Sun, Aug 31, 2014 at 1:48 PM, Eelco Hoogendoorn 
 hoogendoorn.ee...@gmail.com wrote:

 Ive organized all code I had relating to this subject in a github
 repository https://github.com/EelcoHoogendoorn/Numpy_arraysetops_EP.
 That should facilitate shooting around ideas. Ive also added more
 documentation and structure to make it easier to see what is going on.

 Hopefully we can converge on a common vision, and then improve the
 documentation and testing to make it worthy of including in the numpy
 master.

 Note that there is also a complete rewrite of the classic
 numpy.arraysetops, such that they are also generalized to more complex
 input, such as finding unique graph edges, and so on.

 You mentioned getting the numpy core developers involved; are they not
 subscribed to this mailing list? I wouldn't be surprised; youd hope there
 is a channel of discussion concerning development with higher signal to
 noise


 There are only about 2.5 of us at the moment. Those for whom this is an
 itch that need scratching should hash things out and make a PR. The main
 question for me is if it belongs in numpy, scipy, or somewhere else.

 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] How to install numpy on a box without hardware FPU

2014-09-01 Thread Emel Hasdal
Hello,
  Is it possible to configure/install numpy on a box without a hardware FPU? 
When I try to install it using pip,I get a bunch of compile errors since  
floating-point exceptions (FE_DIVBYZERO etc) are undefined on this platform. 
How do I get numpy installed and working on such a platform?
Thanks,Emel 
  ___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-01 Thread Charles R Harris
On Mon, Sep 1, 2014 at 1:49 AM, Eelco Hoogendoorn 
hoogendoorn.ee...@gmail.com wrote:

 Sure, id like to do the hashing things out, but I would also like some
 preliminary feedback as to whether this is going in a direction anyone else
 sees the point of, if it conflicts with other plans, and indeed if we can
 agree that numpy is the right place for it; a point which I would very much
 like to defend. If there is some obvious no-go that im missing, I can do
 without the drudgery of writing proper documentation ;).

 As for whether this belongs in numpy: yes, I would say so. There are the
 extension of functionality to functions already in numpy, which are a
 no-brainer (it need not cost anything performance wise, and ive needed
 unique graph edges many many times), and there is the grouping
 functionality, which is the main novelty.

 However, note that the grouping functionality itself is a very small
 addition, just a few 100 lines of pure python, given that the indexing
 logic has been factored out of the classic arraysetops. At least from a
 developers perspective, it very much feels like a logical extension of the
 same 'thing'.

 But also from a conceptual numpy perspective, grouping is really more an
 'elementary manipulation of an ndarray' than a 'special purpose algorithm'.
 It is useful for literally all kinds of programming; hence there is similar
 functionality in the python standard library (itertools.groupby); so why
 not have an efficient vectorized equivalent in numpy? It belongs there more
 than the linalg module, arguably.

 Also, from a community perspective, a significant fraction of all
 stackoverflow numpy questions are (unknowingly) exactly about 'how to do
 grouping in numpy'.


What I'm trying to say is that numpy is a community project. We don't have
a central planning committee, the only difference between developers and
everyone else is activity and commit rights. Which is to say if you develop
and push this topic it is likely to go in. There certainly seems to be
interest in this functionality. The reason that I brought up scipy is that
there are some graph algorithms there that went in a couple of years ago.

Note that the convention on the list is bottom posting.

snip

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-01 Thread Nathaniel Smith
On Mon, Sep 1, 2014 at 8:49 AM, Eelco Hoogendoorn
hoogendoorn.ee...@gmail.com wrote:
 Sure, id like to do the hashing things out, but I would also like some
 preliminary feedback as to whether this is going in a direction anyone else
 sees the point of, if it conflicts with other plans, and indeed if we can
 agree that numpy is the right place for it; a point which I would very much
 like to defend. If there is some obvious no-go that im missing, I can do
 without the drudgery of writing proper documentation ;).

 As for whether this belongs in numpy: yes, I would say so. There are the
 extension of functionality to functions already in numpy, which are a
 no-brainer (it need not cost anything performance wise, and ive needed
 unique graph edges many many times), and there is the grouping
 functionality, which is the main novelty.

 However, note that the grouping functionality itself is a very small
 addition, just a few 100 lines of pure python, given that the indexing logic
 has been factored out of the classic arraysetops. At least from a developers
 perspective, it very much feels like a logical extension of the same
 'thing'.

My 2 cents: I definitely agree that this is very useful fundamental
functionality, and it would be great if numpy had a solution for it
out of the box. My main concern is that this is a fairly complicated
set of functionality and there are a lot of small decisions to be made
in setting up the API for it. IME it's very hard to just read through
an API like this and reason out the best way to do it by pure logic;
usually it needs to get banged on for a bit in real uses before it
becomes clear what the right set of trade-offs is. And numpy itself is
not a great environment these kinds of iterations. So, IMO the main
challenge is: how do we get the functionality into a state where we
can convince ourselves that it'll be supportable in numpy
indefinitely, and not need to be replaced in a year or two?

Some things that might help with this convincing:
- releasing it as a small standalone package on pypi and getting some
real users to bang on it
- any real code written against the APIs
- feedback from the pandas community since they've spent a lot of time
working on these issues
- ...?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-09-01 Thread Eelco Hoogendoorn
On Mon, Sep 1, 2014 at 2:05 PM, Charles R Harris charlesr.har...@gmail.com
wrote:




 On Mon, Sep 1, 2014 at 1:49 AM, Eelco Hoogendoorn 
 hoogendoorn.ee...@gmail.com wrote:

 Sure, id like to do the hashing things out, but I would also like some
 preliminary feedback as to whether this is going in a direction anyone else
 sees the point of, if it conflicts with other plans, and indeed if we can
 agree that numpy is the right place for it; a point which I would very much
 like to defend. If there is some obvious no-go that im missing, I can do
 without the drudgery of writing proper documentation ;).

 As for whether this belongs in numpy: yes, I would say so. There are the
 extension of functionality to functions already in numpy, which are a
 no-brainer (it need not cost anything performance wise, and ive needed
 unique graph edges many many times), and there is the grouping
 functionality, which is the main novelty.

 However, note that the grouping functionality itself is a very small
 addition, just a few 100 lines of pure python, given that the indexing
 logic has been factored out of the classic arraysetops. At least from a
 developers perspective, it very much feels like a logical extension of the
 same 'thing'.

 But also from a conceptual numpy perspective, grouping is really more an
 'elementary manipulation of an ndarray' than a 'special purpose algorithm'.
 It is useful for literally all kinds of programming; hence there is similar
 functionality in the python standard library (itertools.groupby); so why
 not have an efficient vectorized equivalent in numpy? It belongs there more
 than the linalg module, arguably.

 Also, from a community perspective, a significant fraction of all
 stackoverflow numpy questions are (unknowingly) exactly about 'how to do
 grouping in numpy'.


 What I'm trying to say is that numpy is a community project. We don't have
 a central planning committee, the only difference between developers and
 everyone else is activity and commit rights. Which is to say if you develop
 and push this topic it is likely to go in. There certainly seems to be
 interest in this functionality. The reason that I brought up scipy is that
 there are some graph algorithms there that went in a couple of years ago.

 Note that the convention on the list is bottom posting.

 snip

 Chuck


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


I understand that numpy is a community project, so that the decision isn't
up to any one particular person; but some early stage feedback from those
active in the community would be welcome. I am generally confident that
this addition makes sense, but I have not contributed to numpy before,
and you don't know what you don't know and all... given that there are
multiple suggestions for changing arraysetops, some coordination would be
useful I think.

Note that I use graph edges merely as an example; the proposed
functionality is much more general than graphing algorithms specifically.
The radial reduction
https://github.com/EelcoHoogendoorn/Numpy_arraysetops_EP/blob/master/examples.pyexample
I included on github is particularly illustrative of the general utility of
grouping functionality I think. Operations like radial reductions are
rather common, and a custom implementation is quite lengthy, very bug
prone, and potentially very slow.

Thanks for the heads up on posting convention; ive always let gmail do my
thinking for me, which works well enough for me, but I can see how not
following this convention is annoying to others.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ENH IncrementalWriter for .npy files

2014-09-01 Thread Gabor Kovacs
Dear All,

I would like to add a class for writing one (possibly big) .npy file
saving multiple (same dtype, compatible shape) arrays. My use case was
the saving of slowly accumulating data regularly for a long time into
one file.

Please find a first implementation under
https://github.com/numpy/numpy/pull/4987 . It currently supports
writing a new file only and only in C order in the file. Opening an
existing file for append and reading back parts from a very big .npy
file would be straightforward next steps for a full featured class.

The .npy file format is only affected by leaving some extra space for
re-writing the header later with a possibly bigger shape field,
respecting the 16-byte alignment.

Example:
```
A=np.array([[0,1,2,3,4,5,6,7],[8,9,10,11,12,13,14,15]])
with np.IncrementalWriter(testfile.npy,hdrupdate=True,flush=True) as W:
W.save(A)
W.save(A)
```

Feel free to comment this idea.

Cheers,
Gabor
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to install numpy on a box without hardware FPU

2014-09-01 Thread Julian Taylor
On 01.09.2014 10:33, Emel Hasdal wrote:
 Hello,
 
   Is it possible to configure/install numpy on a box without a hardware
 FPU? When I try to install it using pip,
 I get a bunch of compile errors since  floating-point
 exceptions (FE_DIVBYZERO etc) are undefined on this platform. 
 
 How do I get numpy installed and working on such a platform?
 

If its just that you can try replacing all the fenv stuff with stubs
doing nothing. You only lose some runtime warnings about special cases.

Why do you want to run numpy on such a system?
Numpy is not really intended to run on such devices.
But it is possible the debian armel port is so far I know softfloat and
numpy seems to be running fine, though it probably also emulates fenv.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion