Re: [Numpy-discussion] Does a `mergesorted` function make sense?
Sure, id like to do the hashing things out, but I would also like some preliminary feedback as to whether this is going in a direction anyone else sees the point of, if it conflicts with other plans, and indeed if we can agree that numpy is the right place for it; a point which I would very much like to defend. If there is some obvious no-go that im missing, I can do without the drudgery of writing proper documentation ;). As for whether this belongs in numpy: yes, I would say so. There are the extension of functionality to functions already in numpy, which are a no-brainer (it need not cost anything performance wise, and ive needed unique graph edges many many times), and there is the grouping functionality, which is the main novelty. However, note that the grouping functionality itself is a very small addition, just a few 100 lines of pure python, given that the indexing logic has been factored out of the classic arraysetops. At least from a developers perspective, it very much feels like a logical extension of the same 'thing'. But also from a conceptual numpy perspective, grouping is really more an 'elementary manipulation of an ndarray' than a 'special purpose algorithm'. It is useful for literally all kinds of programming; hence there is similar functionality in the python standard library (itertools.groupby); so why not have an efficient vectorized equivalent in numpy? It belongs there more than the linalg module, arguably. Also, from a community perspective, a significant fraction of all stackoverflow numpy questions are (unknowingly) exactly about 'how to do grouping in numpy'. On Mon, Sep 1, 2014 at 4:36 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Sun, Aug 31, 2014 at 1:48 PM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: Ive organized all code I had relating to this subject in a github repository https://github.com/EelcoHoogendoorn/Numpy_arraysetops_EP. That should facilitate shooting around ideas. Ive also added more documentation and structure to make it easier to see what is going on. Hopefully we can converge on a common vision, and then improve the documentation and testing to make it worthy of including in the numpy master. Note that there is also a complete rewrite of the classic numpy.arraysetops, such that they are also generalized to more complex input, such as finding unique graph edges, and so on. You mentioned getting the numpy core developers involved; are they not subscribed to this mailing list? I wouldn't be surprised; youd hope there is a channel of discussion concerning development with higher signal to noise There are only about 2.5 of us at the moment. Those for whom this is an itch that need scratching should hash things out and make a PR. The main question for me is if it belongs in numpy, scipy, or somewhere else. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] How to install numpy on a box without hardware FPU
Hello, Is it possible to configure/install numpy on a box without a hardware FPU? When I try to install it using pip,I get a bunch of compile errors since floating-point exceptions (FE_DIVBYZERO etc) are undefined on this platform. How do I get numpy installed and working on such a platform? Thanks,Emel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Does a `mergesorted` function make sense?
On Mon, Sep 1, 2014 at 1:49 AM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: Sure, id like to do the hashing things out, but I would also like some preliminary feedback as to whether this is going in a direction anyone else sees the point of, if it conflicts with other plans, and indeed if we can agree that numpy is the right place for it; a point which I would very much like to defend. If there is some obvious no-go that im missing, I can do without the drudgery of writing proper documentation ;). As for whether this belongs in numpy: yes, I would say so. There are the extension of functionality to functions already in numpy, which are a no-brainer (it need not cost anything performance wise, and ive needed unique graph edges many many times), and there is the grouping functionality, which is the main novelty. However, note that the grouping functionality itself is a very small addition, just a few 100 lines of pure python, given that the indexing logic has been factored out of the classic arraysetops. At least from a developers perspective, it very much feels like a logical extension of the same 'thing'. But also from a conceptual numpy perspective, grouping is really more an 'elementary manipulation of an ndarray' than a 'special purpose algorithm'. It is useful for literally all kinds of programming; hence there is similar functionality in the python standard library (itertools.groupby); so why not have an efficient vectorized equivalent in numpy? It belongs there more than the linalg module, arguably. Also, from a community perspective, a significant fraction of all stackoverflow numpy questions are (unknowingly) exactly about 'how to do grouping in numpy'. What I'm trying to say is that numpy is a community project. We don't have a central planning committee, the only difference between developers and everyone else is activity and commit rights. Which is to say if you develop and push this topic it is likely to go in. There certainly seems to be interest in this functionality. The reason that I brought up scipy is that there are some graph algorithms there that went in a couple of years ago. Note that the convention on the list is bottom posting. snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Does a `mergesorted` function make sense?
On Mon, Sep 1, 2014 at 8:49 AM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: Sure, id like to do the hashing things out, but I would also like some preliminary feedback as to whether this is going in a direction anyone else sees the point of, if it conflicts with other plans, and indeed if we can agree that numpy is the right place for it; a point which I would very much like to defend. If there is some obvious no-go that im missing, I can do without the drudgery of writing proper documentation ;). As for whether this belongs in numpy: yes, I would say so. There are the extension of functionality to functions already in numpy, which are a no-brainer (it need not cost anything performance wise, and ive needed unique graph edges many many times), and there is the grouping functionality, which is the main novelty. However, note that the grouping functionality itself is a very small addition, just a few 100 lines of pure python, given that the indexing logic has been factored out of the classic arraysetops. At least from a developers perspective, it very much feels like a logical extension of the same 'thing'. My 2 cents: I definitely agree that this is very useful fundamental functionality, and it would be great if numpy had a solution for it out of the box. My main concern is that this is a fairly complicated set of functionality and there are a lot of small decisions to be made in setting up the API for it. IME it's very hard to just read through an API like this and reason out the best way to do it by pure logic; usually it needs to get banged on for a bit in real uses before it becomes clear what the right set of trade-offs is. And numpy itself is not a great environment these kinds of iterations. So, IMO the main challenge is: how do we get the functionality into a state where we can convince ourselves that it'll be supportable in numpy indefinitely, and not need to be replaced in a year or two? Some things that might help with this convincing: - releasing it as a small standalone package on pypi and getting some real users to bang on it - any real code written against the APIs - feedback from the pandas community since they've spent a lot of time working on these issues - ...? -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Does a `mergesorted` function make sense?
On Mon, Sep 1, 2014 at 2:05 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Sep 1, 2014 at 1:49 AM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: Sure, id like to do the hashing things out, but I would also like some preliminary feedback as to whether this is going in a direction anyone else sees the point of, if it conflicts with other plans, and indeed if we can agree that numpy is the right place for it; a point which I would very much like to defend. If there is some obvious no-go that im missing, I can do without the drudgery of writing proper documentation ;). As for whether this belongs in numpy: yes, I would say so. There are the extension of functionality to functions already in numpy, which are a no-brainer (it need not cost anything performance wise, and ive needed unique graph edges many many times), and there is the grouping functionality, which is the main novelty. However, note that the grouping functionality itself is a very small addition, just a few 100 lines of pure python, given that the indexing logic has been factored out of the classic arraysetops. At least from a developers perspective, it very much feels like a logical extension of the same 'thing'. But also from a conceptual numpy perspective, grouping is really more an 'elementary manipulation of an ndarray' than a 'special purpose algorithm'. It is useful for literally all kinds of programming; hence there is similar functionality in the python standard library (itertools.groupby); so why not have an efficient vectorized equivalent in numpy? It belongs there more than the linalg module, arguably. Also, from a community perspective, a significant fraction of all stackoverflow numpy questions are (unknowingly) exactly about 'how to do grouping in numpy'. What I'm trying to say is that numpy is a community project. We don't have a central planning committee, the only difference between developers and everyone else is activity and commit rights. Which is to say if you develop and push this topic it is likely to go in. There certainly seems to be interest in this functionality. The reason that I brought up scipy is that there are some graph algorithms there that went in a couple of years ago. Note that the convention on the list is bottom posting. snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I understand that numpy is a community project, so that the decision isn't up to any one particular person; but some early stage feedback from those active in the community would be welcome. I am generally confident that this addition makes sense, but I have not contributed to numpy before, and you don't know what you don't know and all... given that there are multiple suggestions for changing arraysetops, some coordination would be useful I think. Note that I use graph edges merely as an example; the proposed functionality is much more general than graphing algorithms specifically. The radial reduction https://github.com/EelcoHoogendoorn/Numpy_arraysetops_EP/blob/master/examples.pyexample I included on github is particularly illustrative of the general utility of grouping functionality I think. Operations like radial reductions are rather common, and a custom implementation is quite lengthy, very bug prone, and potentially very slow. Thanks for the heads up on posting convention; ive always let gmail do my thinking for me, which works well enough for me, but I can see how not following this convention is annoying to others. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ENH IncrementalWriter for .npy files
Dear All, I would like to add a class for writing one (possibly big) .npy file saving multiple (same dtype, compatible shape) arrays. My use case was the saving of slowly accumulating data regularly for a long time into one file. Please find a first implementation under https://github.com/numpy/numpy/pull/4987 . It currently supports writing a new file only and only in C order in the file. Opening an existing file for append and reading back parts from a very big .npy file would be straightforward next steps for a full featured class. The .npy file format is only affected by leaving some extra space for re-writing the header later with a possibly bigger shape field, respecting the 16-byte alignment. Example: ``` A=np.array([[0,1,2,3,4,5,6,7],[8,9,10,11,12,13,14,15]]) with np.IncrementalWriter(testfile.npy,hdrupdate=True,flush=True) as W: W.save(A) W.save(A) ``` Feel free to comment this idea. Cheers, Gabor ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to install numpy on a box without hardware FPU
On 01.09.2014 10:33, Emel Hasdal wrote: Hello, Is it possible to configure/install numpy on a box without a hardware FPU? When I try to install it using pip, I get a bunch of compile errors since floating-point exceptions (FE_DIVBYZERO etc) are undefined on this platform. How do I get numpy installed and working on such a platform? If its just that you can try replacing all the fenv stuff with stubs doing nothing. You only lose some runtime warnings about special cases. Why do you want to run numpy on such a system? Numpy is not really intended to run on such devices. But it is possible the debian armel port is so far I know softfloat and numpy seems to be running fine, though it probably also emulates fenv. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion