[Numpy-discussion] proposed change to recarray access

2015-01-14 Thread Allan Haldane
Hello all, I've submitted a pull request on github which changes how string values in recarrays are returned, which may break old code. https://github.com/numpy/numpy/pull/5454 See also: https://github.com/numpy/numpy/issues/3993 Previously, recarray fields of type 'S' or 'U' (ie, strings)

[Numpy-discussion] structured arrays, recarrays, and record arrays

2015-01-18 Thread Allan Haldane
Hello all, Documentation of recarrays is poor and I'd like to improve it. In order to do this I've been looking at core/records.py, and I would appreciate some feedback on my plan. Let me start by describing what I see. In the docs there is some confusion about 'structured arrays' vs 'record

Re: [Numpy-discussion] structured arrays, recarrays, and record arrays

2015-01-18 Thread Allan Haldane
a change to dtype str and repr, which could affect a lot of things. Cheers, Allan On 01/18/2015 11:36 PM, Allan Haldane wrote: Hello all, Documentation of recarrays is poor and I'd like to improve it. In order to do this I've been looking at core/records.py, and I would appreciate some feedback

Re: [Numpy-discussion] Views of a different dtype

2015-01-29 Thread Allan Haldane
getting a loosening of the restrictions wrong is big, so it should be handled with care. Allan Haldane and myself have been looking into this separately and discussing some of the details over at github, and we both think that the only true limitation that has to be imposed is that the offsets

Re: [Numpy-discussion] Views of a different dtype

2015-01-29 Thread Allan Haldane
a loosening of the restrictions wrong is big, so it should be handled with care. Allan Haldane and myself have been looking into this separately and discussing some of the details over at github, and we both think that the only true limitation that has to be imposed is that the offsets of Python

[Numpy-discussion] should views into structured arrays be reversible?

2015-03-17 Thread Allan Haldane
Hello all, I've introduced PR 5548 https://github.com/numpy/numpy/pull/5548 which, through more careful safety checks, allows views of object arrays. However, I had to make 'partial views' into structured arrays irreversible, and I want to check with the list that that's ok. With the PR, if you

[Numpy-discussion] Rename arguments to np.clip and np.put

2015-03-30 Thread Allan Haldane
Hello everyone, What does the list think of renaming the arguments of np.clip and np.put to match those of ndarray.clip/put? Currently the signatures are np.clip(a, a_min, a_max, out=None) ndarray.clip(a, min=None, max=None, out=None) np.put(a, ind, v, mode='raise')

Re: [Numpy-discussion] Rename arguments to np.clip and np.put

2015-03-31 Thread Allan Haldane
On 03/30/2015 07:16 PM, Jaime Fernández del Río wrote: On Mon, Mar 30, 2015 at 3:59 PM, Allan Haldane allanhald...@gmail.com mailto:allanhald...@gmail.com wrote: Hello everyone, What does the list think of renaming the arguments of np.clip and np.put to match those

[Numpy-discussion] Should ndarray subclasses support the keepdims arg?

2015-05-05 Thread Allan Haldane
Hello all, A question: Many ndarray methods (eg sum, mean, any, min) have a keepdims keyword argument, but ndarray subclass methods sometimes don't. The 'matrix' subclass doesn't, and numpy functions like 'np.sum' intentionally drop/ignore the keepdims argument when called with an ndarray

Re: [Numpy-discussion] Should ndarray subclasses support the keepdims arg?

2015-05-05 Thread Allan Haldane
it unconditionally, then as soon as someone upgraded numpy all their existing code might break for no good reason. On May 5, 2015 8:13 AM, Allan Haldane allanhald...@gmail.com mailto:allanhald...@gmail.com wrote: Hello all, A question: Many ndarray methods (eg sum, mean

Re: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.?

2015-08-03 Thread Allan Haldane
On 08/03/2015 12:25 PM, Chris Barker wrote: 2) The vagaries of the standard C types: int, long, etc (spelled np.intc, which is a int32 on my machine, anyway) [NOTE: is there a C long dtype? I can't find it at the moment...] Numpy does define the platform dependent C integer types short,

[Numpy-discussion] improving structured array assignment

2015-08-06 Thread Allan Haldane
Hello all, I've written up a tentative PR which tidies up structured array assignment, https://github.com/numpy/numpy/pull/6053 It has a backward incompatible change which I'd especially like to get some feedback on: Structure assignment now always works by field position instead of by field

Re: [Numpy-discussion] Development workflow (not git tutorial)

2015-08-14 Thread Allan Haldane
On 08/13/2015 11:52 AM, Anne Archibald wrote: Hi, What is a sensible way to work on (modify, compile, and test) numpy? There is documentation about contributing to numpy at: http://docs.scipy.org/doc/numpy-dev/dev/index.html and:

Re: [Numpy-discussion] Development workflow (not git tutorial)

2015-08-14 Thread Allan Haldane
On 08/14/2015 01:52 PM, Pauli Virtanen wrote: 14.08.2015, 20:45, Allan Haldane kirjoitti: [clip] Related to this, does anyone know how to debug numpy in gdb with proper symbols/source lines, like I can do with other C extensions? I've tried modifying numpy distutils to try to add the right

Re: [Numpy-discussion] constructing record dtypes from the c-api

2015-07-27 Thread Allan Haldane
Hi Jason, As I understand numpy has been set up to mirror C-structs as long as you use the 'align' flag. For example, your struct can be represented as np.dtype('f8,f4,i4,u8', align=True) (assuming 32 bit floats). The offsets of the fields should be exactly the offsets of the elements

Re: [Numpy-discussion] Commit rights for Jonathan J. Helmus

2015-10-28 Thread Allan Haldane
On 10/28/2015 05:27 PM, Nathaniel Smith wrote: Hi all, Jonathan J. Helmus (@jjhelmus) has been given commit rights -- let's all welcome him aboard. -n Welcome Jonathan, happy to have you on the team! Allan ___ NumPy-Discussion mailing list

Re: [Numpy-discussion] A regression in numpy 1.10: VERY slow memory mapped file generation

2015-10-14 Thread Allan Haldane
On 10/14/2015 01:23 AM, Nadav Horesh wrote: > > I have binary files of size range between few MB to 1GB, which I read process > as memory mapped files (via np.memmap). Until numpy 1.9 the creation of > recarray on an existing file (without reading its content) was instantaneous, > and now it

Re: [Numpy-discussion] correct sizeof for ndarray

2015-10-20 Thread Allan Haldane
On 10/20/2015 12:05 AM, Jason Newton wrote: Hi folks, I noticed an unexpected behavior of itemsize for structures with offsets that are larger than that of a packed structure in memory. This matters when parsing in memory structures from C and some others (recently and HDF5/h5py detail got me

Re: [Numpy-discussion] numpy 1.10.1 reduce operation on recarrays

2015-10-16 Thread Allan Haldane
On 10/16/2015 05:31 PM, josef.p...@gmail.com wrote: > > > On Fri, Oct 16, 2015 at 2:21 PM, Charles R Harris > > wrote: > > > > On Fri, Oct 16, 2015 at 12:20 PM, Charles R Harris >

Re: [Numpy-discussion] numpy 1.10.1 reduce operation on recarrays

2015-10-16 Thread Allan Haldane
On 10/16/2015 09:17 PM, josef.p...@gmail.com wrote: On Fri, Oct 16, 2015 at 8:56 PM, Allan Haldane <allanhald...@gmail.com <mailto:allanhald...@gmail.com>> wrote: On 10/16/2015 05:31 PM, josef.p...@gmail.com <mailto:josef.p...@gmail.com> wrote: > >

Re: [Numpy-discussion] np.sign and object comparisons

2015-09-03 Thread Allan Haldane
On 08/31/2015 12:09 AM, Jaime Fernández del Río wrote: > There are three ways of fixing this that I see: > > 1. Arbitrarily choose a value to set the return to. This is equivalent > to choosing a default return for `cmp` for comparisons. This > preserves behavior, but feels wrong. > 2.

Re: [Numpy-discussion] Sign of NaN

2015-09-29 Thread Allan Haldane
On 09/29/2015 11:39 AM, josef.p...@gmail.com wrote: > > > On Tue, Sep 29, 2015 at 11:25 AM, Anne Archibald > wrote: > > IEEE 754 has signum(NaN)->NaN. So does np.sign on floating-point > arrays. Why should it be different for object

Re: [Numpy-discussion] Sign of NaN

2015-09-29 Thread Allan Haldane
On 09/29/2015 02:16 PM, Nathaniel Smith wrote: > On Sep 29, 2015 8:25 AM, "Anne Archibald" > wrote: >> >> IEEE 754 has signum(NaN)->NaN. So does np.sign on floating-point > arrays. Why should it be different for object arrays? > > The argument

[Numpy-discussion] new ufunc implementations for object arrays

2015-09-18 Thread Allan Haldane
Hello all, I've just submitted a PR which overhauls the implementation of ufuncs for object types. https://github.com/numpy/numpy/pull/6320 The motivation for this is that many ufuncs (eg all transcendental functions) can't handle objects. This PR will also make object arrays more customizable,

Re: [Numpy-discussion] Commit rights for Allan Haldane

2015-09-22 Thread Allan Haldane
harlesr.har...@gmail.com <mailto:charlesr.har...@gmail.com>> wrote: > > Hi All, > > Allan Haldane has been given commit rights. Here's to the new member > of the team. > > Chuck > > ___ >

Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-06 Thread Allan Haldane
I've also often wanted to generate large datasets of random uint8 and uint16. As a workaround, this is something I have used: np.ndarray(100, 'u1', np.random.bytes(100)) It has also crossed my mind that np.random.randint and np.random.rand could use an extra 'dtype' keyword. It didn't look

Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-08 Thread Allan Haldane
On 12/08/2015 07:40 PM, Stephan Hoyer wrote: > On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane <allanhald...@gmail.com > <mailto:allanhald...@gmail.com>> wrote: > > > I've also often wanted to generate large datasets of random uint8 > and uint16. As a workar

Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-08 Thread Allan Haldane
On 12/08/2015 08:01 PM, Allan Haldane wrote: > On 12/08/2015 07:40 PM, Stephan Hoyer wrote: >> On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane <allanhald...@gmail.com >> <mailto:allanhald...@gmail.com>> wrote: >> >> >> I've also often want

Re: [Numpy-discussion] Integers to integer powers, let's make a decision

2016-06-10 Thread Allan Haldane
On 06/10/2016 08:10 AM, Alan Isaac wrote: > Is np.arange(10)**10 also "innocent looking" to a Python user? This doesn't bother me much because numpy users have to be aware of overflow issues in lots of other (simple) cases anyway, eg plain addition and multiplication. I'll add my +1 for integer

Re: [Numpy-discussion] Integers to integer powers, let's make a decision

2016-06-10 Thread Allan Haldane
On 06/10/2016 01:50 PM, Alan Isaac wrote: > Again, **almost all** integer combinations overflow: that's the point. Don't almost all integer combinations overflow for multiplication as well? I estimate that for unsigned 32 bit integers, only roughly 1 in 2e8 combinations don't overflow. The

Re: [Numpy-discussion] Integers to integer powers, let's make a decision

2016-06-10 Thread Allan Haldane
On 06/10/2016 03:38 PM, Alan Isaac wrote: np.find_common_type([np.int8],[np.int32]) > dtype('int8') (np.arange(10,dtype=np.int8)+np.int32(2**10)).dtype > dtype('int16') > > And so on. If these other binary operators upcast based > on the scalar value, why wouldn't exponentiation? > I

Re: [Numpy-discussion] Integers to integer powers, let's make a decision

2016-06-13 Thread Allan Haldane
On 06/13/2016 05:05 AM, V. Armando Solé wrote: > On 11/06/2016 02:28, Allan Haldane wrote: >> >> So as an extra twist in this discussion, this means numpy actually >> *does* return a float value for an integer power in a few cases: >> >> &g

Re: [Numpy-discussion] [Suggestion] Labelled Array

2016-02-13 Thread Allan Haldane
I've had a pretty similar idea for a new indexing function 'split_classes' which would help in your case, which essentially does def split_classes(c, v): return [v[c == u] for u in unique(c)] Your example could be coded as >>> [sum(c) for c in split_classes(label, data)]

Re: [Numpy-discussion] [Suggestion] Labelled Array

2016-02-13 Thread Allan Haldane
%timeit split_classes(c,v) 100 loops, best of 3: 4.79 ms per loop In any case, maybe it is useful to Sergio or others. Allan On 02/13/2016 12:11 PM, Allan Haldane wrote: I've had a pretty similar idea for a new indexing function 'split_classes' which would help in your case, which essential

Re: [Numpy-discussion] [Suggestion] Labelled Array

2016-02-19 Thread Allan Haldane
I also want to add a historical note here, that 'groupby' has been discussed a couple times before. Travis Oliphant even made an NEP for it, and Wes McKinney lightly hinted at adding it to numpy. http://thread.gmane.org/gmane.comp.python.numeric.general/37480/focus=37480

Re: [Numpy-discussion] [Suggestion] Labelled Array

2016-02-13 Thread Allan Haldane
Sat, Feb 13, 2016 at 1:01 PM, Allan Haldane <allanhald...@gmail.com <mailto:allanhald...@gmail.com>> wrote: Sorry, to reply to myself here, but looking at it with fresh eyes maybe the performance of the naive version isn't too bad. Here's a comparison

[Numpy-discussion] change to memmap subclass propagation

2016-03-30 Thread Allan Haldane
Hi all, This is a warning for a planned change to np.memmap in https://github.com/numpy/numpy/pull/7406. The return values of ufuncs and fancy slices of a memmap will now be plain ndarrays, since those return values don't point to mem-mapped memory. There is a possibility that if you are

Re: [Numpy-discussion] Numpy arrays shareable among related processes (PR #7533)

2016-05-11 Thread Allan Haldane
On 05/11/2016 04:29 AM, Sturla Molden wrote: > 4. The reason IPC appears expensive with NumPy is because multiprocessing > pickles the arrays. It is pickle that is slow, not the IPC. Some would say > that the pickle overhead is an integral part of the IPC ovearhead, but i > will argue that it is

Re: [Numpy-discussion] Numpy arrays shareable among related processes (PR #7533)

2016-05-11 Thread Allan Haldane
On 05/11/2016 06:48 PM, Sturla Molden wrote: > Elliot Hallmark wrote: >> Strula, this sounds brilliant! To be clear, you're talking about >> serializing the numpy array and reconstructing it in a way that's faster >> than pickle? > > Yes. We know the binary format of

Re: [Numpy-discussion] Numpy arrays shareable among related processes (PR #7533)

2016-05-11 Thread Allan Haldane
On 05/11/2016 06:39 PM, Joe Kington wrote: > > > In python2 it appears that multiprocessing uses pickle protocol 0 which > must cause a big slowdown (a factor of 100) relative to protocol 2, and > uses pickle instead of cPickle. > > > Even on Python 2.x, multiprocessing uses

Re: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve

2016-10-18 Thread Allan Haldane
On 10/17/2016 01:01 PM, Pierre Haessig wrote: > Hi, > > > Le 16/10/2016 à 11:52, Hanno Klemm a écrit : >> When I have similar situations, I usually interpolate between the valid >> values. I assume there are a lot of use cases for convolutions but I have >> difficulties imagining that ignoring

Re: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve

2016-10-18 Thread Allan Haldane
On 10/16/2016 05:52 AM, Hanno Klemm wrote: > > >> On 16 Oct 2016, at 03:21, Allan Haldane <allanhald...@gmail.com> wrote: >> >>> On 10/14/2016 07:49 PM, Juan Nunez-Iglesias wrote: >>> +1 for propagate_mask. That is the only proposal that immediate

Re: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve

2016-10-18 Thread Allan Haldane
On 10/17/2016 01:01 PM, Pierre Haessig wrote: > Le 16/10/2016 à 11:52, Hanno Klemm a écrit : >> When I have similar situations, I usually interpolate between the valid >> values. I assume there are a lot of use cases for convolutions but I have >> difficulties imagining that ignoring a missing

Re: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve

2016-10-15 Thread Allan Haldane
>>> print np.ma.convolve(a, b, propagate_mask=False) [1 2 3 2 2 2 3 2 1 -- 1 2 3 2 1] Allan On 15 Oct. 2016, 5:23 AM +1100, Allan Haldane <allanhald...@gmail.com>, wrote: I think the possibilities that have been mentioned so far (here or in the PR) are: contagious c

[Numpy-discussion] how to name "contagious" keyword in np.ma.convolve

2016-10-14 Thread Allan Haldane
Hi all, Eric Wieser has a PR which defines new functions np.ma.correlate and np.ma.convolve: https://github.com/numpy/numpy/pull/7922 We're deciding how to name the keyword arg which determines whether masked elements are "propagated" in the convolution sums. Currently we are leaning towards

Re: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve

2016-10-14 Thread Allan Haldane
ct 14, 2016 at 1:08 PM, Sebastian Berg > <sebast...@sipsolutions.net <mailto:sebast...@sipsolutions.net>> wrote: > > On Fr, 2016-10-14 at 13:00 -0400, Allan Haldane wrote: > > Hi all, > > > > Eric Wieser has a PR which define