Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Stephan Hoyer
On Mon, Jun 24, 2019 at 5:36 PM Marten van Kerkwijk < m.h.vankerkw...@gmail.com> wrote: > > > On Mon, Jun 24, 2019 at 7:21 PM Stephan Hoyer wrote: > >> On Mon, Jun 24, 2019 at 3:56 PM Allan Haldane >> wrote: >> >>> I'm not at all set on that behavior and we can do something else. For >>> now, I

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Marten van Kerkwijk
On Mon, Jun 24, 2019 at 7:21 PM Stephan Hoyer wrote: > On Mon, Jun 24, 2019 at 3:56 PM Allan Haldane > wrote: > >> I'm not at all set on that behavior and we can do something else. For >> now, I chose this way since it seemed to best match the "IGNORE" mask >> behavior. >> >> The behavior you

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Marten van Kerkwijk
Hi Allan, > The alternative solution in my model would be to replace `np.dot` with a > > masked-specific implementation of what `np.dot` is supposed to stand for > > (in your simple example, `np.add.reduce(np.multiply(m, m))` - more > > generally, add relevant `outer` and `axes`). This would be

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Stephan Hoyer
On Mon, Jun 24, 2019 at 3:56 PM Allan Haldane wrote: > I'm not at all set on that behavior and we can do something else. For > now, I chose this way since it seemed to best match the "IGNORE" mask > behavior. > > The behavior you described further above where the output row/col would > be masked

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Allan Haldane
On 6/24/19 3:09 PM, Marten van Kerkwijk wrote: > Hi Allan, > > Thanks for bringing up the noclobber explicitly (and Stephan for asking > for clarification; I was similarly confused). > > It does clarify the difference in mental picture. In mine, the operation > would indeed be guaranteed to be

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Charles R Harris
On Mon, Jun 24, 2019 at 3:40 PM Marten van Kerkwijk < m.h.vankerkw...@gmail.com> wrote: > Hi Eric, > > The easiest definitely is for the mask to just propagate, which that even > if just one point is masked, all points in the fft will be masked. > > On the direct point I made, I think it is

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Eric Firing
On 2019/06/24 11:39 AM, Marten van Kerkwijk wrote: Hi Eric, The easiest definitely is for the mask to just propagate, which that even if just one point is masked, all points in the fft will be masked. This is perfectly reasonable, and consistent with what happens with nans, of course. My

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Warren Weckesser
On 6/24/19, Marten van Kerkwijk wrote: > Hi Eric, > > The easiest definitely is for the mask to just propagate, which that even > if just one point is masked, all points in the fft will be masked. > > On the direct point I made, I think it is correct that since one can think > of the Fourier

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Marten van Kerkwijk
Hi Eric, The easiest definitely is for the mask to just propagate, which that even if just one point is masked, all points in the fft will be masked. On the direct point I made, I think it is correct that since one can think of the Fourier transform of a sine/cosine fit, then there is a solution

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Eric Firing
On 2019/06/24 9:09 AM, Marten van Kerkwijk wrote: Another example of a function for which I think my model is not particularly insightful (and for which it is difficult to know what to do generally) is `np.fft.fft`. Since an fft is equivalent to a sine/cosine fits to data points, the answer

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Marten van Kerkwijk
Hi Allan, Thanks for bringing up the noclobber explicitly (and Stephan for asking for clarification; I was similarly confused). It does clarify the difference in mental picture. In mine, the operation would indeed be guaranteed to be done on the underlying data, without copy and without

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Allan Haldane
On 6/24/19 12:16 PM, Stephan Hoyer wrote: > On Mon, Jun 24, 2019 at 8:46 AM Allan Haldane > wrote: > >  1. Making a "no-clobber" guarantee on the underlying data > > > Hi Allan -- could kindly clarify what you mean by "no-clobber"? > > Is this referring to

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Stephan Hoyer
On Mon, Jun 24, 2019 at 8:46 AM Allan Haldane wrote: > 1. Making a "no-clobber" guarantee on the underlying data > Hi Allan -- could kindly clarify what you mean by "no-clobber"? Is this referring to allowing masked arrays to mutate masked data values in-place, even on apparently non-in-place

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Allan Haldane
On 6/24/19 11:46 AM, Allan Haldane wrote: > A no-clobber guarantee makes your "iterative mask" example solvable in > an efficient (no-copy) way: > > mask, last_mask = False > while True: > dat_mean = np.mean(MaskedArray(data, mask)) > mask, last_mask = np.abs(data - mask)

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Allan Haldane
On 6/23/19 6:58 PM, Eric Wieser wrote: > I think we’d need to consider separately the operation on the mask > and on the data. In my proposal, the data would always do > |np.sum(array, where=~mask)|, while how the mask would propagate > might depend on the mask itself, > > I quite

Re: [Numpy-discussion] new MaskedArray class

2019-06-24 Thread Allan Haldane
On 6/22/19 11:50 AM, Marten van Kerkwijk wrote: > Hi Allan, > > I'm not sure I would go too much by what the old MaskedArray class did. > It indeed made an effort not to overwrite masked values with a new > result, even to the extend of copying back masked input data elements to > the output

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Marten van Kerkwijk
Hi Eric, On your other points: I remain unconvinced that Mask classes should behave differently on > different ufuncs. I don’t think np.minimum(ignore_na, b) is any different > to np.add(ignore_na, b) - either both should produce b, or both should > produce ignore_na. I would lean towards

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Marten van Kerkwijk
Hi Stephan, Eric perhaps explained my concept better than I could! I do agree that, as written, your example would be clearer, but Allan's code and the current MaskedArray code do have not that much semblance to it, and mine even less, as they deal with operators as whole groups. For mine, it

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Eric Wieser
I think we’d need to consider separately the operation on the mask and on the data. In my proposal, the data would always do np.sum(array, where=~mask), while how the mask would propagate might depend on the mask itself, I quite like this idea, and I think Stephan’s strawman design is actually

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Stephan Hoyer
On Sun, Jun 23, 2019 at 11:55 PM Marten van Kerkwijk < m.h.vankerkw...@gmail.com> wrote: > Your proposal would be something like np.sum(array, >> where=np.ones_like(array))? This seems rather verbose for a common >> operation. Perhaps np.sum(array, where=True) would work, making use of >>

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Marten van Kerkwijk
Hi Stephan, In slightly changed order: Let me try to make the API issue more concrete. Suppose we have a > MaskedArray with values [1, 2, NA]. How do I get: > 1. The sum ignoring masked values, i.e., 3. > 2. The sum that is tainted by masked values, i.e., NA. > > Here's how this works with

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Stephan Hoyer
On Sun, Jun 23, 2019 at 4:07 PM Marten van Kerkwijk < m.h.vankerkw...@gmail.com> wrote: > - If reductions/aggregations default to skipping missing elements, how is >> it be possible to express "NA propagating" versions, which are also useful, >> if slightly less common? >> > > I have been playing

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Marten van Kerkwijk
Hi Tom, I think a sensible alternative mental model for the MaskedArray class is >> that all it does is forward any operations to the data it holds and >> separately propagate a mask, >> > > I'm generally on-board with that mental picture, and agree that the > use-case described by Ben

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Aldcroft, Thomas
On Sat, Jun 22, 2019 at 11:51 AM Marten van Kerkwijk < m.h.vankerkw...@gmail.com> wrote: > Hi Allan, > > I'm not sure I would go too much by what the old MaskedArray class did. It > indeed made an effort not to overwrite masked values with a new result, > even to the extend of copying back masked

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Marten van Kerkwijk
> I think a sensible alternative mental model for the MaskedArray class is >> that all it does is forward any operations to the data it holds and >> separately propagate a mask, ORing elements together for binary operations, >> etc., and explicitly skipping masked elements in reductions (ideally

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Stephan Hoyer
On Sat, Jun 22, 2019 at 6:50 PM Marten van Kerkwijk < m.h.vankerkw...@gmail.com> wrote: > Hi Allan, > > I'm not sure I would go too much by what the old MaskedArray class did. It > indeed made an effort not to overwrite masked values with a new result, > even to the extend of copying back masked

Re: [Numpy-discussion] new MaskedArray class

2019-06-23 Thread Stephan Hoyer
On Thu, Jun 20, 2019 at 7:44 PM Allan Haldane wrote: > On 6/19/19 10:19 PM, Marten van Kerkwijk wrote: > > Hi Allan, > > > > This is very impressive! I could get the tests that I wrote for my class > > pass with yours using Quantity with what I would consider very minimal > > changes. I only

Re: [Numpy-discussion] new MaskedArray class

2019-06-22 Thread Benjamin Root
"""Third, in the old np.ma.MaskedArray masked positions are very often "effectively" clobbered, in the sense that they are not computed. For example, if you do "c = a+b", and then change the mask of c""" My use-cases don't involve changing the mask of "c". It would involve changing the mask of

Re: [Numpy-discussion] new MaskedArray class

2019-06-22 Thread Marten van Kerkwijk
Hi Allan, I'm not sure I would go too much by what the old MaskedArray class did. It indeed made an effort not to overwrite masked values with a new result, even to the extend of copying back masked input data elements to the output data array after an operation. But the fact that this is

Re: [Numpy-discussion] new MaskedArray class

2019-06-22 Thread Allan Haldane
On 6/21/19 2:37 PM, Benjamin Root wrote: Just to note, data that is masked isn't always garbage. There are plenty of use-cases where one may want to temporarily apply a mask for a set of computation, or possibly want to apply a series of different masks to the data. I haven't read through this

Re: [Numpy-discussion] new MaskedArray class

2019-06-21 Thread Benjamin Root
Just to note, data that is masked isn't always garbage. There are plenty of use-cases where one may want to temporarily apply a mask for a set of computation, or possibly want to apply a series of different masks to the data. I haven't read through this discussion deeply enough, but is this new

Re: [Numpy-discussion] new MaskedArray class

2019-06-20 Thread Allan Haldane
On 6/19/19 10:19 PM, Marten van Kerkwijk wrote: > Hi Allan, > > This is very impressive! I could get the tests that I wrote for my class > pass with yours using Quantity with what I would consider very minimal > changes. I only could not find a good way to unmask data (I like the > idea of

Re: [Numpy-discussion] new MaskedArray class

2019-06-19 Thread Marten van Kerkwijk
Hi Allan, This is very impressive! I could get the tests that I wrote for my class pass with yours using Quantity with what I would consider very minimal changes. I only could not find a good way to unmask data (I like the idea of setting the mask on some elements via `ma[item] = X`); is this on

Re: [Numpy-discussion] new MaskedArray class

2019-06-19 Thread Allan Haldane
On 6/18/19 2:04 PM, Marten van Kerkwijk wrote: > > > On Tue, Jun 18, 2019 at 12:55 PM Allan Haldane > wrote: > > > > This may be too much to ask from the initializer, but, if so, it still > > seems most useful if it is made as easy as possible to do,

Re: [Numpy-discussion] new MaskedArray class

2019-06-18 Thread Marten van Kerkwijk
On Tue, Jun 18, 2019 at 12:55 PM Allan Haldane wrote: > > This may be too much to ask from the initializer, but, if so, it still > > seems most useful if it is made as easy as possible to do, say, `class > > MaskedQuantity(Masked, Quantity): `. > > Currently MaskedArray does not accept

Re: [Numpy-discussion] new MaskedArray class

2019-06-18 Thread Allan Haldane
On 6/18/19 10:06 AM, Marten van Kerkwijk wrote: > Hi Allen, > > Thanks for the message and link! In astropy, we've been struggling with > masking a lot, and one of the main conclusions I have reached is that > ideally one has a more abstract `Masked` class that can take any type of > data

Re: [Numpy-discussion] new MaskedArray class

2019-06-18 Thread Marten van Kerkwijk
Hi Allen, Thanks for the message and link! In astropy, we've been struggling with masking a lot, and one of the main conclusions I have reached is that ideally one has a more abstract `Masked` class that can take any type of data (including `ndarray`, of course), and behaves like that data as