On Mon, Mar 30, 2020 at 8:31 AM Daniel Nugent <[email protected]> wrote: > > Didn’t want to follow up on this on the Jira issue earlier since it's sort of > tangential to that bug and more of a usage question. You said: > > > I wouldn't recommend building applications based on them nowadays since the > > level of support / compatibility in other projects is low. > > In my case, I am using them since it seemed like a straightforward > representation of my data that has nulls, the format I’m converting from has > zero cost numpy representations, and converting from an internal format into > Arrow in memory structures appears zero cost (or close to it) as well. I > guess I can just provide the mask as an explicit argument, but my original > desire to use it came from being able to exploit numpy.ma.concatenate in a > way that saved some complexity in implementation. > > Since Arrow itself supports masking values with a bitfield, is there > something intrinsic to the notion of array masks that is not well supported? > Or do you just mean the specific numpy MaskedArray class? >
I mean just the numpy.ma module. Not many Python computing projects nowadays treat MaskedArray objects as first class citizens. Depending on what you need it may or may not be a problem. pyarrow supports ingesting from MaskedArray as a convenience, but it would not be common in my experience for a library's APIs to return MaskedArrays. > If this is too much of a numpy question rather than an arrow question, could > you point me to where I can read up on masked array support or maybe what the > right place to ask the numpy community about whether what I'm doing is > appropriate or not. > > Thanks, > > > -Dan Nugent
