On Mon, Jan 22, 2024 at 5:14 PM Nathan <nathan.goldb...@gmail.com> wrote:
> Hi all, > > I propose we accept NEP 55 and merge PR #25347 implementing the NEP in > time for the NumPy 2.0 RC: > > https://numpy.org/neps/nep-0055-string_dtype.html > https://github.com/numpy/numpy/pull/25347 > > The most controversial aspect of the NEP was support for missing strings > via a user-supplied sentinel object. In the previous discussion on the > mailing list, Warren Weckesser argued for shipping a missing data sentinel > with NumPy for use with the DType, while in code review and the PR for the > NEP, Sebestian expressed concern about the additional complexity of > including missing data support at all. > > I found that supporting missing data is key to efficiently supporting the > new DType in Pandas. I think that argues that we need some level of missing > data support to fully replace object string arrays. I believe the > compromise proposal in the NEP is sufficient for downstream libraries while > limiting additional complexity elsewhere in NumPy. > > Concerns raised in previous discussions about concretely specifying the C > API to be made public, preventing use-after-free errors in a multithreaded > context, and uncertainty around the arena allocator implementation have > been resolved in the latest version of the NEP and the open PR. > Additionally, due to some excellent and timely work by Lysandros Nikolaou, > we now have a number of string ufuncs in NumPy and a straightforward plan > to add more. Loops have been implemented for all the ufuncs added in the > NumPy 2.0 dev cycle so far. > > I would like to see us ship the DType in NumPy 2.0. This will allow us to > advertise a major new feature, will spur efforts to support new DTypes in > downstream libraries, and will allow us to get feedback from the community > that would be difficult to obtain without releasing the code into the wild. > Additionally, I am funded via a NASA ROSES grant for work related to this > effort until the end of 2024, so including the DType in NumPy 2.0 will more > efficiently use my funded time to fix issues. > > If there are no substantive objections to this email, then the NEP will be > considered accepted; see NEP 0 for more details: > https://numpy.org/neps/nep-0000.html > Don't worry too much about the timing, we aren't going to branch without the new strings unless the cat gets into them, which is unlikely. Chuck
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com