On Mon, Jan 22, 2024 at 5:14 PM Nathan <nathan.goldb...@gmail.com> wrote:

> Hi all,
>
> I propose we accept NEP 55 and merge PR #25347 implementing the NEP in
> time for the NumPy 2.0 RC:
>
> https://numpy.org/neps/nep-0055-string_dtype.html
> https://github.com/numpy/numpy/pull/25347
>
> The most controversial aspect of the NEP was support for missing strings
> via a user-supplied sentinel object. In the previous discussion on the
> mailing list, Warren Weckesser argued for shipping a missing data sentinel
> with NumPy for use with the DType, while in code review and the PR for the
> NEP, Sebestian expressed concern about the additional complexity of
> including missing data support at all.
>
> I found that supporting missing data is key to efficiently supporting the
> new DType in Pandas. I think that argues that we need some level of missing
> data support to fully replace object string arrays. I believe the
> compromise proposal in the NEP is sufficient for downstream libraries while
> limiting additional complexity elsewhere in NumPy.
>
> Concerns raised in previous discussions about concretely specifying the C
> API to be made public, preventing use-after-free errors in a multithreaded
> context, and uncertainty around the arena allocator implementation have
> been resolved in the latest version of the NEP and the open PR.
> Additionally, due to some excellent and timely work by Lysandros Nikolaou,
> we now have a number of string ufuncs in NumPy and a straightforward plan
> to add more. Loops have been implemented for all the ufuncs added in the
> NumPy 2.0 dev cycle so far.
>
> I would like to see us ship the DType in NumPy 2.0. This will allow us to
> advertise a major new feature, will spur efforts to support new DTypes in
> downstream libraries, and will allow us to get feedback from the community
> that would be difficult to obtain without releasing the code into the wild.
> Additionally, I am funded via a NASA ROSES grant for work related to this
> effort until the end of 2024, so including the DType in NumPy 2.0 will more
> efficiently use my funded time to fix issues.
>
> If there are no substantive objections to this email, then the NEP will be
> considered accepted; see NEP 0 for more details:
> https://numpy.org/neps/nep-0000.html
>

Don't worry too much about the timing, we aren't going to branch without
the new strings unless the cat gets into them, which is unlikely.

Chuck
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to