On Thu, 2021-05-06 at 13:06 +0100, Eric Wieser wrote:
> Another argument for supporting stateful allocators would be
> compatibility
> with the stateful C++11 allocator API, such as
> https://en.cppreference.com/w/cpp/memory/allocator_traits/allocate.


The Python version of this does have a `void *ctx`, but I am not sure
if the use for this is actually valuable for the NumPy use-cases.
(Honestly, beyond aligned allocation, or memory pinning, I am uncertain
what those use-cases are).

I had more written, but maybe just keep it short:

While I like the `PyObject *` idea, I am also not sure that it helps
much.  If we want allocation specific state, the user should
overallocate and save it before the actual allocation.

I am sure there could be extensions in the future (although I don't
know what exactly).

I am not super worried about it, its fairly niche and we can probably
figure out ways to deprecate an old way of registration and slowly
replace it with a new way.

But if we don't mind the churn it creates, the only serious idea I
would have right now is using a `FromSpec` API.

The only difference would be that we allocate the struct and (for now)
return something that is fully opaque (we could allow get/set functions
on it though).  In fact, we could even keep the current struct largely
unchanged but change it to be the main "spec", with no actual slots
currently necessary (could even be a `void *slots` that is always
NULL).
(slots are a bit unfortunate, since they cast to `void *` making
compile time type checking harder, but overall I think its OK and
something we will be using more anyway for DTypes.)

I am not sure it is worth it, but if there are no arguments why we
cannot allocate the struct, that seems fine.  If the return value is
opaque, we even have the ability to turn it into a proper Python object
if we want to.

Cheers,

Sebastian


> Adding support for stateful allocators at a later date would almost
> certainly create an ABI breakage or lots of pain around avoiding one.
> 
> I haven't thought very much about the PyCapsule approach (although it
> appears some other reviewers on github considered it at one point),
> but
> even building it from scratch, the overhead to support statefulness
> is not
> large.
> As I demonstrate on the github issue (18805), would amount to
> changing the
> API from:
> ```C
> // the version in the NEP
> typedef void *(PyDataMem_AllocFunc)(size_t size);
> typedef void *(PyDataMem_ZeroedAllocFunc)(size_t nelems, size_t
> elsize);
> typedef void (PyDataMem_FreeFunc)(void *ptr, size_t size);
> typedef void *(PyDataMem_ReallocFunc)(void *ptr, size_t size);
> typedef struct {
>     char name[200];
>     PyDataMem_AllocFunc *alloc;
>     PyDataMem_ZeroedAllocFunc *zeroed_alloc;
>     PyDataMem_FreeFunc *free;
>     PyDataMem_ReallocFunc *realloc;
> } PyDataMem_HandlerObject;
> const PyDataMem_Handler * PyDataMem_SetHandler(PyDataMem_Handler
> *handler);
> const char * PyDataMem_GetHandlerName(PyArrayObject *obj);
> ```
> to
> ```C
> // proposed changes: a `PyObject *self` argument pointing to a
> `PyDataMem_HandlerObject` and a ` PyObject_HEAD`
> typedef void *(PyDataMem_AllocFunc)(PyObject *self, size_t size);
> typedef void *(PyDataMem_ZeroedAllocFunc)(PyObject *self, size_t
> nelems,
> size_t elsize);
> typedef void (PyDataMem_FreeFunc)(PyObject *self, void *ptr, size_t
> size);
> typedef void *(PyDataMem_ReallocFunc)(PyObject *self, void *ptr,
> size_t
> size);
> typedef struct {
>     PyObject_HEAD
>     PyDataMem_AllocFunc *alloc;
>     PyDataMem_ZeroedAllocFunc *zeroed_alloc;
>     PyDataMem_FreeFunc *free;
>     PyDataMem_ReallocFunc *realloc;
> } PyDataMem_HandlerObject;
> // steals a reference to handler, caller is responsible for decrefing
> the
> result
> PyDataMem_Handler * PyDataMem_SetHandler(PyDataMem_Handler *handler);
> // borrowed reference
> PyDataMem_Handler * PyDataMem_GetHandler(PyArrayObject *obj);
> 
> // some boilerplate that numpy is already full of and doesn't impact
> users
> of non-stateful allocators
> PyTypeObject PyDataMem_HandlerType = ...;
> ```
> When constructing an array, the reference count of the handler would
> be
> incremented before storing it in the array struct
> 
> Since the extra work now to support this is not awful, but the
> potential
> for ABI headaches down the road is, I think we should aim to support
> statefulness right from the start.
> The runtime overhead of the stateful approach above vs the NEP
> approach is
> negligible, and consists of:
> * Some overhead costs for setting up an allocator. This likely only
> happens
> near startup, so won't matter.
> * An extra incref on each array allocation
> * An extra pointer argument on the stack for each allocation and
> deallocation
> * Perhaps around 32 extra bytes per allocator objects. Since arrays
> just
> store pointers to allocators this doesn't matter.
> 
> Eric
> 
> 
> On Thu, 6 May 2021 at 12:43, Matti Picus <matti.pi...@gmail.com>
> wrote:
> 
> > 
> > On 6/5/21 2:07 pm, Eric Wieser wrote:
> > > The NEP looks good, but I worry the API isn't flexible enough. My
> > > two
> > > main concerns are:
> > > 
> > > ### Stateful allocators
> > > 
> > > Consider an allocator that aligns to `N` bytes, where `N` is
> > > configurable from a python call in someone else's extension
> > > module.
> > > ...
> > > 
> > > ### Thread and async-local allocators
> > > 
> > > For tracing purposes, I expect it to be valuable to be able to
> > > configure the allocator within a single thread / coroutine.
> > > If we want to support this, we'd most likely want to work with
> > > the
> > > PEP567 ContextVar API rather than a half-baked thread_local
> > > solution
> > > that doesn't work for async code.
> > > 
> > > This problem isn't as pressing as the statefulness problem.
> > > Fixing it would amount to extending the `PyDataMem_SetHandler`
> > > API,
> > > and would be unlikely to break any code written against the
> > > current
> > > version of the NEP; meaning it would be fine to leave as a
> > > follow-up.
> > > It might still be worth remarking upon as future work of some
> > > kind in
> > > the NEP.
> > > 
> > > 
> > I would prefer to leave both of these to a future extension for the
> > NEP.
> > Setting the alignment from a python-level call seems to be asking
> > for
> > trouble, and I would need to be convinced that the extra layer of
> > flexibility is worth it.
> > 
> > 
> > It might be worth mentioning that this NEP may be extended in the
> > future, but truthfully I think that is the case for all NEPs.
> > 
> > 
> > Matti
> > 
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to