Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-20 Thread Victor Stinner
 * Add new GIL-free (no need to hold the GIL) memory allocator functions:

   - ``void* PyMem_RawMalloc(size_t size)``
   - ``void* PyMem_RawRealloc(void *ptr, size_t new_size)``
   - ``void PyMem_RawFree(void *ptr)``
   - the behaviour of requesting zero bytes is not defined: return *NULL*
 or a distinct non-*NULL* pointer depending on the platform.
 (...)
 * Add new functions to get and set internal functions of
   ``PyMem_Malloc()``, ``PyMem_Realloc()`` and ``PyMem_Free()``:

   - ``void PyMem_GetAllocator(PyMemBlockAllocator *allocator)``
   - ``void PyMem_SetAllocator(PyMemBlockAllocator *allocator)``
   - ``malloc(ctx, 0)`` and ``realloc(ctx, ptr, 0)`` must not return
 *NULL*: it would be treated as an error.
   - default allocator: ``malloc()``, ``realloc()``, ``free()``;
 ``PyMem_Malloc(0)`` calls ``malloc(1)``
 and ``PyMem_Realloc(NULL, 0)`` calls ``realloc(NULL, 1)``

Oh, one more question: PyMem_RawMalloc(0) has an undefined behaviour,
whereas PyMem_Malloc(0) has a well defined behaviour (don't return
NULL). Adding if (size == 1) size = 0; in the default implementation
of PyMem_RawMalloc(0) should not have a visible overhead, but it gives
the same guarantee than PyMem_Malloc(0) (don't return NULL). Do you
agree to add the test?

I chose to implement if (size  (size_t)PY_SSIZE_T_MAX) return NULL;
in Py*_Malloc(), whereas if (size == 0) size =1; is implemented in
the inner function (_PyMem_Malloc). An application may use an
allocator which has already a well defined behaviour (a malloc(0)
that don't return NULL) and I expect malloc(1) to allocate more
memory than malloc(0) (malloc(0) may create a singleton) :-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-20 Thread Kristján Valur Jónsson
Oh, please don't misunderstand.  I'm not making any demands or requirements, 
what I'm trying to do is to make recommendations based on experience that I 
have had with embedding.  This sound altogether too much like I'm trying to
push things one way or the other :)

The api as laid out certainly seems to work, and be adequate for the purpose.

I can add here as a point of information that since we
work on windows, there was no need to pass in the size argument to the
munmap callback.  VirtualFree(address, NULL) will release the entire chunk
of memory that was initially allocated at that place.  Therefor in our 
implementation
we can reuse the same allocator structo for those arenas.  But I understand
that munmap doesn't have this feature, so passing in the size is prudent.

K

 -Original Message-
 From: Victor Stinner [mailto:victor.stin...@gmail.com]
 Sent: 19. júní 2013 15:59
 To: Kristján Valur Jónsson
 Cc: Python Dev
 Subject: Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python
 memory allocators
 
 Is PyMemMappingAllocator complete enough for your usage at CCP Games?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-20 Thread Nick Coghlan
On 20 June 2013 15:37, Victor Stinner victor.stin...@gmail.com wrote:
 Le jeudi 20 juin 2013, Nick Coghlan a écrit :

  Is PyMemMappingAllocator complete enough for your usage at CCP Games?

 Can we go back to calling this the Arena allocator? Or at least
 Mapped? When I see Mapping in the context of Python I think of the
 container API, not a memory allocation API.

 This function is written to be able to use mmap() and VirtualAlloc(). There
 is no Python function to use directly this allocator yet, but I chose
 memory mapping name because it is very different than the heap and it may
 be useful for other functions than pymalloc.

 If I change the name, it would be called PyObject_SetArenaAllocator() with a
 PyObjectArenaAllocator structure. I'm not sure that PyMemMappingAllocator
 API is future-proof, so I'm fine to call it arena again.

Yeah, I think making that API specifically about pymalloc is a good
idea. It also makes it clearer that if you're bypassing pymalloc
entirely (by replacing the
object allocators), then you shouldn't need to worry about those.

  I hope that the PEP 445 is flexible enough to allow you to decide
  which functions are hooked and replaced, and which functions will be
  leaved unchanged. That's why I'm not in favor of the Make
  PyMem_Malloc() reuse PyMem_RawMalloc() by default alternative.

 It's also why I'm in favour of the domain API rather than separate
 functions.

 1. In the initial iteration, just have the three basic domains (raw,
 interpreter, objects). Replacing allocators for third party libraries is the
 responsibility of embedding applications.

 2. In a later iteration, add PyMem_AddDomain and PyMem_GetDomains APIs
 so that extension modules can register new domains for wrapped libraries.
 Replacing allocators is still the responsibility of embedding applications,
 but there's a consistent API to do it.

 (Alternatively, we could do both now)

 How would you use an allocator of a new domain? PyMemBlockAllocator
 structure is not convinient, and if Py_GetAllocator() only once, you may
 loose a hook installed later.

Say that, for the current PEP, we assume we configure standard library
extension modules to use the PyMem or PyMem_Raw APIs (depending on GIL
usage), thus allowing those to be redirected automatically to an
externally configured allocator when the new PEP 445 APIs are used.

The notion I had an mind as a possible future change is that extension
modules could register a set_allocator callback with the interpreter
so they will be automatically notified if the allocator they're
interested in changes. However, I also realised that this would
actually be independent of the APIs in the current PEP. You could do
something like:

typedef void (*PyMem_AllocatorSetter)(PyMemAllocator *allocator);

void
PyMem_AddExternalAllocator(
PyMemAllocatorDomain domain,
PyMemAllocatorSetter set_allocator
);

Then, whenever the allocator for the specified domain was changed, the
set_allocator callback would be invoked to set the allocator in the
extension module as well. The setter would also be called immediately
on registration, using the currently defined allocator.

We don't have to do this right away (and we should give the basic API
a chance to establish itself first). I just like it as something that
the memory domain model may allow us to pursue in the future. (That
said, we may end up wanting something like this internally anyway,
even just for the standard library extension modules that do memory
allocations)

Cheers,
Nick.

--
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Antoine Pitrou
Le Tue, 18 Jun 2013 22:40:49 +0200,
Victor Stinner victor.stin...@gmail.com a écrit :
 
 Other changes
 -
 
[...]
 
 * Configure external libraries like zlib or OpenSSL to allocate memory
   using ``PyMem_RawMalloc()``

Why so, and is it done by default?

 Only one get/set function for block allocators
 --
 
 Replace the 6 functions:
 
 * ``void PyMem_GetRawAllocator(PyMemBlockAllocator *allocator)``
 * ``void PyMem_GetAllocator(PyMemBlockAllocator *allocator)``
 * ``void PyObject_GetAllocator(PyMemBlockAllocator *allocator)``
 * ``void PyMem_SetRawAllocator(PyMemBlockAllocator *allocator)``
 * ``void PyMem_SetAllocator(PyMemBlockAllocator *allocator)``
 * ``void PyObject_SetAllocator(PyMemBlockAllocator *allocator)``
 
 with 2 functions with an additional *domain* argument:
 
 * ``int PyMem_GetBlockAllocator(int domain, PyMemBlockAllocator
 *allocator)``
 * ``int PyMem_SetBlockAllocator(int domain, PyMemBlockAllocator
 *allocator)``

I would much prefer this solution.

 Drawback: the caller has to check if the result is 0, or handle the
 error.

Or you can just call Py_FatalError() if the domain is invalid.

 If an hook is used to the track memory usage, the ``malloc()`` memory
 will not be seen. Remaining ``malloc()`` may allocate a lot of memory
 and so would be missed in reports.

A lot of memory? In main()?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Victor Stinner
2013/6/19 Antoine Pitrou solip...@pitrou.net:
 Le Tue, 18 Jun 2013 22:40:49 +0200,
 Victor Stinner victor.stin...@gmail.com a écrit :

 Other changes
 -

 [...]

 * Configure external libraries like zlib or OpenSSL to allocate memory
   using ``PyMem_RawMalloc()``

 Why so, and is it done by default?

(Oh, I realized that PyMem_Malloc() may be used instead of
PyMem_RawMalloc() if we are sure that the library will only be used
when the GIL is held.)

is it done by default?

First, it would be safer to only reuse PyMem_RawMalloc() allocator if
PyMem_SetRawMalloc() was called. Just to avoid regressions in Python
3.4.

Then, it depends on the library: if the allocator can be replaced for
one library object (ex: expat supports this), it can always be
replaced. Otherwise, we should only replace the library allocator if
Python is a standalone program (don't replace the library allocator if
Python is embedded). That's why I asked if it is possible to check if
Python is embedded or not.

Why so,

For the track memory usage use case, it is important to track memory
allocated in external libraries to have accurate reports, because
these allocations may be huge.

 Only one get/set function for block allocators
 --

 Replace the 6 functions:

 * ``void PyMem_GetRawAllocator(PyMemBlockAllocator *allocator)``
 * ``void PyMem_GetAllocator(PyMemBlockAllocator *allocator)``
 * ``void PyObject_GetAllocator(PyMemBlockAllocator *allocator)``
 * ``void PyMem_SetRawAllocator(PyMemBlockAllocator *allocator)``
 * ``void PyMem_SetAllocator(PyMemBlockAllocator *allocator)``
 * ``void PyObject_SetAllocator(PyMemBlockAllocator *allocator)``

 with 2 functions with an additional *domain* argument:

 * ``int PyMem_GetBlockAllocator(int domain, PyMemBlockAllocator
 *allocator)``
 * ``int PyMem_SetBlockAllocator(int domain, PyMemBlockAllocator
 *allocator)``

 I would much prefer this solution.

I don't have a strong preference between these two choices.

Oh, one argument in favor of one generic function is that code using
these functions would be simpler. Extract of the unit test of the
implementation (_testcapi.c):

+if (api == 'o')
+PyObject_SetAllocator(hook.alloc);
+else if (api == 'r')
+PyMem_SetRawAllocator(hook.alloc);
+else
+PyMem_SetAllocator(hook.alloc);

With a generic function, this block can be replace with one unique
function call.

 Drawback: the caller has to check if the result is 0, or handle the
 error.

 Or you can just call Py_FatalError() if the domain is invalid.

I don't like Py_FatalError(), especially when Python is embedded. It's
safer to return -1 and expect the caller to check for the error case.

 If an hook is used to the track memory usage, the ``malloc()`` memory
 will not be seen. Remaining ``malloc()`` may allocate a lot of memory
 and so would be missed in reports.

 A lot of memory? In main()?

Not in main(). The Python expat and zlib modules call directly
malloc() and may allocate large blocks. External libraries like
OpenSSL or bz2 may also allocate large blocks.

See issues #18203 and #18227.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Kristján Valur Jónsson
Right, think of the ctxt as a this pointer from c++.
If you have an allocator object, that you got from some c++ api, and want to 
ask Python to use that, you need to be able to thunk the this pointer to get 
at the particular allocator instance.
It used to be a common mistake when writing C callback apis to forget to add an 
opaque context pointer along with the callback function.
This omission makes it difficult (but not impossible) to attach c++ methods to 
such callbacks.

K

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Scott Dial
 Sent: 19. júní 2013 04:34
 To: ncogh...@gmail.com
 Cc: Python-Dev@python.org
 Subject: Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python
 memory allocators
 
 On 6/18/2013 11:32 PM, Nick Coghlan wrote:
  Agreed more of that rationale needs to be moved from the issue tracker
  into the PEP, though.
 
 Thanks for the clarification. I hadn't read the issue tracker at all. On it's 
 face
 value, I didn't see what purpose it served, but having read Kristján's
 comments on the issue tracker, he would like to store state for the allocators
 in that ctx pointer.[1] Having read that (previously, I thought the only 
 utility
 was distinguishing which domain it was -- a small, glorified enumeration), but
 his use-case makes sense and definitely is informative to have in the PEP,
 because the utility of that wasn't obvious to me.
 
 Thanks,
 -Scott
 
 [1] http://bugs.python.org/issue3329#msg190529
 
 One particular trick we have been using, which might be of interest, is to be
 able to tag each allocation with a context id.  This is then set according 
 to a
 global sys.memcontext variable, which the program will modify according to
 what it is doing.  This can then be used to track memory usage by different
 parts of the program.
 
 
 --
 Scott Dial
 sc...@scottdial.com
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/mailman/options/python-
 dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Antoine Pitrou
On Wed, 19 Jun 2013 17:24:21 +0200
Victor Stinner victor.stin...@gmail.com wrote:
 
 For the track memory usage use case, it is important to track memory
 allocated in external libraries to have accurate reports, because
 these allocations may be huge.
[...]
 Not in main(). The Python expat and zlib modules call directly
 malloc() and may allocate large blocks. External libraries like
 OpenSSL or bz2 may also allocate large blocks.

Fair enough.

  Drawback: the caller has to check if the result is 0, or handle the
  error.
 
  Or you can just call Py_FatalError() if the domain is invalid.
 
 I don't like Py_FatalError(), especially when Python is embedded. It's
 safer to return -1 and expect the caller to check for the error case.

I don't think you need to check for errors. The domain is always one of
the existing constants, i.e. it should be hard-coded in the source, not
computed.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Victor Stinner
2013/6/19 Antoine Pitrou solip...@pitrou.net:
 On Wed, 19 Jun 2013 17:24:21 +0200
  Drawback: the caller has to check if the result is 0, or handle the
  error.
 
  Or you can just call Py_FatalError() if the domain is invalid.

 I don't like Py_FatalError(), especially when Python is embedded. It's
 safer to return -1 and expect the caller to check for the error case.

 I don't think you need to check for errors. The domain is always one of
 the existing constants, i.e. it should be hard-coded in the source, not
 computed.

Imagine that PyMem_GetBlockAllocator() is part of the stable ABI and
that a new domain is added to Python 3.5. An application is written
for Python 3.5 and is run with Python 3.4: how would the application
notice that PyMem_GetBlockAllocator() does not know the new domain?

I don't think you need to check for errors.

Do you mean that an unknown domain should be simply ignored?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Kristján Valur Jónsson
Oh, it should be public, in my opinion.
We do exactly that when we embed python into UnrealEngine.  We keep pythons 
internal PyObject_Mem allocator, but have it ask UnrealEngine for its arenas.  
That way, we can still keep track of python's memory usage from with the larger 
application, even if the granularity of memory is now on an arena level, 
rather than individual allocs.

K

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Victor Stinner
 Sent: 18. júní 2013 21:20
 To: Python Dev
 Subject: Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python
 memory allocators
 
 typedef struct {
 /* user context passed as the first argument
to the 2 functions */
 void *ctx;
 
 /* allocate a memory mapping */
 void* (*alloc) (void *ctx, size_t size);
 
 /* release a memory mapping */
 void (*free) (void *ctx, void *ptr, size_t size);
 } PyMemMappingAllocator;
 
 The PyMemMappingAllocator structure is very specific to the pymalloc
 allocator. There is no resize, lock nor protect method. There is no way
 to configure protection or flags of the mapping. The
 PyMem_SetMappingAllocator() function was initially called
 _PyObject_SetArenaAllocator(). I'm not sure that the structure and the
 2 related functions should be public. Can an extension module call private
 (_Py*) functions or use a private structure?
 
 Or the structure might be renamed to indicate that it is specific to arenas?
 
 What do you think?
 
 Victor
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/mailman/options/python-
 dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Victor Stinner
2013/6/19 Kristján Valur Jónsson krist...@ccpgames.com:
 Oh, it should be public, in my opinion.

Ok. And do you think that the PyMemMappingAllocator structure is
complete, or that we should add something to be future-proof? At
least, PyMemMappingAllocator is enough for pymalloc usage :-)

Is PyMemMappingAllocator complete enough for your usage at CCP Games?

 We do exactly that when we embed python into UnrealEngine.  We keep pythons 
 internal PyObject_Mem allocator, but have it ask UnrealEngine for its arenas. 
  That way, we can still keep track of python's memory usage from with the 
 larger application, even if the granularity of memory is now on an arena 
 level, rather than individual allocs.

I hope that the PEP 445 is flexible enough to allow you to decide
which functions are hooked and replaced, and which functions will be
leaved unchanged. That's why I'm not in favor of the Make
PyMem_Malloc() reuse PyMem_RawMalloc() by default alternative.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Terry Reedy

On 6/19/2013 11:24 AM, Victor Stinner wrote:

2013/6/19 Antoine Pitrou solip...@pitrou.net:

Le Tue, 18 Jun 2013 22:40:49 +0200,
Victor Stinner victor.stin...@gmail.com a écrit :



Only one get/set function for block allocators
--

Replace the 6 functions:

* ``void PyMem_GetRawAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_GetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyObject_GetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_SetRawAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_SetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyObject_SetAllocator(PyMemBlockAllocator *allocator)``

with 2 functions with an additional *domain* argument:

* ``int PyMem_GetBlockAllocator(int domain, PyMemBlockAllocator
*allocator)``
* ``int PyMem_SetBlockAllocator(int domain, PyMemBlockAllocator
*allocator)``


I would much prefer this solution.


I do to. The two names can be remembered as one pair with only get/set 
difference.


--
Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Antoine Pitrou
On Wed, 19 Jun 2013 17:49:02 +0200
Victor Stinner victor.stin...@gmail.com wrote:

 2013/6/19 Antoine Pitrou solip...@pitrou.net:
  On Wed, 19 Jun 2013 17:24:21 +0200
   Drawback: the caller has to check if the result is 0, or handle the
   error.
  
   Or you can just call Py_FatalError() if the domain is invalid.
 
  I don't like Py_FatalError(), especially when Python is embedded. It's
  safer to return -1 and expect the caller to check for the error case.
 
  I don't think you need to check for errors. The domain is always one of
  the existing constants, i.e. it should be hard-coded in the source, not
  computed.
 
 Imagine that PyMem_GetBlockAllocator() is part of the stable ABI and
 that a new domain is added to Python 3.5. An application is written
 for Python 3.5 and is run with Python 3.4: how would the application
 notice that PyMem_GetBlockAllocator() does not know the new domain?

That's a good question. I don't know why guidelines Martin used when
designing the stable ABI, but I would expect important high-level
functions to end there, not memory allocation debugging.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Victor Stinner
PyMem_RawAlloc()/Realloc/Free should be part of the stable ABI. I agree
that all other new fumctions ans structures should not.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Nick Coghlan
On 20 Jun 2013 02:03, Victor Stinner victor.stin...@gmail.com wrote:

 2013/6/19 Kristján Valur Jónsson krist...@ccpgames.com:
  Oh, it should be public, in my opinion.

 Ok. And do you think that the PyMemMappingAllocator structure is
 complete, or that we should add something to be future-proof? At
 least, PyMemMappingAllocator is enough for pymalloc usage :-)

 Is PyMemMappingAllocator complete enough for your usage at CCP Games?

Can we go back to calling this the Arena allocator? Or at least Mapped?
When I see Mapping in the context of Python I think of the container API,
not a memory allocation API.


  We do exactly that when we embed python into UnrealEngine.  We keep
pythons internal PyObject_Mem allocator, but have it ask UnrealEngine for
its arenas.  That way, we can still keep track of python's memory usage
from with the larger application, even if the granularity of memory is now
on an arena level, rather than individual allocs.

 I hope that the PEP 445 is flexible enough to allow you to decide
 which functions are hooked and replaced, and which functions will be
 leaved unchanged. That's why I'm not in favor of the Make
 PyMem_Malloc() reuse PyMem_RawMalloc() by default alternative.

It's also why I'm in favour of the domain API rather than separate
functions.

1. In the initial iteration, just have the three basic domains (raw,
interpreter, objects). Replacing allocators for third party libraries is
the responsibility of embedding applications.

2. In a later iteration, add PyMem_AddDomain and PyMem_GetDomains APIs
so that extension modules can register new domains for wrapped libraries.
Replacing allocators is still the responsibility of embedding applications,
but there's a consistent API to do it.

(Alternatively, we could do both now)

And agreed PyMem_Raw* are the only new APIs that should be added to the
stable ABI.

Cheers,
Nick.


 Victor
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Victor Stinner
Le jeudi 20 juin 2013, Nick Coghlan a écrit :

  Is PyMemMappingAllocator complete enough for your usage at CCP Games?

 Can we go back to calling this the Arena allocator? Or at least
 Mapped? When I see Mapping in the context of Python I think of the
 container API, not a memory allocation API.

This function is written to be able to use mmap() and VirtualAlloc(). There
is no Python function to use directly this allocator yet, but I chose
memory mapping name because it is very different than the heap and it may
be useful for other functions than pymalloc.

If I change the name, it would be called PyObject_SetArenaAllocator() with
a PyObjectArenaAllocator structure. I'm not sure that PyMemMappingAllocator
API is future-proof, so I'm fine to call it arena again.

  I hope that the PEP 445 is flexible enough to allow you to decide
  which functions are hooked and replaced, and which functions will be
  leaved unchanged. That's why I'm not in favor of the Make
  PyMem_Malloc() reuse PyMem_RawMalloc() by default alternative.

 It's also why I'm in favour of the domain API rather than separate
 functions.

 1. In the initial iteration, just have the three basic domains (raw,
 interpreter, objects). Replacing allocators for third party libraries is
 the responsibility of embedding applications.

 2. In a later iteration, add PyMem_AddDomain and PyMem_GetDomains APIs
 so that extension modules can register new domains for wrapped libraries.
 Replacing allocators is still the responsibility of embedding applications,
 but there's a consistent API to do it.

 (Alternatively, we could do both now)

How would you use an allocator of a new domain? PyMemBlockAllocator
structure is not convinient, and if Py_GetAllocator() only once, you may
loose a hook installed later.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-18 Thread Victor Stinner
typedef struct {
/* user context passed as the first argument
   to the 2 functions */
void *ctx;

/* allocate a memory mapping */
void* (*alloc) (void *ctx, size_t size);

/* release a memory mapping */
void (*free) (void *ctx, void *ptr, size_t size);
} PyMemMappingAllocator;

The PyMemMappingAllocator structure is very specific to the pymalloc
allocator. There is no resize, lock nor protect method. There is
no way to configure protection or flags of the mapping. The
PyMem_SetMappingAllocator() function was initially called
_PyObject_SetArenaAllocator(). I'm not sure that the structure and the
2 related functions should be public. Can an extension module call
private (_Py*) functions or use a private structure?

Or the structure might be renamed to indicate that it is specific to arenas?

What do you think?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-18 Thread Scott Dial
On 6/18/2013 4:40 PM, Victor Stinner wrote:
 No context argument
 ---
 
 Simplify the signature of allocator functions, remove the context
 argument:
 
 * ``void* malloc(size_t size)``
 * ``void* realloc(void *ptr, size_t new_size)``
 * ``void free(void *ptr)``
 
 It is likely for an allocator hook to be reused for
 ``PyMem_SetAllocator()`` and ``PyObject_SetAllocator()``, or even
 ``PyMem_SetRawAllocator()``, but the hook must call a different function
 depending on the allocator. The context is a convenient way to reuse the
 same custom allocator or hook for different Python allocators.

I think there is a lack of justification for the extra argument, and the
extra argument is not free. The typical use-case for doing this
continuation-passing style is when the set of contexts is either
unknown, arbitrarily large, or infinite. In other words, when it would
be either impossible or impractical to enumerate all of the contexts.
However, in this case, we have only 3.

Your proposal already puts forward having 3 pairs of Get/Set functions,
so there is no distinct advantage in having a single typedef instance
that you pass in to all 3 of them. And, having all 3 pairs use the same
typedef is a bit of an attractive nuisance, in that one could pass the
wrong allocators to the wrong setter. With that, I could argue that
there should be 3 typedefs to prevent coding errors.

Nevertheless, the ctx argument buys the implementer nothing if they have
to begin their alloc function with if(ctx == X). In other words, there
is nothing simpler about:


void *_alloc(void *ctx, size_t size) {
  if(ctx == PYALLOC_PYMEM)
return _alloc_pymem(size);
  else if(ctx == PYALLOC_PYMEM_RAW)
return _alloc_pymem_raw(size);
  else if(ctx == PYALLOC_PYOBJECT)
return _alloc_pyobject(size);
  else
abort();
}

PyMemBlockAllocator pymem_allocator =
  {.ctx=PYALLOC_PYMEM, .alloc=_alloc, .free=_free};
PyMemBlockAllocator pymem_raw_allocator =
  {.ctx=PYALLOC_PYMEM_RAW, .alloc=_alloc, .free=_free};
PyMemBlockAllocator pyobject_allocator =
  {.ctx=PYALLOC_PYOBJECT, .alloc=_alloc, .free=_free};


In comparison to:


PyMemBlockAllocator pymem_allocator =
  {.alloc=_alloc_pymem, .free=_free_pymem};
PyMemBlockAllocator pymem_raw_allocator =
  {.alloc=_alloc_pymem_raw, .free=_free_pymem};
PyMemBlockAllocator pyobject_allocator =
  {.alloc=_alloc_pyobject, .free=_free_pyobject};


And in the latter case, there is no extra indirect branching in the
hot-path of the allocators.

Also, none of the external libraries cited introduce this CPS/ctx stuff.

-- 
Scott Dial
sc...@scottdial.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-18 Thread Nick Coghlan
On 19 June 2013 09:23, Scott Dial scott+python-...@scottdial.com wrote:
 On 6/18/2013 4:40 PM, Victor Stinner wrote:
 No context argument
 ---

 Simplify the signature of allocator functions, remove the context
 argument:

 * ``void* malloc(size_t size)``
 * ``void* realloc(void *ptr, size_t new_size)``
 * ``void free(void *ptr)``

 It is likely for an allocator hook to be reused for
 ``PyMem_SetAllocator()`` and ``PyObject_SetAllocator()``, or even
 ``PyMem_SetRawAllocator()``, but the hook must call a different function
 depending on the allocator. The context is a convenient way to reuse the
 same custom allocator or hook for different Python allocators.

 I think there is a lack of justification for the extra argument, and the
 extra argument is not free. The typical use-case for doing this
 continuation-passing style is when the set of contexts is either
 unknown, arbitrarily large, or infinite. In other words, when it would
 be either impossible or impractical to enumerate all of the contexts.
 However, in this case, we have only 3.

Note that the context is part of the BlockAllocator structure, NOT
predefined by Python.

 Your proposal already puts forward having 3 pairs of Get/Set functions,
 so there is no distinct advantage in having a single typedef instance
 that you pass in to all 3 of them. And, having all 3 pairs use the same
 typedef is a bit of an attractive nuisance, in that one could pass the
 wrong allocators to the wrong setter. With that, I could argue that
 there should be 3 typedefs to prevent coding errors.

I'm not sure we *should* be restricting this to the CPython internal
domains indefinitely. If we use a domain based model from the start,
then that will allow us in the future to let extension modules declare
additional domains rather than having to employ library specific logic
in either the CPython core or in embedding applications.

 Nevertheless, the ctx argument buys the implementer nothing if they have
 to begin their alloc function with if(ctx == X). In other words, there
 is nothing simpler about:

 
 void *_alloc(void *ctx, size_t size) {
   if(ctx == PYALLOC_PYMEM)
 return _alloc_pymem(size);
   else if(ctx == PYALLOC_PYMEM_RAW)
 return _alloc_pymem_raw(size);
   else if(ctx == PYALLOC_PYOBJECT)
 return _alloc_pyobject(size);
   else
 abort();
 }

 PyMemBlockAllocator pymem_allocator =
   {.ctx=PYALLOC_PYMEM, .alloc=_alloc, .free=_free};
 PyMemBlockAllocator pymem_raw_allocator =
   {.ctx=PYALLOC_PYMEM_RAW, .alloc=_alloc, .free=_free};
 PyMemBlockAllocator pyobject_allocator =
   {.ctx=PYALLOC_PYOBJECT, .alloc=_alloc, .free=_free};
 

Why would anyone do that? The context is so embedding applications can
distinguish the CPython runtime from their *other* domains that use
the same allocator functions. If you wanted to use completely
different allocators for each domain, you would just do that and
ignore the context argument entirely.

Agreed more of that rationale needs to be moved from the issue tracker
into the PEP, though.

Cheers,
Nick.

--
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-18 Thread Scott Dial
On 6/18/2013 11:32 PM, Nick Coghlan wrote:
 Agreed more of that rationale needs to be moved from the issue tracker
 into the PEP, though.

Thanks for the clarification. I hadn't read the issue tracker at all. On
it's face value, I didn't see what purpose it served, but having read
Kristján's comments on the issue tracker, he would like to store state
for the allocators in that ctx pointer.[1] Having read that (previously,
I thought the only utility was distinguishing which domain it was -- a
small, glorified enumeration), but his use-case makes sense and
definitely is informative to have in the PEP, because the utility of
that wasn't obvious to me.

Thanks,
-Scott

[1] http://bugs.python.org/issue3329#msg190529

One particular trick we have been using, which might be of interest, is
to be able to tag each allocation with a context id.  This is then set
according to a global sys.memcontext variable, which the program will
modify according to what it is doing.  This can then be used to track
memory usage by different parts of the program.


-- 
Scott Dial
sc...@scottdial.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-18 Thread Victor Stinner
Le mercredi 19 juin 2013, Scott Dial a écrit :

 On 6/18/2013 4:40 PM, Victor Stinner wrote:
  No context argument

 I think there is a lack of justification for the extra argument, and the
 extra argument is not free. The typical use-case for doing this
 continuation-passing style is when the set of contexts is either
 unknown, arbitrarily large, or infinite. In other words, when it would
 be either impossible or impractical to enumerate all of the contexts.
 However, in this case, we have only 3.


See the use case 3 in examples. Without the context argument, you have to
copy/paste 3 times each functions: 3 functions - 9 functions. I don't like
having to copy/paste code, it sounds like a bad design.


 And in the latter case, there is no extra indirect branching in the
 hot-path of the allocators.


Are you concerned by performances? Did you see the Performances section,
there is no overhead according to the benchmark suite.



 Also, none of the external libraries cited introduce this CPS/ctx stuff.


Oops, the list is incomplete. Copy/paste from the issue:


Some customizable memory allocators I know have an extra parameter
void *opaque that is passed to all functions:

- in zlib: zalloc and zfree: http://www.zlib.net/manual.html#Usage
- same thing for bz2.
- lzma's ISzAlloc: http://www.asawicki.info/news_1368_lzma_sdk_-_how_to_use.html
- Oracle's OCI:
http://docs.oracle.com/cd/B10501_01/appdev.920/a96584/oci15re4.htm

OTOH, expat, libxml, libmpdec don't have this extra parameter.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com