Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-04-22 Thread Victor Stinner
Hi,

My pull request has been merged into numpy. numpy now uses
PyMem_RawMalloc() rather than PyMem_Malloc() since it uses the memory
allocator without holding the GIL:
https://github.com/numpy/numpy/pull/7404

It was proposed to modify numpy to hold the GIL. Maybe it will be done later.

It means that there are no more C extensions known to not use
correctly Python memory allocators. So I pushed my change in CPython
to use the pymalloc memory allocator in PyMem_Malloc():
https://hg.python.org/cpython/rev/68b2a43d8653

I documented that porting C extensions to Python 3.6 require to run
tests with PYTHONMALLOC=debug. This environment variable enables
checks at runtime to validate the usage of Python memory allocators,
including checks on the GIL. PYTHONMALLOC=debug and the check on the
GIL are new in Python 3.6.

By the way, I modified the code to log the fatal error. if a buffer
overflow/underflow is detected in a free function like PyObject_Free()
and tracemalloc is enabled, the traceback where the memory block was
allocated is now displayed:
https://docs.python.org/dev/whatsnew/3.6.html#pythonmalloc-environment-variable

Moreover, the warning logger now also log where file, socket, etc.
were allocated on ResourceWarning:
https://docs.python.org/dev/whatsnew/3.6.html#warnings

It looks like Python 3.6 will help developers ;-)

Victor

2016-04-20 1:33 GMT+02:00 Victor Stinner :
> Ping? Is someone still opposed to my change #26249 "Change
> PyMem_Malloc to use pymalloc allocator"? If no, I think that I will
> push my change.
>
> My change only changes two lines, so it can be easily reverted before
> CPython 3.6 if we detect major issues in third-party extensions. And
> maybe it's better to push such change today to get more time to play
> with it, than pushing it late in the development of CPython 3.6.
>
> The new PYTHONMALLOC=debug feature allows to quickly and easily check
> the usage of the PyMem_Malloc() API, even if Python is compiled in
> release mode.
>
> I checked multiple Python extensions written in C. I only found one
> bug in numpy and I sent a patch (not merged yet).
>
> victor
>
> 2016-03-15 0:19 GMT+01:00 Victor Stinner :
>> 2016-02-12 14:31 GMT+01:00 M.-A. Lemburg :
> If your program has bugs, you can use a debug build of Python 3.5 to
> detect misusage of the API.
>>>
>>> Yes, but people don't necessarily do this, e.g. I have
>>> for a very long time ignored debug builds completely
>>> and when I started to try them, I found that some of the
>>> things I had been doing with e.g. free list implementations
>>> did not work in debug builds.
>>
>> I just added support for debug hooks on Python memory allocators on
>> Python compiled in *release* mode. Set the environment variable
>> PYTHONMALLOC to debug to try with Python 3.6.
>>
>> I added a check on PyObject_Malloc() debug hook to ensure that the
>> function is called with the GIL held. I opened an issue to add a
>> similar check on PyMem_Malloc():
>> https://bugs.python.org/issue26563
>>
>>
>>> Yes, but those are part of the stdlib. You'd need to check
>>> a few C extensions which are not tested as part of the stdlib,
>>> e.g. numpy, scipy, lxml, pillow, etc. (esp. ones which implement custom
>>> types in C since these will often need the memory management
>>> APIs).
>>>
>>> It may also be a good idea to check wrapper generators such
>>> as cython, swig, cffi, etc.
>>
>> I ran the test suite of numpy, lxml, Pillow and cryptography (used cffi).
>>
>> I found a bug in numpy. numpy calls PyMem_Malloc() without holding the GIL:
>> https://github.com/numpy/numpy/pull/7404
>>
>> Except of this bug, all other tests pass with PyMem_Malloc() using
>> pymalloc and all debug checks.
>>
>> Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-04-19 Thread Victor Stinner
Ping? Is someone still opposed to my change #26249 "Change
PyMem_Malloc to use pymalloc allocator"? If no, I think that I will
push my change.

My change only changes two lines, so it can be easily reverted before
CPython 3.6 if we detect major issues in third-party extensions. And
maybe it's better to push such change today to get more time to play
with it, than pushing it late in the development of CPython 3.6.

The new PYTHONMALLOC=debug feature allows to quickly and easily check
the usage of the PyMem_Malloc() API, even if Python is compiled in
release mode.

I checked multiple Python extensions written in C. I only found one
bug in numpy and I sent a patch (not merged yet).

victor

2016-03-15 0:19 GMT+01:00 Victor Stinner :
> 2016-02-12 14:31 GMT+01:00 M.-A. Lemburg :
 If your program has bugs, you can use a debug build of Python 3.5 to
 detect misusage of the API.
>>
>> Yes, but people don't necessarily do this, e.g. I have
>> for a very long time ignored debug builds completely
>> and when I started to try them, I found that some of the
>> things I had been doing with e.g. free list implementations
>> did not work in debug builds.
>
> I just added support for debug hooks on Python memory allocators on
> Python compiled in *release* mode. Set the environment variable
> PYTHONMALLOC to debug to try with Python 3.6.
>
> I added a check on PyObject_Malloc() debug hook to ensure that the
> function is called with the GIL held. I opened an issue to add a
> similar check on PyMem_Malloc():
> https://bugs.python.org/issue26563
>
>
>> Yes, but those are part of the stdlib. You'd need to check
>> a few C extensions which are not tested as part of the stdlib,
>> e.g. numpy, scipy, lxml, pillow, etc. (esp. ones which implement custom
>> types in C since these will often need the memory management
>> APIs).
>>
>> It may also be a good idea to check wrapper generators such
>> as cython, swig, cffi, etc.
>
> I ran the test suite of numpy, lxml, Pillow and cryptography (used cffi).
>
> I found a bug in numpy. numpy calls PyMem_Malloc() without holding the GIL:
> https://github.com/numpy/numpy/pull/7404
>
> Except of this bug, all other tests pass with PyMem_Malloc() using
> pymalloc and all debug checks.
>
> Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-03-25 Thread Victor Stinner
So what do you think? Is it worth to change PyMem_Malloc() allocator
to pymalloc for a small speedup?

Should we do something else before doing that?

Or do you expect that too many applications use PyMem_Malloc() without
holding the GIL and will not run try to run their application with
PYTHONMALLOC=debug?

Victor

2016-03-15 0:19 GMT+01:00 Victor Stinner :
> 2016-02-12 14:31 GMT+01:00 M.-A. Lemburg :
 If your program has bugs, you can use a debug build of Python 3.5 to
 detect misusage of the API.
>>
>> Yes, but people don't necessarily do this, e.g. I have
>> for a very long time ignored debug builds completely
>> and when I started to try them, I found that some of the
>> things I had been doing with e.g. free list implementations
>> did not work in debug builds.
>
> I just added support for debug hooks on Python memory allocators on
> Python compiled in *release* mode. Set the environment variable
> PYTHONMALLOC to debug to try with Python 3.6.
>
> I added a check on PyObject_Malloc() debug hook to ensure that the
> function is called with the GIL held. I opened an issue to add a
> similar check on PyMem_Malloc():
> https://bugs.python.org/issue26563
>
>
>> Yes, but those are part of the stdlib. You'd need to check
>> a few C extensions which are not tested as part of the stdlib,
>> e.g. numpy, scipy, lxml, pillow, etc. (esp. ones which implement custom
>> types in C since these will often need the memory management
>> APIs).
>>
>> It may also be a good idea to check wrapper generators such
>> as cython, swig, cffi, etc.
>
> I ran the test suite of numpy, lxml, Pillow and cryptography (used cffi).
>
> I found a bug in numpy. numpy calls PyMem_Malloc() without holding the GIL:
> https://github.com/numpy/numpy/pull/7404
>
> Except of this bug, all other tests pass with PyMem_Malloc() using
> pymalloc and all debug checks.
>
> Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-03-14 Thread Victor Stinner
2016-02-12 14:31 GMT+01:00 M.-A. Lemburg :
>>> If your program has bugs, you can use a debug build of Python 3.5 to
>>> detect misusage of the API.
>
> Yes, but people don't necessarily do this, e.g. I have
> for a very long time ignored debug builds completely
> and when I started to try them, I found that some of the
> things I had been doing with e.g. free list implementations
> did not work in debug builds.

I just added support for debug hooks on Python memory allocators on
Python compiled in *release* mode. Set the environment variable
PYTHONMALLOC to debug to try with Python 3.6.

I added a check on PyObject_Malloc() debug hook to ensure that the
function is called with the GIL held. I opened an issue to add a
similar check on PyMem_Malloc():
https://bugs.python.org/issue26563


> Yes, but those are part of the stdlib. You'd need to check
> a few C extensions which are not tested as part of the stdlib,
> e.g. numpy, scipy, lxml, pillow, etc. (esp. ones which implement custom
> types in C since these will often need the memory management
> APIs).
>
> It may also be a good idea to check wrapper generators such
> as cython, swig, cffi, etc.

I ran the test suite of numpy, lxml, Pillow and cryptography (used cffi).

I found a bug in numpy. numpy calls PyMem_Malloc() without holding the GIL:
https://github.com/numpy/numpy/pull/7404

Except of this bug, all other tests pass with PyMem_Malloc() using
pymalloc and all debug checks.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-03-14 Thread Victor Stinner
2016-03-09 18:54 GMT+01:00 Brett Cannon :
>>> https://docs.python.org/dev/c-api/memory.html#c.PyMem_SetupDebugHooks
>>
>> The main advantage of this variable is that you don't have to
>> recompile Python in debug mode to benefit of these checks.
>
> I just wanted to say this all sounds awesome! Thanks for all the hard work
> on making our memory management story easier to work with, Victor.

You're welcome. I pushed my patch adding PYTHONMALLOC environment variable:
https://docs.python.org/dev/whatsnew/3.6.html#pythonmalloc-environment-variable

Please test PYTHONMALLOC=debug and PYTHONMALLOC=malloc with your
favorite application.

I also adjusted code (like code handling PYTHONMALLOCSTATS env var) to
be able to use debug checks in all cases. For example, debug hooks are
now also installed by default when Python is configured in debug mode
without pymalloc support.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-03-09 Thread Brett Cannon
On Wed, 9 Mar 2016 at 06:57 Victor Stinner  wrote:

> 2016-02-08 15:18 GMT+01:00 Victor Stinner :
> >> Perhaps if you add some guards somewhere :-)
> >
> > We have runtime checks but only implemented in debug mode for efficiency.
> >
> > By the way, I proposed once to add an environment variable to allow to
> > enable these checks without having to recompile Python.  Since the PEP
> > 445, it became easy to implement this. What do you think?
> >
> https://www.python.org/dev/peps/pep-0445/#add-a-new-pydebugmalloc-environment-variable
>
> Ok, I wrote a patch to implement a new PYTHONMALLOC environment variable:
>
>http://bugs.python.org/issue26516
>
> PYTHONMALLOC=debug installs debug hooks to:
>
> * detect API violations, ex: PyObject_Free() called on a buffer
> allocated by PyMem_Malloc()
> * detect write before the start of the buffer (buffer underflow)
> * detect write after the end of the buffer (buffer overflow)
>
> https://docs.python.org/dev/c-api/memory.html#c.PyMem_SetupDebugHooks
>
> The main advantage of this variable is that you don't have to
> recompile Python in debug mode to benefit of these checks.
>

I just wanted to say this all sounds awesome! Thanks for all the hard work
on making our memory management story easier to work with, Victor.

-Brett


>
> Recompiling Python in debug mode requires to recompile *all*
> extensions modules since the debug ABI is incompatible. When I played
> with tracemalloc on Python 2 ( http://pytracemalloc.readthedocs.org/
> ), I had such issues, it was very annoying with non-trivial extension
> modules like PyQt or PyGTK. With PYTHONMALLOC, you don't have to
> recompile extension modules anymore!
>
>
> With tracemalloc and PYTHONMALLOC=debug, we will have a complete tool
> suite to "debug memory"!
>
> My motivation for PYTHONMALLOC=debug is to detect API violations to
> prepare my change on PyMem_Malloc() allocator (
> http://bugs.python.org/issue26249 ), but also to help users to detect
> bugs.
>
> It's common that users report a bug: "Python crashed", but have no
> idea of the responsible of the crash. I hope that detection of buffer
> underflow & overflow will help them to detect bugs in their own
> extension modules.
>
>
> Moreover, I added PYTHONMALLOC=malloc to ease the use of external
> memory debugger on Python. By default, Python uses pymalloc allocator
> for PyObject_Malloc() which raises a lot of false positive in
> Valgrind. We even have a configuration (--with-valgrind) and a
> Valgrind suppressino file to be able to skip these false alarms in
> Valgrind. IMHO PYTHONMALLOC=malloc is a simpler option to use Valgrind
> (or other tools).
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-03-09 Thread Victor Stinner
2016-02-08 15:18 GMT+01:00 Victor Stinner :
>> Perhaps if you add some guards somewhere :-)
>
> We have runtime checks but only implemented in debug mode for efficiency.
>
> By the way, I proposed once to add an environment variable to allow to
> enable these checks without having to recompile Python.  Since the PEP
> 445, it became easy to implement this. What do you think?
> https://www.python.org/dev/peps/pep-0445/#add-a-new-pydebugmalloc-environment-variable

Ok, I wrote a patch to implement a new PYTHONMALLOC environment variable:

   http://bugs.python.org/issue26516

PYTHONMALLOC=debug installs debug hooks to:

* detect API violations, ex: PyObject_Free() called on a buffer
allocated by PyMem_Malloc()
* detect write before the start of the buffer (buffer underflow)
* detect write after the end of the buffer (buffer overflow)

https://docs.python.org/dev/c-api/memory.html#c.PyMem_SetupDebugHooks

The main advantage of this variable is that you don't have to
recompile Python in debug mode to benefit of these checks.

Recompiling Python in debug mode requires to recompile *all*
extensions modules since the debug ABI is incompatible. When I played
with tracemalloc on Python 2 ( http://pytracemalloc.readthedocs.org/
), I had such issues, it was very annoying with non-trivial extension
modules like PyQt or PyGTK. With PYTHONMALLOC, you don't have to
recompile extension modules anymore!


With tracemalloc and PYTHONMALLOC=debug, we will have a complete tool
suite to "debug memory"!

My motivation for PYTHONMALLOC=debug is to detect API violations to
prepare my change on PyMem_Malloc() allocator (
http://bugs.python.org/issue26249 ), but also to help users to detect
bugs.

It's common that users report a bug: "Python crashed", but have no
idea of the responsible of the crash. I hope that detection of buffer
underflow & overflow will help them to detect bugs in their own
extension modules.


Moreover, I added PYTHONMALLOC=malloc to ease the use of external
memory debugger on Python. By default, Python uses pymalloc allocator
for PyObject_Malloc() which raises a lot of false positive in
Valgrind. We even have a configuration (--with-valgrind) and a
Valgrind suppressino file to be able to skip these false alarms in
Valgrind. IMHO PYTHONMALLOC=malloc is a simpler option to use Valgrind
(or other tools).

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-12 Thread Victor Stinner
Hi,

2016-02-12 14:31 GMT+01:00 M.-A. Lemburg :
> Sorry, your email must gotten lost in my inbox.

no problemo


> Yes, but those are part of the stdlib. You'd need to check
> a few C extensions which are not tested as part of the stdlib,
> e.g. numpy, scipy, lxml, pillow, etc. (esp. ones which implement custom
> types in C since these will often need the memory management
> APIs).
>
> It may also be a good idea to check wrapper generators such
> as cython, swig, cffi, etc.

Ok, I will try my patch on some of them. Thanks for the pointers.


> I suppose such a flag would create a noticeable runtime
> performance hit, since the compiler would no longer be
> able to inline the PyMem_*() APIs if you redirect those
> APIs to other sets at runtime.

Hum, I think that you missed the PEP 445. The overhead of this PEP was
discussed and considered as negligible enough to implement the PEP:
https://www.python.org/dev/peps/pep-0445/#performances

Using the PEP 445, there is no overhead to enable debug hooks at
runtime (except of the overhead of the debug checks themself ;-)).

PyMem_Malloc now calls a pointer:
https://hg.python.org/cpython/file/37bacf3fa1f5/Objects/obmalloc.c#l319

Same for PyObject_Malloc:
https://hg.python.org/cpython/file/37bacf3fa1f5/Objects/obmalloc.c#l380


> I also don't see much point in carrying around such
> baggage in production builds of Python, since you'd most
> likely only want to use the tools to debug C extensions during
> their development.

I propose adding an environment variable because it's rare that a
debug build is installed on system. Usually, using a debug build
requires to recompile all C extensions which is not really...
convenient...

With such env var, it would be trivial to check quickly if the Python
memory allocators are used correctly.


> Runtime performance, difference in memory consumption (arenas
> cannot be freed if there are still small chunks allocated),
> memory locality. I'm no expert in this, so can't really
> comment much.

"arenas cannot be freed if there are still small chunks allocated"
yeah, this is called memory fragmentation.

There is a big difference between libc malloc() and pymalloc for small
allocations: pymalloc is able to free an arena using munmap() which
releases immediatly the memory to the system, whereas most
implementation of malloc() use a single contigious memory block which
is only shrinked when all memory "at the top" is free. So it's the
same fragmentation issue that you described, except that it uses a
single arena which has an arbitrary size (between 1 MB and 10 GB,
there is no limit), whereas pymalloc uses small arenas of 256 KB.

In short, I expect less fragmentation with pymalloc.

"memory locality": I have no idea on that. I guess that it can be seen
on benchmarks. pymalloc is designed for objects with short lifetime.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-12 Thread M.-A. Lemburg
On 12.02.2016 12:18, Victor Stinner wrote:
> ping?

Sorry, your email must gotten lost in my inbox.

> 2016-02-08 15:18 GMT+01:00 Victor Stinner :
>> 2016-02-04 15:05 GMT+01:00 M.-A. Lemburg :
>>> Sometimes, yes, but we also do allocations for e.g.
>>> parsing values in Python argument tuples (e.g. using
>>> "es" or "et"):
>>>
>>> https://docs.python.org/3.6/c-api/arg.html
>>>
>>> We do document to use PyMem_Free() on those; not sure whether
>>> everyone does this though.
>>
>> It's well documented. If programs start to crash, they must be fixed.
>>
>> I don't propose to "break the API" for free, but to get a speedup on
>> the overall Python.
>>
>> And I don't think that we can say that it's an API change, since we
>> already stated that PyMem_Free() must be used.
>>
>> If your program has bugs, you can use a debug build of Python 3.5 to
>> detect misusage of the API.

Yes, but people don't necessarily do this, e.g. I have
for a very long time ignored debug builds completely
and when I started to try them, I found that some of the
things I had been doing with e.g. free list implementations
did not work in debug builds.

>>> The Python test suite doesn't test Python C extensions,
>>> so it's not surprising that it passes :-)
>>
>> What do you mean by "C extensions"? Which modules?
>>
>> Many modules in the stdlib have "C accelerators" and the PEP 399 now
>> *require* to test the C and Python implementations.

Yes, but those are part of the stdlib. You'd need to check
a few C extensions which are not tested as part of the stdlib,
e.g. numpy, scipy, lxml, pillow, etc. (esp. ones which implement custom
types in C since these will often need the memory management
APIs).

It may also be a good idea to check wrapper generators such
as cython, swig, cffi, etc.

 Instead of teaching developers that well, in fact, PyObject_Malloc()
 is unrelated to object programming, I think that it's simpler to
 modify PyMem_Malloc() to reuse pymalloc ;-)
>>>
>>> Perhaps if you add some guards somewhere :-)
>>
>> We have runtime checks but only implemented in debug mode for efficiency.
>>
>> By the way, I proposed once to add an environment variable to allow to
>> enable these checks without having to recompile Python.  Since the PEP
>> 445, it became easy to implement this. What do you think?
>> https://www.python.org/dev/peps/pep-0445/#add-a-new-pydebugmalloc-environment-variable
>>
>> "This alternative was rejected because a new environment variable
>> would make Python initialization even more complex. PEP 432 tries to
>> simplify the CPython startup sequence."
>>
>> The PEP 432 looks stuck, so I don't think that we should block
>> enhancements because of this PEP. Anyway, my idea should be easy to
>> implement.

I suppose such a flag would create a noticeable runtime
performance hit, since the compiler would no longer be
able to inline the PyMem_*() APIs if you redirect those
APIs to other sets at runtime.

I also don't see much point in carrying around such
baggage in production builds of Python, since you'd most
likely only want to use the tools to debug C extensions during
their development.

>>> Seriously, this may work if C extensions use the APIs
>>> consistently, but in order to tell, we'd need to check
>>> few.
>>
>> Can you suggest me names of projects that must be tested?

See above for a list of starters :-)

It would be good to add a few more that work on text or
larger chunks of memory, since those will most likely utilize
the memory allocators more than other extensions which mostly
wrap (sets of) C variables.

Some of them may also have benchmarks, so in addition to
checking whether they work with the change, you could also
test performance.

>>> I guess the main question then is whether pymalloc is good enough
>>> for general memory allocation needs; and the answer may well be
>>> "yes".
>>
>> What do you mean by "good enough"? For the runtime performance,
>> pymalloc looks to be faster than malloc(). What are your other
>> criterias? Memory fragmentation?

Runtime performance, difference in memory consumption (arenas
cannot be freed if there are still small chunks allocated),
memory locality. I'm no expert in this, so can't really
comment much.

I suspect that lib C and OS provided allocators will have
advantages as well, but since pymalloc redirects to them for
all larger memory chunks, it's probably an overall win for
Python C extensions (and Python itself).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Feb 12 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/

2016-01-19: Released eGenix pyOpenSSL 0.13.13 ... http://egenix.com/go86

::: We implement business ideas - efficiently in both time and costs :::

Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-12 Thread Victor Stinner
ping?

2016-02-08 15:18 GMT+01:00 Victor Stinner :
> 2016-02-04 15:05 GMT+01:00 M.-A. Lemburg :
>> Sometimes, yes, but we also do allocations for e.g.
>> parsing values in Python argument tuples (e.g. using
>> "es" or "et"):
>>
>> https://docs.python.org/3.6/c-api/arg.html
>>
>> We do document to use PyMem_Free() on those; not sure whether
>> everyone does this though.
>
> It's well documented. If programs start to crash, they must be fixed.
>
> I don't propose to "break the API" for free, but to get a speedup on
> the overall Python.
>
> And I don't think that we can say that it's an API change, since we
> already stated that PyMem_Free() must be used.
>
> If your program has bugs, you can use a debug build of Python 3.5 to
> detect misusage of the API.
>
>
>> The Python test suite doesn't test Python C extensions,
>> so it's not surprising that it passes :-)
>
> What do you mean by "C extensions"? Which modules?
>
> Many modules in the stdlib have "C accelerators" and the PEP 399 now
> *require* to test the C and Python implementations.
>
>
>
>>> Instead of teaching developers that well, in fact, PyObject_Malloc()
>>> is unrelated to object programming, I think that it's simpler to
>>> modify PyMem_Malloc() to reuse pymalloc ;-)
>>
>> Perhaps if you add some guards somewhere :-)
>
> We have runtime checks but only implemented in debug mode for efficiency.
>
> By the way, I proposed once to add an environment variable to allow to
> enable these checks without having to recompile Python.  Since the PEP
> 445, it became easy to implement this. What do you think?
> https://www.python.org/dev/peps/pep-0445/#add-a-new-pydebugmalloc-environment-variable
>
> "This alternative was rejected because a new environment variable
> would make Python initialization even more complex. PEP 432 tries to
> simplify the CPython startup sequence."
>
> The PEP 432 looks stuck, so I don't think that we should block
> enhancements because of this PEP. Anyway, my idea should be easy to
> implement.
>
>
>> Seriously, this may work if C extensions use the APIs
>> consistently, but in order to tell, we'd need to check
>> few.
>
> Can you suggest me names of projects that must be tested?
>
>
>> I guess the main question then is whether pymalloc is good enough
>> for general memory allocation needs; and the answer may well be
>> "yes".
>
> What do you mean by "good enough"? For the runtime performance,
> pymalloc looks to be faster than malloc(). What are your other
> criterias? Memory fragmentation?
>
>
> Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-08 Thread Victor Stinner
2016-02-07 9:22 GMT+01:00 Stefan Behnel :
> Note that the PyObject_Malloc() functions have never been documented.

Yeah, there is an old bug to track this:
http://bugs.python.org/issue20064

> And, for example, the "what's new in 2.5" document says:
>
> """
> Python’s API has many different functions for allocating memory that are
> grouped into families. For example, PyMem_Malloc(), PyMem_Realloc(), and
> PyMem_Free() are one family that allocates raw memory, while
> PyObject_Malloc(), PyObject_Realloc(), and PyObject_Free() are another
> family that’s supposed to be used for creating Python objects.
> """
>
> I don't think there are many extensions out there in which *object* memory
> gets allocated manually, which implicitly puts a pretty clear "don't use"
> marker on these functions.

Should I understand that it's another good reason to make
PyMem_Malloc() faster for everyone?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-08 Thread Victor Stinner
2016-02-04 15:05 GMT+01:00 M.-A. Lemburg :
> Sometimes, yes, but we also do allocations for e.g.
> parsing values in Python argument tuples (e.g. using
> "es" or "et"):
>
> https://docs.python.org/3.6/c-api/arg.html
>
> We do document to use PyMem_Free() on those; not sure whether
> everyone does this though.

It's well documented. If programs start to crash, they must be fixed.

I don't propose to "break the API" for free, but to get a speedup on
the overall Python.

And I don't think that we can say that it's an API change, since we
already stated that PyMem_Free() must be used.

If your program has bugs, you can use a debug build of Python 3.5 to
detect misusage of the API.


> The Python test suite doesn't test Python C extensions,
> so it's not surprising that it passes :-)

What do you mean by "C extensions"? Which modules?

Many modules in the stdlib have "C accelerators" and the PEP 399 now
*require* to test the C and Python implementations.



>> Instead of teaching developers that well, in fact, PyObject_Malloc()
>> is unrelated to object programming, I think that it's simpler to
>> modify PyMem_Malloc() to reuse pymalloc ;-)
>
> Perhaps if you add some guards somewhere :-)

We have runtime checks but only implemented in debug mode for efficiency.

By the way, I proposed once to add an environment variable to allow to
enable these checks without having to recompile Python.  Since the PEP
445, it became easy to implement this. What do you think?
https://www.python.org/dev/peps/pep-0445/#add-a-new-pydebugmalloc-environment-variable

"This alternative was rejected because a new environment variable
would make Python initialization even more complex. PEP 432 tries to
simplify the CPython startup sequence."

The PEP 432 looks stuck, so I don't think that we should block
enhancements because of this PEP. Anyway, my idea should be easy to
implement.


> Seriously, this may work if C extensions use the APIs
> consistently, but in order to tell, we'd need to check
> few.

Can you suggest me names of projects that must be tested?


> I guess the main question then is whether pymalloc is good enough
> for general memory allocation needs; and the answer may well be
> "yes".

What do you mean by "good enough"? For the runtime performance,
pymalloc looks to be faster than malloc(). What are your other
criterias? Memory fragmentation?


Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-07 Thread Stefan Behnel
M.-A. Lemburg schrieb am 04.02.2016 um 13:54:
> On 04.02.2016 13:29, Victor Stinner wrote:
>> But, why not PyObject_Malloc() & PObject_Free() were not used in the
>> first place?
> 
> Good question. I guess developers simply thought of PyObject_Malloc()
> being for PyObjects, not arbitrary memory buffers, most likely
> because pymalloc was advertised as allocator for Python objects,
> not random chunks of memory.

Note that the PyObject_Malloc() functions have never been documented.
(Well, there are references regarding their mere existence in the docs, but
nothing more than that.)

https://docs.python.org/3.6/search.html?q=pyobject_malloc&check_keywords=yes&area=default

And, for example, the "what's new in 2.5" document says:

"""
Python’s API has many different functions for allocating memory that are
grouped into families. For example, PyMem_Malloc(), PyMem_Realloc(), and
PyMem_Free() are one family that allocates raw memory, while
PyObject_Malloc(), PyObject_Realloc(), and PyObject_Free() are another
family that’s supposed to be used for creating Python objects.
"""

I don't think there are many extensions out there in which *object* memory
gets allocated manually, which implicitly puts a pretty clear "don't use"
marker on these functions.

Stefan


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-04 Thread M.-A. Lemburg
On 04.02.2016 14:25, Victor Stinner wrote:
> Thanks for your feedback, you are asking good questions :-)
> 
> 2016-02-04 13:54 GMT+01:00 M.-A. Lemburg :
>>> There are 536 calls to the functions PyMem_Malloc(), PyMem_Realloc()
>>> and PyMem_Free().
>>>
>>> I would prefer to modify a single place having to replace 536 calls :-/
>>
>> You have a point there, but I don't think it'll work out
>> that easily, since we are using such calls to e.g. pass
>> dynamically allocated buffers to code in extensions (which then
>> have to free the buffers again).
> 
> Ah, interesting. But I'm not sure that we delegate the responsability
> of freeing the memory to external libraries. Usually, it's more the
> opposite: a library gives us an allocated memory block, and we have to
> free it. No?

Sometimes, yes, but we also do allocations for e.g.
parsing values in Python argument tuples (e.g. using
"es" or "et"):

https://docs.python.org/3.6/c-api/arg.html

We do document to use PyMem_Free() on those; not sure whether
everyone does this though.

> I checked if we call directly malloc() to pass the buffer to a
> library, but I failed to find such case.
>
> Again, in debug mode, calling free() on a memory block allocated by
> PyMem_Malloc() will likely crash. Since we run the Python test suite
> with a Python compiled in debug mode, we would already have detected
> such bug, no?

The Python test suite doesn't test Python C extensions,
so it's not surprising that it passes :-)

> See also my old issue http://bugs.python.org/issue18203 which replaced
> almost all direct calls to malloc() with PyMem_Malloc() or
> PyMem_RawMalloc().
> 
>> Good question. I guess developers simply thought of PyObject_Malloc()
>> being for PyObjects,
> 
> Yeah, I also understood that, but in practice, it looks like
> PyMem_Malloc() is slower than so using it makes the code less
> efficient than it can be.
> 
> Instead of teaching developers that well, in fact, PyObject_Malloc()
> is unrelated to object programming, I think that it's simpler to
> modify PyMem_Malloc() to reuse pymalloc ;-)

Perhaps if you add some guards somewhere :-)

Seriously, this may work if C extensions use the APIs
consistently, but in order to tell, we'd need to check
few. I know that I switched over all mx Extensions to
use PyObject_*() instead of PyMem_*() or native malloc()
several years ago and have not run into any issues.

I guess the main question then is whether pymalloc is good enough
for general memory allocation needs; and the answer may well be
"yes".

BTW: Tuning pymalloc for commonly used object sizes is
another area where Python could gain better performance,
i.e. reserve more / pre-allocate space for often used block
sizes. pymalloc will also only work well for small blocks
(up to 512 bytes). Everything else is routed to the
system malloc().

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Feb 04 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-04 Thread Victor Stinner
Thanks for your feedback, you are asking good questions :-)

2016-02-04 13:54 GMT+01:00 M.-A. Lemburg :
>> There are 536 calls to the functions PyMem_Malloc(), PyMem_Realloc()
>> and PyMem_Free().
>>
>> I would prefer to modify a single place having to replace 536 calls :-/
>
> You have a point there, but I don't think it'll work out
> that easily, since we are using such calls to e.g. pass
> dynamically allocated buffers to code in extensions (which then
> have to free the buffers again).

Ah, interesting. But I'm not sure that we delegate the responsability
of freeing the memory to external libraries. Usually, it's more the
opposite: a library gives us an allocated memory block, and we have to
free it. No?

I checked if we call directly malloc() to pass the buffer to a
library, but I failed to find such case.

Again, in debug mode, calling free() on a memory block allocated by
PyMem_Malloc() will likely crash. Since we run the Python test suite
with a Python compiled in debug mode, we would already have detected
such bug, no?

See also my old issue http://bugs.python.org/issue18203 which replaced
almost all direct calls to malloc() with PyMem_Malloc() or
PyMem_RawMalloc().


> Good question. I guess developers simply thought of PyObject_Malloc()
> being for PyObjects,

Yeah, I also understood that, but in practice, it looks like
PyMem_Malloc() is slower than so using it makes the code less
efficient than it can be.

Instead of teaching developers that well, in fact, PyObject_Malloc()
is unrelated to object programming, I think that it's simpler to
modify PyMem_Malloc() to reuse pymalloc ;-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-04 Thread M.-A. Lemburg
On 04.02.2016 13:29, Victor Stinner wrote:
> Hi,
> 
> 2016-02-04 11:17 GMT+01:00 M.-A. Lemburg :
>>> Do you see any drawback of using pymalloc for PyMem_Malloc()?
>>
>> Yes: You cannot free memory allocated using pymalloc with the
>> standard C lib free().
> 
> That's not completly new.
> 
> If Python is compiled in debug mode, you get a fatal error with a huge
> error message if you free the memory allocated by PyMem_Malloc() using
> PyObject_Free() or PyMem_RawFree().
> 
> But yes, technically it's possible to use free() when Python is *not*
> compiled in debug mode.

Debug mode is a completely different beast ;-)

>> It would be better to go through the list of PyMem_*() calls
>> in Python and replace them with PyObject_*() calls, where
>> possible.
> 
> There are 536 calls to the functions PyMem_Malloc(), PyMem_Realloc()
> and PyMem_Free().
> 
> I would prefer to modify a single place having to replace 536 calls :-/

You have a point there, but I don't think it'll work out
that easily, since we are using such calls to e.g. pass
dynamically allocated buffers to code in extensions (which then
have to free the buffers again).

>>> Does anyone recall the rationale to have two families to memory allocators?
>>
>> The PyMem_*() APIs were needed to have a cross-platform malloc()
>> implementation which returns standard C lib free()able memory,
>> but also behaves well when passing 0 as size.
> 
> Yeah, PyMem_Malloc() & PyMem_Free() help to have a portable behaviour.
> But, why not PyObject_Malloc() & PObject_Free() were not used in the
> first place?

Good question. I guess developers simply thought of PyObject_Malloc()
being for PyObjects, not arbitrary memory buffers, most likely
because pymalloc was advertised as allocator for Python objects,
not random chunks of memory.

Also: PyObject_*() APIs were first introduced with pymalloc, and
no one really was interested in going through all the calls to
PyMem_*() APIs and convert those to use the new pymalloc at the
time.

All this happened between Python 1.5.2 and 2.0.

One of the reasons probably also was that pymalloc originally
did not return memory back to the system malloc(). This was
changed only some years ago.

> An explanation can be that PyMem_Malloc() can be called without the
> GIL held. But it wasn't true before Python 3.4, since PyMem_Malloc()
> called (indirectly) PyObject_Malloc() when Python was compiled in
> debug mode, and PyObject_Malloc() requires the GIL to be held.
> 
> When I wrote the PEP 445, there was a discussion about the GIL. It was
> proposed to allow to call PyMem_xxx() without the GIL:
> https://www.python.org/dev/peps/pep-0445/#gil-free-pymem-malloc
> 
> This option was rejected.

AFAIR, the GIL was not really part of the consideration at the time.
We used pymalloc for PyObject allocation, that's all.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Feb 04 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-04 Thread Victor Stinner
Hi,

2016-02-04 11:17 GMT+01:00 M.-A. Lemburg :
>> Do you see any drawback of using pymalloc for PyMem_Malloc()?
>
> Yes: You cannot free memory allocated using pymalloc with the
> standard C lib free().

That's not completly new.

If Python is compiled in debug mode, you get a fatal error with a huge
error message if you free the memory allocated by PyMem_Malloc() using
PyObject_Free() or PyMem_RawFree().

But yes, technically it's possible to use free() when Python is *not*
compiled in debug mode.


> It would be better to go through the list of PyMem_*() calls
> in Python and replace them with PyObject_*() calls, where
> possible.

There are 536 calls to the functions PyMem_Malloc(), PyMem_Realloc()
and PyMem_Free().

I would prefer to modify a single place having to replace 536 calls :-/


>> Does anyone recall the rationale to have two families to memory allocators?
>
> The PyMem_*() APIs were needed to have a cross-platform malloc()
> implementation which returns standard C lib free()able memory,
> but also behaves well when passing 0 as size.

Yeah, PyMem_Malloc() & PyMem_Free() help to have a portable behaviour.
But, why not PyObject_Malloc() & PObject_Free() were not used in the
first place?

An explanation can be that PyMem_Malloc() can be called without the
GIL held. But it wasn't true before Python 3.4, since PyMem_Malloc()
called (indirectly) PyObject_Malloc() when Python was compiled in
debug mode, and PyObject_Malloc() requires the GIL to be held.

When I wrote the PEP 445, there was a discussion about the GIL. It was
proposed to allow to call PyMem_xxx() without the GIL:
https://www.python.org/dev/peps/pep-0445/#gil-free-pymem-malloc

This option was rejected.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-04 Thread M.-A. Lemburg
On 03.02.2016 22:03, Victor Stinner wrote:
> Hi,
> 
> There is an old discussion about the performance of PyMem_Malloc()
> memory allocator. CPython is stressing a lot memory allocators. Last
> time I made statistics, it was for the PEP 454:
> "For example, the Python test suites calls malloc() , realloc() or
> free() 270,000 times per second in average."
> https://www.python.org/dev/peps/pep-0454/#log-calls-to-the-memory-allocator
> 
> I proposed a simple change: modify PyMem_Malloc() to use the pymalloc
> allocator which is faster for allocation smaller than 512 bytes, or
> fallback to malloc() (which is the current internal allocator of
> PyMem_Malloc()).
> 
> This tiny change makes Python up to 6% faster on some specific (macro)
> benchmarks, and it doesn't seem to make Python slower on any
> benchmark:
> http://bugs.python.org/issue26249#msg259445
> 
> Do you see any drawback of using pymalloc for PyMem_Malloc()?

Yes: You cannot free memory allocated using pymalloc with the
standard C lib free().

It would be better to go through the list of PyMem_*() calls
in Python and replace them with PyObject_*() calls, where
possible.

> Does anyone recall the rationale to have two families to memory allocators?

The PyMem_*() APIs were needed to have a cross-platform malloc()
implementation which returns standard C lib free()able memory,
but also behaves well when passing 0 as size.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Feb 04 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

2016-02-03 Thread Victor Stinner
> There is an old discussion about the performance of PyMem_Malloc() memory 
> allocator.

Oops, I forgot to mention that my patch is a follow-up of a previous
patch showing nice speedup on dict:
http://bugs.python.org/issue23601
(but I said it in my issue ;-))

Well, see http://bugs.python.org/issue26249 for the longer context.


2016-02-03 22:03 GMT+01:00 Victor Stinner :
> Does anyone recall the rationale to have two families to memory allocators?

I asked Mercurial, and I found the change addind PyMem_Malloc():
---
branch:  legacy-trunk
user:Guido van Rossum 
date:Tue Aug 05 01:59:22 1997 +
files:   Include/mymalloc.h
description:
Added Py_Malloc and friends as well as PyMem_Malloc and friends.
---

As expected, it's old, as the change adding PyObject_Malloc():
---
changeset:   12576:1c7c2dd1beb1
branch:  legacy-trunk
user:Guido van Rossum 
date:Wed May 03 23:44:39 2000 +
files:   Include/mymalloc.h Include/objimpl.h
Modules/_cursesmodule.c Modules/_sre.c Modules/_tkinter.c
Modules/almodule.c Modules/arraymodule.c Modules/bsddbmodule.
description:
Vladimir Marangozov's long-awaited malloc restructuring.
For more comments, read the patc...@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.

(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode.  I'm also holding back on his
change to main.c, which seems unnecessary to me.)
---

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com