[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


STINNER Victor  added the comment:

We decided to only disable the code by default, but the code stays until we are 
sure that nobody uses it.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset e8f9acf03484c6c3f163f04a76321419369c28aa by Victor Stinner in 
branch 'master':
bpo-36611: Disable serialno field of debug memory allocators (#12796)
https://github.com/python/cpython/commit/e8f9acf03484c6c3f163f04a76321419369c28aa


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread miss-islington


miss-islington  added the comment:


New changeset 7182e653fb5c6f78f05892b6ed302fc8db8978d3 by Miss Islington (bot) 
in branch '3.7':
bpo-36611: Fix test_sys.test_getallocatedblocks() (GH-12797)
https://github.com/python/cpython/commit/7182e653fb5c6f78f05892b6ed302fc8db8978d3


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread miss-islington


Change by miss-islington :


--
pull_requests: +12729

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 9b8314cfe29ca532fc335277f6c36b72e6132922 by Victor Stinner in 
branch 'master':
bpo-36611: Fix test_sys.test_getallocatedblocks() (GH-12797)
https://github.com/python/cpython/commit/9b8314cfe29ca532fc335277f6c36b72e6132922


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


STINNER Victor  added the comment:

The serialno has been added at the same time than the whole debug hooks on 
Python memory allocated by Tim Peters in 2002, 17 years ago:

commit ddea208be9e2a8fa281e25ebbc890378dd2aa286
Author: Tim Peters 
Date:   Sat Mar 23 10:03:50 2002 +

Give Python a debug-mode pymalloc, much as sketched on Python-Dev.

When WITH_PYMALLOC is defined, define PYMALLOC_DEBUG to enable the debug
allocator.  This can be done independent of build type (release or debug).
A debug build automatically defines PYMALLOC_DEBUG when pymalloc is
enabled.  It's a detected error to define PYMALLOC_DEBUG when pymalloc
isn't enabled.

Two debugging entry points defined only under PYMALLOC_DEBUG:

+ _PyMalloc_DebugCheckAddress(const void *p) can be used (e.g., from gdb)
  to sanity-check a memory block obtained from pymalloc.  It sprays
  info to stderr (see next) and dies via Py_FatalError if the block is
  detectably damaged.

+ _PyMalloc_DebugDumpAddress(const void *p) can be used to spray info
  about a debug memory block to stderr.

A tiny start at implementing "API family" checks isn't good for
anything yet.

_PyMalloc_DebugRealloc() has been optimized to do little when the new
size is <= old size.  However, if the new size is larger, it really
can't call the underlying realloc() routine without either violating its
contract, or knowing something non-trivial about how the underlying
realloc() works.  A memcpy is always done in this case.

This was a disaster for (and only) one of the std tests:  test_bufio
creates single text file lines up to a million characters long.  On
Windows, fileobject.c's get_line() uses the horridly funky
getline_via_fgets(), which keeps growing and growing a string object
hoping to find a newline.  It grew the string object 1000 bytes each
time, so for a million-character string it took approximately forever
(I gave up after a few minutes).

So, also:

fileobject.c, getline_via_fgets():  When a single line is outrageously
long, grow the string object at a mildly exponential rate, instead of
just 1000 bytes at a time.

That's enough so that a debug-build test_bufio finishes in about 5 seconds
on my Win98SE box.  I'm curious to try this on Win2K, because it has very
different memory behavior than Win9X, and test_bufio always took a factor
of 10 longer to complete on Win2K.  It *could* be that the endless
reallocs were simply killing it on Win2K even in the release build.

--
nosy: +tim.peters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


STINNER Victor  added the comment:

This issue is related to the following thread on python-dev which discuss 
disabling Py_TRACE_REFS by default, bpo-36465, to reduce the memory footprint 
in debug mode:
https://mail.python.org/pipermail/python-dev/2019-April/157015.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread Inada Naoki


Inada Naoki  added the comment:

I never used the serialno too.

--
nosy: +inada.naoki

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


STINNER Victor  added the comment:

While testing my changes, I found a bug in test_sys:

./python -X tracemalloc -m test test_sys -v -m test_getallocatedblocks

==
ERROR: test_getallocatedblocks (test.test_sys.SysModuleTest)
--
Traceback (most recent call last):
  File "/home/vstinner/prog/python/master/Lib/test/test_sys.py", line 770, in 
test_getallocatedblocks
alloc_name = _testcapi.pymem_getallocatorsname()
RuntimeError: cannot get allocators name

Attached PR 12797 fix it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +12724

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


STINNER Victor  added the comment:

I wrote 2 pull requests for the two options:

* PR 12795 removes serialno field
* PR 12796 adds PYMEM_DEBUG_SERIALNO define which is unset by default

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +12723

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


Change by STINNER Victor :


--
keywords: +patch
pull_requests: +12722
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36611] Debug memory allocators: remove useless "serialno" field to reduce memory footprint

2019-04-12 Thread STINNER Victor


New submission from STINNER Victor :

When PYTHONMALLOC=debug environment variable or -X dev command line option is 
used, Python installs debug hooks on memory allocators which add 2 size_t 
before and 2 size_t after each memory block: it adds 32 bytes to every memory 
allocation.

I'm debugging crashes and memory leaks in CPython for 10 years, and I simply 
never had to use "serialno". So I simply propose attached pull request to 
remove it to reduce the memory footprint: I measured a reduction around -5% 
(ex: 1.2 MiB on 33.0 MiB when running test_asyncio). A smaller memory footprint 
allows to use this feature on devices with small memory, like embedded devices.

The change also fix race condition in debug memory allocators: bpo-31473, 
"Debug hooks on memory allocators are not thread safe (serialno variable)".

Using tracemalloc, it is already possible (since Python 3.6) to find where a 
memory block has been allocated, and so decide where to put a breakpoint when 
debugging.

If someone cares about the "serialno" field, maybe we can keep code using a 
compilation flag, like a C #define.

"serialno" is documented as: "an excellent way to set a breakpoint on the next 
run, to capture the instant at which this block was passed out." But again, I 
never used it...

--

Some examples of the *peak* memory usage without => with the change:

* -c pass: 2321.8 kB => 2437.1 kB (-115.3 kiB, -5%)
* -m test test_os test_sys: 14252.3 kB => 13598.6 kB (-653.7 kiB, -5%)
* -m test test_asyncio: 34194.2 kB => 32963.1 kB (-1231.1 kiB, -4%)

Command used to measure the memory consumption:

$ ./python -i -X tracemalloc -c pass
>>> import tracemalloc; print("%.1f kB" % (tracemalloc.get_traced_memory()[1] / 
>>> 1024.))

With the patch:

diff --git a/Modules/_tracemalloc.c b/Modules/_tracemalloc.c
index c5d5671032..e010c2ef84 100644
--- a/Modules/_tracemalloc.c
+++ b/Modules/_tracemalloc.c
@@ -582,6 +582,8 @@ tracemalloc_add_trace(unsigned int domain, uintptr_t ptr,
 _Py_hashtable_entry_t* entry;
 int res;
 
+size += 4 * sizeof(size_t);
+
 assert(_Py_tracemalloc_config.tracing);
 
 traceback = traceback_new();

Replace 4 with 3 to measure memory used with the change.

--

Since Python 3.6, when the debug memory allocator detects a bug (ex: buffer 
overflow), it now also displays the Python traceback where the memory block has 
been allocated if tracemalloc is tracing Python memory allocations.

Example with buffer_overflow.py:
---
import _testcapi

def func():
_testcapi.pymem_buffer_overflow()

def main():
func()

if __name__ == "__main__":
main()
---

Output:
---
$ ./python -X tracemalloc=10 -X dev bug.py

Debug memory block at address p=0x7f45e85c3270: API 'm'
16 bytes originally requested
The 7 pad bytes at p-7 are FORBIDDENBYTE, as expected.
The 8 pad bytes at tail=0x7f45e85c3280 are not all FORBIDDENBYTE (0xfd):
at tail+0: 0x78 *** OUCH
at tail+1: 0xfd
at tail+2: 0xfd
at tail+3: 0xfd
at tail+4: 0xfd
at tail+5: 0xfd
at tail+6: 0xfd
at tail+7: 0xfd
Data at p: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd

Memory block allocated at (most recent call first):
  File "bug.py", line 4
  File "bug.py", line 7
  File "bug.py", line 10

Fatal Python error: bad trailing pad byte

Current thread 0x7f45f5660740 (most recent call first):
  File "bug.py", line 4 in func
  File "bug.py", line 7 in main
  File "bug.py", line 10 in 
Aborted (core dumped)
---

The interesting part is "Memory block allocated at (most recent call first):". 
Traceback reconstructed manually:
---
Memory block allocated at (most recent call first):
  File "bug.py", line 4
_testcapi.pymem_buffer_overflow()
  File "bug.py", line 7
func()
  File "bug.py", line 10
main()
---

You can see exactly where the memory block has been allocated.

Note: Internally, the _PyTraceMalloc_GetTraceback() function is used to get the 
traceback where a memory block has been allocated.

--

Extract of _PyMem_DebugRawAlloc() in Objects/obmalloc.c:

/* Let S = sizeof(size_t).  The debug malloc asks for 4*S extra bytes and
   fills them with useful stuff, here calling the underlying malloc's result p:

p[0: S]
Number of bytes originally asked for.  This is a size_t, big-endian (easier
to read in a memory dump).
p[S]
API ID.  See PEP 445.  This is a character, but seems undocumented.
p[S+1: 2*S]
Copies of FORBIDDENBYTE.  Used to catch under- writes and reads.
p[2*S: 2*S+n]
The requested memory, filled with copies of CLEANBYTE.
Used to catch reference to uninitialized memory.
[2*S] is returned.  Note that this is 8-byte aligned if pymalloc
handled the request itself.
p[2*S+n: 2*S+n+S]
Copies of FORBIDDENBYTE.  Used to catch over- writes and reads.
p[2*S+n+S: 2*S+n+2*S]
A serial number, incremented by 1 on each call to _PyMem_DebugMalloc
and _PyMem_DebugRealloc.
This is a big-endian size_t.
If "bad