Hi,

Le mer. 13 mai 2020 à 02:04, Giampaolo Rodola' <g.rod...@gmail.com> a écrit :
> I would like to discuss a proposal regarding one aspect which AFAIK is 
> currently missing from cPython's test suite: the ability to detect memory 
> leaks of functions implemented in the C extension modules.

test.regrtest can be used to detect 3 kinds of leaks:

* reference leaks: use sys.gettotalrefcount()
* memory block leaks: use sys.getallocatedblocks()
* file descriptor leaks: use test.support.fd_count()

See Lib/test/libregrtest/refleak.py.

> Detecting a memory leak is no easy task, and that's because the process 
> memory fluctuates. Sometimes it may increase (or even decrease!) even if 
> there's no leak, I suppose because of how the OS handles memory, the Python's 
> garbage collector, the fact that RSS is an approximation, and who knows what 
> else. In order to compensate fluctuations I did the following: in case of 
> failure (mem > 0 after calling fun() N times) I retry the test for up to 5 
> times, increasing N (repetitions) each time, so I consider the test a failure 
> only if the memory keeps increasing across all runs. So for instance, here's 
> a legitimate failure:
>
>     
> psutil.tests.test_memory_leaks.TestModuleFunctionsLeaks.test_disk_partitions 
> ...
>     Run #1: extra-mem=696.0K, per-call=3.5K, calls=200
>     Run #2: extra-mem=1.4M, per-call=3.5K, calls=400
>     Run #3: extra-mem=2.1M, per-call=3.5K, calls=600
>     Run #4: extra-mem=2.7M, per-call=3.5K, calls=800
>     Run #5: extra-mem=3.4M, per-call=3.5K, calls=1000
>     FAIL

regrtest usually uses 3 test runs to "warmup" Python: fill caches.
Then it runs the test 3 more times and check for differences.

For references and memory blocks, it only consider that there is a
leak if each test run increased the counter difference by at least
one.

For file descriptor, it considers that there is a leak if any test run
changed a counter.

Before reading "counters", regrtest tries to clear "all" caches that
it knows in the stdlib. Examples: path importer cache, re module
cache, type method cache, etc. See dash_R_cleanup() function of
test.libregrtest.refleak.

Sadly, there are still a few false alarms time to time, like:

"test_functools leaked [1, 2, 1] memory blocks, sum=4"
https://bugs.python.org/issue36560


> This is the best I could come up with as a simple leak detection mechanism to 
> integrate with CI services, and keep more advanced tools like Valgrind out of 
> the picture (I just wanted to know if there's a leak, not to debug the leak 
> itself). In addition, since psutil is able to get the number of fds (UNIX) 
> and handles (Windows) opened by a process, I also run a separate set of tests 
> to make sure I didn't forget to call close(2) or CloseHandle() in C.

I tried to modify regrtest to check for leak of Windows handles:
https://bugs.python.org/issue18174

But many stdlib modules leak handles in various cases. I gave up when
I failed to fix a race condition in multiprocessing:
https://bugs.python.org/issue33966

The parent expects the child process to "steal" a handle, but
sometimes the child process is killed before it steals the handle...

--

I also tried to check for PyMem_RawMalloc() memory leaks, but it made
regrtest not reliable at all:
https://bugs.python.org/issue26850

I understood that CPython has many internal caches and regrtest fails
to clear them all, or it was something different. I never
investigated.


> Would something like this make sense to have in cPython? Here's a quick PoC I 
> put together just to show how this thing would look like in practice:
> https://github.com/giampaolo/cpython/pull/2/files
> A proper work in terms of API coverage would result being quite huge (test 
> all C modules), and ideally should also include cases where functions raise 
> an exception when being fed with an improper input. The biggest stopper here 
> is, of course, psutil, since it's a third party dep, but before getting to 
> that I wanted to see how this idea is perceived in general.

regrtest has many features, sadly it's not an officialy API. It would
be nice if someone could try to move some of its features into
unittest. Sadly, regrtest refleak feature also rely a lot on CPython
internals.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/X5COX25VKMH455Y7OBHJA2FFJOGNOLKC/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to