Olivier Grisel <olivier.gri...@ensta.org> added the comment: I wrote a script to monitor the memory when dumping 2GB of data with python master (C pickler and Python pickler):
``` (py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py Allocating source data... => peak memory usage: 2.014 GB Dumping to disk... done in 5.141s => peak memory usage: 4.014 GB (py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py --use-pypickle Allocating source data... => peak memory usage: 2.014 GB Dumping to disk... done in 5.046s => peak memory usage: 5.955 GB ``` This is using protocol 4. Note that the C pickler is only making 1 useless memory copy instead of 2 for the Python pickler (one for the concatenation and the other because of the framing mechanism of protocol 4). Here the output with the Python pickler fixed in python/cpython#4353: ``` (py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py --use-pypickle Allocating source data... => peak memory usage: 2.014 GB Dumping to disk... done in 6.138s => peak memory usage: 2.014 GB ``` Basically the 2 spurious memory copies of the Python pickler with protocol 4 are gone. Here is the script: https://gist.github.com/ogrisel/0e7b3282c84ae4a581f3b9ec1d84b45a ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue31993> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com