Olivier Grisel <olivier.gri...@ensta.org> added the comment:

I wrote a script to monitor the memory when dumping 2GB of data with python 
master (C pickler and Python pickler):

```
(py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py
Allocating source data...
=> peak memory usage: 2.014 GB
Dumping to disk...
done in 5.141s
=> peak memory usage: 4.014 GB
(py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py 
--use-pypickle
Allocating source data...
=> peak memory usage: 2.014 GB
Dumping to disk...
done in 5.046s
=> peak memory usage: 5.955 GB
```

This is using protocol 4. Note that the C pickler is only making 1 useless 
memory copy instead of 2 for the Python pickler (one for the concatenation and 
the other because of the framing mechanism of protocol 4).

Here the output with the Python pickler fixed in python/cpython#4353:

```
(py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py 
--use-pypickle
Allocating source data...
=> peak memory usage: 2.014 GB
Dumping to disk...
done in 6.138s
=> peak memory usage: 2.014 GB
```


Basically the 2 spurious memory copies of the Python pickler with protocol 4 
are gone.

Here is the script: 
https://gist.github.com/ogrisel/0e7b3282c84ae4a581f3b9ec1d84b45a

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31993>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to