[issue3119] pickle.py is limited by python's call stack

2016-07-26 Thread Tomas Gavenciak

Tomas Gavenciak added the comment:

The issue is still present in Python 2.7.12 and Python 3.5.2, and the 
implementation has not been changed in the master branch either.
You can test it with the attached program constructing a graph (simplified, but 
a realistic application), or with the following obviously deep construction:

import pickle, _pickle
a=()
for i in range(1000): a = (a,)
pickle.dumps(a) # or _pickle.dumps(a)

Any further comments or thoughts on the solution?

--
versions: +Python 3.6 -Python 3.1
Added file: http://bugs.python.org/file43897/graphtest.py

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue3119>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3119] pickle.py is limited by python's call stack

2016-06-10 Thread Tomas Gavenciak

Tomas Gavenciak added the comment:

Hey all,

I would like to patch the C _pickle module to be non-recursive and help this 
patch go through. I have an idea on how to do that with a very small amount of 
changes below and I would like to get feedback and improvements before 
implementing it in a particular way or getting into details (basically to check 
if that is acceptable and whether I missed something major).

Also, what is missing in the Python patch to get it merged?

== Nonrecursive _pickle:

First, observe that all the save_* recursive calls do not rely on the return 
value of save_* except for error checking (which is propagated as -1 to the 
top). Also, almost all work in save_* is done before calling save() recursively 
(most of the work after calling save()s is decref or other minor cleanup).

I would propose a linked list of structures acting as a stack of objects and 
byte-sequences to be written (for the marks and opcodes). Imagine something 
like the following (a rough sketch, I need to check all the required info 
properly later):

struct _PickleStack {_PickleStack* next; PyObject *obj; char write_opcode; /* 
... objects to decref / cleanup after the write */ /* possibly pers_save flags 
*/ };

The pickle stack would be a part of Pickler and save() would push a value onto 
that stack instead of recursion. (The order of save() and calls within save_*() 
needs to be reversed, though.)

Most objects would push a fixed number of objects to the pickle stack, for 
lists and dicts I would make pickle stack entries for all the items at once 
(e.g. as in batch_list()). An alternative would be to store the last item and 
the open iterator, but that seems wasteful.

This should not slow things down significantly (if at all), as the only 
overhead is having own stack of small structures (likely less than stackframe 
size of save_*). If frequent de/allocation will be a problem, I would modify 
the allocation strategy later (to blocks or such).

Are there any issues with changing the working of save() internally? The 
functionality of reduce, custom Picklers, persistent ids and dispatch tables 
should stay the same. The unpickling is not recursive and needs no changes (and 
has its own custom stack impl).

Looking forward to your comments.

--
nosy: +gavento, serhiy.storchaka

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue3119>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com