> Von: Nick Coghlan [mailto:ncogh...@gmail.com] > > On 27 June 2018 at 17:04, Fiedler Roman <roman.fied...@ait.ac.at> wrote: > > Hello List, > > > > Context: we are conducting machine learning experiments that generate > some kind of nested decision trees. As the tree includes specific decision > elements (which require custom code to evaluate), we decided to store the > decision tree (result of the analysis) as generated Python code. Thus the > decision tree can be transferred to sensor nodes (detectors) that will then > filter data according to the decision tree when executing the given code. > > > > Tracking down a crash when executing that generated code, we came to > following simplified reproducer that will cause the interpreter to crash (on > Python 2/3) when loading the code before execution is started: > > > > #!/usr/bin/python2 -BEsStt > > > A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([ > A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])]) > ])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) > > > > The error message is: > > > > s_push: parser stack overflow > > MemoryError > > > > Despite the machine having 16GB of RAM, the code cannot be loaded. > Splitting it into two lines using an intermediate variable is the current > workaround to still get it running after manual adapting. > > This seems like it may indicate a potential problem in the pgen2 > parser generator, since the compilation is failing at the original > parse step, but checking the largest version of this that CPython can > parse on my machine gives a syntax tree of only ~77kB: > > >>> tree = > parser.expr("A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A( > [A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])] > )])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])") > >>> sys.getsizeof(tree) > 77965 > > Attempting to print that hints more closely at the potential problem: > > >>> tree.tolist() > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > RecursionError: maximum recursion depth exceeded while getting the > repr of an object > > As far as I'm aware, the CPython parser is using the actual C stack > for recursion, and is hence throwing MemoryError because it ran out of > stack space to recurse into, not because it ran out of memory in > general (RecursionError would be a more accurate exception).
That seems conclusive. Knowing the cause but fearing regressions, maybe the code should not be changed regarding the limits (thus opening a can of worms) but something like that might be nice: * Raise RecursionError('Maximum supported compile time parser recursion depth of [X] exceeded, see [docuref]') * With the python-warn-all flag, issue a warning if a file reaches half or 75% of the limit during parsing? > Trying your original example in PyPy (which uses a different parser > implementation) suggests you may want to try using that as your > execution target before resorting to switching languages entirely: > > >>>> tree2 = > parser.expr("A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A( > [A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([ > A(None)])])])])])])])])])]]))])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) > ])") > >>>> len(tree2.tolist()) > 5 > > Alternatively, you could explore mimicking the way that scikit-learn > saves its trained models (which I believe is a variation on "use > pickle", but I've never actually gone and checked for sure). Thank you for your very informative post, both solutions/workaround seem appropriate. Apart from that, the "scikit-learn" might also have the advantage to use something more "standardizes", thus easing cooperation in scientific community. I will pass this information on to my colleague. LG Roman _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/