> Von: Guido van Rossum [mailto:gu...@python.org] > > I consider this is a bug -- a violation of Python's (informal) promise to the > user > that when CPython segfaults it is not the user's fault.
Strictly it is not a segfault, just a parser exception that cannot be caught (at least I failed to catch it in a quick test). Seems that the catch block is parsed after parsing the problematic code, so any "except" in the code itself is useless. Apart from that: even when caught, what to do? Your program partially refuses to load - only benefit is that you can die gracefully. > Given typical Python usage patterns, I don't consider this an important bug, > but maybe someone is interested in trying to fix it. Acknowledged: I do not know of any software, where this has high relevance, but my knowledge is quite limited, so asked PSRT before to be sure. > As far as your application is concerned, I'm not sure that generating code > like > that is the right approach. Why don't you generate a data structure and a > little > engine that walks the data structure? That's what I told the colleague asking me to assist in analysis of the crash too. I guess that the "simple generator" was just easier to write, thus used as a starting point. And now by chance a model was generated hitting the Python limit of 50 instantiations/lists per statement or whatsoever. So there is not much "why" to be explained, it just happened. Kind regards, Roman > On Wed, Jun 27, 2018 at 12:05 AM Fiedler Roman <roman.fied...@ait.ac.at > <mailto:roman.fied...@ait.ac.at> > wrote: > > > Hello List, > > Context: we are conducting machine learning experiments that > generate some kind of nested decision trees. As the tree includes specific > decision elements (which require custom code to evaluate), we decided to > store the decision tree (result of the analysis) as generated Python code. > Thus > the decision tree can be transferred to sensor nodes (detectors) that will > then > filter data according to the decision tree when executing the given code. > > Tracking down a crash when executing that generated code, we came > to following simplified reproducer that will cause the interpreter to crash > (on > Python 2/3) when loading the code before execution is started: > > #!/usr/bin/python2 -BEsStt > A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A > ([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(No > ne)])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) > > The error message is: > > s_push: parser stack overflow > MemoryError > > Despite the machine having 16GB of RAM, the code cannot be loaded. > Splitting it into two lines using an intermediate variable is the current > workaround to still get it running after manual adapting. > > As discussed on Python security list, crashes when loading such > decision trees or also mathematical formulas (see bug report [1]) should not > be a security problem. Even when not directly covered in the Python security > model documentation [2], this case comes too close to "arbitrary code > execution", where Python does not attempt to provide any protection. There > might be only some border cases of affected software, e.g. Python sandbox > systems like Zope/Plone or maybe even Python based smart contract > blockchains like Etherereum (do not know if/where the use/derived work > from the default Python interpreter for their use). But in both cases they > would also be too close violating the security model, thus no changes to > Python required from this side. Thus Python security suggested that the > discussion should be continued on this list. > > > Even when no security problem involved, the crash is still quite an > annoyance. Development of code generators can be a tedious tasks. It is then > somehow frustrating, when your generated code is not accepted by the > interpreter, even when you do not feel like getting close to some system- > relevant limits, e.g. 50 elements in a line like above on a 16GB machine. You > may adapt the generator, but as the error does not include any information, > which limit you really violated (number of brackets, function calls, list > definitions?) you can only do experiments or look on the Python compiler > code to figure that out. Even when you fix it, you have no guarantee to hit > some other obscure limit the next day or that those limits change from one > Python minor version to the next causing regressions. > > Questions: > > * Do you deem it possible/sensible to even attempt to write a Python > language code generator that will produce non-malicious, syntactically valid > decision tree code/mathematical formulas and still having a sufficiently high > probability that the Python interpreter will also run that code now and in > near > future (regressions)? > > * Assuming yes to the question above, when generating code, what > should be the maximal nesting depth a code generator can always expect to > be compiled on Python 2.7 and 3.5 on? Are there any other similar > restrictions that need to be considered by the code generator? Or is > generating code that way not the preferred solution anyway - the code > generator should generate e.g. binary python code immediately? Note: in the > end the exact same logic code will run as Python process, it seems it is only > about how it is loaded into the Python interpreter. > > * If not possible/recommended/sensible, we might generate Java- > bytecode or native x86-code instead, where the likelihood of the (virtual) CPU > really executing code that is compliant to the language specification (even > with CPU errata like FDIV-bug et al) might be magnitudes higher than with the > Python interpreter. > > Any feedback appreciated! > > Roman > > [1] https://bugs.python.org/issue3971) > [2] http://python-security.readthedocs.io/security.html#security- > model > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org <mailto:Python-ideas@python.org> > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > -- > > --Guido van Rossum (python.org/~guido <http://python.org/~guido> ) _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/