[issue34707] Python not reentrant

2018-09-17 Thread john skaller


john skaller  added the comment:

eric: yes, that's relevant. Very happy it is discussed.

Contrary to some indication in the post, passing a context handle around 
everywhere is NOT a burden at all. My system does exactly that.

I would note, API's which already require, say, an interpreter handle, don't 
require any modification.

Also, I would note, legacy API's do not have to be broken, you just have a 
single, legacy, global variable holding the default context, and deprecate any 
functions using it.

--

___
Python tracker 
<https://bugs.python.org/issue34707>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34707] Python not reentrant

2018-09-16 Thread john skaller


New submission from john skaller :

Executive Summary: Python currently is not properly re-entrant. This comment 
applies to the CAPI and particularly embedding. A fix is not possible in Python 
3.x but should be scheduled for Python 4. On Linux all binary plugins are 
broken as well.

The fault is exhibited by the need to first call PyInitialise(). This is 
clearly wrong because there is nowhere to put the initialised data. The correct 
sequence should be to first create an interpreter handle, and then initialise 
that. Other API calls exhibit the same fault. For example PyErr_Occured().

Use of thread local storage is NOT enough.

A general embedding scenario is this: a thunk program is used to dynamically 
load a shared library and execute a function in it. That function may load 
other shared libraries. Note carefully there is no global data, the libraries 
are pure code. [This is not an imagined scenario, my whole system works this 
way]

The same library may be loaded several times. For example, A can load B and C, 
and both B and C can load D. Proper visibility control means A cannot see any 
symbols of D.

In this scenario if D wishes to run a Python interpreter, it must call 
PyInitialise(), and it will be called twice, since D is called twice, once from 
A, and once from B. Indeed, if the top level spawns multiple threads, it can be 
called many more times than that. 

Remember the libraries are pure code and fully reentrant. There is no way to 
record if a function has been called already.

In order for Python to be fully re-entrant there is a simple test: if the C 
code of the Python library contains ANY global variables at all then Python is 
wrong. Global variables INCLUDE thread local storage. ALL data and ALL 
functions must hang off a handle so that all functionality and behaviour is 
fully isolated to each handle.

Exceptions to the rule: poorly designed OS such as Unix have some non-reentrant 
features. The worst of these in Unix is signal handling. It is not possible to 
handle signals without a global variable to communicate between the signal 
handler and application. The right way to do this would have been to use a 
polling service to detect the signal. In any case systems like Python do have 
to work with badly designed API's sometimes and therefore these special cases 
do form legitimate exceptions to the requirement that the API be re-entrant. My 
recommendation is to provide a cheat API which looks re-entrant but actually 
isn't, because it delegates to a hidden lower level which isn't, of necessity. 
YMMV: how to handle bad underlying API's should be open for discussion.

Other consequences: On linux at least ALL plugin extensions are built 
incorrectly. The correct way to build a plugin requires explicitly linking 
against the Python library, so that symbols in the Python API can be found. 
These symbols must NOT be found in the application because this is, quite 
simply, not possible, if the application does not include those symbols. In my 
scenario, the top level application is three lines of C than does nothing other 
than load a library and run a fixed function in it. And that library has no 
idea that one of the libraries IT loads may call another library which happens 
to want to run some Python code. Indeed my system can *generate* Python 
modules, and compile and link them against the Python library, but it cannot 
load any existing plugins on Linux, because those plugins were incorrectly 
built and do not link to the Python library as they should. They expect to find 
symbols in the symbol table magically provided but those symbols are not there.

On OSX, however, it works. That is because on OSX, a --framework is used to 
contain the Python library and all plugins HAVE to be linked against the 
framework. I expect the Windows builds to work too, for the same reason (but 
I'm not sure).

This issue is related to the lack of re-entrancy because the same principle is 
broken in both cases. If you need a service, you must ask for it, and when you 
get it, it is exclusively yours.

--
components: Interpreter Core
messages: 325508
nosy: skaller
priority: normal
severity: normal
status: open
title: Python not reentrant
type: behavior
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue34707>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com