New submission from Thomas Wouters <tho...@python.org>:

This is a continuation, of sorts, of issue16421; adding most of that issue's 
audience to the noisy list.

When importing the same extension module under multiple names that share the 
same basename, Python 3 will call the extension module's init function multiple 
times. With extension modules that do not support re-initialisation, this 
causes them to trample all over their own state. In the case of numpy, this 
corrupts CPython internal data structures, like builtin types.

Simple reproducer:
% python3.6 -m venv numpy-3.6
% numpy-3.6/bin/python -m pip install numpy
% PYTHONPATH=./numpy-3.6/lib/python3.6/site-packages/numpy/core/ 
./numpy-3.6/bin/python -c "import numpy.core.multiarray, multiarray; u'' < 1"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
Segmentation fault

(The corruption happens because PyInit_multiarray initialises subclasses of 
builtin types, which causes them to share some data (e.g. tp_as_number) with 
the base class: 
https://github.com/python/cpython/blob/master/Objects/typeobject.c#L5277. 
Calling it a second time then copies data from a different class into that 
shared data, corrupting the base class: 
https://github.com/python/cpython/blob/master/Objects/typeobject.c#L4950. The 
Py_TPFLAGS_READY flag is supposed to protect against this, but 
PyInit_multiarray resets the tp_flags value. I ran into this because we have 
code that vendors numpy and imports it in two different ways.)

The specific case of numpy is somewhat convoluted and exacerbated by dubious 
design choices in numpy, but it is not hard to show that calling an extension 
module's PyInit function twice (if the module doesn't support reinitialisation 
through PEP 3121) is bad: any C globals initialised in the PyInit function will 
be trampled on.

This was not a problem in Python 2 because the extension module cache worked 
based purely on filename. It was changed in response to issue16421, but the 
intent there appears to be to call *different* PyInit methods in the same 
module. However, because PyInit functions are based off of the *basename* of 
the module, not the full module name, a different module name does not mean a 
different init function name.

I think the right approach is to change the extension module cache to key on 
filename and init function name, although this is a little tricky: the init 
function name is calculated much later in the process. Alternatively, key it on 
filename and module basename, rather than full module name.

----------
messages: 313064
nosy: Arfrever, amaury.forgeotdarc, asvetlov, brett.cannon, eric.snow, eudoxos, 
ncoghlan, pitrou, r.david.murray, twouters, vstinner
priority: normal
severity: normal
status: open
title: Importing the same extension module under multiple names breaks 
non-reinitialisable extension modules
type: behavior

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32973>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to