On Mon, Aug 6, 2018 at 8:11 PM Eddie Elizondo <eelizo...@fb.com> wrote:
>
> Background:
>
> Through the implementation of an alternate runtime I've been poking around 
> some of the class initialization routines and I found out that there was a 
> subtle bug with PyType_Ready and the header initializer PyVarObject_HEAD_INIT.
>
>
>
> Looking through the codebase, I couldn't really find any pattern of when the 
> type should be defined within PyVarObject_HEAD_INIT. Sometimes it was 
> initialized to NULL (or 0) and PyType_Type (let's ignore Py_True and Py_False 
> from now).
>
>
>
> From PyType_Ready it turns out that setting the value PyType_Type is never 
> actually needed outside of PyType_Type and PyBaseObject type. This is clear 
> from the code:
>
> if (Py_TYPE(type) == NULL && base != NULL)
>
>     Py_TYPE(type) = Py_TYPE(base);
>
>
>
> Given that any PyTypeObject's base is of type PyType_Type, setting 
> PyVarObject_HEAD_INIT(&PyType_Ready) is superfluous. Therefore, setting all 
> static PyTypeObjects to their ob_type to NULL should be a safe assumption to 
> make.
>
>
>
> Uninitialized Types:
>
> A quick s/PyVarObject_HEAD_INIT(&PyType_Type/PyVarObject_HEAD_INIT(NULL/  
> shows that some objects do need to have their ob_type set from the outset, 
> violating the previous assumption. After writing a quick script, I found that 
> out of the ~300 PyVarObject_HEAD_INIT present in CPython, only these 14 types 
> segfaulted:
>
>
>
> PyByteArrayIter_Type
>
> PyBytesIter_Type
>
> PyDictIterKey_Type
>
> PyDictIterValue_Type
>
> PyDictIterItem_Type
>
> PyClassMethod_Type
>
> PyAsyncGen_Type
>
> PyListIter_Type
>
> PyListRevIter_Type
>
> PyODictIter_Type
>
> PyLongRangeIter_Type
>
> PySetIter_Type
>
> PyTupleIter_Type
>
> PyUnicodeIter_Type
>
>
>
>
>
> Bug:
>
> It turns out that these PyTypeObjects are never initialized through 
> PyType_Ready. However, they are used as fully initialized types. It is by 
> pure chance that the work without having to call the initializer on them. 
> This though is undefined behavior. This not only might result in a weird 
> future bug which is hard to chase down but also, it affects other runtimes as 
> this behavior depends on implementation details of CPython.
>
>
>
> This is a pervasive pattern that should be removed from the codebase and 
> ideally extensions should follow as well.
>
>
>
> Solution:
>
> Here are my proposed solutions in order from less controversial to most 
> controversial. Note that all of them I already tried in a local branch and 
> are working:
>
>
>
> 1) Initialize all uninitialized types.
>
>
>
> Example: 
> https://github.com/eduardo-elizondo/cpython/commit/bc53db3cf4e5a6923b0b1afa6181305553faf173
>
> 2) Move all PyVarObject_HEAD_INIT to NULL except PyType_Type, 
> PyBaseObject_Type and the booleans.
>
>
>
> 3) Special case the initialization of PyType_Type and PyBaseObject_Type 
> within PyType_Ready to now make all calls to PyVarObject_HEAD_INIT use NULL. 
> To enable this a small change within PyType_Ready is needed to initialize 
> PyType_Type PyBaseObject:

Coincidentally, I just wrote a long-ish blog post explaining in
technical details why PyVarObject_HEAD_INIT(&PyType_Type) pretty much
cannot work, at least for extension modules (it is not a problem in
the core library), on Windows (my post was focused on Cygwin but it is
a problem for Windows in general):
http://iguananaut.net/blog/programming/windows-data-import.html

The TL;DR is that it's not possible on Windows to initialize a struct
with a pointer to some other data that's found in another DLL (i.e.
&PyType_Type), unless it happens to be a function, as a special case.
But &PyType_Type obviously is not, so thinks break.

So I'm +1 for requiring passing NULL to PyVarObject_HEAD_INIT,
requiring PyType_Ready with an explicit base type argument, and maybe
(eventually) making PyVarObject_HEAD_INIT argumentless.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to