Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On 10 Aug 2013 21:06, "Eli Bendersky" wrote: > > n Sat, Aug 10, 2013 at 5:47 PM, Nick Coghlan wrote: >> >> In a similar vein, Antoine recently noted that the fact the per-module state isn't a real PyObject creates a variety of interesting lifecycle management challenges. >> >> I'm not seeing an easy solution, either, except to automatically skip reinitialization when the module has already been imported. > > This solution has problems. For example, in the case of ET it would preclude testing what happens when pyexpat is disabled (remember we were discussing this...). This is because there would be no real way to create new instances of such modules (they would all cache themselves in the init function - similarly to what ET now does in trunk, because otherwise some of its global-dependent crazy tests fail). Right, it would still be broken, just in a less horrible way. > > A more radical solution would be to *really* have multiple instances of state per sub-interpreter. Well, they already exist -- it's PyState_FindModule which is the problematic one because it only remembers the last one. But I see that it's only being used by extension modules themselves, to efficiently find modules they belong to. It feels a bit like a hack that was made to avoid rewriting lots of code, because in general a module's objects *can* know which module instance they came from. E.g. it can be saved as a private field in classes exported by the module. > > So a more radical approach would be: > > PyState_FindModule can be deprecated, but still exist and be documented to return the state the *last* module created in this sub-interpreter. stdlib extension modules that actually use this mechanism can be rewritten to just remember the module for real, and not rely on PyState_FindModule to fetch it from a global cache. I don't think this would be hard, and it would make the good intention of PEP 3121 more real - actual intependent state per module instance. Sounds promising to me. I suspect handling exported functions will prove to be tricky, though - they may need to be redesigned to behave more like "module methods". > > Eli > > > > > > > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On Sat, 10 Aug 2013 18:06:02 -0700
Eli Bendersky wrote:
> This solution has problems. For example, in the case of ET it would
> preclude testing what happens when pyexpat is disabled (remember we were
> discussing this...). This is because there would be no real way to create
> new instances of such modules (they would all cache themselves in the init
> function - similarly to what ET now does in trunk, because otherwise some
> of its global-dependent crazy tests fail).
>
> A more radical solution would be to *really* have multiple instances of
> state per sub-interpreter. Well, they already exist -- it's
> PyState_FindModule which is the problematic one because it only remembers
> the last one.
I'm not sure I understand your diagnosis. modules_per_index (and
PyState_FindModule) is per-interpreter so we already have a
per-interpreter state here. Something else must be interferring.
Note that module state is just a field attached to the module object
("void *md_state" in PyModuleObject). It's really the extension modules
which are per-interpreter, which is a good thing.
Regards
Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Green buildbot failure.
On Sat, 10 Aug 2013 21:40:46 -0400 Terry Reedy wrote: > > This run recorded here shows a green test (it appears to have timed out) > http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/7017 > but the corresponding log for this Windows bot > http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/7017/steps/test/logs/stdio > has the expected os.chown failure. You've got the answer at the bottom: "program finished with exit code 0" So for some reason, the test suite crashed, but with a successful exit code. Buildbot thinks it ran fine. > Are such green failures intended? Not really, no. Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
Hi Eli, On Sat, 10 Aug 2013 17:12:53 -0700 Eli Bendersky wrote: > > Note how doing some sys.modules acrobatics and re-importing suddenly > changes the internal state of a previously imported module. This happens > because: > > 1. The first import of 'csv' (which then imports `_csv) creates > module-specific state on the heap and associates it with the current > sub-interpreter. The list of dialects, amongst other things, is in that > state. > 2. The 'del's wipe 'csv' and '_csv' from the cache. > 3. The second import of 'csv' also creates/initializes a new '_csv' module > because it's not in sys.modules. This *replaces* the per-sub-interpreter > cached version of the module's state with the clean state of a new module I would say this is pretty much expected. The converse would be a bug IMO (but perhaps Martin disagrees). PEP 3121's stated goal is not only subinterpreter support: "Extension module initialization currently has a few deficiencies. There is no cleanup for modules, the entry point name might give naming conflicts, the entry functions don't follow the usual calling convention, and multiple interpreters are not supported well." Re-initializing state when importing a module anew makes extension modules more like pure Python modules, which is a good thing. I think the piece of interpretation you offered yesterday on IRC may be the right explanation for the ET shenanigans: "Maybe the bug is that ParseError is kept in per-module state, and also exported from the module?" PEP 3121 doesn't offer any guidelines for using its API, and its example shows PyObject* fields in a module state. I'm starting to think that it might be a bad use of PEP 3121. PyObjects can, and therefore should be stored in the extension module dict where they will participate in normal resource management (i.e. garbage collection). If they are in the module dict, then they shouldn't be held alive by the module state too, otherwise the (currently tricky) lifetime management of extension modules can produce oddities. So, the PEP 3121 "module state" pointer (the optional opaque void* thing) should only be used to hold non-PyObjects. PyObjects should go to the module dict, like they do in normal Python modules. Now, the reason our PEP 3121 extension modules abuse the module state pointer to keep PyObjects is two-fold: 1. it's surprisingly easier (it's actually a one-liner if you don't handle errors - a rather bad thing, but all PEP 3121 extension modules currently don't handle a NULL return from PyState_FindModule...) 2. it protects the module from any module dict monkeypatching. It's not important if you are using a generic API on the PyObject, but it is if the PyObject is really a custom C type with well-defined fields. Those two issues can be addressed if we offer an API for it. How about: PyObject *PyState_GetModuleAttr(struct PyModuleDef *def, const char *name, PyObject *restrict_type) *def* is a pointer to the module definition. *name* is the attribute to look up on the module dict. *restrict_type*, if non-NULL, is a type object the looked up attribute must be an instance of. Lookup an attribute in the current interpreter's extension module instance for the module definition *def*. Returns a *new* reference (!), or NULL if an error occurred. An error can be: - no such module exists for the current interpreter (ImportError? RuntimeError? SystemError?) - no such attribute exists in the module dict (AttributeError) - the attribute doesn't conform to *restrict_type* (TypeError) So code can be written like: PyObject *dialects = PyState_GetModuleAttr( &_csvmodule, "dialects", &PyDict_Type); if (dialects == NULL) return NULL; Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On Sun, 11 Aug 2013 12:33:16 +0200 Antoine Pitrou wrote: > So, the PEP 3121 "module state" pointer (the optional opaque void* > thing) should only be used to hold non-PyObjects. PyObjects should go > to the module dict, like they do in normal Python modules. Now, the > reason our PEP 3121 extension modules abuse the module state pointer to > keep PyObjects is two-fold: > > 1. it's surprisingly easier (it's actually a one-liner if you don't > handle errors - a rather bad thing, but all PEP 3121 extension modules > currently don't handle a NULL return from PyState_FindModule...) > > 2. it protects the module from any module dict monkeypatching. It's not > important if you are using a generic API on the PyObject, but it is if > the PyObject is really a custom C type with well-defined fields. I overlooked a third reason which is performance. But, those lookups are generally not performance-critical. Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On 11 August 2013 06:33, Antoine Pitrou wrote: > So code can be written like: > > PyObject *dialects = PyState_GetModuleAttr( > &_csvmodule, "dialects", &PyDict_Type); > if (dialects == NULL) > return NULL; This sounds like a good near term solution to me. Longer term, I think there may be value in providing a richer extension module initialisation API that lets extension modules be represented as module *subclasses* in sys.modules, since that would get us to a position where it is possible to have *multiple* instances of an extension module in the *same* subinterpreter by holding on to external references after removing them from sys.modules (which is what we do in the test suite for pure Python modules). Enabling that also ties into the question of passing info to the extension module about how it is being loaded (e.g. as a submodule of a larger package), as well as allowing extension modules to cleanly handle reload(). However, that's dependent on the ModuleSpec idea we're currently thrashing out on import-sig (and should be able to bring to python-dev soon), and I think getting that integrated at all will be ambitious enough for 3.4 - using it to improve extension module handling would then be a project for 3.5. Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On Sun, 11 Aug 2013 07:04:40 -0400 Nick Coghlan wrote: > On 11 August 2013 06:33, Antoine Pitrou wrote: > > So code can be written like: > > > > PyObject *dialects = PyState_GetModuleAttr( > > &_csvmodule, "dialects", &PyDict_Type); > > if (dialects == NULL) > > return NULL; > > This sounds like a good near term solution to me. > > Longer term, I think there may be value in providing a richer > extension module initialisation API that lets extension modules be > represented as module *subclasses* in sys.modules, since that would > get us to a position where it is possible to have *multiple* instances > of an extension module in the *same* subinterpreter by holding on to > external references after removing them from sys.modules (which is > what we do in the test suite for pure Python modules). Either that, or add a "struct PyMemberDef *m_members" field to PyModuleDef, to enable looking up stuff in the m_state using regular attribute lookup. Unfortunately, doing so would probably break the ABI. Also, allowing for module subclasses is probably more flexible in the long term. We just need to devise a convenience API for that (perhaps by allowing to create both the subclass *and* instantiate it in a single call). > However, that's dependent on the ModuleSpec idea we're > currently thrashing out on import-sig (and should be able to bring to > python-dev soon), and I think getting that integrated at all will be > ambitious enough for 3.4 - using it to improve extension module > handling would then be a project for 3.5. Sounds reasonable. Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
Antoine Pitrou, 11.08.2013 12:33: > On Sat, 10 Aug 2013 17:12:53 -0700 Eli Bendersky wrote: >> Note how doing some sys.modules acrobatics and re-importing suddenly >> changes the internal state of a previously imported module. This happens >> because: >> >> 1. The first import of 'csv' (which then imports `_csv) creates >> module-specific state on the heap and associates it with the current >> sub-interpreter. The list of dialects, amongst other things, is in that >> state. >> 2. The 'del's wipe 'csv' and '_csv' from the cache. >> 3. The second import of 'csv' also creates/initializes a new '_csv' module >> because it's not in sys.modules. This *replaces* the per-sub-interpreter >> cached version of the module's state with the clean state of a new module > > I would say this is pretty much expected. The converse would be a bug > IMO (but perhaps Martin disagrees). PEP 3121's stated goal is not only > subinterpreter support: > > "Extension module initialization currently has a few deficiencies. > There is no cleanup for modules, the entry point name might give > naming conflicts, the entry functions don't follow the usual calling > convention, and multiple interpreters are not supported well." > > Re-initializing state when importing a module anew makes extension > modules more like pure Python modules, which is a good thing. It's the same as defining a type or function in a loop, or inside of a closure. The whole point of reimporting is that you get a new module. However, it should not change the content of the old module, just create a new one. > So, the PEP 3121 "module state" pointer (the optional opaque void* > thing) should only be used to hold non-PyObjects. PyObjects should go > to the module dict, like they do in normal Python modules. Now, the > reason our PEP 3121 extension modules abuse the module state pointer to > keep PyObjects is two-fold: > > 1. it's surprisingly easier (it's actually a one-liner if you don't > handle errors - a rather bad thing, but all PEP 3121 extension modules > currently don't handle a NULL return from PyState_FindModule...) > > 2. it protects the module from any module dict monkeypatching. It's not > important if you are using a generic API on the PyObject, but it is if > the PyObject is really a custom C type with well-defined fields. Yes, it's a major safety problem if you can crash the interpreter by assigning None to a module attribute. > Those two issues can be addressed if we offer an API for it. How about: > > PyObject *PyState_GetModuleAttr(struct PyModuleDef *def, > const char *name, > PyObject *restrict_type) > > *def* is a pointer to the module definition. > *name* is the attribute to look up on the module dict. > *restrict_type*, if non-NULL, is a type object the looked up attribute > must be an instance of. > > Lookup an attribute in the current interpreter's extension module > instance for the module definition *def*. > Returns a *new* reference (!), or NULL if an error occurred. > An error can be: > - no such module exists for the current interpreter (ImportError? > RuntimeError? SystemError?) > - no such attribute exists in the module dict (AttributeError) > - the attribute doesn't conform to *restrict_type* (TypeError) > > So code can be written like: > > PyObject *dialects = PyState_GetModuleAttr( > &_csvmodule, "dialects", &PyDict_Type); > if (dialects == NULL) > return NULL; At least for Cython it's unlikely that it'll ever use this. It's just way too much overhead for looking up a global name. Plus, not all global names are visible in the module dict, e.g. it's common to have types that are only used internally to keep some kind of state. Those would still have to live in the internal per-module state. ISTM that this is not a proper solution for the problem, because it only covers the simple use cases. Rather, I'd prefer making the handling of names in the per-module instance state safer. Essentially, with PEP 3121, modules are just one form of an extension type. So what's wrong with giving them normal extension type fields? Functions are essentially methods of the module, global types are just inner classes. Both should keep the module alive (on the one side) and be tied to it (on the other side). If you reimport a module, you'd get a new set of everything, and the old module would just linger in the background until the last reference to it dies. In other words, I don't see why modules should be any special. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
Antoine Pitrou, 11.08.2013 13:48: > On Sun, 11 Aug 2013 07:04:40 -0400 Nick Coghlan wrote: >> On 11 August 2013 06:33, Antoine Pitrou wrote: >>> So code can be written like: >>> >>> PyObject *dialects = PyState_GetModuleAttr( >>> &_csvmodule, "dialects", &PyDict_Type); >>> if (dialects == NULL) >>> return NULL; >> >> This sounds like a good near term solution to me. >> >> Longer term, I think there may be value in providing a richer >> extension module initialisation API that lets extension modules be >> represented as module *subclasses* in sys.modules, since that would >> get us to a position where it is possible to have *multiple* instances >> of an extension module in the *same* subinterpreter by holding on to >> external references after removing them from sys.modules (which is >> what we do in the test suite for pure Python modules). > > Either that, or add a "struct PyMemberDef *m_members" field to > PyModuleDef, to enable looking up stuff in the m_state using regular > attribute lookup. Hmm, yes, it's unfortunate that the module state isn't just a public part of the object struct. > Unfortunately, doing so would probably break the ABI. Also, allowing > for module subclasses is probably more flexible in the long term. +1000 > We > just need to devise a convenience API for that (perhaps by allowing to > create both the subclass *and* instantiate it in a single call). Right. This conflicts somewhat with the simplified module creation. If the module loader passed the readily instantiated module instance into the module init function, then module subtypes don't fit into this scheme anymore. One more reason why modules shouldn't be special. Essentially, we need an m_new() and m_init() for them. And the lifetime of the module type would have to be linked to the (sub-)interpreter, whereas the lifetime of the module instance would be determined by whoever uses the module and/or decides to unload/reload it. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Green buildbot failure.
On 11/08/2013 11:00am, Antoine Pitrou wrote: You've got the answer at the bottom: "program finished with exit code 0" So for some reason, the test suite crashed, but with a successful exit code. Buildbot thinks it ran fine. Was the test terminated because it took too long? TerminateProcess(handle, exitcode) sometimes makes the program exit with return code 0 instead of exitcode. At any rate, test_multiprocessing contains this disabled test: # XXX sometimes get p.exitcode == 0 on Windows ... #self.assertEqual(p.exitcode, -signal.SIGTERM) -- Richard ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On Sun, 11 Aug 2013 14:16:10 +0200 Stefan Behnel wrote: > > > We > > just need to devise a convenience API for that (perhaps by allowing to > > create both the subclass *and* instantiate it in a single call). > > Right. This conflicts somewhat with the simplified module creation. If the > module loader passed the readily instantiated module instance into the > module init function, then module subtypes don't fit into this scheme anymore. > > One more reason why modules shouldn't be special. Essentially, we need an > m_new() and m_init() for them. And the lifetime of the module type would > have to be linked to the (sub-)interpreter, whereas the lifetime of the > module instance would be determined by whoever uses the module and/or > decides to unload/reload it. It may be simpler if the only strong reference to the module type is in the module instance itself. Successive module initializations would get different types, but that shouldn't be a problem in practice. Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Green buildbot failure.
http://stackoverflow.com/questions/2061735/42-passed-to-terminateprocess-sometimes-getexitcodeprocess-returns-0 -- Richard ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
Antoine Pitrou, 11.08.2013 14:32: > On Sun, 11 Aug 2013 14:16:10 +0200 Stefan Behnel wrote: >>> We >>> just need to devise a convenience API for that (perhaps by allowing to >>> create both the subclass *and* instantiate it in a single call). >> >> Right. This conflicts somewhat with the simplified module creation. If the >> module loader passed the readily instantiated module instance into the >> module init function, then module subtypes don't fit into this scheme >> anymore. >> >> One more reason why modules shouldn't be special. Essentially, we need an >> m_new() and m_init() for them. And the lifetime of the module type would >> have to be linked to the (sub-)interpreter, whereas the lifetime of the >> module instance would be determined by whoever uses the module and/or >> decides to unload/reload it. > > It may be simpler if the only strong reference to the module type is in > the module instance itself. Successive module initializations would get > different types, but that shouldn't be a problem in practice. Agreed. Then the module instance would just be the only instance of a new type that gets created each time the module initialised. Even if module subtypes were to become common place once they are generally supported (because they're the easiest way to store per-module state efficiently), module reinitialisation should be rare enough to just buy them with a new type for each. The size of the complete module state+dict will almost always outweigh the size of the one additional type by factors. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
Stefan Behnel, 11.08.2013 14:48: > Antoine Pitrou, 11.08.2013 14:32: >> On Sun, 11 Aug 2013 14:16:10 +0200 Stefan Behnel wrote: We just need to devise a convenience API for that (perhaps by allowing to create both the subclass *and* instantiate it in a single call). >>> >>> Right. This conflicts somewhat with the simplified module creation. If the >>> module loader passed the readily instantiated module instance into the >>> module init function, then module subtypes don't fit into this scheme >>> anymore. >>> >>> One more reason why modules shouldn't be special. Essentially, we need an >>> m_new() and m_init() for them. And the lifetime of the module type would >>> have to be linked to the (sub-)interpreter, whereas the lifetime of the >>> module instance would be determined by whoever uses the module and/or >>> decides to unload/reload it. >> >> It may be simpler if the only strong reference to the module type is in >> the module instance itself. Successive module initializations would get >> different types, but that shouldn't be a problem in practice. > > Agreed. Then the module instance would just be the only instance of a new > type that gets created each time the module initialised. Even if module > subtypes were to become common place once they are generally supported > (because they're the easiest way to store per-module state efficiently), > module reinitialisation should be rare enough to just buy them with a new > type for each. The size of the complete module state+dict will almost > always outweigh the size of the one additional type by factors. BTW, this already suggests a simple module initialisation interface. The extension module would expose a function that returns a module type, and the loader/importer would then simply instantiate that. Nothing else is needed. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
Stefan Behnel, 11.08.2013 14:53: > Stefan Behnel, 11.08.2013 14:48: >> Antoine Pitrou, 11.08.2013 14:32: >>> On Sun, 11 Aug 2013 14:16:10 +0200 Stefan Behnel wrote: > We > just need to devise a convenience API for that (perhaps by allowing to > create both the subclass *and* instantiate it in a single call). Right. This conflicts somewhat with the simplified module creation. If the module loader passed the readily instantiated module instance into the module init function, then module subtypes don't fit into this scheme anymore. One more reason why modules shouldn't be special. Essentially, we need an m_new() and m_init() for them. And the lifetime of the module type would have to be linked to the (sub-)interpreter, whereas the lifetime of the module instance would be determined by whoever uses the module and/or decides to unload/reload it. >>> >>> It may be simpler if the only strong reference to the module type is in >>> the module instance itself. Successive module initializations would get >>> different types, but that shouldn't be a problem in practice. >> >> Agreed. Then the module instance would just be the only instance of a new >> type that gets created each time the module initialised. Even if module >> subtypes were to become common place once they are generally supported >> (because they're the easiest way to store per-module state efficiently), >> module reinitialisation should be rare enough to just buy them with a new >> type for each. The size of the complete module state+dict will almost >> always outweigh the size of the one additional type by factors. > > BTW, this already suggests a simple module initialisation interface. The > extension module would expose a function that returns a module type, and > the loader/importer would then simply instantiate that. Nothing else is > needed. Actually, strike the word "module type" and replace it with "type". Is there really a reason why Python needs a module type at all? I mean, you can stick arbitrary objects in sys.modules, so why not allow arbitrary types to be returned by the module creation function? Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On Sun, Aug 11, 2013 at 2:58 AM, Antoine Pitrou wrote:
> On Sat, 10 Aug 2013 18:06:02 -0700
> Eli Bendersky wrote:
> > This solution has problems. For example, in the case of ET it would
> > preclude testing what happens when pyexpat is disabled (remember we were
> > discussing this...). This is because there would be no real way to create
> > new instances of such modules (they would all cache themselves in the
> init
> > function - similarly to what ET now does in trunk, because otherwise some
> > of its global-dependent crazy tests fail).
> >
> > A more radical solution would be to *really* have multiple instances of
> > state per sub-interpreter. Well, they already exist -- it's
> > PyState_FindModule which is the problematic one because it only remembers
> > the last one.
>
> I'm not sure I understand your diagnosis. modules_per_index (and
> PyState_FindModule) is per-interpreter so we already have a
> per-interpreter state here. Something else must be interferring.
>
>
Yes, it's per interpreter, but only one per interpreter is remembered in
state->modules_by_index. What I'm trying to say is that currently two
different instances of PyModuleObject *within the same interpterer* share
the state if they get to it through PyState_FindModule, because they share
the same PyModuleDef, and stat->modules_by_index keeps only one module per
PyModuleDef.
> Note that module state is just a field attached to the module object
> ("void *md_state" in PyModuleObject). It's really the extension modules
> which are per-interpreter, which is a good thing.
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On 11 Aug 2013 09:02, "Stefan Behnel" wrote: > > Stefan Behnel, 11.08.2013 14:53: > > Stefan Behnel, 11.08.2013 14:48: > >> Antoine Pitrou, 11.08.2013 14:32: > >>> On Sun, 11 Aug 2013 14:16:10 +0200 Stefan Behnel wrote: > > We > > just need to devise a convenience API for that (perhaps by allowing to > > create both the subclass *and* instantiate it in a single call). > > Right. This conflicts somewhat with the simplified module creation. If the > module loader passed the readily instantiated module instance into the > module init function, then module subtypes don't fit into this scheme anymore. > > One more reason why modules shouldn't be special. Essentially, we need an > m_new() and m_init() for them. And the lifetime of the module type would > have to be linked to the (sub-)interpreter, whereas the lifetime of the > module instance would be determined by whoever uses the module and/or > decides to unload/reload it. > >>> > >>> It may be simpler if the only strong reference to the module type is in > >>> the module instance itself. Successive module initializations would get > >>> different types, but that shouldn't be a problem in practice. > >> > >> Agreed. Then the module instance would just be the only instance of a new > >> type that gets created each time the module initialised. Even if module > >> subtypes were to become common place once they are generally supported > >> (because they're the easiest way to store per-module state efficiently), > >> module reinitialisation should be rare enough to just buy them with a new > >> type for each. The size of the complete module state+dict will almost > >> always outweigh the size of the one additional type by factors. > > > > BTW, this already suggests a simple module initialisation interface. The > > extension module would expose a function that returns a module type, and > > the loader/importer would then simply instantiate that. Nothing else is needed. > > Actually, strike the word "module type" and replace it with "type". Is > there really a reason why Python needs a module type at all? I mean, you > can stick arbitrary objects in sys.modules, so why not allow arbitrary > types to be returned by the module creation function? That's exactly what I have in mind, but the way extension module imports currently work means we can't easily do it just yet. Fortunately, importlib means we now have some hope of fixing that :) Cheers, Nick. > > Stefan > > > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou wrote: > > Hi Eli, > > On Sat, 10 Aug 2013 17:12:53 -0700 > Eli Bendersky wrote: > > > > Note how doing some sys.modules acrobatics and re-importing suddenly > > changes the internal state of a previously imported module. This happens > > because: > > > > 1. The first import of 'csv' (which then imports `_csv) creates > > module-specific state on the heap and associates it with the current > > sub-interpreter. The list of dialects, amongst other things, is in that > > state. > > 2. The 'del's wipe 'csv' and '_csv' from the cache. > > 3. The second import of 'csv' also creates/initializes a new '_csv' > module > > because it's not in sys.modules. This *replaces* the per-sub-interpreter > > cached version of the module's state with the clean state of a new module > > I would say this is pretty much expected. I'm struggling to see how it's expected. The two imported csv modules are different (i.e. different id() of members), and yet some state is shared between them. I think the root reason for it is that "PyModuleDev _csvmodule" is uniqued per interpreter, not per module instance. Even if dialects were not a PyObject, this would still be problematic, don't you think? And note that here, unlike the ET.ParseError case, I don't think the problem is exporting internal per-module state as a module attribute. The following two are un-reconcilable, IMHO: 1. Wanting to have two instances of the same module in the same interpterer. 2. Using a global shared PyModuleDef between all instances of the same module in the same interpterer. > The converse would be a bug > IMO (but perhaps Martin disagrees). PEP 3121's stated goal is not only > subinterpreter support: > > "Extension module initialization currently has a few deficiencies. > There is no cleanup for modules, the entry point name might give > naming conflicts, the entry functions don't follow the usual calling > convention, and multiple interpreters are not supported well." > > Re-initializing state when importing a module anew makes extension > modules more like pure Python modules, which is a good thing. > > > I think the piece of interpretation you offered yesterday on IRC may be > the right explanation for the ET shenanigans: > > "Maybe the bug is that ParseError is kept in per-module state, and > also exported from the module?" > > PEP 3121 doesn't offer any guidelines for using its API, and its > example shows PyObject* fields in a module state. > > I'm starting to think that it might be a bad use of PEP 3121. PyObjects > can, and therefore should be stored in the extension module dict where > they will participate in normal resource management (i.e. garbage > collection). If they are in the module dict, then they shouldn't be > held alive by the module state too, otherwise the (currently tricky) > lifetime management of extension modules can produce oddities. > > > So, the PEP 3121 "module state" pointer (the optional opaque void* > thing) should only be used to hold non-PyObjects. PyObjects should go > to the module dict, like they do in normal Python modules. Now, the > reason our PEP 3121 extension modules abuse the module state pointer to > keep PyObjects is two-fold: > > 1. it's surprisingly easier (it's actually a one-liner if you don't > handle errors - a rather bad thing, but all PEP 3121 extension modules > currently don't handle a NULL return from PyState_FindModule...) > > 2. it protects the module from any module dict monkeypatching. It's not > important if you are using a generic API on the PyObject, but it is if > the PyObject is really a custom C type with well-defined fields. > > Those two issues can be addressed if we offer an API for it. How about: > > PyObject *PyState_GetModuleAttr(struct PyModuleDef *def, > const char *name, > PyObject *restrict_type) > > *def* is a pointer to the module definition. > *name* is the attribute to look up on the module dict. > *restrict_type*, if non-NULL, is a type object the looked up attribute > must be an instance of. > > Lookup an attribute in the current interpreter's extension module > instance for the module definition *def*. > Returns a *new* reference (!), or NULL if an error occurred. > An error can be: > - no such module exists for the current interpreter (ImportError? > RuntimeError? SystemError?) > - no such attribute exists in the module dict (AttributeError) > - the attribute doesn't conform to *restrict_type* (TypeError) > > So code can be written like: > > PyObject *dialects = PyState_GetModuleAttr( > &_csvmodule, "dialects", &PyDict_Type); > if (dialects == NULL) > return NULL; > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-arch
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On Sun, 11 Aug 2013 06:26:55 -0700
Eli Bendersky wrote:
> On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou wrote:
>
> >
> > Hi Eli,
> >
> > On Sat, 10 Aug 2013 17:12:53 -0700
> > Eli Bendersky wrote:
> > >
> > > Note how doing some sys.modules acrobatics and re-importing suddenly
> > > changes the internal state of a previously imported module. This happens
> > > because:
> > >
> > > 1. The first import of 'csv' (which then imports `_csv) creates
> > > module-specific state on the heap and associates it with the current
> > > sub-interpreter. The list of dialects, amongst other things, is in that
> > > state.
> > > 2. The 'del's wipe 'csv' and '_csv' from the cache.
> > > 3. The second import of 'csv' also creates/initializes a new '_csv'
> > module
> > > because it's not in sys.modules. This *replaces* the per-sub-interpreter
> > > cached version of the module's state with the clean state of a new module
> >
> > I would say this is pretty much expected.
>
> I'm struggling to see how it's expected. The two imported csv modules are
> different (i.e. different id() of members), and yet some state is shared
> between them.
There are two csv modules, but there are not two _csv modules.
Extension modules are currently immortal until the end of the
interpreter:
>>> csv = __import__('csv')
>>> wcsv = weakref.ref(csv)
>>> w_csv = weakref.ref(sys.modules['_csv'])
>>> del sys.modules['csv']
>>> del sys.modules['_csv']
>>> del csv
>>> gc.collect()
50
>>> wcsv()
>>> w_csv()
So, "sharing" a state is pretty much expected, since you are
re-initializating an existing module.
(but the module does get re-initialized, which is the point of PEP 3121)
> 1. Wanting to have two instances of the same module in the same interpterer.
It could be nice, but really, that's not a common use case. And it's
impossible for extension modules, currently.
Regards
Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] redesigning the extension module initialisation protocol (was: Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others))
Nick Coghlan, 11.08.2013 15:19: > On 11 Aug 2013 09:02, "Stefan Behnel" wrote: >>> BTW, this already suggests a simple module initialisation interface. The >>> extension module would expose a function that returns a module type, and >>> the loader/importer would then simply instantiate that. Nothing else is >>> needed. >> >> Actually, strike the word "module type" and replace it with "type". Is >> there really a reason why Python needs a module type at all? I mean, you >> can stick arbitrary objects in sys.modules, so why not allow arbitrary >> types to be returned by the module creation function? > > That's exactly what I have in mind, but the way extension module imports > currently work means we can't easily do it just yet. Fortunately, importlib > means we now have some hope of fixing that :) Well, what do we need? We don't need to care about existing code, as long as the current scheme is only deprecated and not deleted. That won't happen before Py4 anyway. New code would simply export a different symbol when compiling for a CPython that supports it, which points to the function that returns the type. Then, there's already the PyType_Copy() function, which can be used to create a heap type from a statically defined type. So extension modules can simply define an (arbitrary) additional type in any way they see fit, copy it to the heap, and return it. Next, we need to define a signature for the type's __init__() method. This can be done in a future proof way by allowing arbitrary keyword arguments to be added, i.e. such a type must have a signature like def __init__(self, currently, used, pos, args, **kwargs) and simply ignore kwargs for now. Actually, we may get away with not passing all too many arguments here if we allow the importer to add stuff to the type's dict in between, specifically __file__, __path__ and friends, so that they are available before the type gets instantiated. Not sure if this is a good idea, but it would at least relieve the user from having to copy these things over from some kind of context or whatever we might want to pass in. Alternatively, we could split the instantiation up between tp_new() and tp_init(), and let the importer set stuff on the instance dict in between the two. But given that this context won't actually change once the shared library is loaded, the only reason to prefer modifying the instance instead of the type would be to avoid requiring a tp_dict for the type. Open for discussion, I guess. Did I forget anything? Sounds simple enough to me so far. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On Sun, Aug 11, 2013 at 6:40 AM, Antoine Pitrou wrote:
> On Sun, 11 Aug 2013 06:26:55 -0700
> Eli Bendersky wrote:
> > On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou
> wrote:
> >
> > >
> > > Hi Eli,
> > >
> > > On Sat, 10 Aug 2013 17:12:53 -0700
> > > Eli Bendersky wrote:
> > > >
> > > > Note how doing some sys.modules acrobatics and re-importing suddenly
> > > > changes the internal state of a previously imported module. This
> happens
> > > > because:
> > > >
> > > > 1. The first import of 'csv' (which then imports `_csv) creates
> > > > module-specific state on the heap and associates it with the current
> > > > sub-interpreter. The list of dialects, amongst other things, is in
> that
> > > > state.
> > > > 2. The 'del's wipe 'csv' and '_csv' from the cache.
> > > > 3. The second import of 'csv' also creates/initializes a new '_csv'
> > > module
> > > > because it's not in sys.modules. This *replaces* the
> per-sub-interpreter
> > > > cached version of the module's state with the clean state of a new
> module
> > >
> > > I would say this is pretty much expected.
> >
> > I'm struggling to see how it's expected. The two imported csv modules are
> > different (i.e. different id() of members), and yet some state is shared
> > between them.
>
> There are two csv modules, but there are not two _csv modules.
> Extension modules are currently immortal until the end of the
> interpreter:
>
> >>> csv = __import__('csv')
> >>> wcsv = weakref.ref(csv)
> >>> w_csv = weakref.ref(sys.modules['_csv'])
> >>> del sys.modules['csv']
> >>> del sys.modules['_csv']
> >>> del csv
> >>> gc.collect()
> 50
> >>> wcsv()
> >>> w_csv()
> '/home/antoine/cpython/default/build/lib.linux-x86_64-3.4-pydebug/_
> csv.cpython-34dm.so'>
>
>
> So, "sharing" a state is pretty much expected, since you are
> re-initializating an existing module.
> (but the module does get re-initialized, which is the point of PEP 3121)
>
Yes, you're right - this is an oversight on my behalf. Indeed, the
extensions dict in import.c keeps it alive once loaded, and only ever gets
cleaned up in Py_Finalize.
Eli
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On Sun, 11 Aug 2013 08:49:56 -0700
Eli Bendersky wrote:
> On Sun, Aug 11, 2013 at 6:40 AM, Antoine Pitrou wrote:
>
> > On Sun, 11 Aug 2013 06:26:55 -0700
> > Eli Bendersky wrote:
> > > On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou
> > wrote:
> > >
> > > >
> > > > Hi Eli,
> > > >
> > > > On Sat, 10 Aug 2013 17:12:53 -0700
> > > > Eli Bendersky wrote:
> > > > >
> > > > > Note how doing some sys.modules acrobatics and re-importing suddenly
> > > > > changes the internal state of a previously imported module. This
> > happens
> > > > > because:
> > > > >
> > > > > 1. The first import of 'csv' (which then imports `_csv) creates
> > > > > module-specific state on the heap and associates it with the current
> > > > > sub-interpreter. The list of dialects, amongst other things, is in
> > that
> > > > > state.
> > > > > 2. The 'del's wipe 'csv' and '_csv' from the cache.
> > > > > 3. The second import of 'csv' also creates/initializes a new '_csv'
> > > > module
> > > > > because it's not in sys.modules. This *replaces* the
> > per-sub-interpreter
> > > > > cached version of the module's state with the clean state of a new
> > module
> > > >
> > > > I would say this is pretty much expected.
> > >
> > > I'm struggling to see how it's expected. The two imported csv modules are
> > > different (i.e. different id() of members), and yet some state is shared
> > > between them.
> >
> > There are two csv modules, but there are not two _csv modules.
> > Extension modules are currently immortal until the end of the
> > interpreter:
> >
> > >>> csv = __import__('csv')
> > >>> wcsv = weakref.ref(csv)
> > >>> w_csv = weakref.ref(sys.modules['_csv'])
> > >>> del sys.modules['csv']
> > >>> del sys.modules['_csv']
> > >>> del csv
> > >>> gc.collect()
> > 50
> > >>> wcsv()
> > >>> w_csv()
> > > '/home/antoine/cpython/default/build/lib.linux-x86_64-3.4-pydebug/_
> > csv.cpython-34dm.so'>
> >
> >
> > So, "sharing" a state is pretty much expected, since you are
> > re-initializating an existing module.
> > (but the module does get re-initialized, which is the point of PEP 3121)
> >
>
> Yes, you're right - this is an oversight on my behalf. Indeed, the
> extensions dict in import.c keeps it alive once loaded, and only ever gets
> cleaned up in Py_Finalize.
It's not the extensions dict in import.c, it's modules_by_index in the
interpreter state.
(otherwise it wouldn't be per-interpreter)
The extensions dict holds the module *definition* (the struct
PyModuleDef), not the module instance.
Regards
Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)
On Sun, Aug 11, 2013 at 8:56 AM, Antoine Pitrou wrote:
> On Sun, 11 Aug 2013 08:49:56 -0700
> Eli Bendersky wrote:
>
> > On Sun, Aug 11, 2013 at 6:40 AM, Antoine Pitrou
> wrote:
> >
> > > On Sun, 11 Aug 2013 06:26:55 -0700
> > > Eli Bendersky wrote:
> > > > On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou >
> > > wrote:
> > > >
> > > > >
> > > > > Hi Eli,
> > > > >
> > > > > On Sat, 10 Aug 2013 17:12:53 -0700
> > > > > Eli Bendersky wrote:
> > > > > >
> > > > > > Note how doing some sys.modules acrobatics and re-importing
> suddenly
> > > > > > changes the internal state of a previously imported module. This
> > > happens
> > > > > > because:
> > > > > >
> > > > > > 1. The first import of 'csv' (which then imports `_csv) creates
> > > > > > module-specific state on the heap and associates it with the
> current
> > > > > > sub-interpreter. The list of dialects, amongst other things, is
> in
> > > that
> > > > > > state.
> > > > > > 2. The 'del's wipe 'csv' and '_csv' from the cache.
> > > > > > 3. The second import of 'csv' also creates/initializes a new
> '_csv'
> > > > > module
> > > > > > because it's not in sys.modules. This *replaces* the
> > > per-sub-interpreter
> > > > > > cached version of the module's state with the clean state of a
> new
> > > module
> > > > >
> > > > > I would say this is pretty much expected.
> > > >
> > > > I'm struggling to see how it's expected. The two imported csv
> modules are
> > > > different (i.e. different id() of members), and yet some state is
> shared
> > > > between them.
> > >
> > > There are two csv modules, but there are not two _csv modules.
> > > Extension modules are currently immortal until the end of the
> > > interpreter:
> > >
> > > >>> csv = __import__('csv')
> > > >>> wcsv = weakref.ref(csv)
> > > >>> w_csv = weakref.ref(sys.modules['_csv'])
> > > >>> del sys.modules['csv']
> > > >>> del sys.modules['_csv']
> > > >>> del csv
> > > >>> gc.collect()
> > > 50
> > > >>> wcsv()
> > > >>> w_csv()
> > > > > '/home/antoine/cpython/default/build/lib.linux-x86_64-3.4-pydebug/_
> > > csv.cpython-34dm.so'>
> > >
> > >
> > > So, "sharing" a state is pretty much expected, since you are
> > > re-initializating an existing module.
> > > (but the module does get re-initialized, which is the point of PEP
> 3121)
> > >
> >
> > Yes, you're right - this is an oversight on my behalf. Indeed, the
> > extensions dict in import.c keeps it alive once loaded, and only ever
> gets
> > cleaned up in Py_Finalize.
>
> It's not the extensions dict in import.c, it's modules_by_index in the
> interpreter state.
> (otherwise it wouldn't be per-interpreter)
>
> The extensions dict holds the module *definition* (the struct
> PyModuleDef), not the module instance.
>
Thanks for the clarification.
Eli
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] redesigning the extension module initialisation protocol (was: Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others))
On Sun, Aug 11, 2013 at 6:52 AM, Stefan Behnel wrote: > Nick Coghlan, 11.08.2013 15:19: > > On 11 Aug 2013 09:02, "Stefan Behnel" wrote: > >>> BTW, this already suggests a simple module initialisation interface. > The > >>> extension module would expose a function that returns a module type, > and > >>> the loader/importer would then simply instantiate that. Nothing else is > >>> needed. > >> > >> Actually, strike the word "module type" and replace it with "type". Is > >> there really a reason why Python needs a module type at all? I mean, you > >> can stick arbitrary objects in sys.modules, so why not allow arbitrary > >> types to be returned by the module creation function? > > > > That's exactly what I have in mind, but the way extension module imports > > currently work means we can't easily do it just yet. Fortunately, > importlib > > means we now have some hope of fixing that :) > > Well, what do we need? We don't need to care about existing code, as long > as the current scheme is only deprecated and not deleted. That won't happen > before Py4 anyway. New code would simply export a different symbol when > compiling for a CPython that supports it, which points to the function that > returns the type. > > Then, there's already the PyType_Copy() function, which can be used to > create a heap type from a statically defined type. So extension modules can > simply define an (arbitrary) additional type in any way they see fit, copy > it to the heap, and return it. > > Next, we need to define a signature for the type's __init__() method. This > can be done in a future proof way by allowing arbitrary keyword arguments > to be added, i.e. such a type must have a signature like > > def __init__(self, currently, used, pos, args, **kwargs) > > and simply ignore kwargs for now. > > Actually, we may get away with not passing all too many arguments here if > we allow the importer to add stuff to the type's dict in between, > specifically __file__, __path__ and friends, so that they are available > before the type gets instantiated. Not sure if this is a good idea, but it > would at least relieve the user from having to copy these things over from > some kind of context or whatever we might want to pass in. > > Alternatively, we could split the instantiation up between tp_new() and > tp_init(), and let the importer set stuff on the instance dict in between > the two. But given that this context won't actually change once the shared > library is loaded, the only reason to prefer modifying the instance instead > of the type would be to avoid requiring a tp_dict for the type. Open for > discussion, I guess. > > Did I forget anything? Sounds simple enough to me so far. > Out of curiosity - can we list actual use cases for this new design? The previous thread, admittedly, deals with an isoteric corner-cases that comes up in overly-clever tests. If we plan to serious consider these changes - and this appears to be worth a PEP - we need a list of actual advantages over the current approach. It's not that a more conceptually pure design is an insufficient reason, IMHO, but it would be interesting to hear about other implications. Eli ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] redesigning the extension module initialisation protocol
Eli Bendersky, 11.08.2013 19:43: > Out of curiosity - can we list actual use cases for this new design? The > previous thread, admittedly, deals with an isoteric corner-cases that comes > up in overly-clever tests. If we plan to serious consider these changes - > and this appears to be worth a PEP - we need a list of actual advantages > over the current approach. It's not that a more conceptually pure design is > an insufficient reason, IMHO, but it would be interesting to hear about > other implications. http://mail.python.org/pipermail/python-dev/2012-November/122599.html http://bugs.python.org/issue13429 http://bugs.python.org/issue16392 Yes, it definitely needs a PEP. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Reaping threads and subprocesses
Some tests uses the following idiom: def test_main(): try: test.support.run_unittest(...) finally: test.support.reap_children() Other tests uses the following idiom: def test_main(): key = test.support.threading_setup() try: test.support.run_unittest(...) finally: test.support.threading_cleanup(*key) or in other words: @test.support.reap_threads def test_main(): test.support.run_unittest(...) These tests are not discoverable. There are some ways to make them discoverable. 1. Create unittest.TestCase subclasses or mixins with overloaded the run() method. class ThreadReaped: def run(self, result): key = test.support.threading_setup() try: return super().run(result) finally: test.support.threading_cleanup(*key) class ChildReaped: def run(self, result): try: return super().run(result) finally: test.support.reap_children() 2. Create unittest.TestCase subclasses or mixins with overloaded setUpClass() and tearDownClass() methods. class ThreadReaped: @classmethod def setUpClass(cls): cls._threads = test.support.threading_setup() @classmethod def tearDownClass(cls): test.support.threading_cleanup(*cls._threads) class ChildReaped: @classmethod def tearDownClass(cls): test.support.reap_children() 3. Create unittest.TestCase subclasses or mixins with overloaded setUp() and tearDown() methods. class ThreadReaped: def setUp(self): self._threads = test.support.threading_setup() def tearDown(self): test.support.threading_cleanup(*self._threads) class ChildReaped: def tearDown(self): test.support.reap_children() 4. Create unittest.TestCase subclasses or mixins with using addCleanup() in constructor. class ThreadReaped: def __init__(self): self.addCleanup(test.support.threading_cleanup, *test.support.threading_setup()) class ChildReaped: def __init__(self): self.addCleanup(test.support.reap_children) Of course instead subclassing we can use decorators which modify test class. What method is better? Do you have other suggestions? The issue where this problem was first occurred: http://bugs.python.org/issue16968. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Green buildbot failure.
Richard Oudkerk writes: > On 11/08/2013 11:00am, Antoine Pitrou wrote: >> You've got the answer at the bottom: >> >>"program finished with exit code 0" >> >> So for some reason, the test suite crashed, but with a successful exit >> code. Buildbot thinks it ran fine. > > Was the test terminated because it took too long? Yes, it looks like it. This test (and one on the XP-4 buildbot in the same time frame) was terminated by an external watchdog script that kills python_d processes that have been running for more than 2 hours. I put the script in place (quite a while back) as a workaround for failures that would strand a python process, blocking future tests due to files remaining in use. It's a last ditch, crude, sledge-hammer. Historically, if this code ran, the buildbot had already itself timed out, so the exit code (which I can't control) wasn't very important. 2 hours had been conservative (and a trade-off as longer values also risks failing more future tests) but it may need to be increased. In this particular case it was a false alarm - the host was heavily loaded during this time frame, which I think prolonged the test time by an unusually large amount. -- David ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Green buildbot failure.
2013/8/11 David Bolen : >> Was the test terminated because it took too long? > > Yes, it looks like it. > > This test (and one on the XP-4 buildbot in the same time frame) was > terminated by an external watchdog script that kills python_d > processes that have been running for more than 2 hours. I put the > script in place (quite a while back) as a workaround for failures that > would strand a python process, blocking future tests due to files > remaining in use. It's a last ditch, crude, sledge-hammer. test.regrtest uses faulthandler.dump_traceback_later() to stop the test after a timeout if --timeout command line option is used. http://docs.python.org/dev/library/faulthandler.html#faulthandler.dump_traceback_later Do you pass this option? The timeout is not global but one a single function of a test file, so you can use shorter timeout. It has also the advantage of dumping the traceback of all Python threads before exiting. It didn't try this feature recently on Windows, but it is supposed to work :-) Victor ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] redesigning the extension module initialisation protocol (was: Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others))
On 11 Aug 2013 09:55, "Stefan Behnel" wrote: > > Nick Coghlan, 11.08.2013 15:19: > > On 11 Aug 2013 09:02, "Stefan Behnel" wrote: > >>> BTW, this already suggests a simple module initialisation interface. The > >>> extension module would expose a function that returns a module type, and > >>> the loader/importer would then simply instantiate that. Nothing else is > >>> needed. > >> > >> Actually, strike the word "module type" and replace it with "type". Is > >> there really a reason why Python needs a module type at all? I mean, you > >> can stick arbitrary objects in sys.modules, so why not allow arbitrary > >> types to be returned by the module creation function? > > > > That's exactly what I have in mind, but the way extension module imports > > currently work means we can't easily do it just yet. Fortunately, importlib > > means we now have some hope of fixing that :) > > Well, what do we need? We don't need to care about existing code, as long > as the current scheme is only deprecated and not deleted. That won't happen > before Py4 anyway. New code would simply export a different symbol when > compiling for a CPython that supports it, which points to the function that > returns the type. > > Then, there's already the PyType_Copy() function, which can be used to > create a heap type from a statically defined type. So extension modules can > simply define an (arbitrary) additional type in any way they see fit, copy > it to the heap, and return it. > > Next, we need to define a signature for the type's __init__() method. We need the "ModuleSpec" object to pass here, which is what we're currently working on in import-sig. We're not going to define something specifically for C extensions when other modules suffer related problems. Cheers, Nick. This > can be done in a future proof way by allowing arbitrary keyword arguments > to be added, i.e. such a type must have a signature like > > def __init__(self, currently, used, pos, args, **kwargs) > > and simply ignore kwargs for now. > > Actually, we may get away with not passing all too many arguments here if > we allow the importer to add stuff to the type's dict in between, > specifically __file__, __path__ and friends, so that they are available > before the type gets instantiated. Not sure if this is a good idea, but it > would at least relieve the user from having to copy these things over from > some kind of context or whatever we might want to pass in. > > Alternatively, we could split the instantiation up between tp_new() and > tp_init(), and let the importer set stuff on the instance dict in between > the two. But given that this context won't actually change once the shared > library is loaded, the only reason to prefer modifying the instance instead > of the type would be to avoid requiring a tp_dict for the type. Open for > discussion, I guess. > > Did I forget anything? Sounds simple enough to me so far. > > Stefan > > > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable
Hi, I fixed various bugs in the implementation of the (new) PEP 446: http://hg.python.org/features/pep-446 At revision da685bd67524, the full test suite pass on: - Fedora 18 (Linux 3.9), x86_64 - FreeBSD 9.1, x86_64 - Windows 7 SP1, x86_64 - OpenIndiana (close to Solaris 11), x86_64 Some tests are failing, but these failures are unrelated to the PEP 446 (same tests are failing in the original Python): - Windows: test_signal, failure related to faulthandler (issue already fixed in default) - OpenIndiana: test_locale, test_uuid Victor ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable
2013/8/12 Victor Stinner : > I fixed various bugs in the implementation of the (new) PEP 446: > http://hg.python.org/features/pep-446 > > At revision da685bd67524, the full test suite pass on: (...) I also checked the usage of atomic flags. There was a minor bug on Linux, it is now fixed (remove an useless call to fcntl to check if SOCK_CLOEXEC works). open(): On Linux, FreeBSD and Solaris 11, O_CLOEXEC flag is used. fcntl(F_GETFD) is only called once for all file descriptors, to check if O_CLOEXEC works. On Windows, O_NOINHERIT is used. socket.socket(): On Linux, SOCK_CLOEXEC flag is used, no extra syscall is required. os.pipe(): On Linux, pipe2() is used with O_CLOEXEC. On other platforms, os.set_inheritable() must be called to make the new file descriptors non-inheritables. On Windows, the atomic flag WSA_FLAG_NO_HANDLE_INHERIT is not used to create a socket. I don't know the Windows well enough to make such change. My OpenIndiana VM looks to be older than Solaris 11: O_CLOEXEC flag is missing. I regenerated the patch in the isssue: http://bugs.python.org/issue18571 Victor ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] redesigning the extension module initialisation protocol
Nick Coghlan, 12.08.2013 00:41: > On 11 Aug 2013 09:55, "Stefan Behnel" wrote: > this already suggests a simple module initialisation interface. > The > extension module would expose a function that returns a module type, > and > the loader/importer would then simply instantiate that. Nothing else > is needed. Actually, strike the word "module type" and replace it with "type". >> [...] >> Next, we need to define a signature for the type's __init__() method. > > We need the "ModuleSpec" object to pass here, which is what we're currently > working on in import-sig. Ok but that's just the very final step. All the rest is C-API specific. And for clarification: you want to let the importer create the ModuleSpec object and the pass it into the module's __init__ method? I guess it could also be passed into the type creation function then, right? Since it wouldn't harm to do that, I think it's a good idea to provide as much information to the extension module as possible, as early as we can, and that's the first time we talk to the shared library. I've started writing up a pre-PEP that describes this protocol. I think it makes sense to keep it separate from the ModuleSpec PEP as the latter can easily be accepted without changing anything at the C-API level, but it shouldn't happen the other way round. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
