Hi,

I just added a new sys.module_names attribute, list (technically a
frozenset) of all stdlib module names:
https://bugs.python.org/issue42955

There are multiple use cases:

* Group stdlib imports when reformatting a Python file,
* Exclude stdlib imports when computing dependencies.
* Exclude stdlib modules when listing extension modules on crash or
fatal error, only list 3rd party extension (already implemented in
master, see bpo-42923 ;-)).
* Exclude stdlib modules when tracing the execution of a program using
the trace module.
* Detect typo and suggest a fix: ImportError("No module named maths.
Did you mean 'math'?",) (test the nice friendly-traceback project!).

Example:

>>> 'asyncio' in sys.module_names
True
>>> 'numpy' in sys.module_names
False

>>> len(sys.module_names)
312
>>> type(sys.module_names)
<class 'frozenset'>

>>> sorted(sys.module_names)[:10]
['__future__', '_abc', '_aix_support', '_ast', '_asyncio', '_bisect',
'_blake2', '_bootsubprocess', '_bz2', '_codecs']
>>> sorted(sys.module_names)[-10:]
['xml.dom', 'xml.etree', 'xml.parsers', 'xml.sax', 'xmlrpc', 'zipapp',
'zipfile', 'zipimport', 'zlib', 'zoneinfo']

The list is opinionated and defined by its documentation:

   A frozenset of strings containing the names of standard library
   modules.

   It is the same on all platforms. Modules which are not available on
   some platforms and modules disabled at Python build are also listed.
   All module kinds are listed: pure Python, built-in, frozen and
   extension modules. Test modules are excluded.

   For packages, only sub-packages are listed, not sub-modules. For
   example, ``concurrent`` package and ``concurrent.futures``
   sub-package are listed, but not ``concurrent.futures.base``
   sub-module.

   See also the :attr:`sys.builtin_module_names` list.

The design (especially, the fact of having the same list on all
platforms) comes from the use cases list above. For example, running
isort should produce the same output on any platform, and not depend
if the Python stdlib was splitted into multiple packages on Linux
(which is done by most popular Linux distributions).

The list is generated by the Tools/scripts/generate_module_names.py script:
https://github.com/python/cpython/blob/master/Tools/scripts/generate_module_names.py

When you add a new module, you must run "make regen-module-names,
otherwise a pre-commit check will fail on your PR ;-) The list of
Windows extensions is currently hardcoded in the script (contributions
are welcomed to discover them, since the list is short and evolves
rarely, I didn't feel the need to spend time that on that).

Currently (Python 3.10.0a4+), there are 312 names in sys.module_names,
stored in Python/module_names.h:
https://github.com/python/cpython/blob/master/Python/module_names.h

It was decided to include "helper" modules like "_aix_support" which
is used by sysconfig. But test modules like _testcapi are excluded to
make the list shorter (it's rare to run the CPython test suite outside
Python).

There are 83 private modules, name starting with an underscore
(exclude _abc but also __future__):

>>> len([name for name in sys.module_names if not name.startswith('_')])
229

This new attribute may help to define "what is the Python stdlib" ;-)

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BTX7SH2CR66QCLER2EXAK2GOUAH2U4CL/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to