Hi, Since Steve Dower asked me to write a formal PEP for my proposal of a new C API to initialize Python, here you have!
This new PEP will be shortly rendered as HTML at: https://www.python.org/dev/peps/pep-0587/ The Rationale might be a little bit weak at this point and there are still 3 small open questions, but the whole PEP should give you a better overview of my proposal than my previous email :-) I chose to start with a new PEP rather than updating PEP 432. I didn't feel able to just "update" the PEP 432, my proposal is different. The main difference between my PEP 587 and Nick Coghlan's PEP 432 is that my PEP 587 only allows to configure Python *before* its initialization, whereas Nick wanted to give control between its "Initializing" and "Initialized" initialization phases. Moreover, my PEP only uses C types rather Nick wanted to use Python types. By setting the private PyConfig._init_main field to 1, my PEP 587 allows to stop Python initialization early to get a bare minimum working Python only with builtin types and no importlib. If I understood correctly, with my PEP, it remains possible to *extend* Python C API later to implement what Nick wants. Nick: please correct me if I'm wrong :-) Victor PEP: 587 Title: Python Initialization Configuration Author: Victor Stinner <vstin...@redhat.com> Discussions-To: python-dev@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Mar-2018 Python-Version: 3.8 Abstract ======== Add a new C API to configure the Python Initialization providing finer control on the whole configuration and better error reporting. Rationale ========= Python is highly configurable but its configuration evolved organically: configuration parameters is scattered all around the code using different ways to set them (mostly global configuration variables and functions). A straightforward and reliable way to configure Python is needed. Some configuration parameters are not accessible from the C API, or not easily. The C API of Python 3.7 Initialization takes ``wchar_t*`` strings as input whereas the Python filesystem encoding is set during the initialization. Python Initialization C API =========================== This PEP proposes to add the following new structures, functions and macros. New structures (4): * ``PyConfig`` * ``PyInitError`` * ``PyPreConfig`` * ``PyWideCharList`` New functions (8): * ``Py_PreInitialize(config)`` * ``Py_PreInitializeFromArgs(config, argc, argv)`` * ``Py_PreInitializeFromWideArgs(config, argc, argv)`` * ``Py_InitializeFromConfig(config)`` * ``Py_InitializeFromArgs(config, argc, argv)`` * ``Py_InitializeFromWideArgs(config, argc, argv)`` * ``Py_UnixMain(argc, argv)`` * ``Py_ExitInitError(err)`` New macros (6): * ``Py_INIT_OK()`` * ``Py_INIT_ERR(MSG)`` * ``Py_INIT_USER_ERR(MSG)`` * ``Py_INIT_NO_MEMORY()`` * ``Py_INIT_EXIT(EXITCODE)`` * ``Py_INIT_FAILED(err)`` PyWideCharList -------------- ``PyWideCharList`` is a list of ``wchar_t*`` strings. Example to initialize a string from C static array:: static wchar_t* argv[2] = { L"-c", L"pass", }; PyWideCharList config_argv = PyWideCharList_INIT; config_argv.length = Py_ARRAY_LENGTH(argv); config_argv.items = argv; ``PyWideCharList`` structure fields: * ``length`` (``Py_ssize_t``) * ``items`` (``wchar_t**``) If *length* is non-zero, *items* must be non-NULL and all strings must be non-NULL. .. note:: The "WideChar" name comes from the existing Python C API. Example: ``PyUnicode_FromWideChar(const wchar_t *str, Py_ssize_t size)``. PyInitError ----------- ``PyInitError`` is a structure to store an error message or an exit code for the Python Initialization. Example:: PyInitError alloc(void **ptr, size_t size) { *ptr = PyMem_RawMalloc(size); if (*ptr == NULL) { return Py_INIT_NO_MEMORY(); } return Py_INIT_OK(); } int main(int argc, char **argv) { void *ptr; PyInitError err = alloc(&ptr, 16); if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); } PyMem_Free(ptr); return 0; } ``PyInitError`` fields: * ``exitcode`` (``int``): if greater or equal to zero, argument passed to ``exit()`` * ``msg`` (``const char*``): error message * ``prefix`` (``const char*``): string displayed before the message, usually rendered as ``prefix: msg``. * ``user_err`` (int): if non-zero, the error is caused by the user configuration, otherwise it's an internal Python error. Macro to create an error: * ``Py_INIT_OK()``: success * ``Py_INIT_NO_MEMORY()``: memory allocation failure (out of memory) * ``Py_INIT_ERR(MSG)``: Python internal error * ``Py_INIT_USER_ERR(MSG)``: error caused by user configuration * ``Py_INIT_EXIT(STATUS)``: exit Python with the specified status Other macros and functions: * ``Py_INIT_FAILED(err)``: Is the result an error or an exit? * ``Py_ExitInitError(err)``: call ``exit(status)`` for an error created by ``Py_INIT_EXIT(status)``, call ``Py_FatalError(msg)`` for other errors. Must not be called for an error created by ``Py_INIT_OK()``. Pre-Initialization with PyPreConfig ----------------------------------- ``PyPreConfig`` structure is used to pre-initialize Python: * Set the memory allocator * Configure the LC_CTYPE locale * Set the UTF-8 mode Example using the pre-initialization to enable the UTF-8 Mode:: PyPreConfig preconfig = PyPreConfig_INIT; preconfig.utf8_mode = 1; PyInitError err = Py_PreInitialize(&preconfig); if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); } /* at this point, Python will speak UTF-8 */ Py_Initialize(); /* ... use Python API here ... */ Py_Finalize(); Functions to pre-initialize Python: * ``PyInitError Py_PreInitialize(const PyPreConfig *config)`` * ``PyInitError Py_PreInitializeFromArgs( const PyPreConfig *config, int argc, char **argv)`` * ``PyInitError Py_PreInitializeFromWideArgs( const PyPreConfig *config, int argc, wchar_t **argv)`` These functions can be called with *config* set to ``NULL``. The caller is responsible to handler error using ``Py_INIT_FAILED()`` and ``Py_ExitInitError()``. ``PyPreConfig`` fields: * ``allocator``: name of the memory allocator (ex: ``"malloc"``) * ``coerce_c_locale_warn``: if non-zero, emit a warning if the C locale is coerced. * ``coerce_c_locale``: if equals to 2, coerce the C locale; if equals to 1, read the LC_CTYPE to decide if it should be coerced. * ``dev_mode``: see ``PyConfig.dev_mode`` * ``isolated``: see ``PyConfig.isolated`` * ``legacy_windows_fs_encoding`` (Windows only): if non-zero, set the Python filesystem encoding to ``"mbcs"``. * ``use_environment``: see ``PyConfig.use_environment`` * ``utf8_mode``: if non-zero, enable the UTF-8 mode The C locale coercion (PEP 538) and the UTF-8 Mode (PEP 540) are disabled by default in ``PyPreConfig``. Set ``coerce_c_locale``, ``coerce_c_locale_warn`` and ``utf8_mode`` to ``-1`` to let Python enable them depending on the user configuration. Initialization with PyConfig ---------------------------- The ``PyConfig`` structure contains all parameters to configure Python. Example of Python initialization enabling the isolated mode:: PyConfig config = PyConfig_INIT; config.isolated = 1; PyInitError err = Py_InitializeFromConfig(&config); if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); } /* ... use Python API here ... */ Py_Finalize(); Functions to initialize Python: * ``PyInitError Py_InitializeFromConfig(const PyConfig *config)`` * ``PyInitError Py_InitializeFromArgs(const PyConfig *config, int argc, char **argv)`` * ``PyInitError Py_InitializeFromWideArgs(const PyConfig *config, int argc, wchar_t **argv)`` These functions can be called with *config* set to ``NULL``. The caller is responsible to handler error using ``Py_INIT_FAILED()`` and ``Py_ExitInitError()``. PyConfig fields: * ``argv``: ``sys.argv`` * ``base_exec_prefix``: ``sys.base_exec_prefix`` * ``base_prefix``: ``sys.base_prefix`` * ``buffered_stdio``: if equals to 0, enable unbuffered mode, make stdout and stderr streams to be unbuffered. * ``bytes_warning``: if equals to 1, issue a warning when comparing ``bytes`` or ``bytearray`` with ``str``, or comparing ``bytes`` with ``int``. If equal or greater to 2, raise a ``BytesWarning`` exception. * ``dll_path`` (Windows only): Windows DLL path * ``dump_refs``: if non-zero, display all objects still alive at exit * ``exec_prefix``: ``sys.exec_prefix`` * ``executable``: ``sys.executable`` * ``faulthandler``: if non-zero, call ``faulthandler.enable()`` * ``filesystem_encoding``: Filesystem encoding, ``sys.getfilesystemencoding()`` * ``filesystem_errors``: Filesystem encoding errors, ``sys.getfilesystemencodeerrors()`` * ``hash_seed``, ``use_hash_seed``: randomized hash function seed * ``home``: Python home * ``import_time``: if non-zero, profile import time * ``inspect``: enter interactive mode after executing a script or a command * ``install_signal_handlers``: install signal handlers? * ``interactive``: interactive mode * ``legacy_windows_stdio`` (Windows only): if non-zero, use ``io.FileIO`` instead of ``WindowsConsoleIO`` for ``sys.stdin``, ``sys.stdout`` and ``sys.stderr``. * ``malloc_stats``: if non-zero, dump memory allocation statistics at exit * ``module_search_path_env``: ``PYTHONPATH`` environment variale value * ``module_search_paths``, ``use_module_search_paths``: ``sys.path`` * ``optimization_level``: compilation optimization level * ``parser_debug``: if non-zero, turn on parser debugging output (for expert only, depending on compilation options). * ``prefix``: ``sys.prefix`` * ``program_name``: Program name * ``program``: ``argv[0]`` or an empty string * ``pycache_prefix``: ``.pyc`` cache prefix * ``quiet``: quiet mode (ex: don't display the copyright and version messages even in interactive mode) * ``run_command``: ``-c COMMAND`` argument * ``run_filename``: ``python3 SCRIPT`` argument * ``run_module``: ``python3 -m MODULE`` argument * ``show_alloc_count``: show allocation counts at exit * ``show_ref_count``: show total reference count at exit * ``site_import``: import the ``site`` module at startup? * ``skip_source_first_line``: skip the first line of the source * ``stdio_encoding``, ``stdio_errors``: encoding and encoding errors of ``sys.stdin``, ``sys.stdout`` and ``sys.stderr`` * ``tracemalloc``: if non-zero, call ``tracemalloc.start(value)`` * ``user_site_directory``: if non-zero, add user site directory to ``sys.path`` * ``verbose``: if non-zero, enable verbose mode * ``warnoptions``: options of the ``warnings`` module to build filters * ``write_bytecode``: if non-zero, write ``.pyc`` files * ``xoptions``: ``sys._xoptions`` There are also private fields which are for internal-usage only: * ``_check_hash_pycs_mode`` * ``_frozen`` * ``_init_main`` * ``_install_importlib`` New Py_UnixMain() function -------------------------- Python 3.7 provides a high-level ``Py_Main()`` function which requires to pass command line arguments as ``wchar_t*`` strings. It is non-trivial to use the correct encoding to decode bytes. Python has its own set of issues with C locale coercion and UTF-8 Mode. This PEP adds a new ``Py_UnixMain()`` function which takes command line arguments as bytes:: int Py_UnixMain(int argc, char **argv) Memory allocations and Py_DecodeLocale() ---------------------------------------- New pre-initialization and initialization APIs use constant ``PyPreConfig`` or ``PyConfig`` structures. If memory is allocated dynamically, the caller is responsible to release it. Using static strings is just fine. Python memory allocation functions like ``PyMem_RawMalloc()`` must not be used before Python pre-initialization. Using ``malloc()`` and ``free()`` is always safe. ``Py_DecodeLocale()`` must only be used after the pre-initialization. XXX Open Questions ================== This PEP is still a draft with open questions which should be answered: * Do we need to add an API for import ``inittab``? * What about the stable ABI? Should we add a version into ``PyPreConfig`` and ``PyConfig`` structures somehow? The Windows API is known for its ABI stability and it stores the structure size into the structure directly. Do the same? * The PEP 432 stores ``PYTHONCASEOK`` into the config. Do we need to add something for that into ``PyConfig``? How would it be exposed at the Python level for ``importlib``? Passed as an argument to ``importlib._bootstrap._setup()`` maybe? Backwards Compatibility ======================= This PEP only adds a new API: it leaves the existing API unchanged and has no impact on the backwards compatibility. Alternative: PEP 432 ==================== This PEP is inspired by Nick Coghlan's PEP 432 with a main difference: it only allows to configure Python before its initialization. The PEP 432 uses three initialization phases: Pre-Initialization, Initializing, Initialized. It is possible to configure Python between Initializing and Initialized phases using Python objects. This PEP only uses C types like ``int`` and ``wchar_t*`` (and ``PyWideCharList`` structure). All parameters must be configured at once before the Python initialization using the ``PyConfig`` structure. Annex: Python Configuration =========================== Priority and Rules ------------------ Priority of configuration parameters, highest to lowest: * ``PyConfig`` * ``PyPreConfig`` * Configuration files * Command line options * Environment variables * Global configuration variables Priority of warning options, highest to lowest: * ``PyConfig.warnoptions`` * ``PyConfig.dev_mode`` (add ``"default"``) * ``PYTHONWARNINGS`` environment variables * ``-W WARNOPTION`` command line argument * ``PyConfig.bytes_warning`` (add ``"error::BytesWarning"`` if greater than 1, or add ``"default::BytesWarning``) Rules on ``PyConfig`` and ``PyPreConfig`` parameters: * If ``isolated`` is non-zero, ``use_environment`` and ``user_site_directory`` are set to 0 * If ``legacy_windows_fs_encoding`` is non-zero, ``utf8_mode`` is set to 0 * If ``dev_mode`` is non-zero, ``allocator`` is set to ``"debug"``, ``faulthandler`` is set to 1, and ``"default"`` filter is added to ``warnoptions``. But ``PYTHONMALLOC`` has the priority over ``dev_mode`` to set the memory allocator. Configuration Files ------------------- Python configuration files: * ``pyvenv.cfg`` * ``python._pth`` (Windows only) * ``pybuilddir.txt`` (Unix only) Global Configuration Variables ------------------------------ Global configuration variables mapped to ``PyPreConfig`` fields: ======================================== ================================ Variable Field ======================================== ================================ ``Py_LegacyWindowsFSEncodingFlag`` ``legacy_windows_fs_encoding`` ``Py_LegacyWindowsFSEncodingFlag`` ``legacy_windows_fs_encoding`` ``Py_UTF8Mode`` ``utf8_mode`` ``Py_UTF8Mode`` ``utf8_mode`` ======================================== ================================ Global configuration variables mapped to ``PyConfig`` fields: ======================================== ================================ Variable Field ======================================== ================================ ``Py_BytesWarningFlag`` ``bytes_warning`` ``Py_DebugFlag`` ``parser_debug`` ``Py_DontWriteBytecodeFlag`` ``write_bytecode`` ``Py_FileSystemDefaultEncodeErrors`` ``filesystem_errors`` ``Py_FileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_FrozenFlag`` ``_frozen`` ``Py_HasFileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_HashRandomizationFlag`` ``use_hash_seed``, ``hash_seed`` ``Py_IgnoreEnvironmentFlag`` ``use_environment`` ``Py_InspectFlag`` ``inspect`` ``Py_InteractiveFlag`` ``interactive`` ``Py_IsolatedFlag`` ``isolated`` ``Py_LegacyWindowsStdioFlag`` ``legacy_windows_stdio`` ``Py_NoSiteFlag`` ``site_import`` ``Py_NoUserSiteDirectory`` ``user_site_directory`` ``Py_OptimizeFlag`` ``optimization_level`` ``Py_QuietFlag`` ``quiet`` ``Py_UnbufferedStdioFlag`` ``buffered_stdio`` ``Py_VerboseFlag`` ``verbose`` ``_Py_HasFileSystemDefaultEncodeErrors`` ``filesystem_errors`` ``Py_BytesWarningFlag`` ``bytes_warning`` ``Py_DebugFlag`` ``parser_debug`` ``Py_DontWriteBytecodeFlag`` ``write_bytecode`` ``Py_FileSystemDefaultEncodeErrors`` ``filesystem_errors`` ``Py_FileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_FrozenFlag`` ``_frozen`` ``Py_HasFileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_HashRandomizationFlag`` ``use_hash_seed``, ``hash_seed`` ``Py_IgnoreEnvironmentFlag`` ``use_environment`` ``Py_InspectFlag`` ``inspect`` ``Py_InteractiveFlag`` ``interactive`` ``Py_IsolatedFlag`` ``isolated`` ``Py_LegacyWindowsStdioFlag`` ``legacy_windows_stdio`` ``Py_NoSiteFlag`` ``site_import`` ``Py_NoUserSiteDirectory`` ``user_site_directory`` ``Py_OptimizeFlag`` ``optimization_level`` ``Py_QuietFlag`` ``quiet`` ``Py_UnbufferedStdioFlag`` ``buffered_stdio`` ``Py_VerboseFlag`` ``verbose`` ``_Py_HasFileSystemDefaultEncodeErrors`` ``filesystem_errors`` ======================================== ================================ ``Py_LegacyWindowsFSEncodingFlag`` and ``Py_LegacyWindowsStdioFlag`` are only available on Windows. Command Line Arguments ---------------------- Usage:: python3 [options] python3 [options] -c COMMAND python3 [options] -m MODULE python3 [options] SCRIPT Command line options mapped to pseudo-action on ``PyConfig`` fields: ================================ ================================ Option ``PyConfig`` field ================================ ================================ ``-b`` ``bytes_warning++`` ``-B`` ``write_bytecode = 0`` ``-c COMMAND`` ``run_module = COMMAND`` ``--check-hash-based-pycs=MODE`` ``_check_hash_pycs_mode = MODE`` ``-d`` ``parser_debug++`` ``-E`` ``use_environment = 0`` ``-i`` ``inspect++`` and ``interactive++`` ``-I`` ``isolated = 1`` ``-m MODULE`` ``run_module = MODULE`` ``-O`` ``optimization_level++`` ``-q`` ``quiet++`` ``-R`` ``use_hash_seed = 0`` ``-s`` ``user_site_directory = 0`` ``-S`` ``site_import`` ``-t`` ignored (kept for backwards compatibility) ``-u`` ``buffered_stdio = 0`` ``-v`` ``verbose++`` ``-W WARNING`` add ``WARNING`` to ``warnoptions`` ``-x`` ``skip_source_first_line = 1`` ``-X XOPTION`` add ``XOPTION`` to ``xoptions`` ================================ ================================ ``-h``, ``-?`` and ``-V`` options are handled outside ``PyConfig``. Environment Variables --------------------- Environment variables mapped to ``PyPreConfig`` fields: ================================= ============================================= Variable ``PyPreConfig`` field ================================= ============================================= ``PYTHONCOERCECLOCALE`` ``coerce_c_locale``, ``coerce_c_locale_warn`` ``PYTHONDEVMODE`` ``dev_mode`` ``PYTHONLEGACYWINDOWSFSENCODING`` ``legacy_windows_fs_encoding`` ``PYTHONMALLOC`` ``allocator`` ``PYTHONUTF8`` ``utf8_mode`` ================================= ============================================= Environment variables mapped to ``PyConfig`` fields: ================================= ==================================== Variable ``PyConfig`` field ================================= ==================================== ``PYTHONDEBUG`` ``parser_debug`` ``PYTHONDEVMODE`` ``dev_mode`` ``PYTHONDONTWRITEBYTECODE`` ``write_bytecode`` ``PYTHONDUMPREFS`` ``dump_refs`` ``PYTHONEXECUTABLE`` ``program_name`` ``PYTHONFAULTHANDLER`` ``faulthandler`` ``PYTHONHASHSEED`` ``use_hash_seed``, ``hash_seed`` ``PYTHONHOME`` ``home`` ``PYTHONINSPECT`` ``inspect`` ``PYTHONIOENCODING`` ``stdio_encoding``, ``stdio_errors`` ``PYTHONLEGACYWINDOWSSTDIO`` ``legacy_windows_stdio`` ``PYTHONMALLOCSTATS`` ``malloc_stats`` ``PYTHONNOUSERSITE`` ``user_site_directory`` ``PYTHONOPTIMIZE`` ``optimization_level`` ``PYTHONPATH`` ``module_search_path_env`` ``PYTHONPROFILEIMPORTTIME`` ``import_time`` ``PYTHONPYCACHEPREFIX,`` ``pycache_prefix`` ``PYTHONTRACEMALLOC`` ``tracemalloc`` ``PYTHONUNBUFFERED`` ``buffered_stdio`` ``PYTHONVERBOSE`` ``verbose`` ``PYTHONWARNINGS`` ``warnoptions`` ================================= ==================================== ``PYTHONLEGACYWINDOWSFSENCODING`` and ``PYTHONLEGACYWINDOWSSTDIO`` are specific to Windows. ``PYTHONDEVMODE`` is mapped to ``PyPreConfig.dev_mode`` and ``PyConfig.dev_mode``. Annex: Python 3.7 API ===================== Python 3.7 has 4 functions in its C API to initialize and finalize Python: * ``Py_Initialize()``, ``Py_InitializeEx()``: initialize Python * ``Py_Finalize()``, ``Py_FinalizeEx()``: finalize Python Python can be configured using scattered global configuration variables (like ``Py_IgnoreEnvironmentFlag``) and using the following functions: * ``PyImport_AppendInittab()`` * ``PyImport_ExtendInittab()`` * ``PyMem_SetAllocator()`` * ``PyMem_SetupDebugHooks()`` * ``PyObject_SetArenaAllocator()`` * ``Py_SetPath()`` * ``Py_SetProgramName()`` * ``Py_SetPythonHome()`` * ``Py_SetStandardStreamEncoding()`` * ``PySys_AddWarnOption()`` * ``PySys_AddXOption()`` * ``PySys_ResetWarnOptions()`` There is also a high-level ``Py_Main()`` function. Copyright ========= This document has been placed in the public domain. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com