[Python-ideas] Re: Improving sys.executable for embedded Python scenarios

Gregory P. Smith Sat, 01 May 2021 18:53:30 -0700

On Sat, May 1, 2021 at 10:49 AM Gregory Szorc <gregory.sz...@gmail.com>
wrote:


> The way it works today, if you have an application embedding Python, your
> sys.argv[0] is (likely) your main executable and sys.executable is probably
> None or the empty string (per the stdlib docs which say not to set
> sys.executable if there isn't a path to a known `python` executable).
>
> Unfortunately, since sys.executable is a str, the executable it points to
> must behave as `python` does. This means that your application embedding
> and distributing its own Python must provide a `python` or `python`-like
> standalone executable and use it for sys.executable and this executable
> must be independent from your main application because the run-time
> behavior is different. (Yes, you can employ symlink hacks and your
> executable can sniff argv[0] and dispatch to your app or `python`
> accordingly. But symlinks aren't reliable on Windows and this still
> requires multiple files/executables.) **This limitation effectively
> prevents the existence of single file application binaries who also want to
> expose a full `python`-like environment, as there's no standard way to
> advertise a mechanism to invoke `python` that isn't a standalone executable
> with no arguments.**
>

minor nit: I wouldn't use the words "must behave" above... Since using
sys.executable = None at work for the past five years.  The issues we run
into are predominantly in unit tests that try to launch an interpreter via
subprocess of sys.executable.  and the bulk of that is in CPython's own
test suite (which I vote "doesn't really count").

regardless, not needing to tweak even those would be a convenience and it
could open up doors for some more application frameworks that make such
environment assumptions and are thus hard to distribute stand-alone.  ex:
It'd open the door for multiprocessing spawn mode within stand alone
embedded binaries.

While applications embedding Python may not have an explicit `python`
> executable, they do likely have the capability to instantiate a
> `python`-like environment at run-time: they have the interpreter after all,
> they "just" need to provide a mechanism to invoke Py_RunMain() with an
> interpreter config initialized using the "python" profile.
>
> **I'd like to propose a long-term replacement to sys.executable that
> enables applications embedding Python to advertise a mechanism for invoking
> the same executable such that they get a `python` experience.**
>
> The easiest way to do this is to introduce a list[str] variant. Let's call
> it sys.python_interpreter. Here's how it would work.
>
> Say I've produced myapp.exe, a Windows application. If you run `myapp.exe
> python --`, the executable behaves like `python`. e.g. `myapp.exe python --
> -c 'print("hello, world")'` would be equivalent to `python -c
> 'print("hello, world")'`. The app would set `sys.python_interpreter =
> ["myapp.exe", "python", "--"]`. Then Python code wanting to invoke a Python
> interpreter would do something like
> `subprocess.run(sys.python_interpreter)` and automatically dispatch through
> the same executable.
>

yep, that seems reasonable.  unfortunately the command line arguments are a
global namespace, but choosing a unique "launch me as a standalone python
interpreter" arg when building a standalone python executable app that will
never conflict with an application, at build time, is doable.  Nobody's
application wants this specific unique per build ---$(uuid) flag in argv[1]
right? ;) ...

There's still an API challenge to decide on here: people using
sys.executable also expect to pass flags to the python interpreter.  Do we
make an API guarantee that the final flag in sys.python_interpreter is
always a terminator that separates python flags from application flags (--
or otherwise)?

For applications not wanting to expose a `python`-like capability, they
> would simply set sys.python_interpreter to None or [], just like they do
> with sys.executable today.
>

Yep.  Though that should be done at stand alone python application build
time to avoid any command line of the binary possibly launching as a plain
interpreter.  (this isn't security, anyone with access to read the stand
alone executable can figure out how to construct a raw interpreter usable
in their environment from that)


> In fact, I imagine Python's initialization would automatically set
> sys.python_interpreter to [sys.executable] by default and applications
> would have to opt in to a more advanced PyConfig field to make
> sys.python_interpreter different. This would make sys.python_interpreter
> behaviorally backwards compatible, so code bases could use
> sys.python_interpreter as a modern substitute for sys.executable, if
> available, without that much risk.
>

+1

-gps


>
> Some applications may want more advanced mechanisms than command line
> arguments to dispatch off of. For example, maybe you want to key off an
> environment variable to activate "Python mode."  This scenario is a bit
> harder to implement, as it would require yet another advertisement on how
> to invoke `python`. If subprocess had a "builder" interface for iteratively
> constructing a process invocation, we could expose a stdlib function to
> return a builder preconfigured to invoke `python`. But since such an
> interface doesn't exist, there's not as clean a solution for cases that
> require something more advanced than additional process arguments. Maybe we
> could make sys.python_interpreter a tuple[list[str], dict[str, str]] where
> that dict is environment variables to set. Doable. But I'm unconvinced the
> complexity is warranted, especially since the application has full control
> over interpreter initialization and can set most of the settings that
> they'd want to set through environment variables (e.g. PYTHONHOME) as part
> of initializing the `python`-like environment.
>
> Yes, there will be a long tail of applications needing to adapt to the
> reality that sys.python_interpreter exists and is a list. Checks like `if
> sys.executable == sys.argv[0]` will need to become more complicated. Maybe
> we could expose a simple "am I a Python interpreter process" in the stdlib?
> (The inverse "am I not a Python interpreter executable" question could also
> benefit from stdlib standardization, as there are unofficial mechanisms
> like sys.frozen and sys.meipass attempting to answer this question.)
>
> Anyway, as it stands, sys.executable just doesn't work for applications
> embedding Python who want to expose a full `python`-like environment from
> single executable distributions. I think the introduction of a new API to
> allow applications to "self-dispatch" to a Python interpreter could
> eventually lead to significant ergonomic wins for embedded Python
> applications. This would make Python a more attractive target for
> embedding, which benefits the larger Python ecosystem.
>
> Thoughts?
>
> (I rarely post here. So if this idea is actionable, please inform me of
> next steps to make it become a reality.)
> _______________________________________________
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/O66N56PB4U6AGICGBSRFD2OWA5JWMFC6/
> Code of Conduct: http://python.org/psf/codeofconduct/
>

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QL6FF4OXQB2DVIQHNQZPSNS4HGGXKWCH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Improving sys.executable for embedded Python scenarios

Reply via email to