On 15.09.2021 00:16, Guido van Rossum wrote:
> Here I think we need to drop our perfectionist attitude. When I saw 
> Marc-Andre's
> proposal my first response was also "but what about threads." But really,
> os.chdir() is the culprit here, and since it's a syscall we can't fix it. If 
> we
> can live with that, we can live with the proposed os.workdir(). The docs just
> need a warning about threads.
> 
> Yes, you could create a thread-safe version of this by overriding open() -- 
> and
> several hundred other functions that refer to pathnames natively. Such a 
> project
> will inevitably have serious backwards compatibility concerns (any use of
> pathnames from C extensions will not work right) so it's just not going to be 
> done.
> 
> If we don't offer this in the stdlib, users will just implement this 
> themselves,
> poorly (for example, by not restoring the original when done).

Thanks for all the feedback. I'm just going to reply here in a single
email.

First, let me repeat the notes I had added to the proposal, since these
seem to have gotten lost in the thread. I'll some more context:

> Notes:
> - The context manager is not thread safe. There's no thread safe model
>   for the current work dir. OTOH, scripts usually don't use threads,
>   so not a big deal.

The CWD (current working directory) concept is a very old concept
where the OS manages a pointer into the file system on a per process
basis in order to handle requests for relative paths and map them
to absolute paths. https://en.wikipedia.org/wiki/Working_directory

Since this is per process and managed by the OS, the concept does
not extend to threads (possibly) running in the process.

While Python could theoretically try to emulate parts of this for
threads as well, this is not within the scope of os.workdir().

It's supposed to be a simple convenient way of saying: please
chdir here, do something and then return to the previously
active work dir.

The problem it's trying to solve is that last part... functions
forgetting to restore the previous active work dir can easily
break other parts of the script, since the CWD is used for
resolving relative paths in Python as well.

If thread-safety is seen as a big problem, we could add a concept
of managing a single CWD across all threads using lock protected
globals. Not sure whether this is worth the trouble, but perhaps
making it optional would ease some concerns. In this case, os.chdir()
would have to be adapted as well, in order to update the Python
CWD state in the same way (and possibly block or warn when os.chdir()
is called inside an active with os.workdir() block).

> - The context manager could be made more elaborate, e.g. adding optional
>   logging for debugging purposes.

We could add a logger parameter to the context manager for
this to customize the logger, or simply use the default one
from logging.

> - The same could be added to pathlib's objects as .workdir() method
>   returning a context manager.

Since people are starting to use pathlib more, this would be
a natural thing to do, as others have noted already :-)

The context manager will also accept a path-like object as
argument (simply because it passes this directly to os.chdir()
which already does).

As for having the context manager method on a Path object
return the Path object itself (suggested by David), this could
be made to work by returning a subclass of Path, which then
has the necessary extras needed for CWD processing.

I'll add some more notes:

- Why in the os module ? Because this is where you'd search for a way
  to set the CWD and indeed, we have os.chdir() and os.getcwd() in
  this module. Also see the note on pathlib above.

- There are other global settings which could benefit from such
  context managers as well, such as umask (mentioned by Cameron)
  or the locale settings, which you sometimes want to temporarily
  modify, but those are not used as often.

- Why the lowercase name ? Simply to make it fit the other os module
  APIs. Classes should normally have SnakeCase names, but since we're
  not interested in the class aspect here, but rather the context
  manager API of the resulting object, I think it's fair to stick
  with lower case. Many other context managers use a lower case name
  as well.

- Chris mentioned that library code should not be changing the
  CWD. In an ideal world, I'd agree, but there are cases where
  you don't have an option and you need to change the CWD in order
  make certain things work, e.g. you don't control the code you
  want to run, but need to make it work in a specific directory
  by changing the CWD and then passing relative paths to the
  code.

  If you use subprocesses for this, you can make this work,
  since the module forks for you and then sets the CWD in the
  forked process on Unix, but this still leaves you with a gap
  in Python, since you may well want to code in the same
  block which e.g. relies on os.abspath() working with your
  modified version of the CWD.

- Main use case ? Shell scripts written in Python instead of
  bash.

- Adding a big warning to the docs. Yes, absolutely, and os.chdir()
  should receive one in the docs as well.

- openat() et al. (mentioned by Cameron and Christian). These could
  be used to implement a thread-safe way of working with CWDs,
  essentially placing the CWD under control of Python. However,
  Python extensions and the OS would now know about this per-thread
  CWD, so you'd most likely run into problems with only part of your
  code working with these per-thread CWDs. This is definitely
  out of scope for os.workdir().

- Bike shedding on the name. "workdir" is what I use for the context
  manager, but e.g. os.cwd() would also do and is more in line with
  os.getcwd(). We could even have os.chdir() return the context
  manager, without introducing any new API in the os module.
  Still, I find "with os.workdir()" quite intuitive :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 15 2021)
>>> Python Projects, Coaching and Support ...    https://www.egenix.com/
>>> Python Product Development ...        https://consulting.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               https://www.egenix.com/company/contact/
                     https://www.malemburg.com/

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UWJB4J6UT7XC7XLA4L5JN55COWYO4OW3/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to