Hi Zbyszek, See below my comment.
2011/11/26 Zbigniew Jędrzejewski-Szmek <zbys...@in.waw.pl>: > Hi, > I apologize in advance for the length of this mail. > > sys.path > ======== > When a script or a module is executed by invoking python with proper > arguments, sys.path is extended. When a path to script is given, the > directory containing the script is prepended. When '-m' or '-c' is used, > $CWD is prepended. This is documented in > http://docs.python.org/dev/using/cmdline.html, so far ok. > > sys.path and $PYTHONPATH is like $PATH -- if you can convince someone to put > a directory under your control in any of them, you can execute code as this > someone. Therefore, sys.path is dangerous and important. Unfortunately, > sys.path manipulations are only described very briefly, and without any > commentary, in the on-line documentation. python(1) manpage doesn't even > mention them. > > The problem: each of the commands below is insecure: > > python /tmp/script.py (when script.py is safe by itself) > ('/tmp' is added to sys.path, so an attacker can override any > module imported in /tmp/script.py by writing to /tmp/module.py) > > cd /tmp && python -mtimeit -s 'import numpy' 'numpy.test()' > (UNIX users are accustomed to being able to safely execute > programs in any directory, e.g. ls, or gcc, or something. > > Here '' is added to sys.path, so it is not secure to run > python is other-user-writable directories.) > > cd /tmp/ && python -c 'import numpy; print(numpy.version.version)' > (The same as above, '' is added to sys.path.) > > cd /tmp && python > (The same as above). > > IMHO, if this (long-lived) behaviour is necessary, it should at least be > prominently documented. Also in the manpage. > > Prepending realpath(dirname(scriptname)) > ======================================== > Before adding a directory to sys.path as described above, Python actually > runs os.path.realpath over it. This means that if the path to a script given > on the commandline is actually a symlink, the directory containing the real > file will be executed. This behaviour is not really documented (the > documentation only says "the directory containing that file is added to the > start of sys.path"), but since the integrity of sys.path is so important, it > should be, IMHO. > > Using realpath instead of the (expected) path specified by the user breaks > imports of non-pure-python (mixed .py and .so) modules from modules executed > as scripts on Debian. This is because Debian installs > architecture-independent python files in /usr/share/pyshared, and symlinks > those files into /usr/lib/pymodules/pythonX.Y/. The architecture-dependent > .so and python-version-dependent .pyc files are installed in > /usr/lib/pymodules/pythonX.Y/. When a script, e.g. > /usr/lib/pymodules/pythonX.Y/script.py, is executed, the directory > /usr/share/pyshared is prepended to sys.path. If the script tries to import > a module which has architecture-dependent parts (e.g. numpy) it first sees > the incomplete module in /usr/share/pyshared and fails. > > This happens for example in parallel python > (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=620551) and recently when > packaging CellProfiler for Debian. > > Again, if this is on purpose, it should be documented. > > PEP 395 (Qualified Names for Modules) > ===================================== > > PEP 395 proposes another sys.path manipulation. When running a script, the > directory tree will be walked upwards as long as there are __init__.py > files, and then the first directory without will be added. > > This is of course a fine idea, but it makes a scenario, which was previously > safe, insecure. More precisely, when executing a script in a directory in a > parent directory-writable-by-other-users, the parent directory will be added > to sys.path. > > So the (safe) operation of downloading an archive with a package, unzipping > it in /tmp, changing into the created directory, checking that the script > doesn't do anything bad, and running a script is now insecure if there is > __init__.py in the archive root. > > > I guess that it would be useful to have an option to turn off those sys.path > manipulations. Thanks very much for the details explanation. Given this, I believe I can safely give up on CellProfiler packaging until this issue is addressed upstream (either in CellProfiler using an indirection, or in python). Thanks, -- Mathieu _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com