Re: PYTHONPATH issue explanation

2018-03-24 Thread Chris Marusich
Hi Hartmut,

Awesome analysis!  Thank you for taking point on this.  I will offer
some feedback.  I hope it is useful.

The short version is: I think Python should let us explicitly tell it
where its system site directory is.  If Python provided such a feature,
then I think we could use it and avoid all these problems.  I think this
would be better than modifying the heuristics that Python uses for
finding its system site during start-up (although I think that is a good
back-up plan), since those heuristics are complicated and difficult to
control.  It would just be simpler if we could explicitly tell Python
where its site directory is, instead of indirectly arranging for Python
to find its site directory via its module-lookup Rube-Goldberg machine.

Hartmut Goebel  writes:

> This python interpreter does not find the site-packages in GUIX_PROFILE
> since site-packages are search relative to "sys.base_prefix" (which is
> the same as "sys.prefix" except in virtual environments).
> "sys.base_prefix" is determined based on the executable's path (argv[0])
> by resolving all symlinks.

I am familiar with this problem.  Any time you want to deploy Python and
its libraries by building up a symlink tree, and you put Python in a
part of the file system that lives far away from the libraries
themselves, Python will punish you cruelly with this behavior.  It is no
fun at all.  :-( You always have to come up with silly hacks to work
around it, and those hacks don't work generally in every case.

Question: Why does Python insist on canonicalizing its executable path?
It always seemed to me like if Python just used the original path, these
problems would not occur.  People who use symlink trees to deploy Python
would be happy.  Perhaps I am missing some information.  What is the
intent behind Python's decision to canonicalize the executable path?
What problems occur if Python doesn't do that?

> The python interpreter assumes "site-packages" to be relative to "where
> python is installed" - called "sys.base_prefix" (which is the same as
> "sys.prefix" except in virtual environments). "sys.base_prefix" is
> determined based on the executable's path (argv[0]) by resolving all
> symlinks. For Guix this means: "sys.base_prefix" will always point to
> /gnu/store/…-python-X.Y, not to GUIX_PROFILE. Thus the site-packages
> installed into the guix profile will not be found.

Yes.  This is a problem.  As you know, this heuristic fails
spectacularly when you try to deploy Python in a symlink tree.

Question: Why does Python not supply a way to "inject" the system site
directory?  In Guix-deployed systems, we are the masters of reality.  We
control ALL the paths.  We can tell Python exactly where its "system
site" is - we can build a symlink tree of its system site in the store
and then tell Python to use that site specifically.  For example, if
Python would let us specify this path via a PYTHON_SYSTEM_SITE
environment variable, then I think it would solve many (all?) of our
problems.  Perhaps this is similar to what you are suggesting regarding
GUIX_PYTHON_X.Y_SITE_PACKAGES and GUIX_PYTHONHOME_X.Y.

> This is why we currently (mis-) use PYTHONPATH: To make the
> site-packages installed into the guix profile available.

I agree that this is a mis-use.  People do it because Python doesn't
provide any better way.  And then people find out about all its terrible
down-sides, like for example the fact that .pth files will not be
processed if they appear on the PYTHONPATH.  And then they do stuff like
hack site.py to walk the PYTHONPATH and evaluate all the .pth files,
which is gross but sort of works.  Just thinking about the pain I have
experienced with this stuff makes my blood boil.

> no. 2
> suggests using a mechanism already implemented in python: Setting
> "PYTHONHOME" will make the interpreter to use this as "sys.base_prefix"
> unconditionally. Again there is only one PYTHONHOME variable for all
> versions of python (designed by upstream). We could work around this
> easily (while keeping upstream compatibility) by using
> GUIX-PYTHONHOME-X.Y, to be evaluated just after PYTHONHOME.

Are there legitimate use cases where a user wants to set their own
PYTHONHOME?  If so, would our use of PYTHONHOME prevent them from doing
that?  If so, that seems bad.

In the past, I have used PYTHONUSERBASE (or maybe it was PYTHONUSERSITE,
I can't remember exactly which) to make Python find libraries in a
symlink tree.  However, because that is intended for users to use, I
don't think it's a good solution for us here.  If we co-opt these
environment variables, then users would not be able to use them.

> The drawback is: This is implemented using an environment variable,
> which might not give the expected results in all cases. E.g. running
> /gnu/store/…-profile/bin/python will not load the site-packages of that
> profile. Also there might be issues implementing virtual environments.
> (Thinking about this, I'm 

Re: PYTHONPATH issue explanation

2018-03-18 Thread 宋文武
iyzs...@member.fsf.org (宋文武) writes:

> [...]
>
> I'd like do more tests with the GUIX_PYTHON_X_Y_SITE_PACKAGES option
> (patch sent), hope it works :-)

Hello, I have write a shell script to do some tests, it looks good to me!


Updated 'GUIX_PYTHON_X_Y_SITE_PACKAGES' patch, target 'core-updates' at
commit 171a117c (you also have to comment out the "manual-database"
profile hook in the "guix/profiles.scm", as it's broken in that commit):

>From d807306d02aab0a84de4fa3ff457a5b97ac15520 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E5=AE=8B=E6=96=87=E6=AD=A6?= 
Date: Sat, 17 Mar 2018 18:46:55 +0800
Subject: [PATCH] gnu: python-2.7, python-3.6: Honor
 'GUIX_PYTHON_X_Y_SITE_PACKAGES'.

This replace the use of 'PYTHONPATH' as search path specification, as
suggested by Hartmut Goebel .  See
 for
details.

* gnu/packages/python.scm (python-guix-search-path-specification)
(python-guix-sitecustomize.py): New procedures.
(python-2.7, python-3.6):
[native-search-paths]: Use 'python-guix-search-path-specification'.
[arguments]: Add 'install-sitecustomize.py' phase.
---
 gnu/packages/python.scm | 65 +++--
 1 file changed, 57 insertions(+), 8 deletions(-)

diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm
index f3a75c30e..45de8c527 100644
--- a/gnu/packages/python.scm
+++ b/gnu/packages/python.scm
@@ -136,6 +136,41 @@
   #:use-module (guix build-system trivial)
   #:use-module (srfi srfi-1))
 
+(define (python-guix-search-path-specification version)
+  "Return the search path specification for python VERSION."
+  (let* ((major.minor (version-major+minor version))
+ (variable(string-append
+   "GUIX_PYTHON_"
+   (string-replace-substring major.minor "." "_")
+   "_SITE_PACKAGES"))
+ (files   (list (string-append
+ "lib/python" major.minor "/site-packages"
+(search-path-specification
+ (variable variable)
+ (files files
+
+(define (python-guix-sitecustomize.py version)
+  "Return the content of @file{sitecustomize.py} for python VERSION."
+  (let* ((major.minor (version-major+minor version))
+ (variable(string-append
+   "GUIX_PYTHON_"
+   (string-replace-substring major.minor "." "_")
+   "_SITE_PACKAGES")))
+(format #f "# Append module search paths for guix packages to sys.path.
+import os
+import site
+
+SITE_PACKAGES = os.environ.get('~a')
+
+if SITE_PACKAGES is None:
+SITE_PACKAGES = []
+else:
+SITE_PACKAGES = SITE_PACKAGES.split(os.pathsep)
+
+for i in SITE_PACKAGES:
+site.addsitedir(i)
+" variable)))
+
 (define-public python-2.7
   (package
 (name "python2")
@@ -304,6 +339,16 @@
  "/site-packages")))
(install-file tkinter.so target)
(delete-file tkinter.so)
+#t)))
+  (add-after 'install 'install-sitecustomize.py
+(lambda* (#:key outputs #:allow-other-keys)
+  (let* ((out (assoc-ref outputs "out"))
+ (sitedir (car (find-files out "^site-packages$"
+   #:directories? #t
+(with-output-to-file
+(string-append sitedir "/sitecustomize.py")
+  (lambda ()
+(display ,(python-guix-sitecustomize.py version
 #t))
 (inputs
  `(("bzip2" ,bzip2)
@@ -318,9 +363,7 @@
 (native-inputs
  `(("pkg-config" ,pkg-config)))
 (native-search-paths
- (list (search-path-specification
-(variable "PYTHONPATH")
-(files '("lib/python2.7/site-packages")
+ (list (python-guix-search-path-specification version)))
 (home-page "https://www.python.org;)
 (synopsis "High-level, dynamically-typed programming language")
 (description
@@ -428,13 +471,19 @@ data types.")
  ,file)))
   (find-files out "\\.py$")))
   (list '() '("-O") '("-OO")))
+ #t)))
+   (replace 'install-sitecustomize.py
+ (lambda* (#:key outputs #:allow-other-keys)
+   (let* ((out (assoc-ref outputs "out"))
+  (sitedir (car (find-files out "^site-packages$"
+#:directories? #t
+ (with-output-to-file
+ (string-append sitedir "/sitecustomize.py")
+   (lambda ()
+ (display ,(python-guix-sitecustomize.py version
  #t)))
 (native-search-paths
- (list (search-path-specification
-(variable "PYTHONPATH")
-(files (list 

Re: PYTHONPATH issue explanation

2018-03-17 Thread 宋文武
Hartmut Goebel  writes:

> Hi,
>
> I agree with Ricardo: We first should agree on what we want to
> implement.

Okay.

>
> I created a pad at [1] for collecting all test-cases and the expected
> results. Please add you test-cases there. Thanks!
>
> [1] https://semestriel.framapad.org/p/guix-python-site-packages-test-cases

I have append some text, it's available to all in realtime?
not sure how it works...

>
> Am 17.03.2018 um 02:41 schrieb 宋文武:
>
>> - "GUIX_PYTHON_X_Y_SITE_PACKAGES" […] is necessary for the "build" 
>> environment.
> For the build environment we could easily work around using PYTHONPATH.
> Since the build-system is clearly defined and does not interfere with
> any user-definitions, this is save to do.

Yes, but if "GUIX_PYTHON_X_Y_SITE_PACKAGES" does works (i hope so) in
the "profile" side, it's better to replace PYTHONPATH for consistent.

>
>> - Avoid any environment variable for the "profile" environment.
>>
>>   We have a union "profile" for all the python packages, so environment
>>   variables can be totally avoided with the help of "venv".
> […]
>>  We only need to make the "profile"
>>   a "venv" for python.  For python3, a simple "pyvenv.cfg" file is
>>   enough, for python2 I guess we have to make a union or copy files like
>>   what "virtualenv" does.
>
> This would be a very elegant solution. Unfortunately this does not work
> as shown in part 2 of my analysis, esp. point 4a.

A workaround for the broke case maybe tell the user to create a
"sitecustomize.py" in the created venv, and add the search paths of
profile himself.


I'd like do more tests with the GUIX_PYTHON_X_Y_SITE_PACKAGES option
(patch sent), hope it works :-)



Re: PYTHONPATH issue explanation

2018-03-17 Thread Hartmut Goebel
Am 17.03.2018 um 11:07 schrieb Ricardo Wurmus:
>
>> - sys.prefix and sys.exec_prefix would still point to the store, not to
>>   the profile.  This might break Python appications expecting
>>   site-packages to be below sys.prefix.
> Is this an actual problem?  Do you know of applications that make this
> assumption?  If so, is this unfixable?

I'm not aware of any actual problem.

> I’m not too hopeful about this variant, but I’m rather ignorant about
> venvs.  My main concern is about whether it will still be possible for
> users to create venvs from a subset of their installed packages when we
> generate a pyvenv.cfg by default.

venvs never contain a "subset of installed packages". They either
include all system site-packages or none of them.

But as I've already written, generating a pyvenv.cfg for this case will
not work as we need it.

-- 
Regards
Hartmut Goebel

| Hartmut Goebel  | h.goe...@crazy-compilers.com   |
| www.crazy-compilers.com | compilers which you thought are impossible |



Re: PYTHONPATH issue explanation

2018-03-17 Thread Hartmut Goebel
Am 17.03.2018 um 11:07 schrieb Ricardo Wurmus:
> What I don’t like about this solution is that PYTHONHOME can only hold a
> single directory, so composing profiles (that use the same Python
> variant) would no longer work.  I prefer the

What exactly do you mean with "composing profiles"? This fails:

guix environment --ad-hoc python
echo $GUIX_ENVIRONMENT
# /gnu/store/0d8vp2h…-profile
echo $PYTHONPATH
# /gnu/store/0d8vp2h…-profile/lib/python3.5/site-packages
guix environment --ad-hoc python-simplejson
echo $GUIX_ENVIRONMENT
# /gnu/store/5xgfisg…-profile
echo $PYTHONPATH
# /gnu/store/0d8vp2h…-profile/lib/python3.5/site-packages
python3 -s -c 'import simplejson'
# import error

('-s' avoids leaking packages from §HOME/.local/…)

-- 
Regards
Hartmut Goebel

| Hartmut Goebel  | h.goe...@crazy-compilers.com   |
| www.crazy-compilers.com | compilers which you thought are impossible |




Re: PYTHONPATH issue explanation

2018-03-17 Thread Hartmut Goebel
Hi,

I agree with Ricardo: We first should agree on what we want to implement.

I created a pad at [1] for collecting all test-cases and the expected
results. Please add you test-cases there. Thanks!

[1] https://semestriel.framapad.org/p/guix-python-site-packages-test-cases

Am 17.03.2018 um 02:41 schrieb 宋文武:

> - "GUIX_PYTHON_X_Y_SITE_PACKAGES" […] is necessary for the "build" 
> environment.
For the build environment we could easily work around using PYTHONPATH.
Since the build-system is clearly defined and does not interfere with
any user-definitions, this is save to do.

> - Avoid any environment variable for the "profile" environment.
>
>   We have a union "profile" for all the python packages, so environment
>   variables can be totally avoided with the help of "venv".
[…]
>  We only need to make the "profile"
>   a "venv" for python.  For python3, a simple "pyvenv.cfg" file is
>   enough, for python2 I guess we have to make a union or copy files like
>   what "virtualenv" does.

This would be a very elegant solution. Unfortunately this does not work
as shown in part 2 of my analysis, esp. point 4a.

>   > We could avoid GUIX-PYTHONHOME[23] if we stop resolving the symlinks
>   > at the correct point in iteration.
>
>   This is exactly what "venv" does! 

Unfortunately venv works quite different: system site-packages are
always taken from sys.base_exec. See part 3 of my analysis, esp. the
"pyvenv.cfg" section.

> I plan to implement option 1 by adding a "sitecustomize.py" (better
> than modify "site.py") into the python packages, and modify
> "search-path-specification" to use "GUIX_PYTHON_X_Y_SITE_PACKAGES".

When implementing this in sitecustomize.py, you will end up
re-implementing the complete venv mechanism.

When going the GUIX_PYTHON_X_Y_SITE_PACKAGES route, we should look where
the best place will be: Maybe site.PREFIXES, maybe
site.getsitepackages(), maybe site.venv().

-- 
Regards
Hartmut Goebel

| Hartmut Goebel  | h.goe...@crazy-compilers.com   |
| www.crazy-compilers.com | compilers which you thought are impossible |




Re: PYTHONPATH issue explanation

2018-03-17 Thread Ricardo Wurmus

宋文武  writes:

> Option 2, "GUIX_PYTHONHOME_X_Y" can not be used in the build-system
> unless we make a union of python inputs

For texlive we create a temporary union in the build system, so this
shouldn’t be an unsurmountable obstacle.

What I don’t like about this solution is that PYTHONHOME can only hold a
single directory, so composing profiles (that use the same Python
variant) would no longer work.  I prefer the
GUIX_PYTHON_X_Y_SITE_PACKAGES solution, because it is an actual search
path.

> - "GUIX_PYTHON_X_Y_SITE_PACKAGES" (X.Y is not a valid env identifier
>   in bash) is necessary for the "build" environment.
>
>   We don't make a union of all the inputs in the "build" environment, so
>   a PATH (contains multiples directories) like env have to be used to
>   let python find all its "site-packages" from inputs.

I think this might be a good solution as it is a drop-in replacement for
our current use PYTHONPATH.

Hartmut wrote this:

> - sys.prefix and sys.exec_prefix would still point to the store, not to
>   the profile.  This might break Python appications expecting
>   site-packages to be below sys.prefix.

Is this an actual problem?  Do you know of applications that make this
assumption?  If so, is this unfixable?

>   We have a union "profile" for all the python packages, so environment
>   variables can be totally avoided with the help of "venv".
>
>   > We could avoid GUIX-PYTHONHOME[23] if we stop resolving the symlinks
>   > at the correct point in iteration.
>
>   This is exactly what "venv" does!  We only need to make the "profile"
>   a "venv" for python.  For python3, a simple "pyvenv.cfg" file is
>   enough, for python2 I guess we have to make a union or copy files like
>   what "virtualenv" does.

I’m not too hopeful about this variant, but I’m rather ignorant about
venvs.  My main concern is about whether it will still be possible for
users to create venvs from a subset of their installed packages when we
generate a pyvenv.cfg by default.

> I plan to implement option 1 by adding a "sitecustomize.py" (better
> than modify "site.py") into the python packages, and modify
> "search-path-specification" to use "GUIX_PYTHON_X_Y_SITE_PACKAGES".
>
> How's that sound?

This sounds good to me.  Thank you and thanks again, Hartmut, for laying
out our options and analysing their advantages and drawbacks!

Before working on this, though, I would like to have the above question
answered to avoid wasting your time.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net





Re: PYTHONPATH issue explanation

2018-03-16 Thread 宋文武
Hartmut Goebel  writes:

Hello,

> Hi,
>
> given the ongoing discussion around Python show that my explanation was
> not good enough. I'll try to summarize and give more background.

Thanks for the explanations and this one!

So I have more understanding of it and ideas...

>
> With regard to Python, guix currently has a major issue, which my
> proposals are addressing. There are other issues (like naming the
> executables, the "wrapper" script", etc.) which are not addressed
> here.
> [...]

Okay, the "major issue" is that we're using "PYTHONPATH", which will add
entries into "sys.path" before builtin ones.  It's semantically wrong
and may (or had?) cause issues.


> Part 3 of my analysis lists three solutions for this, where only number
> 2 and 3 are "good choices".

Option 2, "GUIX_PYTHONHOME_X_Y" can not be used in the build-system
unless we make a union of python inputs, so I think we should go for 1
and optional (later) add 3 too:

- "GUIX_PYTHON_X_Y_SITE_PACKAGES" (X.Y is not a valid env identifier
  in bash) is necessary for the "build" environment.

  We don't make a union of all the inputs in the "build" environment, so
  a PATH (contains multiples directories) like env have to be used to
  let python find all its "site-packages" from inputs.

  > Drawbacks: This might break Python appications expecting
  > site-packages to be below sys.prefix.

  We have a patch named "python-2.7-site-prefixes.patch" seems to handle
  this, maybe we should do it for python3 too?


- Avoid any environment variable for the "profile" environment.

  We have a union "profile" for all the python packages, so environment
  variables can be totally avoided with the help of "venv".

  > We could avoid GUIX-PYTHONHOME[23] if we stop resolving the symlinks
  > at the correct point in iteration.

  This is exactly what "venv" does!  We only need to make the "profile"
  a "venv" for python.  For python3, a simple "pyvenv.cfg" file is
  enough, for python2 I guess we have to make a union or copy files like
  what "virtualenv" does.


I plan to implement option 1 by adding a "sitecustomize.py" (better
than modify "site.py") into the python packages, and modify
"search-path-specification" to use "GUIX_PYTHON_X_Y_SITE_PACKAGES".

How's that sound?



Re: PYTHONPATH issue explanation

2018-03-15 Thread Hartmut Goebel
Hi,

given the ongoing discussion around Python show that my explanation was
not good enough. I'll try to summarize and give more background.

With regard to Python, guix currently has a major issue, which my
proposals are addressing. There are other issues (like naming the
executables, the "wrapper" script", etc.) which are not addressed here.

When installing Python and some Python packages (e.g. python-simplejson)
in guix, the python interpreter will be linked to
GUIX_PROFILE/bin/pythonX.Y and the packages' files are linked into
GUIX_PROFILE/lib/python-X.Y/site-packages/…, which is perfectly okay.

This python interpreter does not find the site-packages in GUIX_PROFILE
since site-packages are search relative to "sys.base_prefix" (which is
the same as "sys.prefix" except in virtual environments).
"sys.base_prefix" is determined based on the executable's path (argv[0])
by resolving all symlinks.

The python interpreter assumes "site-packages" to be relative to "where
python is installed" - called "sys.base_prefix" (which is the same as
"sys.prefix" except in virtual environments). "sys.base_prefix" is
determined based on the executable's path (argv[0]) by resolving all
symlinks. For Guix this means: "sys.base_prefix" will always point to
/gnu/store/…-python-X.Y, not to GUIX_PROFILE. Thus the site-packages
installed into the guix profile will not be found.

This is why we currently (mis-) use PYTHONPATH: To make the
site-packages installed into the guix profile available.

Using PYTHONPATH for this woes since there is only one PYTHONPATH
variable for all versions of python. This is designed by upstream.

Additionally: When using PYTHONPATH the site-packages are added to the
search path ("sys.path") *in front* of the python standard library,
while they are expected to be added *behind*.

Part 3 of my analysis lists three solutions for this, where only number
2 and 3 are "good choices".

no. 2
suggests using a mechanism already implemented in python: Setting
"PYTHONHOME" will make the interpreter to use this as "sys.base_prefix"
unconditionally. Again there is only one PYTHONHOME variable for all
versions of python (designed by upstream). We could work around this
easily (while keeping upstream compatibility) by using
GUIX-PYTHONHOME-X.Y, to be evaluated just after PYTHONHOME.

This would be easy to implement using Guix's "search-path" capabilities
and a small patch to the python interpreter.

The drawback is: This is implemented using an environment variable,
which might not give the expected results in all cases. E.g. running
/gnu/store/…-profile/bin/python will not load the site-packages of that
profile. Also there might be issues implementing virtual environments.
(Thinking about this, I'm quite sure there will. Ouch!)

no.3
suggests changing the way the python interpreter is resolving symlinks
when searching for "sys.base_prefix". The idea is to stop "at the profile".

The hard part of this is to determine "at the profile". Also this needs
a larger patch. But if we manage to implement this, it would be perfect.
I could contribute a draft for this implemented in Python. The
C-implementation needs to be done by some C programmer.

Which way should we go?

-- 
Regards
Hartmut Goebel

| Hartmut Goebel  | h.goe...@crazy-compilers.com   |
| www.crazy-compilers.com | compilers which you thought are impossible |