Hi Mike,
On 22/04/2020 02:32, Mike Kelsey wrote:
Hello, again. I am working on creating a module for one of my experiment's
internal Python packages. Normally, we check the package out of our GitBlit
repository and install it using |pip install --user|. I wrote a .eb file
using 'PythonPackage', and specified the appropriate git_config options to
check everything out. But the build fails during the final sanity check,
because it doesn't find the dependencies:
== FAILED: Installation ended unsuccessfully (build directory:
/scratch/group/mitchcomp/eb/tmp/build/CDMSDataCatalog/0.9.2/system-system-Python-3.6.6):
build failed (first 300 chars): cmd "pip check" exited with exit code 1 and
output:
cdmsdatacatalog 0.9.2 requires datacat, which is not installed.
cdmsdatacatalog 0.9.2 requires tqdm, which is not installed.
This package contains the required setup.py, which itself has the argument,
install_requires=['datacat @
git+https://github.com/slaclab/datacat.git#subdirectory=client/python',
'requests',
'tqdm'],
I thought the |pip install| action took care of parsing this list, doing all
the requisite downloads internally, and leaving you with a fully functional
package with all of its dependencies satisfied.
Do we have to copy and adapt that list into EasyBuild language? If so, how
does one specify the Python-specific suffixes on the Git URL? And how does
one then propagate all of that to PIP so it knows what to find? The
exts_list structure does not support this (in particular, it doesn't support
git_config, which I'm working on myself).
We instruct pip to not auto-download-and-install missing dependencies
(which it does by default), because that often leads to installations
you can't reproduce later using the same easyconfig file.
The "pip check" is something we require in easyconfigs in our central
repository fairly recently, and we do so because we have learned the
hard way that stuff breaks if we don't ensure all required Python
packages are specified in the easyconfig file (see
https://github.com/easybuilders/easybuild-easyconfigs/issues/10462 for a
recent case of broken easyconfigs for exactly this reason).
For https://github.com/slaclab/datacat.git, you could either use
git_config (thanks a lot for your work on making that supported for
extensions as well!), or you could just download a source tarball
provided by GitHub
(https://github.com/slaclab/datacat/archive/stable.tar.gz for example),
and use that. The 'subdirectory' part probably corresponds to the
'start_dir' easyconfig parameter?
pip will check for already installed Python packages via $PYTHONPATH, so
you don't need to do anything special for "pip check" to be happy, other
than listing all required Python packages (that are not already included
with the Python installation you're using, or provided by another
dependency like SciPy-bundle) with a specific version in the easyconfig
file using PythonBundle.
There's a script floating out there (that we should integrate in
EasyBuild itself) that facilitates this a bit, see
https://gist.github.com/boegel/fd9a636d652aa5c8e57778088e9c0a21 (and
improved version in
https://gist.github.com/Flamefire/49426e502cd8983757bd01a08a10ae0d).
This reminds me that I should get back to writing up documentation on
writing easyconfig files for (bundles of) Python packages...
regards,
Kenneth
-- Mike Kelsey