Ludovic Courtès writes:
> I’m trying to gather a “wish list” of things to be done to facilitate
> the use of Guix on clusters and for high-performance computing (HPC).
> Ricardo and I wrote about the advantages, shortcomings, and perspectives
> I know that Pjotr, Roel, Ben, Eric and maybe others also have experience
> and ideas on what should be done (and maybe even code? :-)).
> So I’ve come up with an initial list of work items going from the
> immediate needs to crazy ideas (batch scheduler integration!) that
> hopefully make sense to cluster/HPC people. I’d be happy to get
> feedback, suggestions, etc. from whoever is interested!
> (The reason I’m asking is that I’m considering submitting a proposal at
> Inria to work on some of these things.)
> TIA! :-)
Here are some aspects I think we need:
* Network-aware guix-daemon
From a user's point of view it would be cool to have a network-aware
guix-daemon. In our cluster, we have a shared storage, on which we have
the store, but manipulating the store through guix-daemon is now limited
to a single node (and a single request per profile). Having `guix' talk
with `guix-daemon' over a network allows users to install stuff from
any node, instead of a specific node.
* Profile management
The abstraction of profiles is an awesome feature of FPM, but the user
interface is missing. We could do better here.
Switch the default profile
(and prepend values of environment variables to the current values):
$ guix profile --switch=/path/to/shared/profile
Reset to default profile (and environment variable values without the
profile we just unset):
$ guix profile --reset
Create an isolated environment based on a profile:
$ guix environment --profile=/path/to/profile --pure --ad-hoc
* Workflow management/execution
Add automatic program execution with its own vocabulary. I think
"workflow management" boils down to execution of a G-exp, but the
results do not necessarily need to be stored in the store (because the
data it works on is probably managed by an external data management
system). A powerful feature of GNU Guix is its domain-specific
language for describing software packages. We could add
domain-specific parts for workflow management (a `workflow' data type
and a `task' or `process' data type gets us there more or less).
With workflow management we are only interested in the "build
function", not the "source code" or the "build output".
You are probably aware that I worked on this for some time, so I could
share the data types I have and the execution engine parts I have.
The HPC-specific part of this is the compatibility with existing job
scheduling systems and data management systems.
* Document on why we need super user privileges on the Guix daemon
Probably an infamous point by now. By design, the Linux kernel keeps
control over all processes. With GNU Guix, we need some control over
the environment in which a process runs (disable network access,
change the user that executes a process), and the environment in which
the output lives (chown, chmod, to allow multiple users to use the
build output). Instead of hitting the wall of "we are not going to
run this thing with root privileges", we could present our sysadmins
with a document for the reasons, the design decisions and the actual
code involved in super user privilege stuff.
This is something I am working on as well, but help is always welcome