I’m trying to gather a “wish list” of things to be done to facilitate
the use of Guix on clusters and for high-performance computing (HPC).
Ricardo and I wrote about the advantages, shortcomings, and perspectives
I know that Pjotr, Roel, Ben, Eric and maybe others also have experience
and ideas on what should be done (and maybe even code? :-)).
So I’ve come up with an initial list of work items going from the
immediate needs to crazy ideas (batch scheduler integration!) that
hopefully make sense to cluster/HPC people. I’d be happy to get
feedback, suggestions, etc. from whoever is interested!
(The reason I’m asking is that I’m considering submitting a proposal at
Inria to work on some of these things.)
- non-root usage
+ file system virtualization needed
* map ~/.local/gnu/store to /gnu/store
* user name spaces?
* [[https://github.com/proot-me/PRoot/][PRoot]]? but performance problems?
* common interface, like “guix enter” spawns a shell where
/gnu/store is available
+ daemon functionality as a library
* client no longer connects to the daemon, does everything
locally, including direct store accesses
* can use substitutes
+ or plain ’guix-daemon --disable-root’?
+ see [[http://lists.gnu.org/archive/html/help-guix/2016-06/msg00079.html][discussion with Ben Woodcroft and Roel]]
- central daemon usage (like at MDC, but improved)
+ describe/define appropriate setup, like:
* daemon runs on front-end node
* clients can connect to daemon from compute nodes, and perform
* use of distributed file systems: anything to pay attention to?
* how should the front-end offload to compute nodes?
+ technical issues
* daemon needs to be able to listen for connections elsewhere
* client needs to be able to [[http://debbugs.gnu.org/cgi/bugreport.cgi?bug=20381][connect remotely]] instead of using
* how do we share localstatedir? how do we share /gnu/store?
* how do we share the profile directory?
+ admin/social issues
* daemon runs as root
* daemon needs Internet access
* Ricardo mentions lack of nscd and problems caused by the use of
NSS plugins like [[https://fedoraproject.org/wiki/Features/SSSD][SSSD]] in this context
+ batch scheduler integration?
* allow users to offload right from their machine to the cluster?
- package variants, experimentation
+ for experiments, as in Section 4.2 of [[https://hal.inria.fr/hal-01161771/en][the RepPar paper]]
* in the meantime we added [[https://www.gnu.org/software/guix/manual/html_node/Package-Transformation-Options.html][--with-input et al.]]; need more?
+ for [[https://lists.gnu.org/archive/html/guix-devel/2016-10/msg00005.html][CPU-specific optimizations]]
+ somehow support -mtune=native (and even profile-guided
+ simplify the API to switch compilers, libcs, etc.
- workflow, reproducible science
+ implement [[http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22629][channels]]
+ provide a way to see which Guix commit is used, like “guix channel
+ simple ways to [[https://lists.gnu.org/archive/html/guix-devel/2016-10/msg00701.html][test the dependents of a package]] (see also
discussion between E. Agullo & A. Enge)
* new transformation options: --with-graft, --with-source
+ support [[https://lists.gnu.org/archive/html/guix-devel/2016-05/msg00380.html][workflows and pipelines]]?
+ add [[https://github.com/galaxyproject/galaxy/issues/2778][Guix support in Galaxy]]?