Re: Guix on clusters and in HPC

2016-11-21 Thread Ludovic Courtès
Hi Ben,

Ben Woodcroft  skribis:

> I hope the proposal is/was working out.

Making progress; it’s due for the end of the year.  Until then you’re
welcome to make suggestions.  :-)

> On 03/11/16 23:44, Ludovic Courtès wrote:
>> Hi!
>>
>> Ben Woodcroft  skribis:
>>
>>> I'm a little late here, but please do all of the things on that list :)
>> :-)
>>
>>> With this suggestion:
>>>
>>>  + for 
>>> [[https://lists.gnu.org/archive/html/guix-devel/2016-10/msg5.html][CPU-specific
>>>  optimizations]]
>>>  + somehow support -mtune=native (and even profile-guided
>>>optimizations?)
>>>
>>> I'm not sure if you already thought of this, but an important use case is 
>>> that of pipelines, where we may want to optimise not just the package being 
>>> built, but instead one (or more) of its dependencies. So your suggestion of 
>>> this syntax:
>>>
>>>guix package --tune=haswell -i diamond
>>>
>>> requires some extensions, maybe something like this, where bamm can be used 
>>> as a pipeline that uses bwa and samtools:
>>>
>>>guix package -i bamm --tune=haswell bwa samtools
>>>
>>> and to optimise the C in bamm itself too:
>>>
>>>guix package -i bamm --tune=haswell bwa samtools bamm
>> So you’re saying that --tune should apply recursively, right?
> Sort of. The difficulty with applying it fully recursively is that it
> might greatly increase the maintenance burden of using --tune for
> little performance improvement, since all inputs would have to be
> tuned, not just those that substantively affect performance. This is
> all conjecture though, I'm not sure how many packages will fail to
> build when tuned.

Good question.  It’s also unclear whether tuning should be attempted
recursively on all packages (it wouldn’t make sense to rebuild glibc or
GCC, for instance), or if packages should somehow declare that they are
candidates.

Thanks,
Ludo’.



Re: Guix on clusters and in HPC

2016-11-18 Thread Ben Woodcroft

Hi Ludo,

I hope the proposal is/was working out.


On 03/11/16 23:44, Ludovic Courtès wrote:

Hi!

Ben Woodcroft  skribis:


I'm a little late here, but please do all of the things on that list :)

:-)


With this suggestion:

 + for 
[[https://lists.gnu.org/archive/html/guix-devel/2016-10/msg5.html][CPU-specific
 optimizations]]
 + somehow support -mtune=native (and even profile-guided
   optimizations?)

I'm not sure if you already thought of this, but an important use case is that 
of pipelines, where we may want to optimise not just the package being built, 
but instead one (or more) of its dependencies. So your suggestion of this 
syntax:

   guix package --tune=haswell -i diamond

requires some extensions, maybe something like this, where bamm can be used as 
a pipeline that uses bwa and samtools:

   guix package -i bamm --tune=haswell bwa samtools

and to optimise the C in bamm itself too:

   guix package -i bamm --tune=haswell bwa samtools bamm

So you’re saying that --tune should apply recursively, right?
Sort of. The difficulty with applying it fully recursively is that it 
might greatly increase the maintenance burden of using --tune for little 
performance improvement, since all inputs would have to be tuned, not 
just those that substantively affect performance. This is all conjecture 
though, I'm not sure how many packages will fail to build when tuned.





On 01/11/16 17:15, Ricardo Wurmus wrote:

[...]


I strongly encourage users to do two things:

- use manifests
- record the current version of Guix and our local package repository
when instantiating a manifest.  It only takes these two pieces of
information to reproduce a software environment

Is it possible to help automate this process somehow e.g. by checking
if packages in GUIX_PACKAGE_PATH are within git repositories and
reporting their statuses?

It would be nice.

As you note, there’s a design question that needs to be discussed.  On
one hand, Guix doesn’t need to know and care about how things in
$GUIX_PACKAGE_PATH were obtained, etc.  On the other hand, if Guix would
manage such external repos by itself, it would be able to give more
precise information on what’s being used and to provide more featureful
tools.

This is related to the idea of “channels” that we’ve been discussing
notably with Pjotr.


Right, that is probably a more general solution.
ben



Re: Guix on clusters and in HPC

2016-11-08 Thread Ludovic Courtès
Pjotr Prins  skribis:

> Wrote down a way to distribute software using containers and tar ;)
>
>   https://github.com/pjotrp/guix-notes/blob/master/DISTRIBUTE.org

Pretty cool indeed!

Recently I thought we could extract the ‘self-contained-tarball’
function you quoted and make it a bit more generic so that we can use it
for a ‘guix package’ command:

  $ guix pack guix   # same as ‘make guix-binary.x86_64.tar.gz’
  /gnu/store/…-guix.tar.gz

  $ guix pack emacs
  /gnu/store/…-emacs.tar.gz

>From there, we might even be able to do something like:

  $ guix package --format=docker --entry-point=emacs emacs

(See .)

Ludo’.



Re: Guix on clusters and in HPC

2016-11-05 Thread Roel Janssen

Pjotr Prins writes:

> Wrote down a way to distribute software using containers and tar ;)
>
>   https://github.com/pjotrp/guix-notes/blob/master/DISTRIBUTE.org
>

Wow, awesome stuff!  I'm going to play around with this.

Kind regards,
Roel Janssen




Re: Guix on clusters and in HPC

2016-11-04 Thread Chris Marusich
Pjotr Prins  writes:

> Wrote down a way to distribute software using containers and tar ;)
>
>   https://github.com/pjotrp/guix-notes/blob/master/DISTRIBUTE.org

Neat trick!  Thanks for sharing.  I see that this relies on undocumented
behavior, which is the fact that each store directory in the
environment's closure gets bind-mounted read-only, and practically no
other files are visible in the container.  I had to peek inside
guix/scripts/environment.scm to figure that out.  Fun stuff :)

Should the bind-mount behavior when creating containers for an
environment also be documented in the manual, or was there a reason why
we didn't mention it there?

-- 
Chris


signature.asc
Description: PGP signature


Re: Guix on clusters and in HPC

2016-11-04 Thread Pjotr Prins
Wrote down a way to distribute software using containers and tar ;)

  https://github.com/pjotrp/guix-notes/blob/master/DISTRIBUTE.org

On Wed, Nov 02, 2016 at 04:03:25PM +, Pjotr Prins wrote:
> On Wed, Nov 02, 2016 at 09:25:42AM +1000, Ben Woodcroft wrote:
> > guix pull: error: build failed: cloning builder process: Operation not
> > permitted
> 
> You can set the permissions to run the daemon. Bruno did some work
> there:
> 
>   https://hub.docker.com/r/bmpvieira/guix/
> 
> > That seems to suggest that we cannot run the daemon inside a docker
> > container, so I suppose we'd have to fall back on copying files from a store
> > built outside docker-land, right?
> 
> I think that is the preferred solution. No special privileges needed.
> 
> Pj.
> 

-- 



Re: Guix on clusters and in HPC

2016-11-03 Thread Ludovic Courtès
Ben Woodcroft  skribis:

> Has anyone ever managed to get Guix to work inside docker? I attempted
> it as I intend on submitting some applications to kbase[0,1], where
> developers submit docker files to run their applications within the
> "narrative" interface i.e. web-facing interfaces to bioinformatic
> tools. I failed I think because of docker's single-process
> restriction. Using the attached (straightforward) dockerfile it fails
> at this step:
>
> RUN echo "nohup ~root/.guix-profile/bin/guix-daemon
> --build-users-group=guixbuild &" > /tmp/daemon-script.sh
> RUN bash /tmp/daemon-script.sh; guix pull
> ...
> guix pull: error: build failed: cloning builder process: Operation not
> permitted

That means that the clone(2) call in nix/libstore/build.cc failed, most
likely because one of the CLONE_NEW* flags isn’t supported by the kernel
you’re running.

What version of Linux is it?

Thanks,
Ludo’.



Re: Guix on clusters and in HPC

2016-11-03 Thread Ludovic Courtès
Hi!

Ben Woodcroft  skribis:

> I'm a little late here, but please do all of the things on that list :)

:-)

> With this suggestion:
>
> + for 
> [[https://lists.gnu.org/archive/html/guix-devel/2016-10/msg5.html][CPU-specific
>  optimizations]]
> + somehow support -mtune=native (and even profile-guided
>   optimizations?)
>
> I'm not sure if you already thought of this, but an important use case is 
> that of pipelines, where we may want to optimise not just the package being 
> built, but instead one (or more) of its dependencies. So your suggestion of 
> this syntax:
>
>   guix package --tune=haswell -i diamond
>
> requires some extensions, maybe something like this, where bamm can be used 
> as a pipeline that uses bwa and samtools:
>
>   guix package -i bamm --tune=haswell bwa samtools
>
> and to optimise the C in bamm itself too:
>
>   guix package -i bamm --tune=haswell bwa samtools bamm

So you’re saying that --tune should apply recursively, right?

> On 01/11/16 17:15, Ricardo Wurmus wrote:

[...]

>> I strongly encourage users to do two things:
>>
>> - use manifests
>> - record the current version of Guix and our local package repository
>>when instantiating a manifest.  It only takes these two pieces of
>>information to reproduce a software environment
> Is it possible to help automate this process somehow e.g. by checking
> if packages in GUIX_PACKAGE_PATH are within git repositories and
> reporting their statuses?

It would be nice.

As you note, there’s a design question that needs to be discussed.  On
one hand, Guix doesn’t need to know and care about how things in
$GUIX_PACKAGE_PATH were obtained, etc.  On the other hand, if Guix would
manage such external repos by itself, it would be able to give more
precise information on what’s being used and to provide more featureful
tools.

This is related to the idea of “channels” that we’ve been discussing
notably with Pjotr.

Ludo’.



Re: Guix on clusters and in HPC

2016-11-02 Thread Pjotr Prins
On Wed, Nov 02, 2016 at 09:25:42AM +1000, Ben Woodcroft wrote:
> guix pull: error: build failed: cloning builder process: Operation not
> permitted

You can set the permissions to run the daemon. Bruno did some work
there:

  https://hub.docker.com/r/bmpvieira/guix/

> That seems to suggest that we cannot run the daemon inside a docker
> container, so I suppose we'd have to fall back on copying files from a store
> built outside docker-land, right?

I think that is the preferred solution. No special privileges needed.

Pj.



Re: Guix on clusters and in HPC

2016-11-01 Thread Ben Woodcroft

On 26/10/16 21:51, Ludovic Courtès wrote:

Ricardo Wurmus  skribis:


Ludovic Courtès  writes:

What they suggest is to add Guix support simply by using Guix inside of
Docker…  Obviously, I’m not a fan of this because of how inelegant this
all seems.  When it comes to bringing Guix to Galaxy I think we have
cultural problems to overcome, not really technical issues.

Well, if this approach allows us to demonstrate the improvements Guix
can bring (and to sidestep the cultural differences), it may be a good
idea to try it.


Has anyone ever managed to get Guix to work inside docker? I attempted 
it as I intend on submitting some applications to kbase[0,1], where 
developers submit docker files to run their applications within the 
"narrative" interface i.e. web-facing interfaces to bioinformatic tools. 
I failed I think because of docker's single-process restriction. Using 
the attached (straightforward) dockerfile it fails at this step:


RUN echo "nohup ~root/.guix-profile/bin/guix-daemon 
--build-users-group=guixbuild &" > /tmp/daemon-script.sh

RUN bash /tmp/daemon-script.sh; guix pull
...
guix pull: error: build failed: cloning builder process: Operation not 
permitted


That seems to suggest that we cannot run the daemon inside a docker 
container, so I suppose we'd have to fall back on copying files from a 
store built outside docker-land, right?


Thanks, ben



[0]: http://kbase.us/
[1]: https://github.com/kbase/user_docs/blob/master/kbase-architecture.md
###
# Dockerfile
#
# Version:  1
# Software: GNU Guix
# Software Version: 0.11.0-ubuntu14.04
###

# Base image as Unbuntu
FROM ubuntu:14.04

RUN apt-get update
RUN apt-get -y install wget
RUN apt-get -y install build-essential

# Install Guix from binary
RUN cd /tmp && wget 
ftp://alpha.gnu.org/gnu/guix/guix-binary-0.11.0.x86_64-linux.tar.xz
RUN cd /tmp && wget 
ftp://alpha.gnu.org/gnu/guix/guix-binary-0.11.0.x86_64-linux.tar.xz.sig

RUN gpg --keyserver pgp.mit.edu --recv-keys 090B11993D9AEBB5
RUN gpg --verify /tmp/guix-binary-0.11.0.x86_64-linux.tar.xz.sig

RUN tar --warning=no-timestamp -xf /tmp/guix-binary-0.11.0.x86_64-linux.tar.xz

RUN ln -sf /var/guix/profiles/per-user/root/guix-profile ~root/.guix-profile

RUN groupadd --system guixbuild
RUN for i in `seq -w 1 10`; do useradd -g guixbuild -G guixbuild -d /var/empty 
-s `which nologin` -c "Guix build user $i" --system guixbuilder$i; done

RUN mkdir -p /usr/local/bin
RUN ln -s /var/guix/profiles/per-user/root/guix-profile/bin/guix /usr/local/bin

# Authorize hydra. Perhaps unnecessary in the future.
RUN guix archive --authorize < ~root/.guix-profile/share/guix/hydra.gnu.org.pub

# Start the daemon manually
RUN echo "nohup ~root/.guix-profile/bin/guix-daemon 
--build-users-group=guixbuild &" > /tmp/daemon-script.sh
RUN bash /tmp/daemon-script.sh; guix pull


Re: Guix on clusters and in HPC

2016-11-01 Thread Ben Woodcroft

Hi,

I'm a little late here, but please do all of the things on that list :)


With this suggestion:

+ for 
[[https://lists.gnu.org/archive/html/guix-devel/2016-10/msg5.html][CPU-specific
 optimizations]]
+ somehow support -mtune=native (and even profile-guided
  optimizations?)

I'm not sure if you already thought of this, but an important use case is that 
of pipelines, where we may want to optimise not just the package being built, 
but instead one (or more) of its dependencies. So your suggestion of this 
syntax:

  guix package --tune=haswell -i diamond

requires some extensions, maybe something like this, where bamm can be used as 
a pipeline that uses bwa and samtools:

  guix package -i bamm --tune=haswell bwa samtools

and to optimise the C in bamm itself too:

  guix package -i bamm --tune=haswell bwa samtools bamm


On 01/11/16 17:15, Ricardo Wurmus wrote:

myglc2  writes:


On 10/26/2016 at 14:08 Ricardo Wurmus writes:


At the MDC we’re using SGE and users specify their software environment
in the job script.  The software environment is a Guix profile, so the
job script usually contains a line to source the profile’s
“etc/profile”, which has the effect of setting up the required
environment variables.

Cool. How do you deal with the tendency of user's profiles to be "moving
targets?" IOW, I am wondering how one would reproduce a result at a
later date when one's profile has "changed"?

I strongly encourage users to do two things:

- use manifests
- record the current version of Guix and our local package repository
   when instantiating a manifest.  It only takes these two pieces of
   information to reproduce a software environment
Is it possible to help automate this process somehow e.g. by checking if 
packages in GUIX_PACKAGE_PATH are within git repositories and reporting 
their statuses? Or is that too much tie-in with git? Tie-in with git 
would also be useful because 'guix lint' could be used to check 
correctness of git commit messages, etc.


ben



Re: Guix on clusters and in HPC

2016-11-01 Thread Ricardo Wurmus

myglc2  writes:

> On 10/26/2016 at 14:08 Ricardo Wurmus writes:
>
>> At the MDC we’re using SGE and users specify their software environment
>> in the job script.  The software environment is a Guix profile, so the
>> job script usually contains a line to source the profile’s
>> “etc/profile”, which has the effect of setting up the required
>> environment variables.
>
> Cool. How do you deal with the tendency of user's profiles to be "moving
> targets?" IOW, I am wondering how one would reproduce a result at a
> later date when one's profile has "changed"?

I strongly encourage users to do two things:

- use manifests
- record the current version of Guix and our local package repository
  when instantiating a manifest.  It only takes these two pieces of
  information to reproduce a software environment

>>> Based on my experiments with Guix/Debian, GuixSD, VMs, and VM images it
>>> is not obvious to me which of these levels of abstraction is
>>> appropriate.
>>
>> FWIW we’re using Guix on top of CentOS 6.8.  The store is mounted
>> read-only on all cluster nodes.
>
> Nice. Do you attempt to "protect" your users from variations in the
> CentOS config?

I’m not sure I understand the question.  With Guix we’re not relying on
the host system software with the exception of the kernel at runtime.
Users have one profile that they use on the cluster nodes (running
CentOS) as well as on their workstations (Ubuntu or a later version of
CentOS).

When it comes to building software outside of Guix (e.g. when using “pip
install” for Python or “install.packages()” in R) there’s little I can
do.  I’m considering to provide a “runtime” environment in which the
Guix toolchain and very common libraries are available, which can be
used to build software in a traditional fashion.  I’m hacking on Rstudio
server now to make it usable running inside a container where the system
toolchain is essentially swapped out for a toolchain from Guix.

This is needed because mixing binaries that are dynamically loaded at
runtime (e.g. some R modules from Guix with bindings to system
libraries) cannot possibly work due to ABI incompatibility.  This is
actually the most common problem I’m facing here, because users are used
to install stuff on their own.  Mixing with Guix binaries only works as
long as the applications run as separate processes.

~~ Ricardo




Re: Guix on clusters and in HPC

2016-10-31 Thread myglc2
On 10/26/2016 at 14:00 Ludovic Courtès writes:

> myglc2  skribis:
>
>> The scheduler that I am most familiar with, SGE, supports the
>> proposition that compute hosts are heterogeneous and that they each have
>> a fixed software and/or hardware configuration. As a result, users need
>> to specify resources, such as SW packages &/or #CPUs &/or memory needed
>> for a given job. These requirements in turn control where a given job
>> can run. QMAKE, the integration of GNU Make with the SGE scheduler,
>> further allows a make recipe step to specify specific resources for a
>> SGE job to process the make step.
>
> I see.
>
>> While SGE is dated and can be a bear to use, it provides a useful
>> yardstick for HPC/Cluster functionality. So it is useful to consider how
>> Guix(SD) might impact this model. Presumably a defining characteristic
>> of GuixSD clusters is that the software configuration of compute hosts
>> no longer needs to be fixed and the user can "dial in" a specific SW
>> configuration for each job step.  This is in many ways a good thing. But
>> it also generates new requirements. How does one specify the SW config
>> for a given job or recipe step:
>>
>> 1) VM image?
>>
>> 2) VM?
>>
>> 3) Installed System Packages?
>>
>> 4) Installed (user) packages?
>
> The ultimate model here would be that of offloading¹: users would use
> Guix on their machine, compute the derivation they want to build
> locally, and offload the actual build to the cluster.  In turn, the
> cluster would schedule builds on the available and matching compute
> nodes.  But of course, this is quite sophisticated.
>
> ¹
> https://www.gnu.org/software/guix/manual/html_node/Daemon-Offload-Setup.html

Thanks for pointing me to this. I hadn't internalized (an probably
don't yet understand) just how cool the Offload Facility is. Sorry if my
earlier comments were pendantic or uninformed as a result :-(

Considering the Offload Facility as a SGE replacement, it would be
interesting to make a venn diagram of the SGE and Guix Offload Facility
functions and to study the usability issues of each. Having failed that
assignment, here are a few thoughts:

I guess we would see the QMAKE makefile mentioned above as being
replaced by guile recipe(s)?  Maybe we need a cheat sheet showing how to
map between the two sets of functions/concepts?

I am a little unclear about the implications of placing all analysis
results into the Store. In labs where I have worked, data sources and
destinations are typically managed by project.  What are the pros and
cons of everything in the store? What are the management/maintenance
issues? E.g, when any result can be reproducibly derived, then only the
inputs are precious, but maybe we want to "protect" from GC those
results that were computationally more expensive?

In Grid Engine, "submit hosts" are the machines that a user logs into to
gain access to the cluster. Usually there are one or two such hosts,
often used, in part, to simplify cluster access control. I guess you are
saying that, when every user machine is set up as an "offload facility,"
it becomes like a "submit host." In general, this would be "nicer" but
not sufficient. Grid Engine also provides a 'qrsh' command that allows
users to log into compute host(s), reserving the same resources as
required by a given job. This is useful when debugging a process that is
failing or prototyping a process that requires memory, CPUs, or other
resources not available on the user machine. Can the offload facility be
extended to support something like this?

> A more directly usable approach is to simply let users manage profiles
> on the cluster using ‘guix package’ or ‘guix environment’.  Then they
> can specify the right profile or the right ‘guix environment’ command in
> their jobs.

This seems quite powerful. How would one reproducibly specify "which
guix" version [to use | was used]?

Does this fit within the offload facility harness?



Re: Guix on clusters and in HPC

2016-10-31 Thread myglc2
On 10/26/2016 at 14:08 Ricardo Wurmus writes:

> At the MDC we’re using SGE and users specify their software environment
> in the job script.  The software environment is a Guix profile, so the
> job script usually contains a line to source the profile’s
> “etc/profile”, which has the effect of setting up the required
> environment variables.

Cool. How do you deal with the tendency of user's profiles to be "moving
targets?" IOW, I am wondering how one would reproduce a result at a
later date when one's profile has "changed"?

> I don’t know of anyone who uses VMs or VM images to specify software
> environments.

One rationale I can think of for VM images is to "archive" them along
with the analysis result to provide brute-force reproducibility.

An example I know of is a group whose cluster consists of VMs on VMware.
The VMs run a mix of OSes provisioned with varying levels of resource
(e.g. #CPUs, amount of memory, installed software). 

>> Based on my experiments with Guix/Debian, GuixSD, VMs, and VM images it
>> is not obvious to me which of these levels of abstraction is
>> appropriate.
>
> FWIW we’re using Guix on top of CentOS 6.8.  The store is mounted
> read-only on all cluster nodes.

Nice. Do you attempt to "protect" your users from variations in the
CentOS config?

>> The most forward-thinking group that I know discarded their cluster
>> hardware a year ago to replace it with starcluster
>> (http://star.mit.edu/cluster/). Starcluster automates the creation,
>> care, and feeding of a HPC clusters on AWS using the Grid Engine
>> scheduler and AMIs. The group has a full-time "starcluster jockey" who
>> manages their cluster and they seem quite happy with the approach. So
>> you may want to consider starcluster as a model when you think of
>> cluster management requirements.
>
> When using starcluster are software environments transferred to AWS on
> demand?  Does this happen on a per-job basis?  Are any of the
> instantiated machines persistent or are they discarded after use?

In the application I refer to the cluster is kept spun up.  I am not
sure if they have built a custom Amazon VM-image (AMI) or if they start
with a "stock" AMI and configure the compute hosts during the spin up.



Re: Guix on clusters and in HPC

2016-10-26 Thread Ludovic Courtès
Hi!

Eric Bavier  skribis:

>>   - non-root usage
>
> The Singularity project advertises that it does not use a root-owned
> daemon http://singularity.lbl.gov/about#no-root-owned-daemon-processes
> but it does not in the same section explain that it uses a setuid
> helper instead: http://singularity.lbl.gov/docs-security which also 
> summarizes some of the current limitations and trade-offs of user namespaces.

Interesting, thanks for the pointers, especially the second one which
moderates the claim of the first one.

Do you how widely Singularity is being deployed?

The build daemon used to have a small setuid helper that people could
use instead of running the whole daemon as root; it was removed in Nix
and in commit d43eb499a6c112af609118803c6cd33fbcedfa43 on our side.

The reason for the removal was that nobody was using it, and that it was
presumably unhelpful in overcoming the “non-root” use case.

I feel like it may be easier to get user namespaces enabled than to get
a setuid helper installed.  WDYT?

>>   - central daemon usage (like at MDC, but improved)
>
> For many-user systems, I think we'd need to put in place some controls
> to keep users from stepping on each others feet when it comes to interacting
> with the deamon.  E.g. One user spends a bunch of time building her
> application; before she gets a chance to use it, another user comes along
> and runs 'guix gc'.

That’s not a problem: packages in a profile are protected from GC, and
profiles generated by ‘guix environment’ are also protected for the
duration of the session.

With ‘guix build’, one has to use ‘-r’ to make sure the package won’t be
GC’d as soon as ‘guix build’ completes.

> Can a user run 'nice 10 guix build ...' and have it DTRT?

No it won’t DTRT.

> On existing systems, the root partition may not be as large as Guix might
> like and there may not be opporunities to mount a separate partition for the
> store.  While it's nice that Guix would give users the change to share
> package build results, often disk partitions are weighted in favor of /home
> (possibly because of the current widespread practice of users building their
> own packages in $HOME).  Until that changes, sysadmins might like some more
> powerful tools for auditing store disk usage to answer questions such as
> "which user profiles are exclusively occupying the most store space?" or even
> some way to specify expiration dates for certain profiles.

I see what you mean, though it’s again a “cultural” thing.  I can see
that the shared store would effectively allow sysadmins to save space,
but yeah.

>> + admin/social issues
>>   * daemon runs as root
>
> So, Singularity uses a setuid helper, and Shifter needs to run the Docker
> daemon.  It may be easier to convince sysadmins to run Guix's daemon
> given those other examples.  Of course, if we can do what we need to
> with even fewer privileges, that'd be great.

Good to know!

>>   * daemon needs Internet access
>
> There are many HPC sites that are air-gapped for security reasons.  Of those
> sites that I know, the ones that allow any outside software to be put on the
> machine after the initial system installation require CD/DVD installation 
> media.
> IMO for such sites, and for other users wishing to take Guix off the grid, it
> would be nice to be able to prepopulate the installation media, whether USB or
> CD, with more package outputs and/or source (e.g. like Trisquel's "Sources 
> DVD").
> Or similarly a way to "mount" some media that contains a bunch of package
> definitions for GUIX_PACKAGE_PATH as well as the corresponding source or 
> output
> for a specific Guix release.

Probably ‘guix build --sources=transitive’ and similar tools can help
here?  Then we can populate a store on a DVD or something and import it
on the machine.

>>   - package variants, experimentation
>> + for experiments, as in Section 4.2 of
>> [[https://hal.inria.fr/hal-01161771/en][the RepPar paper]]
>>   * in the meantime we added
>>   
>> [[https://www.gnu.org/software/guix/manual/html_node/Package-Transformation-Options.html][--with-input
>>   et al.]]; need more?
>> + for
>> 
>> [[https://lists.gnu.org/archive/html/guix-devel/2016-10/msg5.html][CPU-specific
>> optimizations]]
>> + somehow support -mtune=native (and even profile-guided
>>   optimizations?)
>> + simplify the API to switch compilers, libcs, etc.
>
> +1 for all these
>
> Even though we intend to not specifically support proprietary compilers, some
> users may still want to explore building their packages with other compilers,
> like e.g. Clang and Rose

Yup.

>>   - workflow, reproducible science
>> + implement
>> [[http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22629][channels]]
>
> Perhaps what I discussed above re installation media could fold into this.

I think it’s orthogonal.

Thanks a lot for your feedback!

Ludo’.



Re: Guix on clusters and in HPC

2016-10-26 Thread Eric Bavier
  - non-root usage
+ file system virtualization needed
  * map ~/.local/gnu/store to /gnu/store
  * user name spaces?
  * [[https://github.com/proot-me/PRoot/][PRoot]]? but performance problems?
  * common interface, like “guix enter” spawns a shell where
/gnu/store is available
+ daemon functionality as a library
  * client no longer connects to the daemon, does everything
locally, including direct store accesses
  * can use substitutes
+ or plain ’guix-daemon --disable-root’?
+ see [[http://lists.gnu.org/archive/html/help-guix/2016-06/msg00079.html][discussion with Ben Woodcroft and Roel]]
  - central daemon usage (like at MDC, but improved)
+ describe/define appropriate setup, like:
  * daemon runs on front-end node
  * clients can connect to daemon from compute nodes, and perform
any operation
  * use of distributed file systems: anything to pay attention to?
  * how should the front-end offload to compute nodes?
+ technical issues
  * daemon needs to be able to listen for connections elsewhere
  * client needs to be able to [[http://debbugs.gnu.org/cgi/bugreport.cgi?bug=20381][connect remotely]] instead of using
[[http://debbugs.gnu.org/cgi/bugreport.cgi?bug=20381#5][‘socat’ hack]]
  * how do we share localstatedir?  how do we share /gnu/store?
  * how do we share the profile directory?
+ admin/social issues
  * daemon runs as root
  * daemon needs Internet access
  * Ricardo mentions lack of nscd and problems caused by the use of
NSS plugins like [[https://fedoraproject.org/wiki/Features/SSSD][SSSD]] in this context
+ batch scheduler integration?
  * allow users to offload right from their machine to the cluster?
  - package variants, experimentation
+ for experiments, as in Section 4.2 of [[https://hal.inria.fr/hal-01161771/en][the RepPar paper]]
  * in the meantime we added [[https://www.gnu.org/software/guix/manual/html_node/Package-Transformation-Options.html][--with-input et al.]]; need more?
+ for [[https://lists.gnu.org/archive/html/guix-devel/2016-10/msg5.html][CPU-specific optimizations]]
+ somehow support -mtune=native (and even profile-guided
  optimizations?)
+ simplify the API to switch compilers, libcs, etc.
  - workflow, reproducible science
+ implement [[http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22629][channels]]
+ provide a way to see which Guix commit is used, like “guix channel
  describe”
+ simple ways to [[https://lists.gnu.org/archive/html/guix-devel/2016-10/msg00701.html][test the dependents of a package]] (see also
  discussion between E. Agullo & A. Enge)
  * new transformation options: --with-graft, --with-source
recursive
+ support [[https://lists.gnu.org/archive/html/guix-devel/2016-05/msg00380.html][workflows and pipelines]]?
+ add [[https://github.com/galaxyproject/galaxy/issues/2778][Guix support in Galaxy]]?


Re: Guix on clusters and in HPC

2016-10-26 Thread Ludovic Courtès
Ricardo Wurmus  skribis:

> Ludovic Courtès  writes:
>
>> Your thoughts about the point about Galaxy?
>
> I talked to one of the Galaxy core developers at a conference and they
> told me they have implemented Docker support recently.  Essentially,
> they build software in a minimal Docker system and then extract the
> binaries such that they can be run without Docker.

OK.  So they assume the Docker daemon is running, right?  And it’s
running as root?  That’d be good for us.  ;-)

> What they suggest is to add Guix support simply by using Guix inside of
> Docker…  Obviously, I’m not a fan of this because of how inelegant this
> all seems.  When it comes to bringing Guix to Galaxy I think we have
> cultural problems to overcome, not really technical issues.

Well, if this approach allows us to demonstrate the improvements Guix
can bring (and to sidestep the cultural differences), it may be a good
idea to try it.

Thanks for your feedback,
Ludo’.



Re: Guix on clusters and in HPC

2016-10-26 Thread Ludovic Courtès
Hi,

myglc2  skribis:

> The scheduler that I am most familiar with, SGE, supports the
> proposition that compute hosts are heterogeneous and that they each have
> a fixed software and/or hardware configuration. As a result, users need
> to specify resources, such as SW packages &/or #CPUs &/or memory needed
> for a given job. These requirements in turn control where a given job
> can run. QMAKE, the integration of GNU Make with the SGE scheduler,
> further allows a make recipe step to specify specific resources for a
> SGE job to process the make step.

I see.

> While SGE is dated and can be a bear to use, it provides a useful
> yardstick for HPC/Cluster functionality. So it is useful to consider how
> Guix(SD) might impact this model. Presumably a defining characteristic
> of GuixSD clusters is that the software configuration of compute hosts
> no longer needs to be fixed and the user can "dial in" a specific SW
> configuration for each job step.  This is in many ways a good thing. But
> it also generates new requirements. How does one specify the SW config
> for a given job or recipe step:
>
> 1) VM image?
>
> 2) VM?
>
> 3) Installed System Packages?
>
> 4) Installed (user) packages?

The ultimate model here would be that of offloading¹: users would use
Guix on their machine, compute the derivation they want to build
locally, and offload the actual build to the cluster.  In turn, the
cluster would schedule builds on the available and matching compute
nodes.  But of course, this is quite sophisticated.

¹ https://www.gnu.org/software/guix/manual/html_node/Daemon-Offload-Setup.html

A more directly usable approach is to simply let users manage profiles
on the cluster using ‘guix package’ or ‘guix environment’.  Then they
can specify the right profile or the right ‘guix environment’ command in
their jobs.

> Based on my experiments with Guix/Debian, GuixSD, VMs, and VM images it
> is not obvious to me which of these levels of abstraction is
> appropriate. Perhaps any mix should be supported. In any case, tools to
> manage this aspect of a GuixSD cluster are needed. And they need to be
> integrated with the cluster scheduler to produce a manageable GuixSD HPC
> cluster.

Note that I’m focusing on the use of Guix on a cluster on top of
whatever ancient distro is already running, as a replacement for
home-made “modules” and such, and as opposed to running GuixSD on all
the compute nodes.

Running GuixSD on all the nodes of a cluster would certainly be valuable
from a sysadmin viewpoint, but it’s also something that’s much harder to
do in practice today.

> The most forward-thinking group that I know discarded their cluster
> hardware a year ago to replace it with starcluster
> (http://star.mit.edu/cluster/). Starcluster automates the creation,
> care, and feeding of a HPC clusters on AWS using the Grid Engine
> scheduler and AMIs. The group has a full-time "starcluster jockey" who
> manages their cluster and they seem quite happy with the approach. So
> you may want to consider starcluster as a model when you think of
> cluster management requirements.

Hmm, OK.

Thanks for your feedback,
Ludo’.



Re: Guix on clusters and in HPC

2016-10-26 Thread Ricardo Wurmus

myglc2  writes:

> While SGE is dated and can be a bear to use, it provides a useful
> yardstick for HPC/Cluster functionality. So it is useful to consider how
> Guix(SD) might impact this model. Presumably a defining characteristic
> of GuixSD clusters is that the software configuration of compute hosts
> no longer needs to be fixed and the user can "dial in" a specific SW
> configuration for each job step.  This is in many ways a good thing. But
> it also generates new requirements. How does one specify the SW config
> for a given job or recipe step:
>
> 1) VM image?
>
> 2) VM?
>
> 3) Installed System Packages?
>
> 4) Installed (user) packages?

At the MDC we’re using SGE and users specify their software environment
in the job script.  The software environment is a Guix profile, so the
job script usually contains a line to source the profile’s
“etc/profile”, which has the effect of setting up the required
environment variables.

I don’t know of anyone who uses VMs or VM images to specify software
environments.

> Based on my experiments with Guix/Debian, GuixSD, VMs, and VM images it
> is not obvious to me which of these levels of abstraction is
> appropriate.

FWIW we’re using Guix on top of CentOS 6.8.  The store is mounted
read-only on all cluster nodes.

> The most forward-thinking group that I know discarded their cluster
> hardware a year ago to replace it with starcluster
> (http://star.mit.edu/cluster/). Starcluster automates the creation,
> care, and feeding of a HPC clusters on AWS using the Grid Engine
> scheduler and AMIs. The group has a full-time "starcluster jockey" who
> manages their cluster and they seem quite happy with the approach. So
> you may want to consider starcluster as a model when you think of
> cluster management requirements.

When using starcluster are software environments transferred to AWS on
demand?  Does this happen on a per-job basis?  Are any of the
instantiated machines persistent or are they discarded after use?

~~ Ricardo




Re: Guix on clusters and in HPC

2016-10-24 Thread myglc2
On 10/18/2016 at 16:20 Ludovic Courtès writes:

> Hello,
>
> I’m trying to gather a “wish list” of things to be done to facilitate
> the use of Guix on clusters and for high-performance computing (HPC).

The scheduler that I am most familiar with, SGE, supports the
proposition that compute hosts are heterogeneous and that they each have
a fixed software and/or hardware configuration. As a result, users need
to specify resources, such as SW packages &/or #CPUs &/or memory needed
for a given job. These requirements in turn control where a given job
can run. QMAKE, the integration of GNU Make with the SGE scheduler,
further allows a make recipe step to specify specific resources for a
SGE job to process the make step.

While SGE is dated and can be a bear to use, it provides a useful
yardstick for HPC/Cluster functionality. So it is useful to consider how
Guix(SD) might impact this model. Presumably a defining characteristic
of GuixSD clusters is that the software configuration of compute hosts
no longer needs to be fixed and the user can "dial in" a specific SW
configuration for each job step.  This is in many ways a good thing. But
it also generates new requirements. How does one specify the SW config
for a given job or recipe step:

1) VM image?

2) VM?

3) Installed System Packages?

4) Installed (user) packages?

Based on my experiments with Guix/Debian, GuixSD, VMs, and VM images it
is not obvious to me which of these levels of abstraction is
appropriate. Perhaps any mix should be supported. In any case, tools to
manage this aspect of a GuixSD cluster are needed. And they need to be
integrated with the cluster scheduler to produce a manageable GuixSD HPC
cluster.

The most forward-thinking group that I know discarded their cluster
hardware a year ago to replace it with starcluster
(http://star.mit.edu/cluster/). Starcluster automates the creation,
care, and feeding of a HPC clusters on AWS using the Grid Engine
scheduler and AMIs. The group has a full-time "starcluster jockey" who
manages their cluster and they seem quite happy with the approach. So
you may want to consider starcluster as a model when you think of
cluster management requirements.



Re: Guix on clusters and in HPC

2016-10-21 Thread Roel Janssen

Ricardo Wurmus writes:

> Roel Janssen  writes:
>
>> * Network-aware guix-daemon
>>
>>   From a user's point of view it would be cool to have a network-aware
>>   guix-daemon.  In our cluster, we have a shared storage, on which we have
>>   the store, but manipulating the store through guix-daemon is now limited
>>   to a single node (and a single request per profile).  Having `guix' talk
>>   with `guix-daemon' over a network allows users to install stuff from
>>   any node, instead of a specific node.
>
> That’s on the list as
>
>   * client needs to be able to 
> [[http://debbugs.gnu.org/cgi/bugreport.cgi?bug=20381][connect remotely]] 
> instead of using 
> [[http://debbugs.gnu.org/cgi/bugreport.cgi?bug=20381#5][‘socat’ hack]]
>
> I’m currently using the socat hack at the MDC with a wrapper to make it
> seamless for the users.
>
>> * Profile management
>>
>>   The abstraction of profiles is an awesome feature of FPM, but the user
>>   interface is missing.  We could do better here.
>>
>>   Switch the default profile
>>   (and prepend values of environment variables to the current values):
>>   $ guix profile --switch=/path/to/shared/profile
>
> This could be a wrapper doing essentially this:
>
> bash  # sub-shell
> source /path/to/shared/profile/etc/profile
> …
>
>>   Reset to default profile (and environment variable values without the
>>   profile we just unset):
>>   $ guix profile --reset
>
> Using the above wrapper it would be equivalent to just:
>
> exit  # exit the sub-shell
>
> Does this make sense or is more needed here?
> We’re using the above workflow at the MDC.  It’s a little verbose and
> requires users to keep track of the shell in which they are operating,
> but this basically works.  Would be nice to abstract this away and hide
> it behind a nicer user interface (e.g. “guix environment save” and “guix
> environment load”).

Well, I envisioned that `guix profile --switch' would actually change
the symlink `$HOME/.guix-profile' to another profile, so that it applies
to all shells you start after you provide the command.  But maybe `guix
environment' would be better suited anyway.

Kind regards,
Roel Janssen



Re: Guix on clusters and in HPC

2016-10-21 Thread Ricardo Wurmus

Ludovic Courtès  writes:

> Your thoughts about the point about Galaxy?

I talked to one of the Galaxy core developers at a conference and they
told me they have implemented Docker support recently.  Essentially,
they build software in a minimal Docker system and then extract the
binaries such that they can be run without Docker.

What they suggest is to add Guix support simply by using Guix inside of
Docker…  Obviously, I’m not a fan of this because of how inelegant this
all seems.  When it comes to bringing Guix to Galaxy I think we have
cultural problems to overcome, not really technical issues.

~~ Ricardo




Re: Guix on clusters and in HPC

2016-10-20 Thread Ludovic Courtès
Thomas Danckaert  skribis:

> From: l...@gnu.org (Ludovic Courtès)
> Subject: Guix on clusters and in HPC
> Date: Tue, 18 Oct 2016 16:20:43 +0200
>
>> So I’ve come up with an initial list of work items going from the
>> immediate needs to crazy ideas (batch scheduler integration!) that
>> hopefully make sense to cluster/HPC people.  I’d be happy to get
>> feedback, suggestions, etc. from whoever is interested!
>
> Here's a semi-crazy suggestion: some HPC people really like the intel
> compiler/math kernel library, so a way to switch toolchains on
> packages, and somehow integrate “foreign” toolchains would suit
> them. But that might be “advocating the use of non-free software” (in
> fact I almost feel like a heretic for bringing this up :-) ...).

Yeah, I’m aware of this.  In section 5 of
, we wrote:

  GNU Guix does not provide proprietary software packages.
  Unfortunately, proprietary software is still relatively common in HPC,
  be it linear algebra libraries or GPU support.  Yet, we see it as a
  strength more than a limitation.  Often, these “black boxes”
  inherently limit reproducibility—how is one going to reproduce a
  software environment without permission to run the software in the
  first place? What if the software depends on the ability to “call
  home” to function at all? More importantly, we view reproducible
  software environments and reproducible science as a tool towards
  improved and shared knowledge; developers who deny the freedom to
  study and modify their code work against this goal.

As heretic as it may seem in HPC circles ;-), I stand by this.

Regarding GPU support, there’s light on the horizon with GCC’s new PTX
backend, OpenACC support, and with libgomp’s offloading support.

We’ll see!

Ludo’.



Re: Guix on clusters and in HPC

2016-10-20 Thread Ludovic Courtès
Hi Roel,

Roel Janssen  skribis:

> Here are some aspects I think we need:
>
> * Network-aware guix-daemon

Of course!

> * Profile management
>
>   The abstraction of profiles is an awesome feature of FPM, but the user
>   interface is missing.  We could do better here.
>
>   Switch the default profile
>   (and prepend values of environment variables to the current values):
>   $ guix profile --switch=/path/to/shared/profile
>
>   Reset to default profile (and environment variable values without the
>   profile we just unset):
>   $ guix profile --reset
>
>   Create an isolated environment based on a profile:
>   $ guix environment --profile=/path/to/profile --pure --ad-hoc

I can see the desire of having something that more closely resembles
what “modules” does, but I have the same questions as Ricardo.
Essentially, how much would it help compared to what’s already
available?  (Honest question.)

In general, adding simpler UIs is a good idea IMO; it’s just that I’m
unsure what’s “better” in this case.

> * Workflow management/execution
>
>   Add automatic program execution with its own vocabulary.  I think
>   "workflow management" boils down to execution of a G-exp, but the
>   results do not necessarily need to be stored in the store (because the
>   data it works on is probably managed by an external data management
>   system).  A powerful feature of GNU Guix is its domain-specific
>   language for describing software packages.  We could add
>   domain-specific parts for workflow management (a `workflow' data type
>   and a `task' or `process' data type gets us there more or less).
>
>   With workflow management we are only interested in the "build
>   function", not the "source code" or the "build output".
>
>   You are probably aware that I worked on this for some time, so I could
>   share the data types I have and the execution engine parts I have.

Yes, definitely!  This is what I had in mind, hence the reference to
.

Obviously if there’s already code, it’s even better.  :-)

>   The HPC-specific part of this is the compatibility with existing job
>   scheduling systems and data management systems.

Do you mean that it integrates with a job scheduler?

> * Document on why we need super user privileges on the Guix daemon
>
>   Probably an infamous point by now.  By design, the Linux kernel keeps
>   control over all processes.  With GNU Guix, we need some control over
>   the environment in which a process runs (disable network access,
>   change the user that executes a process), and the environment in which
>   the output lives (chown, chmod, to allow multiple users to use the
>   build output).  Instead of hitting the wall of "we are not going to
>   run this thing with root privileges", we could present our sysadmins
>   with a document for the reasons, the design decisions and the actual
>   code involved in super user privilege stuff.
>
>   This is something I am working on as well, but help is always welcome
>   :-).

Good point.

mentions it when talking about --disable-chroot at the end, but this
could be improved.

That’s it?  No crazy ideas?  ;-)

Your thoughts about the point about Galaxy?

Thanks for your feedback!

Ludo’.



Re: Guix on clusters and in HPC

2016-10-19 Thread Thomas Danckaert

From: l...@gnu.org (Ludovic Courtès)
Subject: Guix on clusters and in HPC
Date: Tue, 18 Oct 2016 16:20:43 +0200


So I’ve come up with an initial list of work items going from the
immediate needs to crazy ideas (batch scheduler integration!) that
hopefully make sense to cluster/HPC people.  I’d be happy to get
feedback, suggestions, etc. from whoever is interested!


Here's a semi-crazy suggestion: some HPC people really like the intel 
compiler/math kernel library, so a way to switch toolchains on 
packages, and somehow integrate “foreign” toolchains would suit them. 
But that might be “advocating the use of non-free software” (in fact 
I almost feel like a heretic for bringing this up :-) ...).


best,

Thomas


Re: Guix on clusters and in HPC

2016-10-18 Thread Roel Janssen

Ludovic Courtès writes:

> Hello,
>
> I’m trying to gather a “wish list” of things to be done to facilitate
> the use of Guix on clusters and for high-performance computing (HPC).
>
> Ricardo and I wrote about the advantages, shortcomings, and perspectives
> before:
>
>   http://elephly.net/posts/2015-04-17-gnu-guix.html
>   https://hal.inria.fr/hal-01161771/en
>
> I know that Pjotr, Roel, Ben, Eric and maybe others also have experience
> and ideas on what should be done (and maybe even code? :-)).
>
> So I’ve come up with an initial list of work items going from the
> immediate needs to crazy ideas (batch scheduler integration!) that
> hopefully make sense to cluster/HPC people.  I’d be happy to get
> feedback, suggestions, etc. from whoever is interested!
>
> (The reason I’m asking is that I’m considering submitting a proposal at
> Inria to work on some of these things.)
>
> TIA!  :-)

Here are some aspects I think we need:

* Network-aware guix-daemon

  From a user's point of view it would be cool to have a network-aware
  guix-daemon.  In our cluster, we have a shared storage, on which we have
  the store, but manipulating the store through guix-daemon is now limited
  to a single node (and a single request per profile).  Having `guix' talk
  with `guix-daemon' over a network allows users to install stuff from
  any node, instead of a specific node.

* Profile management

  The abstraction of profiles is an awesome feature of FPM, but the user
  interface is missing.  We could do better here.

  Switch the default profile
  (and prepend values of environment variables to the current values):
  $ guix profile --switch=/path/to/shared/profile

  Reset to default profile (and environment variable values without the
  profile we just unset):
  $ guix profile --reset

  Create an isolated environment based on a profile:
  $ guix environment --profile=/path/to/profile --pure --ad-hoc

* Workflow management/execution

  Add automatic program execution with its own vocabulary.  I think
  "workflow management" boils down to execution of a G-exp, but the
  results do not necessarily need to be stored in the store (because the
  data it works on is probably managed by an external data management
  system).  A powerful feature of GNU Guix is its domain-specific
  language for describing software packages.  We could add
  domain-specific parts for workflow management (a `workflow' data type
  and a `task' or `process' data type gets us there more or less).

  With workflow management we are only interested in the "build
  function", not the "source code" or the "build output".

  You are probably aware that I worked on this for some time, so I could
  share the data types I have and the execution engine parts I have.

  The HPC-specific part of this is the compatibility with existing job
  scheduling systems and data management systems.

* Document on why we need super user privileges on the Guix daemon

  Probably an infamous point by now.  By design, the Linux kernel keeps
  control over all processes.  With GNU Guix, we need some control over
  the environment in which a process runs (disable network access,
  change the user that executes a process), and the environment in which
  the output lives (chown, chmod, to allow multiple users to use the
  build output).  Instead of hitting the wall of "we are not going to
  run this thing with root privileges", we could present our sysadmins
  with a document for the reasons, the design decisions and the actual
  code involved in super user privilege stuff.

  This is something I am working on as well, but help is always welcome
  :-).


Kind regards,
Roel Janssen



Re: Guix on clusters and in HPC

2016-10-18 Thread Ludovic Courtès
Christopher Allan Webber  skribis:

> Great!  I wonder how much of the need for cluster / HPC stuff overlaps
> with the desiderata of "Guix Ops" / "guix deploy"?

I think it’s quite different.  Cluster nodes usually run a
vendor-provided GNU/Linux distro (often the ancient RPM-based ones) on
top of which admins add their own thing, which could be Guix.

Of course in an ideal world people would run GuixSD on all the cluster
nodes and use “guix deploy” to that end, but that’s not what I had in
mind here.  :-)

Ludo’.



Re: Guix on clusters and in HPC

2016-10-18 Thread Christopher Allan Webber
Ludovic Courtès writes:

> (The reason I’m asking is that I’m considering submitting a proposal at
> Inria to work on some of these things.)

Great!  I wonder how much of the need for cluster / HPC stuff overlaps
with the desiderata of "Guix Ops" / "guix deploy"?