Re: Performance on NFS

2017-06-18 Thread Ricardo Wurmus

Ludovic Courtès  writes:

[…]

> So the problem is probably not NFS in this case but rather RPC
> performance.
>
> However, I can’t help with this until you drop ‘guixr’ and use
> GUIX_DAEMON_SOCKET=guix:// instead.  Hint hint.  ;-)

FWIW: I’m using GUIX_DAEMON_SOCKET=guix:// since a few days.  It allows
for a simpler “guixr”, which now only prints extra info and refuses to
run on the login node.  It’s really nice!

I haven’t noticed any difference in performance, but I certainly could
test patches.

Performance differs a lot for different packages.  “guix build bwa” (~ 7
seconds) is a lot faster than “guix build python2-numpy” (> 40 seconds),
for example.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net




Re: Performance on NFS

2017-06-17 Thread Roel Janssen

Ludovic Courtès writes:

> Hi!
>
> Roel Janssen  skribis:
>
>> I applied the patch, and here are the results:
>>
>> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure -- true
>> The following derivations will be built:
>>/gnu/store/0hz8g844432b5h9zbqr9cpsjy0brg15h-profile.drv
>>/gnu/store/wkksb7bbx3jr0p6p5cj4kkphbwday0yd-info-dir.drv
>>/gnu/store/cd2mwx9qprdy23p7j3pik2zs14nifn36-manual-database.drv
>> Creating manual page database for 1 packages... done in 1.816 s
>>
>> real 1m14.686s
>> user 0m5.761s
>> sys  0m0.498s
>> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure -- true
>>
>> real 0m34.100s
>> user 0m5.599s
>> sys  0m0.414s
>> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure -- true
>>
>> real 0m33.821s
>> user 0m5.140s
>> sys  0m0.432s
>
> You’re telling me it’s just as bad as before, right?

Sorry for the somewhat hasted response.

Well, before it was more variable what the speed was.  Now it seems to
be pretty stable around ~30 to ~35 seconds with grafting, and ~15 to ~20
seconds without grafting.

I really appreciate the effort for optimizing.  And I feel it is
improving.

>
>> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
>> --no-substitutes --no-grafts -- true
>> The following derivations will be built:
>>/gnu/store/rvh0imjdimwm90nzr0fmr5gmp97lyiix-profile.drv
>>/gnu/store/5hm3v4afjf9gix92ixqzv9bwc11a608s-fonts-dir.drv
>>
>> real 0m37.200s
>> user 0m3.408s
>> sys  0m0.284s
>> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
>> --no-substitutes --no-grafts -- true
>>
>> real 0m19.415s
>> user 0m3.466s
>> sys  0m0.306s
>> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
>> --no-substitutes --no-grafts -- true
>>
>> real 0m18.850s
>> user 0m3.536s
>> sys  0m0.346s
>> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
>> --no-grafts -- true
>>
>> real 0m16.003s
>> user 0m3.246s
>> sys  0m0.301s
>> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
>> --no-grafts -- true
>>
>> real 0m18.205s
>> user 0m3.470s
>> sys  0m0.314s
>> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
>> --no-substitutes -- true
>>
>> real 0m33.731s
>> user 0m5.111s
>> sys  0m0.428s
>> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
>> --no-substitutes -- true
>>
>> real 0m30.993s
>> user 0m5.049s
>> sys  0m0.458s
>>
>> Why is grafting so slow, even if it doesn't have to graft anything?
>
> Grafting leads to a bunch of additional RPCs:
>
> --8<---cut here---start->8---
> $ GUIX_PROFILING=rpc ./pre-inst-env guix build coreutils
> /gnu/store/mskh7zisxa313anqv68c5lr4hajldjc5-coreutils-8.27-debug
> /gnu/store/xbvwxf4k5njnb3hn93xwqlppjkiz4hdv-coreutils-8.27
> Remote procedure call summary: 379 RPCs
>   build-things   ... 1
>   built-in-builders  ... 1
>   valid-path?... 5
>   query-substitutable-path-infos ... 8
>   query-references   ...22
>   query-valid-derivers   ...48
>   add-text-to-store  ...   294
> $ GUIX_PROFILING=rpc ./pre-inst-env guix build coreutils --no-grafts
> /gnu/store/mskh7zisxa313anqv68c5lr4hajldjc5-coreutils-8.27-debug
> /gnu/store/xbvwxf4k5njnb3hn93xwqlppjkiz4hdv-coreutils-8.27
> Remote procedure call summary: 294 RPCs
>   built-in-builders  ... 1
>   query-substitutable-path-infos ... 1
>   build-things   ... 1
>   valid-path?... 5
>   add-text-to-store  ...   286
> --8<---cut here---end--->8---
>
> So the problem is probably not NFS in this case but rather RPC
> performance.
>
> However, I can’t help with this until you drop ‘guixr’ and use
> GUIX_DAEMON_SOCKET=guix:// instead.  Hint hint.  ;-)

This is what guixr is already doing, see:
https://github.com/UMCUGenetics/guix-additions/blob/master/umcu/packages/guix.scm#L95

So I went a little bit further and did this:
[roel@hpcguix ~]$ export GUIX_DAEMON_SOCKET="/gnu/daemon-socket/socket"
[roel@hpcguix ~]$ export NIX_STATE_DIR=/gnu

This means that if I run "guix" on the same machine as where guix-daemon
is running, and communicate over the UNIX socket, I should not
experience a performance problem, other than a little bit of NFS
latency..

Here are the timings:

[roel@hpcguix ~]$ time guix environment --ad-hoc coreutils --pure --no-grafts 
-- true

real0m16.293s
user0m2.550s
sys 0m0.274s
[roel@hpcguix ~]$ time guix environment --ad-hoc coreutils --pure --no-grafts 
-- true

real0m15.746s
user0m2.411s
sys 0m0.260s
[roel@hpcguix ~]$ time guix environment --ad-hoc coreutils --pure -- true

real0m32.821s
user0m4.342s
sys 0m0.375s

[roel@hpcguix ~]$ time guix environment --ad-hoc coreutils --pure -- true

real0m31.140s
user

Re: Performance on NFS

2017-06-17 Thread Ludovic Courtès
Hi!

Roel Janssen  skribis:

> I applied the patch, and here are the results:
>
> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure -- true
> The following derivations will be built:
>/gnu/store/0hz8g844432b5h9zbqr9cpsjy0brg15h-profile.drv
>/gnu/store/wkksb7bbx3jr0p6p5cj4kkphbwday0yd-info-dir.drv
>/gnu/store/cd2mwx9qprdy23p7j3pik2zs14nifn36-manual-database.drv
> Creating manual page database for 1 packages... done in 1.816 s
>
> real  1m14.686s
> user  0m5.761s
> sys   0m0.498s
> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure -- true
>
> real  0m34.100s
> user  0m5.599s
> sys   0m0.414s
> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure -- true
>
> real  0m33.821s
> user  0m5.140s
> sys   0m0.432s

You’re telling me it’s just as bad as before, right?

> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
> --no-substitutes --no-grafts -- true
> The following derivations will be built:
>/gnu/store/rvh0imjdimwm90nzr0fmr5gmp97lyiix-profile.drv
>/gnu/store/5hm3v4afjf9gix92ixqzv9bwc11a608s-fonts-dir.drv
>
> real  0m37.200s
> user  0m3.408s
> sys   0m0.284s
> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
> --no-substitutes --no-grafts -- true
>
> real  0m19.415s
> user  0m3.466s
> sys   0m0.306s
> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
> --no-substitutes --no-grafts -- true
>
> real  0m18.850s
> user  0m3.536s
> sys   0m0.346s
> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
> --no-grafts -- true
>
> real  0m16.003s
> user  0m3.246s
> sys   0m0.301s
> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
> --no-grafts -- true
>
> real  0m18.205s
> user  0m3.470s
> sys   0m0.314s
> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
> --no-substitutes -- true
>
> real  0m33.731s
> user  0m5.111s
> sys   0m0.428s
> [roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
> --no-substitutes -- true
>
> real  0m30.993s
> user  0m5.049s
> sys   0m0.458s
>
> Why is grafting so slow, even if it doesn't have to graft anything?

Grafting leads to a bunch of additional RPCs:

--8<---cut here---start->8---
$ GUIX_PROFILING=rpc ./pre-inst-env guix build coreutils
/gnu/store/mskh7zisxa313anqv68c5lr4hajldjc5-coreutils-8.27-debug
/gnu/store/xbvwxf4k5njnb3hn93xwqlppjkiz4hdv-coreutils-8.27
Remote procedure call summary: 379 RPCs
  build-things   ... 1
  built-in-builders  ... 1
  valid-path?... 5
  query-substitutable-path-infos ... 8
  query-references   ...22
  query-valid-derivers   ...48
  add-text-to-store  ...   294
$ GUIX_PROFILING=rpc ./pre-inst-env guix build coreutils --no-grafts
/gnu/store/mskh7zisxa313anqv68c5lr4hajldjc5-coreutils-8.27-debug
/gnu/store/xbvwxf4k5njnb3hn93xwqlppjkiz4hdv-coreutils-8.27
Remote procedure call summary: 294 RPCs
  built-in-builders  ... 1
  query-substitutable-path-infos ... 1
  build-things   ... 1
  valid-path?... 5
  add-text-to-store  ...   286
--8<---cut here---end--->8---

So the problem is probably not NFS in this case but rather RPC
performance.

However, I can’t help with this until you drop ‘guixr’ and use
GUIX_DAEMON_SOCKET=guix:// instead.  Hint hint.  ;-)

Thanks for your feedback,
Ludo’.



Re: Performance on NFS

2017-06-17 Thread Roel Janssen
Hello!

Ludovic Courtès writes:

> Hello!
>
> I just pushed another improvement:
> d27cc3bfaafe6b5b0831e88afb1c46311d382a0b significantly reduces the
> number of ‘stat’ calls when discovering .scm files.  I’d be curious to
> see what impact it has on NFS.
>
> Commit cbee955901b3e252ebdeb8066a2196055149198a also reduces the number
> of ‘stat’ calls.
>
> After that I have:
>
> --8<---cut here---start->8---
> $ ./pre-inst-env strace -c  guix environment --ad-hoc coreutils -- true 
> % time seconds  usecs/call callserrors syscall
> -- --- --- - - 
>  27.030.006916   1 13585   read
>  24.640.006306   1  9701  8029 stat
>  21.250.005438   1 10698   write
>  11.050.0028282828 1   wait4
>   3.600.000922   1  1598   122 open
>   2.880.000738   4   20328 futex
>   1.790.000458   1   670   mmap
>   1.750.000448   1   642   mprotect
>   1.640.000419   0  1476   close
>   1.360.000347   0  1411 3 lseek
>   0.740.000190   2   107   sendfile
>   0.700.000180   451   getdents
>   0.620.000159   0   886   fstat
>   0.540.000137   1   196   lstat
>   0.220.57   199   brk
>   0.090.24   4 6   clone
>   0.020.05   1 5   munmap
>   0.020.05   019   rt_sigprocmask
>   0.020.04   027   clock_gettime
>   0.010.03   3 1   getrandom
>   0.010.02   2 1   connect
>   0.010.02   1 3   prlimit64
>   0.000.01   1 2 2 access
>   0.000.01   1 1   socket
>   0.000.00   0 5   rt_sigaction
>   0.000.00   0 7 1 ioctl
>   0.000.00   0 1   madvise
>   0.000.00   0 1   execve
>   0.000.00   0 1   uname
>   0.000.00   0 3   fcntl
>   0.000.00   0 1   getcwd
>   0.000.00   0 1   statfs
>   0.000.00   0 1   arch_prctl
>   0.000.00   0 2   sched_getaffinity
>   0.000.00   0 1   set_tid_address
>   0.000.00   0 1   set_robust_list
>   0.000.00   0 3   pipe2
> -- --- --- - - 
> 100.000.025590 41417  8185 total
> $ git describe
> v0.13.0-804-gf4453df9a
> --8<---cut here---end--->8---
>
> The number of ‘stat’ calls is a function of the number of entries in
> ‘GUILE_LOAD_PATH’ and ‘GUILE_LOAD_COMPILED_PATH’.  I can’t think of any
> easy way to reduce it further.
>
> The many ‘read’ and ‘write’ are due to the unbuffered port used for
> RPCs, and definitely not great when talking to a remote store.  I’ve
> tried to add some buffering but that turned out to be trickier than I
> had hoped for.
>
> To be continued!
>
> Ludo’.

I applied the patch, and here are the results:

[roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure -- true
The following derivations will be built:
   /gnu/store/0hz8g844432b5h9zbqr9cpsjy0brg15h-profile.drv
   /gnu/store/wkksb7bbx3jr0p6p5cj4kkphbwday0yd-info-dir.drv
   /gnu/store/cd2mwx9qprdy23p7j3pik2zs14nifn36-manual-database.drv
Creating manual page database for 1 packages... done in 1.816 s

real1m14.686s
user0m5.761s
sys 0m0.498s
[roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure -- true

real0m34.100s
user0m5.599s
sys 0m0.414s
[roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure -- true

real0m33.821s
user0m5.140s
sys 0m0.432s
[roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
--no-substitutes --no-grafts -- true
The following derivations will be built:
   /gnu/store/rvh0imjdimwm90nzr0fmr5gmp97lyiix-profile.drv
   /gnu/store/5hm3v4afjf9gix92ixqzv9bwc11a608s-fonts-dir.drv

real0m37.200s
user0m3.408s
sys 0m0.284s
[roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
--no-substitutes --no-grafts -- true

real0m19.415s
user0m3.466s
sys 0m0.306s
[roel@hpcguix guix]$ time guixr environment --ad-hoc coreutils --pure 
--no-substitutes --no-grafts -- true

real0m18.850s
user0m3.536s
sys 0m0.346s
[roel@hpcguix guix]$ 

Performance on NFS

2017-06-16 Thread Ludovic Courtès
Hello!

I just pushed another improvement:
d27cc3bfaafe6b5b0831e88afb1c46311d382a0b significantly reduces the
number of ‘stat’ calls when discovering .scm files.  I’d be curious to
see what impact it has on NFS.

Commit cbee955901b3e252ebdeb8066a2196055149198a also reduces the number
of ‘stat’ calls.

After that I have:

--8<---cut here---start->8---
$ ./pre-inst-env strace -c  guix environment --ad-hoc coreutils -- true 
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 27.030.006916   1 13585   read
 24.640.006306   1  9701  8029 stat
 21.250.005438   1 10698   write
 11.050.0028282828 1   wait4
  3.600.000922   1  1598   122 open
  2.880.000738   4   20328 futex
  1.790.000458   1   670   mmap
  1.750.000448   1   642   mprotect
  1.640.000419   0  1476   close
  1.360.000347   0  1411 3 lseek
  0.740.000190   2   107   sendfile
  0.700.000180   451   getdents
  0.620.000159   0   886   fstat
  0.540.000137   1   196   lstat
  0.220.57   199   brk
  0.090.24   4 6   clone
  0.020.05   1 5   munmap
  0.020.05   019   rt_sigprocmask
  0.020.04   027   clock_gettime
  0.010.03   3 1   getrandom
  0.010.02   2 1   connect
  0.010.02   1 3   prlimit64
  0.000.01   1 2 2 access
  0.000.01   1 1   socket
  0.000.00   0 5   rt_sigaction
  0.000.00   0 7 1 ioctl
  0.000.00   0 1   madvise
  0.000.00   0 1   execve
  0.000.00   0 1   uname
  0.000.00   0 3   fcntl
  0.000.00   0 1   getcwd
  0.000.00   0 1   statfs
  0.000.00   0 1   arch_prctl
  0.000.00   0 2   sched_getaffinity
  0.000.00   0 1   set_tid_address
  0.000.00   0 1   set_robust_list
  0.000.00   0 3   pipe2
-- --- --- - - 
100.000.025590 41417  8185 total
$ git describe
v0.13.0-804-gf4453df9a
--8<---cut here---end--->8---

The number of ‘stat’ calls is a function of the number of entries in
‘GUILE_LOAD_PATH’ and ‘GUILE_LOAD_COMPILED_PATH’.  I can’t think of any
easy way to reduce it further.

The many ‘read’ and ‘write’ are due to the unbuffered port used for
RPCs, and definitely not great when talking to a remote store.  I’ve
tried to add some buffering but that turned out to be trickier than I
had hoped for.

To be continued!

Ludo’.



Re: Performance on NFS

2017-06-12 Thread Ludovic Courtès
Hi!

Roel Janssen <r...@gnu.org> skribis:

> From these timings, I don't think it has a big impact.

Wait, wait.  If we take the best timings of each series of runs, we get:

  hpc-submit1: 26.4s -> 18.0s (-30%)
  hpc-guix:26.2s -> 22.7s (-13%)

This is arguably insufficient but still non-negligible.  I’ve committed
it as b46712159c15f72fc28b71d17d5a7c74fcb64ed0.

With commit 015f17e8b9eff97f656852180ac51c75438d7f9d, the number of
open(2) calls for that same command drops from 991 to 795 (including 122
errors).  I suspect we can’t reduce it further:

--8<---cut here---start->8---
$ ./pre-inst-env strace -o ,,s guix build coreutils -n
$ grep '^open.*' < ,,s |wc -l
795
$ grep '^open.*\.go"' < ,,s |wc -l
563
$ grep '^open.*\.patch"' < ,,s |wc -l
29
$ grep '^open.*\.scm"' < ,,s |wc -l
6
--8<---cut here---end--->8---

Could you check how this affects performance on your NFS system?

There’s possibly another low-hanging fruit, which is to disable file
name canonicalization (via ‘%file-port-name-canonicalization’.)  It
special care though, so I’ll try that later.

> This makes me wonder, can't we replace the disk-intensive stuff with a
> database?  If we only have to read the files on disk once, after which
> we extracted the information (the hashes?) needed to compute which
> links have to be created to make an environment, then actually
> creating the environment can be as fast as only creating those links.

Essentially we need to compute derivations as a function of local files
(sent to the daemon with ‘add-to-store’) and other derivations.  We
cannot avoid that.

In the case of a remote server, communications with the daemon play an
important role too.  Have you tried setting ‘GUIX_DAEMON_SOCKET’ as
suggested before instead of using the “socat hack”?  I think this should
be faster (see commit 950d51c9d9a5107c5dac279da7d2e431134b5f43.)

HTH,
Ludo’.



Re: Performance on NFS

2017-06-12 Thread Roel Janssen

Ludovic Courtès writes:

> Hi Roel,
>
> Roel Janssen  skribis:
>
>> You should know that we have 'submit' nodes that use the guixr wrapper
>> script to connect to the guix-daemon that runs on the 'hpcguix' node.
>>
>> Both have a /gnu mounted by a storage subsystem.
>>
>> I couldn't run the second command on a 'submit' node.  But I could run
>> it in the 'hpcguix' node.
>
> OK.
>
> Side note: I think you can replace your ‘guixr’ wrapper by just doing:
>
>   export GUIX_DAEMON_SOCKET=guix://hpcguix:1234
>
> See .
>
>> The first command:
>> --
>>
>> [roel@hpc-submit1 ~]$ time guixr environment --ad-hoc coreutils --pure -- 
>> true
>>
>> real0m38.415s
>> user0m6.075s
>> sys 0m0.611s
>>
>> [roel@hpcguix ~]$ time guix environment --ad-hoc coreutils --pure -- true
>>
>> real0m27.054s
>> user0m4.254s
>> sys 0m0.383s
>>
>>
>> The second command:
>> ---
>>
>> [roel@hpcguix ~]$ time guix environment --ad-hoc -e '(@ (gnu packages base) 
>> coreutils)' --pure --no-substitutes --no-grafts -- true  
>>  
>>
>> The following derivations will be built:
>>/gnu/store/9wczighnyz1bz43j4wawf09z180g3ywv-profile.drv
>>/gnu/store/ffsyhajbdcp1lcq6x65czghya1iydly8-info-dir.drv
>>/gnu/store/5gyl3l23ps6f8dgay4awybwq7n9j9pzk-fonts-dir.drv
>>/gnu/store/l2mwj2q4vnq2v5raxz64ra7jyphd2jyd-manual-database.drv
>> Creating manual page database for 1 packages... done in 5.524 s
>>
>> real1m6.812s
>> user0m2.969s
>> sys 0m0.325s
>> [roel@hpcguix ~]$ time guix environment --ad-hoc -e '(@ (gnu packages base) 
>> coreutils)' --pure --no-substitutes --no-grafts -- true
>>
>> real0m23.357s
>> user0m2.802s
>> sys 0m0.340s
>>
>>
>> I suspect that the difference between the two commands is that one only
>> looks for one module, while the other looks in all modules.  Looking at
>> the second run, I suppose the difference is quite small.
>
> Yeah, -e doesn’t seem to be much faster (there are still a lot of
> modules to load anyway.)
>
> At any rate, let’s see what we can do; 23 seconds is not okay.
>
> I did a quick experiment:
>
> --8<---cut here---start->8---
> $ strace -o ,,s -s 123 guix environment --ad-hoc -e '(@ (gnu packages base) 
> coreutils)' --pure --no-substitutes --no-grafts -- true
> $ grep ^open ,,s |wc -l
> 1095
> $ grep '^open.*ENOENT' ,,s |wc -l
> 136
> $ grep -E '^(open|stat|lstat).*patches/gcc-arm-bug' ,,s |wc -l
> 27
> $ grep -E '^(open|stat|lstat).*guix/build/utils' ,,s |wc -l
> 2190
> --8<---cut here---end--->8---
>
> After the patch below, I get:
>
> --8<---cut here---start->8---
> $ grep -E '^(open|stat|lstat).*guix/build/utils' ,,s2 |wc -l
> 14
> $ grep -E '^(open|stat|lstat).*patches/gcc-arm-bug' ,,s2 |wc -l
> 4
> --8<---cut here---end--->8---
>
> Here’s the big picture before and after:
>
> --8<---cut here---start->8---
> $ strace -c guix environment --ad-hoc -e '(@ (gnu packages base) coreutils)' 
> --pure --no-substitutes --no-grafts -- true
> % time seconds  usecs/call callserrors syscall
> -- --- --- - - 
>  32.550.009781   1 10463  9158 stat
>  15.550.004673   1  8780   write
>  11.260.0033853385 1   wait4
>   7.940.002387  20   12212 futex
>   6.380.001917   0  5052 4 lstat
>   5.700.001713   2  1095   136 open
>   5.540.001664   1  2919   read
>   3.020.000909   2   525   mmap
>   2.960.000889 148 6   clone
>   2.500.000751   2   481   mprotect
>   2.000.000600   1   959   close
>   1.560.000469   1   898 3 lseek
>   1.100.000330   3   100   sendfile
>   0.880.000264   0   541   fstat
>   0.420.000127   1   175   brk
>   0.150.44   222   rt_sigaction
>   0.090.26   5 5   munmap
>   0.060.19   118   rt_sigprocmask
>   0.060.19   210 3 ioctl
>   0.030.10   3 3 3 access
>   0.030.10   2 6 1 fcntl
>   0.030.09   023   clock_gettime
>   0.030.08   2 5   prlimit64
>   0.020.06   6 1   statfs
>   0.020.05   2 3   pipe2
>   0.010.04 

Re: Performance on NFS

2017-06-09 Thread Ludovic Courtès
Hi Roel,

Roel Janssen  skribis:

> You should know that we have 'submit' nodes that use the guixr wrapper
> script to connect to the guix-daemon that runs on the 'hpcguix' node.
>
> Both have a /gnu mounted by a storage subsystem.
>
> I couldn't run the second command on a 'submit' node.  But I could run
> it in the 'hpcguix' node.

OK.

Side note: I think you can replace your ‘guixr’ wrapper by just doing:

  export GUIX_DAEMON_SOCKET=guix://hpcguix:1234

See .

> The first command:
> --
>
> [roel@hpc-submit1 ~]$ time guixr environment --ad-hoc coreutils --pure -- true
>
> real0m38.415s
> user0m6.075s
> sys 0m0.611s
>
> [roel@hpcguix ~]$ time guix environment --ad-hoc coreutils --pure -- true
>
> real0m27.054s
> user0m4.254s
> sys 0m0.383s
>
>
> The second command:
> ---
>
> [roel@hpcguix ~]$ time guix environment --ad-hoc -e '(@ (gnu packages base) 
> coreutils)' --pure --no-substitutes --no-grafts -- true   
>   
>  
> The following derivations will be built:
>/gnu/store/9wczighnyz1bz43j4wawf09z180g3ywv-profile.drv
>/gnu/store/ffsyhajbdcp1lcq6x65czghya1iydly8-info-dir.drv
>/gnu/store/5gyl3l23ps6f8dgay4awybwq7n9j9pzk-fonts-dir.drv
>/gnu/store/l2mwj2q4vnq2v5raxz64ra7jyphd2jyd-manual-database.drv
> Creating manual page database for 1 packages... done in 5.524 s
>
> real1m6.812s
> user0m2.969s
> sys 0m0.325s
> [roel@hpcguix ~]$ time guix environment --ad-hoc -e '(@ (gnu packages base) 
> coreutils)' --pure --no-substitutes --no-grafts -- true
>
> real0m23.357s
> user0m2.802s
> sys 0m0.340s
>
>
> I suspect that the difference between the two commands is that one only
> looks for one module, while the other looks in all modules.  Looking at
> the second run, I suppose the difference is quite small.

Yeah, -e doesn’t seem to be much faster (there are still a lot of
modules to load anyway.)

At any rate, let’s see what we can do; 23 seconds is not okay.

I did a quick experiment:

--8<---cut here---start->8---
$ strace -o ,,s -s 123 guix environment --ad-hoc -e '(@ (gnu packages base) 
coreutils)' --pure --no-substitutes --no-grafts -- true
$ grep ^open ,,s |wc -l
1095
$ grep '^open.*ENOENT' ,,s |wc -l
136
$ grep -E '^(open|stat|lstat).*patches/gcc-arm-bug' ,,s |wc -l
27
$ grep -E '^(open|stat|lstat).*guix/build/utils' ,,s |wc -l
2190
--8<---cut here---end--->8---

After the patch below, I get:

--8<---cut here---start->8---
$ grep -E '^(open|stat|lstat).*guix/build/utils' ,,s2 |wc -l
14
$ grep -E '^(open|stat|lstat).*patches/gcc-arm-bug' ,,s2 |wc -l
4
--8<---cut here---end--->8---

Here’s the big picture before and after:

--8<---cut here---start->8---
$ strace -c guix environment --ad-hoc -e '(@ (gnu packages base) coreutils)' 
--pure --no-substitutes --no-grafts -- true
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 32.550.009781   1 10463  9158 stat
 15.550.004673   1  8780   write
 11.260.0033853385 1   wait4
  7.940.002387  20   12212 futex
  6.380.001917   0  5052 4 lstat
  5.700.001713   2  1095   136 open
  5.540.001664   1  2919   read
  3.020.000909   2   525   mmap
  2.960.000889 148 6   clone
  2.500.000751   2   481   mprotect
  2.000.000600   1   959   close
  1.560.000469   1   898 3 lseek
  1.100.000330   3   100   sendfile
  0.880.000264   0   541   fstat
  0.420.000127   1   175   brk
  0.150.44   222   rt_sigaction
  0.090.26   5 5   munmap
  0.060.19   118   rt_sigprocmask
  0.060.19   210 3 ioctl
  0.030.10   3 3 3 access
  0.030.10   2 6 1 fcntl
  0.030.09   023   clock_gettime
  0.030.08   2 5   prlimit64
  0.020.06   6 1   statfs
  0.020.05   2 3   pipe2
  0.010.04   2 2   getpid
  0.010.04   4 1   sysinfo
  0.010.03   3 1   dup2
  0.010.03   2 2   arch_prctl
  0.01

Re: Performance on NFS

2017-06-07 Thread Roel Janssen

Ludovic Courtès writes:

> How does:
>
>   time guix environment --ad-hoc coreutils --pure -- true
>
> compare to:
>
>   time guix environment --ad-hoc -e '(@ (gnu packages base) coreutils)' 
> --pure -- true
>
> ?  That would give us an estimate of how much the cache I describe would
> help.
>
> Thanks,
> Ludo’.

You should know that we have 'submit' nodes that use the guixr wrapper
script to connect to the guix-daemon that runs on the 'hpcguix' node.

Both have a /gnu mounted by a storage subsystem.

I couldn't run the second command on a 'submit' node.  But I could run
it in the 'hpcguix' node.


The first command:
--

[roel@hpc-submit1 ~]$ time guixr environment --ad-hoc coreutils --pure -- true

real0m38.415s
user0m6.075s
sys 0m0.611s

[roel@hpcguix ~]$ time guix environment --ad-hoc coreutils --pure -- true

real0m27.054s
user0m4.254s
sys 0m0.383s


The second command:
---

[roel@hpcguix ~]$ time guix environment --ad-hoc -e '(@ (gnu packages base) 
coreutils)' --pure --no-substitutes --no-grafts -- true 
 
The following derivations will be built:
   /gnu/store/9wczighnyz1bz43j4wawf09z180g3ywv-profile.drv
   /gnu/store/ffsyhajbdcp1lcq6x65czghya1iydly8-info-dir.drv
   /gnu/store/5gyl3l23ps6f8dgay4awybwq7n9j9pzk-fonts-dir.drv
   /gnu/store/l2mwj2q4vnq2v5raxz64ra7jyphd2jyd-manual-database.drv
Creating manual page database for 1 packages... done in 5.524 s

real1m6.812s
user0m2.969s
sys 0m0.325s
[roel@hpcguix ~]$ time guix environment --ad-hoc -e '(@ (gnu packages base) 
coreutils)' --pure --no-substitutes --no-grafts -- true

real0m23.357s
user0m2.802s
sys 0m0.340s


I suspect that the difference between the two commands is that one only
looks for one module, while the other looks in all modules.  Looking at
the second run, I suppose the difference is quite small.

Kind regards,
Roel Janssen



Re: Performance on NFS

2017-06-07 Thread Ricardo Wurmus

Ludovic Courtès  writes:

> I’m not sure if SMB is faster than NFS, is it?  9p (which Linux
> supports) might work well.

Sadly, our “storage appliances” only offer NFS or SMB (and maybe SCP).
I’d gladly try *anything* to get away from our slow NFS.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net




Performance on NFS

2017-06-07 Thread Ludovic Courtès
Ricardo Wurmus  skribis:

> Roel Janssen  writes:

[...]

>> The reason we have /gnu NFS-mounted on the build node is that we can
>> then very easily replace the node when it would have a hardware failure,
>> without even losing the ability to run programs that were already
>> installed.
>
> That’s the same reason I have for keeping /gnu on NFS.  I have been
> experimenting with lsync (writing to local disks and then pushing
> changes asynchronously from local-gnu to remote-gnu), but I wasn’t
> convinced it would be reliable.
>
> I’m not sure if this would help, though, when the nodes keep mounting
> /gnu over NFS.

The nodes don’t do as much I/O the store as the daemon, so I think it
makes sense to use a local store for the node that runs guix-daemon.  In
general, Guix aside, building software on NFS is unavoidably slow.

> Is there a way to put /gnu on a Samba share instead?  I’d like to give
> that a try, but I’m not sure about what would happen to ownership and
> permissions (I don’t worry about hardlinks because I disabled
> deduplication).

I’m not sure if SMB is faster than NFS, is it?  9p (which Linux
supports) might work well.

Thanks,
Ludo’.



Performance on NFS

2017-06-07 Thread Ludovic Courtès
Hi Roel,

Roel Janssen  skribis:

>> That is going to make builds vey slow.  I would suggest having
>> guix-daemon operate on the local /gnu (that’s what Ricardo did, IIRC).
>
> That definitely speeds things up about 50%.  So, running the environment
> command from a node that is connected to the node that runs
> guix-daemon, will take about 30 seconds instead of one minute.
>
> This could be made faster by letting 'guix-daemon' do more of the work,
> and let the client-side 'guix' only do the minimum.

Right, but it’s the client that computes the derivation, which involves
quite a bit of I/O and computation.

> (I know.. this remote-guix-daemon stuff voids the warranty.. ;))
>
> The NFS overhead is pretty large.  Maybe we can better tune it, and if
> so, document how to tune it for GNU Guix.  I already talked to our
> storage expert, and tuning boils down to using fast disks, a
> large-enough NFS cache, and low-latency network equipment.
>
> The reason we have /gnu NFS-mounted on the build node is that we can
> then very easily replace the node when it would have a hardware failure,
> without even losing the ability to run programs that were already
> installed.

I guess you could also use a RAID device for /gnu/store.  Or you could
have a local /gnu/store and replicate it elsewhere, maybe
opportunistically via ‘guix publish’.

>> We should make sure these 1782 calls are all warranted (e.g., make sure
>> we don’t open the same file twice; this could be easily diagnosed by
>> looking at the output of ‘strace’.)
>>
>> But then, as you say, it’s a process that opens lots of files.
>>
>> We could cache package lookups, and maybe that’d help a little bit.
>> Currently we use ‘fold-packages’, which loads every single package
>> module, and then we look for a package named “coreutils” among them;
>> with a cache, we could directly load (gnu packages base) and its
>> dependencies.
>>
>> Ludo’.
>
> Running 'guix package -A' only takes a couple of seconds.  So I don't
> think that loading the Scheme modules is really a big issue.

How does:

  time guix environment --ad-hoc coreutils --pure -- true

compare to:

  time guix environment --ad-hoc -e '(@ (gnu packages base) coreutils)' --pure 
-- true

?  That would give us an estimate of how much the cache I describe would
help.

Thanks,
Ludo’.