Re: /gnu/store/.links/

2018-02-09 Thread Leo Famulari
On Fri, Feb 09, 2018 at 03:24:00PM +0100, Pjotr Prins wrote:
> Hmmm. I think this is better handled at the file system level if
> people want deduplication. These systems will be more common.

In general, yes! But filesystems with this feature are still not widely
deployed...


signature.asc
Description: PGP signature


Re: /gnu/store/.links/

2018-02-09 Thread Mark H Weaver
l...@gnu.org (Ludovic Courtès) writes:

> Pjotr Prins  skribis:
>
>> On Fri, Feb 09, 2018 at 01:11:23PM +0100, Ricardo Wurmus wrote:
>
> [...]
>
>>> I don’t know about scalability.  This number is still well below the
>>> limits of ext4 file systems, but accessing a big directory listing like
>>> that can be slow.  I would feel a little better about this if we split
>>> it up into different prefix directories (like it’s done for browser
>>> caches).  I don’t think it’s necessary, though.
>>
>> For ext4 it is going to be an issue. Anyway, we'll see what happens.
>
> In practice, when the maximum number of links is reached, we simply
> transparently skip deduplication.

Ideally, we should at some point change the daemon to break
/gnu/store/.links up into several subdirectories, as is done for log
files in /var/log/guix/drvs.  The main complication is dealing with the
transition between the old layout and the new.

   Mark



Re: /gnu/store/.links/

2018-02-09 Thread Pjotr Prins
On Fri, Feb 09, 2018 at 06:00:02PM +0100, Ludovic Courtès wrote:
> In practice, when the maximum number of links is reached, we simply
> transparently skip deduplication.  See this commit:
> 
>   commit 12b6c951cf5ca6055a22a2eec85665353f5510e5
>   Author: Ludovic Courtès 
>   Date:   Fri Oct 28 20:34:15 2016 +0200
> 
>   daemon: Do not error out when deduplication fails due to ENOSPC.
> 
>   This solves a problem whereby if /gnu/store/.links had enough entries,
>   ext4's directory index would be full, leading to link(2) returning
>   ENOSPC.
> 
>   * nix/libstore/optimise-store.cc (LocalStore::optimisePath_): Upon
>   ENOSPC from link(2), print a message and return instead of throwing a
>   'SysError'.
> 
> It does scale well, and it’s been here “forever”.

OK. My mindset is probably ext2...

> If you’re wondering how much gets deduplicated, see
> .
> :-)

Fancy that :)

Pj.



Re: /gnu/store/.links/

2018-02-09 Thread Ludovic Courtès
Pjotr Prins  skribis:

> On Fri, Feb 09, 2018 at 01:11:23PM +0100, Ricardo Wurmus wrote:

[...]

>> I don’t know about scalability.  This number is still well below the
>> limits of ext4 file systems, but accessing a big directory listing like
>> that can be slow.  I would feel a little better about this if we split
>> it up into different prefix directories (like it’s done for browser
>> caches).  I don’t think it’s necessary, though.
>
> For ext4 it is going to be an issue. Anyway, we'll see what happens.

In practice, when the maximum number of links is reached, we simply
transparently skip deduplication.  See this commit:

  commit 12b6c951cf5ca6055a22a2eec85665353f5510e5
  Author: Ludovic Courtès 
  Date:   Fri Oct 28 20:34:15 2016 +0200

  daemon: Do not error out when deduplication fails due to ENOSPC.

  This solves a problem whereby if /gnu/store/.links had enough entries,
  ext4's directory index would be full, leading to link(2) returning
  ENOSPC.

  * nix/libstore/optimise-store.cc (LocalStore::optimisePath_): Upon
  ENOSPC from link(2), print a message and return instead of throwing a
  'SysError'.

It does scale well, and it’s been here “forever”.

If you’re wondering how much gets deduplicated, see
.
:-)

Ludo’.



Re: /gnu/store/.links/

2018-02-09 Thread Pjotr Prins
On Fri, Feb 09, 2018 at 01:11:23PM +0100, Ricardo Wurmus wrote:
> 
> Hi Pjotr,
> 
> > What is
> >
> >   ls -1 /gnu/store/.links/|wc -l
> >   495938
> >
> > Never saw it before. Does this scale?
> 
> It’s used for optional file deduplication.  It is enabled by default,
> but you can disable it with a daemon option on file systems that
> deduplicate data at the block level.

Hmmm. I think this is better handled at the file system level if
people want deduplication. These systems will be more common.

> I don’t know about scalability.  This number is still well below the
> limits of ext4 file systems, but accessing a big directory listing like
> that can be slow.  I would feel a little better about this if we split
> it up into different prefix directories (like it’s done for browser
> caches).  I don’t think it’s necessary, though.

For ext4 it is going to be an issue. Anyway, we'll see what happens.
Thanks for explaining.

Pj.



Re: /gnu/store/.links/

2018-02-09 Thread Ricardo Wurmus

Hi Pjotr,

> What is
>
>   ls -1 /gnu/store/.links/|wc -l
>   495938
>
> Never saw it before. Does this scale?

It’s used for optional file deduplication.  It is enabled by default,
but you can disable it with a daemon option on file systems that
deduplicate data at the block level.

I don’t know about scalability.  This number is still well below the
limits of ext4 file systems, but accessing a big directory listing like
that can be slow.  I would feel a little better about this if we split
it up into different prefix directories (like it’s done for browser
caches).  I don’t think it’s necessary, though.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net