On Thu, Mar 21, 2019 at 2:45 PM Dan van der Ster <[email protected]> wrote:
>
> On Thu, Mar 21, 2019 at 8:51 AM Gregory Farnum <[email protected]> wrote:
> >
> > On Wed, Mar 20, 2019 at 6:06 PM Dan van der Ster <[email protected]> 
> > wrote:
> >>
> >> On Tue, Mar 19, 2019 at 9:43 AM Erwin Bogaard <[email protected]> 
> >> wrote:
> >> >
> >> > Hi,
> >> >
> >> >
> >> >
> >> > For a number of application we use, there is a lot of file duplication. 
> >> > This wastes precious storage space, which I would like to avoid.
> >> >
> >> > When using a local disk, I can use a hard link to let all duplicate 
> >> > files point to the same inode (use “rdfind”, for example).
> >> >
> >> >
> >> >
> >> > As there isn’t any deduplication in Ceph(FS) I’m wondering if I can use 
> >> > hard links on CephFS in the same way as I use for ‘regular’ file systems 
> >> > like ext4 and xfs.
> >> >
> >> > 1. Is it advisible to use hard links on CephFS? (It isn’t in the ‘best 
> >> > practices’: http://docs.ceph.com/docs/master/cephfs/app-best-practices/)
> >> >
> >> > 2. Is there any performance (dis)advantage?
> >> >
> >> > 3. When using hard links, is there an actual space savings, or is there 
> >> > some trickery happening?
> >> >
> >> > 4. Are there any issues (other than the regular hard link ‘gotcha’s’) I 
> >> > need to keep in mind combining hard links with CephFS?
> >>
> >> The only issue we've seen is if you hardlink b to a, then rm a, then
> >> never stat b, the inode is added to the "stray" directory. By default
> >> there is a limit of 1 million stray entries -- so if you accumulate
> >> files in this state eventually users will be unable to rm any files,
> >> until you stat the `b` files.
> >
> >
> > Eek. Do you know if we have any tickets about that issue? It's easy to see 
> > how that happens but definitely isn't a good user experience!
>
> I'm not aware of a ticket -- I had thought it was just a fact of life
> with hardlinks and cephfs.

I think it is for now, but as you've demonstrated that's not really a
good situation and I'm sure we can figure out some way of
automatically merging inodes into their remaining link parents.
I've created a ticket at http://tracker.ceph.com/issues/38849

> After hitting this issue in prod, we found the explanation here in
> this old thread (with your useful post ;) ):
>
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-October/013621.html
>
> Our immediate workaround was to increase mds bal fragment size max
> (e.g. to 200000).
> In our env we now monitor num_strays in case these get out of control again.
>
> BTW, now thinking about this more... isn't directory fragmentation
> supposed to let the stray dir grow to unlimited shards? (on our side
> it seems limited to 10 shards). Maybe this is just some configuration
> issue on our side?

Sounds like I haven't missed a change here: the stray directory is a
special system directory that doesn't get fragmented like normal ones
do. We just set it up (hard-coded even, IIRC, but maybe a config
option) so that each MDS gets 10 of them after the first time somebody
managed to make it large enough that a single stray directory object
got too large. o_0
-Greg
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to