On Sat, Feb 9, 2019 at 8:10 AM Hector Martin <hec...@marcansoft.com> wrote:
>
> Hi list,
>
> As I understand it, CephFS implements hard links as effectively "smart
> soft links", where one link is the primary for the inode and the others
> effectively reference it. When it comes to directories, the size for a
> hardlinked file is only accounted for in recursive stats for the
> "primary" link. This is good (no double-accounting).
>
> I'd like to be able to control *which* of those hard links is the
> primary, post-facto, to control what directory their size is accounted
> under. I want to write a tool that takes some rules as to which
> directories should be "preferred" for containing the master link, and
> corrects it if necessary (by recursively stating everything and looking
> for files with the same inode number to enumerate all links).
>
> To swap out a primary link with another I came up with this sequence:
>
> link("old_primary", "tmp1")
> symlink("tmp1", "tmp2")
> rename("tmp2", "old_primary") // old_primary replaced with another inode
> stat("/otherdir/new_primary") // new_primary hopefully takes over stray
> rename("tmp1", "old_primary)  // put things back the way they were
>
> The idea is that, since renames of hardlinks over themselves are a no-op
> in POSIX and won't work, I need to use an intermediate symlink step to
> ensure continuity of access to the old file; this isn't 100% transparent
> but it beats e.g. removing old_primary and re-linking new_primary over
> it (which would cause old_primary to vanish for a short time, which is
> undesirable). Hopefully the stat() ensures that the new_primary is what
> takes over the stray inode. This seems to work in practice; if there is
> a better way, I'd like to hear it.
>
> Figuring out which link is the primary is a bigger issue. Only
> directories report recursive stats where this matters, not files
> themselves. On a directory with hardlinked files, if ceph.dir.rfiles >
> sum(ceph.dir.rfiles for each subdir) + count(files with nlinks == 1)
> then some hardlinked files are primary; I could attempt to use this
> formula and then just do the above dance for every hardlinked file to
> move the primaries off, but this seems fragile and likely to break in
> certain situations (or do needless work). Any other ideas?
>
how about directly reading backtrace, something equivalent to:

rados -p cephfs1_data getxattr xxxxxxx.00000000 parent >/tmp/parent
ceph-dencoder import /tmp/parent type inode_backtrace_t decode dump_json


> Thanks,
> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://mrcn.st/pub
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to