On 12/13/23 13:28, Kent Overstreet wrote:
> On Wed, Dec 13, 2023 at 08:37:57AM +0100, Donald Buczek wrote:
>> Probably not for the specific applications I mentioned (backup, mirror,
>> accounting). These are intended to run continuously, slowly and unnoticed
>> in the background, so they are memory and i/o throttled via cgroups anyway
>> and one is even using sleep after so-and-so many stat calls to reduce
>> its impact.
>>
>> If they could tell a directory from a snapshot, I would probably stop them
>> from walking into snapshots. And if not, the snapshot id is all that is
>> needed to tell a clone in a snapshot from a hardlink. So these don't really
>> need the filehandle.
> 
> Perhaps we should allocate a bit for differentiating a snapshot from a
> non snapshot subvolume?
Are there non-snapshots subvolumes?

>From  debugfs bcachefs/../btrees, I've got the impression, that every
volume starts with a (single) snapshot.

new fileystem:

subvolumes
==========
u64s 10 type subvolume 0:1:0 len 0 ver 0: root 4096 snapshot id 4294967295 
parent 0

snapshots
=========
u64s 10 type snapshot 0:4294967295:0 len 0 ver 0: is_subvol 1 deleted 0 parent  
        0 children          0          0 subvol 1 tree 1 depth 0 skiplist 0 0 0

`bcachefs subvolume create /mnt/v`

subvolumes
==========
u64s 10 type subvolume 0:1:0 len 0 ver 0: root 4096 snapshot id 4294967295 
parent 0
u64s 10 type subvolume 0:2:0 len 0 ver 0: root 1207959552 snapshot id 
4294967294 parent 0

snapshots
=========
u64s 10 type snapshot 0:4294967294:0 len 0 ver 0: is_subvol 1 deleted 0 parent  
        0 children          0          0 subvol 2 tree 2 depth 0 skiplist 0 0 0
u64s 10 type snapshot 0:4294967295:0 len 0 ver 0: is_subvol 1 deleted 0 parent  
        0 children          0          0 subvol 1 tree 1 depth 0 skiplist 0 0 0

`bcachefs subvolume snapshot /mnt/v /mnt/s`

subvolumes
==========
u64s 10 type subvolume 0:1:0 len 0 ver 0: root 4096 snapshot id 4294967295 
parent 0
u64s 10 type subvolume 0:2:0 len 0 ver 0: root 1207959552 snapshot id 
4294967292 parent 0
u64s 10 type subvolume 0:3:0 len 0 ver 0: root 1207959552 snapshot id 
4294967293 parent 2

snapshot
========
u64s 10 type snapshot 0:4294967292:0 len 0 ver 0: is_subvol 1 deleted 0 parent 
4294967294 children          0          0 subvol 2 tree 2 depth 1 skiplist 
4294967294 4294967294 4294967294
u64s 10 type snapshot 0:4294967293:0 len 0 ver 0: is_subvol 1 deleted 0 parent 
4294967294 children          0          0 subvol 3 tree 2 depth 1 skiplist 
4294967294 4294967294 4294967294
u64s 10 type snapshot 0:4294967294:0 len 0 ver 0: is_subvol 0 deleted 0 parent  
        0 children 4294967293 4294967292 subvol 0 tree 2 depth 0 skiplist 0 0 0
u64s 10 type snapshot 0:4294967295:0 len 0 ver 0: is_subvol 1 deleted 0 parent  
        0 children          0          0 subvol 1 tree 1 depth 0 skiplist 0 0 0

Now reading and interpreting the filehandles:

/mnt/.     type  177 : 00 10 00 00 00 00 00 00 01 00 00 00 00 00 00 00 : inode 
0000000000001000 subvolume 00000001 generation 00000000
/mnt/v     type  177 : 00 00 00 48 00 00 00 00 02 00 00 00 00 00 00 00 : inode 
0000000048000000 subvolume 00000002 generation 00000000
/mnt/s     type  177 : 00 00 00 48 00 00 00 00 03 00 00 00 00 00 00 00 : inode 
0000000048000000 subvolume 00000003 generation 00000000


So is there really a type difference between the objects created by
`bcachefs subvolume create` and `bcachefs subvolume snapshot` ? It appears
that they both point to a volume which points to a snapshot in the snapshot
tree.

Best

  Donald


>> In the thread it was assumed, that there are other (unspecified)
>> applications which need the filehandle and currently use name_to_handle_at().
>>
>> I though it was self-evident that a single syscall to retrieve all
>> information atomically is better than a set of syscalls. Each additional
>> syscall has overhead and you need to be concerned with the data changing
>> between the calls.
> 
> All other things being equal, yeah it would be. But things are never
> equal :)
> 
> Expanding struct statx is not going to be as easy as hoped, so we need
> to be a bit careful how we use the remaining space, and since as Dave
> pointed out the filehandle isn't needed for checking uniqueness unless
> nlink > 1 it's not really a hotpath in any application I can think of.
> 
> (If anyone does know of an application where it might matter, now's the
> time to bring it up!)
> 
>> Userspace nfs server as an example of an application, where visible
>> performance is more relevant, was already mentioned by someone else.
> 
> I'd love to hear confirmation from someone more intimately familiar with
> NFS, but AFAIK it shouldn't matter there; the filehandle exists to
> support resuming IO or other operations to a file (because the server
> can go away and come back). If all the client did was a stat, there's no
> need for a filehandle - that's not needed until a file is opened.

-- 
Donald Buczek
buc...@molgen.mpg.de
Tel: +49 30 8413 1433


Reply via email to