On Wed, 06 Dec 2023, Donald Buczek wrote:
> On 12/4/23 23:45, NeilBrown wrote:
> 
> >> Hmm, isn't a (struct file_handle) filehandle supposed to be opaque?
> >> I'm not sure if it is a good idea to stick filesystem-specific
> >> meta-information (like a subvolume ident) into special bits of opaque
> >> numbers (whether inode number or filehandle) at all.
> > 
> > Yes, a filehandle is opaque.  Digging around inside it would be wrong.
> > 
> > My understanding was that your primary need was not to know where
> > subvolumes were, but to know when two files were different even though
> > they had the same inode number.  filehandles can be used to tell if two
> > files are different with total reliability.
> 
> You are right, these are two orthogonal requests and they were not
> clearly separated in this thread, thus the misunderstanding. 
> 
> I think, unique identifiers for different objects are a must.  I think
> in bcachefs, the filehandles returned by name_to_handle_at() already
> are unique, so this could be used by userspace to work around
> duplicate inode numbers right away.  Even the hashing you proposed
> could be done in userspace if it wants to fold the data into to some
> existing format with inode numbers. 
> 
> >> The information could also be made available via specialized ioctl(s)
> >> as an alternative to xattr.
> > 
> > ioctl would be worse than xattr.  statx extension would be best if you
> > need something that filehandle cannot provide.
> > 
> >>
> >> One disadvantage of embedded in filehandle, xattr or ioctl is that
> >> userspace would need an extra syscall for each directory just to see
> >> if this is a subvolume or a snapshot, which most of the time it is
> >> not.  Maybe, even an otherwise unnecessary open is needed to get a
> >> consistent view between multiple information-retrieving syscalls.  If
> >> only statx() was extendable.
> > 
> > uhmm... statx is extendable...
> > Adding a STATX_ATTR_SUBVOLUME_ROOT attribute flag shouldn't be too
> > controversial, if a clear need and use-case were presented.
> 
> When the inode numbers are made unique one way or the other but tools
> like backup or mirror have no way to identify snapshots, they would
> just copy the snapshots removing the deduplication on the destination.
> If they had a way to figure out, that they are about to walk into a
> new volume, they could instead do a more sensible thing like stop and
> cry for help or whatever is explicitly configured.
> 
> statx() would be very nice, because you can avoid additional system
> calls.  I wish, we could get the file_handle in the same call, but I
> assume that is not possible to be added to statx() because of the
> potential big and unknown length.  However, I assume the combination
> of volume-ident and inode number, which both would be available from
> one statx() call, are also unique, so this should do, too. 

The file handle is limited in size to 128 bytes.  The current statx
structure doesn't have that much spare space, but I don't see a problem
with having it return a larger structure if that is explicitly
requested.  Of course I don't know what other developers might think of
that.

Extending statx to return a "root-of-subvol" flag or possibly a
"subvol-id" u64 should be uncontroversial and would allow backup tools
and similar to easily ignore subvols.  This would need to be driven the
the bcachefs team of course.

Thanks,
NeilBrown

Reply via email to