On Mon, May 03, 2010 at 10:36:29PM -0400, Edward Ned Harvey wrote:
> > From: Jonathan Adams [mailto:jonathan.ad...@oracle.com]
> > 
> > Under what circumstances are you doing this?  Is this a kernel driver,
> > a /dev/kmem reader, etc?  Do you want it for any possible file, a file
> > open
> > in some process's file descriptor table, or what?
> 
> Good question.  Tough to answer concisely, because this is the summary of
> weeks of discussion, several dozen messages and ongoing brainstorming on
> zfs-discuss.
> 
> The "big picture" is like this:  Users wish to easily list, or restore, or
> just look at, some old snapshot version of some files or directory.  They
> may be working inside a zfs filesystem, nested within several other layers
> of zfs filesystems.  They need to locate the correct ".zfs" directory, which
> is probably hidden, and since they're not sysadmins, they may not remember
> the name "zfs" much less precisely where to find the right one ... they may
> even be computer novices, browsing a CIFS share, for all we know.

Sure.

> For me, locating the right .zfs directory is easy.  But it's really nice if
> I can make it easier for my users.
> 
> Even after locating the right .zfs directory, the user must navigate *back
> down* the directory structure, to the same directory path where they began
> (except this time they're in a snapshot), and they must do this for every
> snapshot, checking time stamps, in order to find all the different versions
> of some object.  In reality, they won't bother doing this for every
> snapshot; because it's too much work, they'll just make a guess which one
> they want, and repeat if desired.

Understood.

> I've begun (very infantile) developing something called "zhist" which is
> written in python (to be cross-platform compatible, able to run on NFS or
> CIFS clients) to locate all the snapshot versions of some object.
> Optionally just show the versions that are different.  But another user on
> zfs-discuss commented that the pathname of an object is prone to change from
> snapshot to snapshot, by renaming or relocating to a different directory.
> So if possible, I'd like to be intelligent, and locate objects by inode
> number instead of blindly assuming an unchanged pathname.

Understood;  though you can always "check" (i.e. stat(".zfs/path/object") and
see if the inode # matches).  In most cases, the file will probably not have
moved.  I agree that searching the whole space is hard.

> zhist would automate the process:  Lookup inode number of an object in the
> present filesystem, locate the .zfs directory for this filesystem, find all
> the snapshots of this object in previous snapshots, check the mtime of each
> one, and only print out unique results which are accessible by the current
> user.  All of which is to be completed ... very quickly.

Ah;  have you looked at http://arc.opensolaris.org/caselog/PSARC/2010/105/,
the ZFS diff ARC case?  It wouldn't work over CIFS or NFS, but would give
you the correct name.

> And eventually, made available in GUI form, and over NFS and CIFS, and
> possibly even to work on files (instead of being limited to just
> directories).
> 
> There are many obstacles to overcome before a robust final product is
> available, so the plan is to take simple baby steps, one at a time.
> 
> In order to describe the first baby step, I acknowledge:
> * At present, there is no reference in the filesystem, from a file to its
> parent directory(ies).  So at present, it's only possible to identify the
> path name of an *directory* by inode number.

That's not (quite) true;  ZFS does keep track of parents, and since you are
presumably starting from a path, you can certainly get a path->inode mapping.

> * There is a potential security risk, so even the reverse directory lookup
> by inode, can only be done by root.

What I don't understand is what you think an in-kernel solution can do
which is better than a find(1) invocation.  Are you thinking of using
the VFS_VGET() interfaces, then walking up the ".." chains to the root
filesystem?  Doing this in a stable fashion seems hard.

> * The function is only available in solaris/opensolaris kernel, so it can
> only be done locally (not NFS or CIFS)
> And probably some more important limitations.
> 
> So the first baby step is simply to reverse lookup any directory, as root,
> on a local ZFS filesystem.  Hopefully this can be built into an application
> which root runs, and not purely limited to inside the kernel.  You'd be
> surprised how many people tell me "Can't be done."  So a very limited case
> proof of concept is a good start.

It sounds like what you want is something like "zfs history dataset path",
which walks dataset and all of its snapshots, giving you the history of
every change which we have a record of which involved path or a file renamed
from path.

> Ideally, the second step is to create a setuid root executable, such as
> sudo, which local "normal" users can run, which will become root for a
> moment, derive the path, check to see if the normal user has access to that
> path name, and upon success, print the results.
> 
> Naturally, the whole process must be nearly instant.  Which means "find
> /tank/.zfs/snapshot -inum 12345" is not acceptable.

> As things develop more, I expect more interest to arise.  But for now, I'm
> all alone.  For now, I only know this can be done in theory, and it's very
> unclear if even the slightest thing necessary for implementation is actually
> available at all.

Certainly there's no stable way to do this at the moment, but I'm sure that
something could be hacked together in the kernel.  How much infrastructure
have you put together?

Cheers,
- jonathan

_______________________________________________
opensolaris-code mailing list
opensolaris-code@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Reply via email to