2010]

Dan Price Mon, 29 Mar 2010 21:54:01 -0700

On Mon 29 Mar 2010 at 11:14AM, Bart Smaalders wrote:
> On 03/29/10 11:01, Matthew Ahrens wrote:
> 
> >How do commands like ls and find handle printing of filenames with
> >arbitrary characters (newlines and such)?
> 
> In general, badly.


Tim,

My concern, which others have hinted at, is that there are a legion of
people who are going to want to consume this information and there is
great value in making said information be machine parseable.  Automated
build systems, tripwires, fancy backup/recovery tools, et cetera.

In summary, the current output seems mostly OK if it's for humans, but
the case is ambiguous about who the expected consumer is.  It would
be a tragedy if there wasn't a machine consumable way to get at this
information.

I also have questions about how intelligent a consuming piece of
software must be in order to make sense of this information.  Has anyone
written a proof of concept tool using this?  For example, if a directory
/foo/a is renamed to /foo/b, then an analyzer would need to stat /foo/b
in order to discover that /foo/b is a directory, then traverse into as
needed.  It would be a shame if everyone who wanted to consume this had
to write the same thousand lines of code (I'm happy to be convinced that
this isn't the case).

Some specific questions...

1) In what order are the changes printed?  If I saw:

        +       /myfiles/rename_dir
        R       /myfiles/rename_dir -> /myfiles/rename_dir

My analyzer would need to be smart enough to realize that the second
must have happened before the first, and that both paths need
evaluation.  Right?

2) The meaning of "file/directory" (Don's concern aside) seems ambiguous in
the proposal.  Are we tracking the filesystem *namespace* entry?  Or the
actual object?  I found that not being sure of this made the proposal
hard to evaluate.  Simple thought experiment which confused me:

        snapshot at 1
        rm a/b
        rm a/c
        rmdir a
        echo "foo" > a
        snapshot at 2

Does that yield this?                   Or this?

        -       a/b           |         -       a/b
        -       a/c           |         -       a/c
        -       a             |         M       a
        +       a             |

3) Output is shown with leading slashes.  Is output shown relative to the
mount point?  Or something else?  (If the former, what if between @a and
@b the mountpoint changed?)

4) I would also vote for a mode which simply outputs a list of
pathnames to investigate for differences.  This would enable:

   zfs diff -someflag a at 1 a at 2 | xargs do_some_analysis_on_these


Thanks for tackling this,

        -dp

-- 
Daniel Price, Solaris Kernel Engineering    http://blogs.sun.com/dp

zfs diff [PSARC/2010/105 FastTrack timeout 04/05/2010]

Reply via email to