On 3/29/10 10:54 PM, Dan Price wrote:
> On Mon 29 Mar 2010 at 11:14AM, Bart Smaalders wrote:
>    
>> On 03/29/10 11:01, Matthew Ahrens wrote:
>>
>>      
>>> How do commands like ls and find handle printing of filenames with
>>> arbitrary characters (newlines and such)?
>>>        
>> In general, badly.
>>      
> Tim,
>
> My concern, which others have hinted at, is that there are a legion of
> people who are going to want to consume this information and there is
> great value in making said information be machine parseable.  Automated
> build systems, tripwires, fancy backup/recovery tools, et cetera.
>
> In summary, the current output seems mostly OK if it's for humans, but
> the case is ambiguous about who the expected consumer is.  It would
> be a tragedy if there wasn't a machine consumable way to get at this
> information.
>
>    
I'm adding a -H option for scripting, with parseable output.
> I also have questions about how intelligent a consuming piece of
> software must be in order to make sense of this information.  Has anyone
> written a proof of concept tool using this?  For example, if a directory
> /foo/a is renamed to /foo/b, then an analyzer would need to stat /foo/b
> in order to discover that /foo/b is a directory, then traverse into as
> needed.  It would be a shame if everyone who wanted to consume this had
> to write the same thousand lines of code (I'm happy to be convinced that
> this isn't the case).
>
> Some specific questions...
>
> 1) In what order are the changes printed?  If I saw:
>
>       +       /myfiles/rename_dir
>       R       /myfiles/rename_dir ->  /myfiles/rename_dir
>
> My analyzer would need to be smart enough to realize that the second
> must have happened before the first, and that both paths need
> evaluation.  Right?
>
>    
This got clarified, I believe.

> 2) The meaning of "file/directory" (Don's concern aside) seems ambiguous in
> the proposal.  Are we tracking the filesystem *namespace* entry?  Or the
> actual object?  I found that not being sure of this made the proposal
> hard to evaluate.  Simple thought experiment which confused me:
>
>       snapshot at 1
>       rm a/b
>       rm a/c
>       rmdir a
>       echo "foo">  a
>       snapshot at 2
>
> Does that yield this?                 Or this?
>
>       -       a/b           |         -       a/b
>       -       a/c           |         -       a/c
>       -       a             |         M       a
>       +       a             |
>
>    
We are tracking the actual file objects.  Running your test with the 
current code:

M       /files/
-       /files/a
-       /files/a/b
-       /files/a/c
+       /files/a

Having slept on this, I think an extra field in the output will help.  A 
type character
could be added to another column, using the same sorts of symbols that 
ls -F shows,
@ for symbolic link, / for directory, | for pipe, etc.  Plus
an 'F' to indicate a regular file.  So the above would become:

M       /files/    /
-       /files/a    F
-       /files/a/b    F
-       /files/a/c    F
+       /files/a    F
> 3) Output is shown with leading slashes.  Is output shown relative to the
> mount point?  Or something else?  (If the former, what if between @a and
> @b the mountpoint changed?)
>
>    
The output shown is relative to where the dataset is mounted at the time 
of the diff.
We don't necessarily know where it was mounted at the time of any 
particular snapshot.

You are correct that the command should work with clones, too, as though 
are desendant.
For a clone we'd present its paths relative to where it is mounted.

-tim

Reply via email to