Re: [PATCH v4 1/2] diff --no-index: optionally follow symlinks

2017-03-25 Thread Junio C Hamano
On Sat, Mar 25, 2017 at 2:30 PM, Dennis Kaarsemaker
 wrote:
>
>>  - does this code do a reasonable thing when the path is a symbolic
>>link that points at a directory?  what does it mean to grab
>>st.st_size for such a thing (and then go on to open() and xmmap()
>>it)?
>
> No, it does something entirely unreasonable. I hadn't even thought of
> testing with symlinks to directories, as my ulterior motive was the
> next commit that makes it work with pipes. This will be fixed.

To be quite honest, I do not mind it if the "toplevel pipe that came from the
command line is treated as if it were a regular file" was the only change in
this series, without doing anything for symbolic links. I do not use the
process substitution myself, but I can see why sometimes it is handy to
pass two process invocations on the command line of "diff" (if it were only
one, then "-" with the usual redirection already works, but you cannot do
two command using that syntax).

Perhaps we can have only that part and perfect it first and have it ready for
the next release, postponing the symlink dereferencing, which is a different
issue?


Re: [PATCH v4 1/2] diff --no-index: optionally follow symlinks

2017-03-25 Thread Dennis Kaarsemaker
On Fri, 2017-03-24 at 15:56 -0700, Junio C Hamano wrote:

> diff --git a/diff.c b/diff.c
> > index be11e4ef2b..2afecfb939 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -2815,7 +2815,7 @@ int diff_populate_filespec(struct diff_filespec *s, 
> > unsigned int flags)
> > s->size = xsize_t(st.st_size);
> > if (!s->size)
> > goto empty;
> > -   if (S_ISLNK(st.st_mode)) {
> > +   if (S_ISLNK(s->mode)) {
> 
> This change is conceptually wrong.  s->mode (often) comes from the
> index but in this codepath, after finding that s->oid is not valid
> or we want to read from the working tree instead (several lines
> before this part), we are committed to read from the working tree
> and check things with st.st_* fields, not s->mode, when we decide
> what to do with the thing we find on the filesystem, no?

Hmm, true. It just accidentally does the right thing because s->mode
happens to always match the expectations of this code. I will pass on
more information into diff_populate_filespec so an explicit check can
be done here.

> > @@ -2825,6 +2825,10 @@ int diff_populate_filespec(struct diff_filespec *s, 
> > unsigned int flags)
> > s->should_free = 1;
> > return 0;
> > }
> > +   if (S_ISLNK(st.st_mode)) {
> > +   stat(s->path, );
> > +   s->size = xsize_t(st.st_size);
> > +   }
> > if (size_only)
> > return 0;
> > if ((flags & CHECK_BINARY) &&
> 
> I suspect that this would conflict with a recent topic.  

Possibly. I used the same base commit for the newer versions as that
seems to be your preference. If there is a merge conflict, do you want
me to rebase against current master?

> But more importantly, this inserted code feels doubly wrong.
> 
>  - what allows us to unconditionally do "ah, symbolic link on the
>disk--find the target of the link, not the symbolic link itself"?
>We do not seem to be checking '--dereference' around here.

The implicit check above (which you already noted is faulty) allows us
to do this. So fixing the check above will also involve fixing this.

>  - does this code do a reasonable thing when the path is a symbolic
>link that points at a directory?  what does it mean to grab
>st.st_size for such a thing (and then go on to open() and xmmap()
>it)?

No, it does something entirely unreasonable. I hadn't even thought of
testing with symlinks to directories, as my ulterior motive was the
next commit that makes it work with pipes. This will be fixed.

Thanks very much for the thoroughness of your review!

D.


Re: [PATCH v4 1/2] diff --no-index: optionally follow symlinks

2017-03-24 Thread Junio C Hamano
Dennis Kaarsemaker  writes:

> @@ -52,7 +52,7 @@ static int get_mode(const char *path, int *mode)
>  #endif
>   else if (path == file_from_standard_input)
>   *mode = create_ce_mode(0666);
> - else if (lstat(path, ))
> + else if (dereference ? stat(path, ) : lstat(path, ))
>   return error("Could not access '%s'", path);
>   else
>   *mode = st.st_mode;

This part makes sense---when the caller tells us to stat() we
stat(), otherwise, we lstat().

> diff --git a/diff.c b/diff.c
> index be11e4ef2b..2afecfb939 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -2815,7 +2815,7 @@ int diff_populate_filespec(struct diff_filespec *s, 
> unsigned int flags)
>   s->size = xsize_t(st.st_size);
>   if (!s->size)
>   goto empty;
> - if (S_ISLNK(st.st_mode)) {
> + if (S_ISLNK(s->mode)) {

This change is conceptually wrong.  s->mode (often) comes from the
index but in this codepath, after finding that s->oid is not valid
or we want to read from the working tree instead (several lines
before this part), we are committed to read from the working tree
and check things with st.st_* fields, not s->mode, when we decide
what to do with the thing we find on the filesystem, no?

> @@ -2825,6 +2825,10 @@ int diff_populate_filespec(struct diff_filespec *s, 
> unsigned int flags)
>   s->should_free = 1;
>   return 0;
>   }
> + if (S_ISLNK(st.st_mode)) {
> + stat(s->path, );
> + s->size = xsize_t(st.st_size);
> + }
>   if (size_only)
>   return 0;
>   if ((flags & CHECK_BINARY) &&

I suspect that this would conflict with a recent topic.  

But more importantly, this inserted code feels doubly wrong.

 - what allows us to unconditionally do "ah, symbolic link on the
   disk--find the target of the link, not the symbolic link itself"?
   We do not seem to be checking '--dereference' around here.

 - does this code do a reasonable thing when the path is a symbolic
   link that points at a directory?  what does it mean to grab
   st.st_size for such a thing (and then go on to open() and xmmap()
   it)?

Puzzled.

Thanks.