Re: [PATCH] use refnames instead of "left"/"right" in dirdiffs

2013-03-28 Thread John Keeping
On Thu, Mar 28, 2013 at 02:19:44PM +, John Keeping wrote:
> On Thu, Mar 28, 2013 at 01:46:16PM +0100, Christoph Anton Mitterer wrote:
> > On Wed, 2013-03-27 at 23:07 +, John Keeping wrote: 
> > > That's not going to work well on Windows, is it?
> >
> > Uhm Winwhat? No seriously... doesn't dir-diff fail ther anway? The mkdir
> > right now also uses mkpath with "/"... and I could read in it's
> > documentation that it would automatically translate this, does it?
> 
> I believe Windows generally accepts '/' as a directory separator in
> place of '\'.  In a recent thread Johannes Sixt reported a difftool test
> failure on Windows, so it does work (and we do have code to implicitly
> set --no-symlinks when running on Windows).
> 
> > >  Anything with two dots
> > > in is already forbidden so we don't need to worry about that
> >
> > Sure about this? I mean they're forbidden as refnames, but is this
> > really checked within git-difftool before?
> 
> We've already run git-diff by this point, and that should have
> complained if the refs are invalid, causing difftool to die.
> 
> > > ; I'm not
> > > sure we need to remove forward slashes at all
> >
> > Mhh could be... seems that the cleanup happens via completely removing
> > the $tmpdir path.
> > And for the actual diff it shouldn't matter when there's a partentdir
> > more.
> > 
> > >  until we consider the
> > > "commit containing" syntax ':/fix nasty bug' or 'master^{/fix bug}'.
> >
> > Phew... don't ask me... I'm not that into the git source code... from
> > the filename side, these shouldn't bother, only / an NUL is forbidden
> > (in POSIX,... again I haven't checked win/mac which might not adhere to
> > the standards).
> 
> Filenames on Windows cannot contain any of the following[1]:
> 
> \ / : * ? " < > |
> 
> but it's also potentially annoying that '^' and '{' have special meaning
> in some shells and would need escaping (although I suppose we don't
> really expect users to by typing these directory names in themselves).
> 
> > > I'm more concerned with specifiers containing '^', '@', '{', ':' - see
> > > 'SPECIFYING REVISIONS' in git-rev-parse(1) for the full details of
> > > what's acceptable.
> >
> > Mhh there are likely more problems... I just noted that $ldir/$rdir are
> > used in a call to system() so... that needs to be shell escaped to
> > prevent any bad stuff
> 
> Are there really non-list calls to system?  Providing the only calls
> provide each of the arguments separately (instead of in a single string)
> I think we're OK, but I'm also not a Perl expert.
> 
> > And if perl (me not a perl guy) interprets any such characters specially
> > in strings, it must be handled either.
> > 
> > > At some point I think it may be better to fall back
> > > to the SHA1 of the relevant commit.
> > Think that would be quite a drawback...
> 
> It depends where that happens.  I suspect most people use relatively
> simple ref specifiers which wouldn't get to this stage, but do you
> really want to turn the following into a directory name?
> 
> origin/master@{3 weeks ago}^{/Merge branch 'jk/}^2
> 
> I admit it's something of a contrived example but I hope it illustrates
> my point that sometimes naming the directory after the ref specifier may
> be less useful than just using "left" or the SHA1.

Actually, I think I was wrong here it's probably easier to just do
something you already do, but with a whitelist, like this:

my $leftname, $rightname;
// figure out what refs these point at...
$leftname =~ s/[^a-zA-Z0-9]/_/g;
$rightname =~ s/[^a-zA-Z0-9]/_/g;

We probably want to whitelist some more characters there, but that seems
like the simplest way to tackle it.  And if someone puts in a long
revision then they get a long directory name (or we truncate it at some
point).

> [1] http://support.microsoft.com/kb/177506
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] use refnames instead of "left"/"right" in dirdiffs

2013-03-28 Thread John Keeping
On Thu, Mar 28, 2013 at 01:46:16PM +0100, Christoph Anton Mitterer wrote:
> On Wed, 2013-03-27 at 23:07 +, John Keeping wrote: 
> > That's not going to work well on Windows, is it?
>
> Uhm Winwhat? No seriously... doesn't dir-diff fail ther anway? The mkdir
> right now also uses mkpath with "/"... and I could read in it's
> documentation that it would automatically translate this, does it?

I believe Windows generally accepts '/' as a directory separator in
place of '\'.  In a recent thread Johannes Sixt reported a difftool test
failure on Windows, so it does work (and we do have code to implicitly
set --no-symlinks when running on Windows).

> >  Anything with two dots
> > in is already forbidden so we don't need to worry about that
>
> Sure about this? I mean they're forbidden as refnames, but is this
> really checked within git-difftool before?

We've already run git-diff by this point, and that should have
complained if the refs are invalid, causing difftool to die.

> > ; I'm not
> > sure we need to remove forward slashes at all
>
> Mhh could be... seems that the cleanup happens via completely removing
> the $tmpdir path.
> And for the actual diff it shouldn't matter when there's a partentdir
> more.
> 
> >  until we consider the
> > "commit containing" syntax ':/fix nasty bug' or 'master^{/fix bug}'.
>
> Phew... don't ask me... I'm not that into the git source code... from
> the filename side, these shouldn't bother, only / an NUL is forbidden
> (in POSIX,... again I haven't checked win/mac which might not adhere to
> the standards).

Filenames on Windows cannot contain any of the following[1]:

\ / : * ? " < > |

but it's also potentially annoying that '^' and '{' have special meaning
in some shells and would need escaping (although I suppose we don't
really expect users to by typing these directory names in themselves).

> > I'm more concerned with specifiers containing '^', '@', '{', ':' - see
> > 'SPECIFYING REVISIONS' in git-rev-parse(1) for the full details of
> > what's acceptable.
>
> Mhh there are likely more problems... I just noted that $ldir/$rdir are
> used in a call to system() so... that needs to be shell escaped to
> prevent any bad stuff

Are there really non-list calls to system?  Providing the only calls
provide each of the arguments separately (instead of in a single string)
I think we're OK, but I'm also not a Perl expert.

> And if perl (me not a perl guy) interprets any such characters specially
> in strings, it must be handled either.
> 
> > At some point I think it may be better to fall back
> > to the SHA1 of the relevant commit.
> Think that would be quite a drawback...

It depends where that happens.  I suspect most people use relatively
simple ref specifiers which wouldn't get to this stage, but do you
really want to turn the following into a directory name?

origin/master@{3 weeks ago}^{/Merge branch 'jk/}^2

I admit it's something of a contrived example but I hope it illustrates
my point that sometimes naming the directory after the ref specifier may
be less useful than just using "left" or the SHA1.


[1] http://support.microsoft.com/kb/177506


John
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] use refnames instead of "left"/"right" in dirdiffs

2013-03-28 Thread Christoph Anton Mitterer
Hi John.


On Wed, 2013-03-27 at 23:07 +, John Keeping wrote: 
> That's not going to work well on Windows, is it?
Uhm Winwhat? No seriously... doesn't dir-diff fail ther anway? The mkdir
right now also uses mkpath with "/"... and I could read in it's
documentation that it would automatically translate this, does it?

>  Anything with two dots
> in is already forbidden so we don't need to worry about that
Sure about this? I mean they're forbidden as refnames, but is this
really checked within git-difftool before?

> ; I'm not
> sure we need to remove forward slashes at all
Mhh could be... seems that the cleanup happens via completely removing
the $tmpdir path.
And for the actual diff it shouldn't matter when there's a partentdir
more.


>  until we consider the
> "commit containing" syntax ':/fix nasty bug' or 'master^{/fix bug}'.
Phew... don't ask me... I'm not that into the git source code... from
the filename side, these shouldn't bother, only / an NUL is forbidden
(in POSIX,... again I haven't checked win/mac which might not adhere to
the standards).

> I'm more concerned with specifiers containing '^', '@', '{', ':' - see
> 'SPECIFYING REVISIONS' in git-rev-parse(1) for the full details of
> what's acceptable.
Mhh there are likely more problems... I just noted that $ldir/$rdir are
used in a call to system() so... that needs to be shell escaped to
prevent any bad stuff
And if perl (me not a perl guy) interprets any such characters specially
in strings, it must be handled either.

> At some point I think it may be better to fall back
> to the SHA1 of the relevant commit.
Think that would be quite a drawback...

Cheers,
Chris.


smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH] use refnames instead of "left"/"right" in dirdiffs

2013-03-27 Thread John Keeping
On Wed, Mar 27, 2013 at 11:13:17PM +0100, Christoph Anton Mitterer wrote:
> Currently, when a dir-diff is made with git-difftool the two revisions are
> stored in two temporary directories ".../left" and ".../right".
> Many difftools show these pathnames in ther UI and therefore it would be 
> helpful
> for users, if actual reference names specified as progam arguments was used
> instead.
> 
> Reference names might contain slash / characters which are not allowed to be
> part of a file name. These must therefore be encoded.
> 
> Also, reference names that would could possibly "break out" of the temporary
> directory (e.g. "/foo", "foo/../bar" or "foo/././bar") must be sanitised.
> * Added a subroutine escape_reference_to_single_directory_name() which 
> encodes a
>   reference name to a valid single directory name.
>   Any occurance of a slash / is replaced by two backslashes \\.
>   Having a backslash \ in a reference name should be forbidden, but just to be
>   save from collisions, any occurance of a backslash \ is replaced by a
>   backslash followed by an underscore \_ at first.
> 
> * Use this new function to construct the pathnames of the temporary 
> directories
>   for the two revisions in dir-diffs.
> 
> Signed-off-by: Christoph Anton Mitterer 
> 
> diff --git a/git-difftool.perl b/git-difftool.perl
> index 12231fb..53e756d 100755
> --- a/git-difftool.perl
> +++ b/git-difftool.perl
> @@ -83,6 +83,28 @@ sub exit_cleanup
> exit($status | ($status >> 8));
>  }
>  
> +sub escape_reference_to_single_directory_name
> +{
> +   # Git allows reference names (see git-check-ref-format(1)) which 
> cannot
> +   # be directly mapped to a single directory name.
> +   #
> +   # This subroutines replaces any occurance of a slash / by two
> +   # backslashes \\.
> +   # Thereby, break-out attempts like "/foo", "foo/../bar" or 
> "foo/././bar"
> +   # are prevented, too.

That's not going to work well on Windows, is it?  Anything with two dots
in is already forbidden so we don't need to worry about that; I'm not
sure we need to remove forward slashes at all, until we consider the
"commit containing" syntax ':/fix nasty bug' or 'master^{/fix bug}'.

I'm more concerned with specifiers containing '^', '@', '{', ':' - see
'SPECIFYING REVISIONS' in git-rev-parse(1) for the full details of
what's acceptable.  At some point I think it may be better to fall back
to the SHA1 of the relevant commit.

> +   #
> +   # Having a backslash \ in a reference name should be forbidden, but 
> just
> +   # to be save from collisions, any occurance of a backslash \ is 
> replaced
> +   # by a backslash followed by an underscore \_ at first.
> +
> +   my ($commit_name)  = @_;
> +
> +   $commit_name =~ s/\\/\\_/g;
> +   $commit_name =~ s/\///g;
> +
> +   return $commit_name;
> +}
> +
>  sub setup_dir_diff
>  {
> my ($repo, $workdir, $symlinks) = @_;
> @@ -169,8 +191,13 @@ EOF
>  
> # Setup temp directories
> my $tmpdir = tempdir('git-difftool.X', CLEANUP => 0, TMPDIR => 1);
> -   my $ldir = "$tmpdir/left";
> -   my $rdir = "$tmpdir/right";
> +   my $ldir = "$tmpdir/" . 
> escape_reference_to_single_directory_name($ARGV[0]);
> +   my $rdir = "$tmpdir/";
> +   if (@ARGV < 2) {
> +   $rdir .= 'HEAD';
> +   } else {
> +   $rdir .= escape_reference_to_single_directory_name($ARGV[1]);
> +   }

I don't think this approach is general enough, since git-difftool
accepts the same range of arguments that git-diff does.

We need to walk over @ARGV here, ignore anything starting with '-', stop
when we reach '--' and handle the arguments found in there including the
three acceptable range forms:

 
..
...

replacing missing commits with HEAD.

> mkpath($ldir) or exit_cleanup($tmpdir, 1);
> mkpath($rdir) or exit_cleanup($tmpdir, 1);
>  
> 


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] use refnames instead of "left"/"right" in dirdiffs

2013-03-27 Thread Christoph Anton Mitterer
Currently, when a dir-diff is made with git-difftool the two revisions are
stored in two temporary directories ".../left" and ".../right".
Many difftools show these pathnames in ther UI and therefore it would be helpful
for users, if actual reference names specified as progam arguments was used
instead.

Reference names might contain slash / characters which are not allowed to be
part of a file name. These must therefore be encoded.

Also, reference names that would could possibly "break out" of the temporary
directory (e.g. "/foo", "foo/../bar" or "foo/././bar") must be sanitised.
* Added a subroutine escape_reference_to_single_directory_name() which encodes a
  reference name to a valid single directory name.
  Any occurance of a slash / is replaced by two backslashes \\.
  Having a backslash \ in a reference name should be forbidden, but just to be
  save from collisions, any occurance of a backslash \ is replaced by a
  backslash followed by an underscore \_ at first.

* Use this new function to construct the pathnames of the temporary directories
  for the two revisions in dir-diffs.

Signed-off-by: Christoph Anton Mitterer 

diff --git a/git-difftool.perl b/git-difftool.perl
index 12231fb..53e756d 100755
--- a/git-difftool.perl
+++ b/git-difftool.perl
@@ -83,6 +83,28 @@ sub exit_cleanup
exit($status | ($status >> 8));
 }
 
+sub escape_reference_to_single_directory_name
+{
+   # Git allows reference names (see git-check-ref-format(1)) which cannot
+   # be directly mapped to a single directory name.
+   #
+   # This subroutines replaces any occurance of a slash / by two
+   # backslashes \\.
+   # Thereby, break-out attempts like "/foo", "foo/../bar" or "foo/././bar"
+   # are prevented, too.
+   #
+   # Having a backslash \ in a reference name should be forbidden, but just
+   # to be save from collisions, any occurance of a backslash \ is replaced
+   # by a backslash followed by an underscore \_ at first.
+
+   my ($commit_name)  = @_;
+
+   $commit_name =~ s/\\/\\_/g;
+   $commit_name =~ s/\///g;
+
+   return $commit_name;
+}
+
 sub setup_dir_diff
 {
my ($repo, $workdir, $symlinks) = @_;
@@ -169,8 +191,13 @@ EOF
 
# Setup temp directories
my $tmpdir = tempdir('git-difftool.X', CLEANUP => 0, TMPDIR => 1);
-   my $ldir = "$tmpdir/left";
-   my $rdir = "$tmpdir/right";
+   my $ldir = "$tmpdir/" . 
escape_reference_to_single_directory_name($ARGV[0]);
+   my $rdir = "$tmpdir/";
+   if (@ARGV < 2) {
+   $rdir .= 'HEAD';
+   } else {
+   $rdir .= escape_reference_to_single_directory_name($ARGV[1]);
+   }
mkpath($ldir) or exit_cleanup($tmpdir, 1);
mkpath($rdir) or exit_cleanup($tmpdir, 1);
 



smime.p7s
Description: S/MIME cryptographic signature