Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-28 Thread Jonathan Nieder
Hi,

Some quick details.

Florian Achleitner wrote:

Anyways, specifying cat-blob-fd is not 
 allowed via the 'option' command (see Documentation and 85c62395).
 It wouldn't make too much sense, because the file descriptor must be set up 
 by 
 the parent.

 But for the fifo, it would, probably.

More precisely, requiring that the cat-blob fd go on the command line
instead of in the stream matches a model where fast-import's three
standard streams are abstract:

 - its input, using the INPUT FORMAT described in git-fast-import(1)
 - its standard output, which echoes progress commands
 - its backflow stream, where responses to cat-blob and ls commands go

The stream format is not necessarily pinned to a Unix model where
input and output are based on filesystems and file descriptors.  We
can imagine a fast-import backend that runs on a remote server and the
transport used for these streams is sockets; for fast-import frontends
to be usable in such a context, the streams they produce must not rely
on particular fd numbers, nor on pathnames (except for mark state
saved to relative paths with --relative-marks) representing anything
in particular.

This goes just as much for a fifo set up on the filesystem where the
fast-import backend runs as for an inherited file descriptor.  In the
current model, such backend-specific details of setup go on the
command line.

   The backward channel is only used by 
 the 
 commands 'cat-blob' and 'ls' of fast-import. If a remote helper wants to use 
 them, it would could make fast-import open the pipe by sending an 'option' 
 command with the fifo filename, otherwise it defaults to stdout (like now) 
 and 
 is rather useless.

I'm getting confused by terminology again.  Let's see if I have the cast
of characters straight:

 - the fast-import backend (e.g., git fast-import or hg-fastimport)
 - the fast-import frontend (e.g., git fast-export or svn-fe)
 - git's generic foreign VCS support plumbing, also known as
   transport-helper.c
 - the remote helper (e.g., git remote-svn or git remote-testgit)

Why would the fast-import backend ever need to open a pipe?  If I want
it to write backflow to a fifo, I can use

mkfifo my-favorite-fifo
git fast-import --cat-blob-fd=3 3my-favorite-fifo

If I want it to write backflow to a pipe, I can use (using ksh syntax)

cat |
git fast-import --cat-blob-fd=3 3p

 This would take the fifo setup out of transport-helper. The remote-helper 
 would 
 have to create it, if it needs it.

We can imagine transport-helper.c learning the name of a fifo set up by
the remote helper by sending it the capabilities command:

git capabilities
helper option
helper import
helper cat-blob-file my-favorite-fifo
helper refspec refs/heads/*:refs/helper/remotename/*
helper

transport-helper.c could then use that information to invoke
fast-import appropriately:

git fast-import --cat-blob-fd=3 3my-favorite-fifo

But this seems like pushing complication onto the remote helper; since
there is expected to be one remote helper per foreign VCS,
implementing the appropriate logic correctly once and for all in
transport-helper.c for all interested remote helpers to take advantage
of seems to me like a better policy.

 Apropos stdout. That leads to another idea. You already suggested that it 
 would be easiest to only use FDs 0..2. Currently stdout and stderr of fast-
 import go to the shell. We could connect stdout to the remote-helper and 
 don't 
 need the additional channel at all.

The complication that makes this strategy not so easy is progress
commands in the fast-import input stream.  (Incidentally, it would be
nice to teach transport-helper.c to display specially formatted
progress commands using a progress bar some day.)

Hoping that clarifies,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-28 Thread Jonathan Nieder
Florian Achleitner wrote:

 So I should kick printd out?

I think so, yes.

git log -SGIT_TRANSPORT_HELPER_DEBUG transport-helper.c tells me
that that option was added to make the transport-helper machinery make
noise to make it obvious at what stage a remote helper has deadlocked.

GIT_TRANSPORT_HELPER_DEBUG already takes care of that, so there would
not be need for an imitation of that in remote-svn, unless I am
missing something (and even if I am missing something, it seems
complicated enough to be worth moving to another patch where it can be
explained more easily).

[...]
 +
 + printf(import\n);
 + printf(\n);
 + fflush(stdout);
 + return SUCCESS;
 +}
[...]
Maybe the purpose of the flush would
 be more obvious if it were moved to the caller.

 Acutally this goes to the git parent process (not fast-import), waiting for a 
 reply to the command. I think I have to call flush on this side of the pipe. 
 Can you flush it from the reader? This wouldn't have the desired effect, it 
 drops buffered data.

*slaps head*  This is the capabilities command, and it needs to
flush because the reader needs to know what commands it's allowed to
use next before it starts using them.  My brain turned off and I
thought you were emitting an import command rather than advertising
that you support it for some reason.

And 'printf(\n)' was a separate printf because that way, patches
like

printf(import\n);
+   printf(bidi-import\n);
printf(\n);
fflush(stdout);

become simpler.

I'm tempted to suggest a structure like

const char * const capabilities[] = {import};
int i;

for (i = 0; i  ARRAY_SIZE(capabilities); i++)
puts(capabilities[i]);
puts();   /* blank line */

fflush(stdout);

but your original code was fine, too.

[...]
 + /* opening a fifo for usually reading blocks until a writer has opened
 it too. +  * Therefore, we open with RDWR.
 +  */
 + report_fd = open(back_pipe_env, O_RDWR);
 + if(report_fd  0) {
 + die(Unable to open fast-import back-pipe! %s, 
 strerror(errno));
[...]
 I believe it can be solved using RDONLY and WRONLY too. Probably we solve it 
 by not using the fifo at all.
 Currently the blocking comes from the fact, that fast-import doesn't parse 
 it's command line at startup. It rather reads an input line first and decides 
 whether to parse the argv after reading the first input line or at the end of 
 the input. (don't know why)
 remote-svn opens the pipe before sending the first command to fast-import and 
 blocks on the open, while fast-import waits for input -- deadlock.

Thanks for explaining.  Now we've discussed a few different approproaches,
none of which is perfect.

a. use --cat-blob-fd, no FIFO

   Doing this unconditionally would break platforms that don't support
   --cat-blob-fd=(descriptor 2), like Windows, so we'd have to:

   * Make it conditional --- only do it (1) we are not on Windows and
 (2) the remote helper requests backflow by advertising the
 import-bidi capability.

   * Let the remote helper know what's going on by using
 import-bidi instead of import in the command stream to
 initiate the import.

b. use envvars to pass around FIFO path

   This complicates the fast-import interface and makes debugging hard.
   It would be nice to avoid this if we can, but in case we can't, it's
   nice to have the option available.

c. transport-helper.c uses FIFO behind the scenes.

   Like (a), except it would require a fast-import tweak (boo) and
   would work on Windows (yea)

d. use --cat-blob-fd with FIFO

   Early scripted remote-svn prototypes did this to fulfill fetch
   requests.

   It has no advantage over use --cat-blob-fd, no FIFO except being
   easier to implement as a shell script.  I'm listing this just for
   comparison; since (a) looks better in every way, I don't see any
   reason to pursue this one.

Since avoiding deadlocks with bidirectional communication is always a
little subtle, it would be nice for this to be implemented once in
transport-helper.c rather than each remote helper author having to
reimplement it again.  As a result, my knee-jerk ranking is a  c 
b  d.

Sane?
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 14/16] transport-helper: add import|export-marks to fast-import command line.

2012-07-28 Thread Jonathan Nieder
Hi,

Florian Achleitner wrote:

   a515ebe9.
[...]
 Btw, these added capabilities are not mentioned in Docs.

Sverre, any hints about how these (capabilities import-marks file
and export-marks file) are meant to be used?  Why would one use
these instead of feature import-marks / feature export-marks?
Should there be a corresponding capability for relative-marks, too?

Curious,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make git-svn Use accessors for paths and urls

2012-07-28 Thread Michael G Schwern
On 2012.7.27 8:10 PM, Jonathan Nieder wrote:
 This is the last refactoring patch series.  After this bugs, start
 getting fixed.
 
 I just wanted to say thanks for your thoughtful presentation of this
 code.  I was worried before, but these have been pleasantly submitted.

You're welcome.  I've gained at least three levels in rebasing in the process.


 If you have a chance at some point to offer advice, I'd love to add
 the information to Documentation/SubmittingPatches that was missing.
 Proposed text is ideal, but outline form or a list of missing aspects
 and confusing existing coverage would be fine, too.

Remind me when I'm done with the 1.7 fix please?


-- 
You are wicked and wrong to have broken inside and peeked at the
implementation and then relied upon it.
-- tchrist in 31832.969261130@chthon
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make git-svn Use accessors for paths and urls

2012-07-28 Thread Jonathan Nieder
Michael G Schwern wrote:
 On 2012.7.27 8:10 PM, Jonathan Nieder wrote:

 If you have a chance at some point to offer advice, I'd love to add
 the information to Documentation/SubmittingPatches that was missing.
[...]
 Remind me when I'm done with the 1.7 fix please?

Sure, if I remember to. :)

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] Move the canonicalization functions to Git::SVN::Utils

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

So they can be used by others.

I'd like to test them, but they're going to become SVN API wrappers shortly
and those aren't predictable.

No functional change.
---
 git-svn.perl  | 33 +++-
 perl/Git/SVN/Utils.pm | 52 ++-
 2 files changed, 58 insertions(+), 27 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index de1ddd1..a857484 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -29,7 +29,13 @@ use Git::SVN::Prompt;
 use Git::SVN::Log;
 use Git::SVN::Migration;
 
-use Git::SVN::Utils qw(fatal can_compress);
+use Git::SVN::Utils qw(
+   fatal
+   can_compress
+   canonicalize_path
+   canonicalize_url
+);
+
 use Git qw(
git_cmd_try
command
@@ -1256,31 +1262,6 @@ sub cmd_mkdirs {
$gs-mkemptydirs($_revision);
 }
 
-sub canonicalize_path {
-   my ($path) = @_;
-   my $dot_slash_added = 0;
-   if (substr($path, 0, 1) ne /) {
-   $path = ./ . $path;
-   $dot_slash_added = 1;
-   }
-   # File::Spec-canonpath doesn't collapse x/../y into y (for a
-   # good reason), so let's do this manually.
-   $path =~ s#/+#/#g;
-   $path =~ s#/\.(?:/|$)#/#g;
-   $path =~ s#/[^/]+/\.\.##g;
-   $path =~ s#/$##g;
-   $path =~ s#^\./## if $dot_slash_added;
-   $path =~ s#^/##;
-   $path =~ s#^\.$##;
-   return $path;
-}
-
-sub canonicalize_url {
-   my ($url) = @_;
-   $url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;
-   return $url;
-}
-
 # get_svnprops(PATH)
 # --
 # Helper for cmd_propget and cmd_proplist below.
diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 3d0bfa4..c842d00 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -5,7 +5,12 @@ use warnings;
 
 use base qw(Exporter);
 
-our @EXPORT_OK = qw(fatal can_compress);
+our @EXPORT_OK = qw(
+   fatal
+   can_compress
+   canonicalize_path
+   canonicalize_url
+);
 
 
 =head1 NAME
@@ -56,4 +61,49 @@ sub can_compress {
 }
 
 
+=head3 canonicalize_path
+
+my $canoncalized_path = canonicalize_path($path);
+
+Converts $path into a canonical form which is safe to pass to the SVN
+API as a file path.
+
+=cut
+
+sub canonicalize_path {
+   my ($path) = @_;
+   my $dot_slash_added = 0;
+   if (substr($path, 0, 1) ne /) {
+   $path = ./ . $path;
+   $dot_slash_added = 1;
+   }
+   # File::Spec-canonpath doesn't collapse x/../y into y (for a
+   # good reason), so let's do this manually.
+   $path =~ s#/+#/#g;
+   $path =~ s#/\.(?:/|$)#/#g;
+   $path =~ s#/[^/]+/\.\.##g;
+   $path =~ s#/$##g;
+   $path =~ s#^\./## if $dot_slash_added;
+   $path =~ s#^/##;
+   $path =~ s#^\.$##;
+   return $path;
+}
+
+
+=head3 canonicalize_url
+
+my $canonicalized_url = canonicalize_url($url);
+
+Converts $url into a canonical form which is safe to pass to the SVN
+API as a URL.
+
+=cut
+
+sub canonicalize_url {
+   my ($url) = @_;
+   $url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;
+   return $url;
+}
+
+
 1;
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

The SVN API functions will not accept ../foo but their canonicalization
functions will not collapse it.  So we'll have to do it ourselves.

_collapse_dotdot() works better than the existing regex did.

This will be used shortly when canonicalize_path() starts using the
SVN API.
---
 perl/Git/SVN/Utils.pm | 14 +-
 t/Git-SVN/Utils/collapse_dotdot.t | 23 +++
 2 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 t/Git-SVN/Utils/collapse_dotdot.t

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 9d5d3c5..7314e52 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -72,6 +72,18 @@ API as a file path.
 
 =cut
 
+# Turn foo/../bar into bar
+sub _collapse_dotdot {
+   my $path = shift;
+
+   1 while $path =~ s{/[^/]+/+\.\.}{};
+   1 while $path =~ s{[^/]+/+\.\./}{};
+   1 while $path =~ s{[^/]+/+\.\.}{};
+
+   return $path;
+}
+
+
 sub canonicalize_path {
my ($path) = @_;
my $dot_slash_added = 0;
@@ -83,7 +95,7 @@ sub canonicalize_path {
# good reason), so let's do this manually.
$path =~ s#/+#/#g;
$path =~ s#/\.(?:/|$)#/#g;
-   $path =~ s#/[^/]+/\.\.##g;
+   $path = _collapse_dotdot($path);
$path =~ s#/$##g;
$path =~ s#^\./## if $dot_slash_added;
$path =~ s#^/##;
diff --git a/t/Git-SVN/Utils/collapse_dotdot.t 
b/t/Git-SVN/Utils/collapse_dotdot.t
new file mode 100644
index 000..1da1cce
--- /dev/null
+++ b/t/Git-SVN/Utils/collapse_dotdot.t
@@ -0,0 +1,23 @@
+#!/usr/bin/env perl
+
+use strict;
+use warnings;
+
+use Test::More 'no_plan';
+
+use Git::SVN::Utils;
+my $collapse_dotdot = \Git::SVN::Utils::_collapse_dotdot;
+
+my %tests = (
+   foo/bar/baz   = foo/bar/baz,
+   ..= ..,
+   foo/..= ,
+   /foo/bar/../../baz= /baz,
+   deeply/.././deeply/nested = ./deeply/nested,
+);
+
+for my $arg (keys %tests) {
+   my $want = $tests{$arg};
+
+   is $collapse_dotdot-($arg), $want, _collapse_dotdot('$arg') = $want;
+}
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] Remove irrelevant comment.

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

The code doesn't use File::Spec.
---
 perl/Git/SVN/Utils.pm | 2 --
 1 file changed, 2 deletions(-)

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index deade07..6c8ae53 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -92,8 +92,6 @@ sub canonicalize_path {
$path = ./ . $path;
$dot_slash_added = 1;
}
-   # File::Spec-canonpath doesn't collapse x/../y into y (for a
-   # good reason), so let's do this manually.
$path =~ s#/+#/#g;
$path =~ s#/\.(?:/|$)#/#g;
$path = _collapse_dotdot($path);
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] Switch path canonicalization to use the SVN API.

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

All tests pass with SVN 1.6.  SVN 1.7 remains broken, not worrying
about it yet.

SVN changed its path canonicalization API between 1.6 and 1.7.
http://svnbook.red-bean.com/en/1.6/svn.developer.usingapi.html#svn.developer.usingapi.urlpath
http://svnbook.red-bean.com/en/1.7/svn.developer.usingapi.html#svn.developer.usingapi.urlpath

The SVN API does not accept foo/.. but it also doesn't canonicalize
it.  We have to do it ourselves.
---
 perl/Git/SVN/Utils.pm | 21 +
 1 file changed, 21 insertions(+)

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 6c8ae53..7ae6fac 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -86,6 +86,27 @@ sub _collapse_dotdot {
 
 
 sub canonicalize_path {
+   my $path = shift;
+
+   # The 1.7 way to do it
+   if ( defined SVN::_Core::svn_dirent_canonicalize ) {
+   $path = _collapse_dotdot($path);
+   return SVN::_Core::svn_dirent_canonicalize($path);
+   }
+   # The 1.6 way to do it
+   elsif ( defined SVN::_Core::svn_path_canonicalize ) {
+   $path = _collapse_dotdot($path);
+   return SVN::_Core::svn_path_canonicalize($path);
+   }
+   # No SVN API canonicalization is available, do it ourselves
+   else {
+   $path = _canonicalize_path_ourselves($path);
+   return $path;
+   }
+}
+
+
+sub _canonicalize_path_ourselves {
my ($path) = @_;
my $dot_slash_added = 0;
if (substr($path, 0, 1) ne /) {
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] Make Git::SVN and Git::SVN::Ra canonicalize paths and urls.

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

This canonicalizes paths and urls as early as possible so we don't
have to remember to do it at the point of use.  It will fix a swath
of SVN 1.7 problems in one go.

Its ok to double canonicalize things.

SVN 1.7 still fails, still not worrying about that.
---
 perl/Git/SVN.pm| 6 --
 perl/Git/SVN/Ra.pm | 6 +-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index b0ed3ea..798f6c4 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -27,6 +27,8 @@ use Git::SVN::Utils qw(
fatal
can_compress
join_paths
+   canonicalize_path
+   canonicalize_url
 );
 
 my $can_use_yaml;
@@ -2305,7 +2307,7 @@ sub path {
 
 if( @_ ) {
 my $path = shift;
-$self-{path} = $path;
+$self-{path} = canonicalize_path($path);
 return;
 }
 
@@ -2318,7 +2320,7 @@ sub url {
 
 if( @_ ) {
 my $url = shift;
-$self-{url} = $url;
+$self-{url} = canonicalize_url($url);
 return;
 }
 
diff --git a/perl/Git/SVN/Ra.pm b/perl/Git/SVN/Ra.pm
index 27dcdd5..ef7b3dd 100644
--- a/perl/Git/SVN/Ra.pm
+++ b/perl/Git/SVN/Ra.pm
@@ -3,6 +3,10 @@ use vars qw/@ISA $config_dir $_ignore_refs_regex 
$_log_window_size/;
 use strict;
 use warnings;
 use SVN::Client;
+use Git::SVN::Utils qw(
+   canonicalize_url
+);
+
 use SVN::Ra;
 BEGIN {
@ISA = qw(SVN::Ra);
@@ -138,7 +142,7 @@ sub url {
 
 if( @_ ) {
 my $url = shift;
-$self-{url} = $url;
+$self-{url} = canonicalize_url($url);
 return;
 }
 
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fix git-svn for SVN 1.7

2012-07-28 Thread Michael G. Schwern
This patch series fixes git-svn for SVN 1.7 tested against SVN 1.7.5 and
1.6.18.  Patch 7/8 is where SVN 1.7 starts passing.

There is one exception.  t9100-git-svn-basic.sh fails 11-13.  This appears
to be due to a bug in SVN to do with symlinks.  Leave that for somebody
else, this is the final submission in the series.

The work was difficult because the code relies on simple string equalty
when comparing URLs and paths.  Turning on canonicalization in one part
of the code would cause another part to fail if it also did not
canonicalize.  There's likely still issues.

A better solution would be to have path and URL objects which overload
the eq operator and automatically stringify canonicalized and escaped.

This patch series should be placed on top of the previous which
made accessors canonicalize.  I'm getting a bit ahead of things
by submitting it now, but I'd like to get the review process
rolling.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/8] Fix typo in test

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

Test to check that the migration got rid of the old style git-svn directory.
It wasn't failing, just throwing a message to STDERR.
---
 t/t9107-git-svn-migrate.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t9107-git-svn-migrate.sh b/t/t9107-git-svn-migrate.sh
index 289fc31..cfb4453 100755
--- a/t/t9107-git-svn-migrate.sh
+++ b/t/t9107-git-svn-migrate.sh
@@ -32,7 +32,7 @@ test_expect_success 'initialize old-style (v0) git svn 
layout' '
echo $svnrepo  $GIT_DIR/git-svn/info/url 
echo $svnrepo  $GIT_DIR/svn/info/url 
git svn migrate 
-   ! test -d $GIT_DIR/git svn 
+   ! test -d $GIT_DIR/git-svn 
git rev-parse --verify refs/${remotes_git_svn}^0 
git rev-parse --verify refs/remotes/svn^0 
test $(git config --get svn-remote.svn.url) = $svnrepo 
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/8] Replace hand rolled URL escapes with canonicalization

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

Continuing to move towards getting everything canonicalizing the same way.

* Git::SVN-init_remote_config and Git::SVN::Ra-minimize_url both
  have to canonicalize the same way else init_remote_config
  will incorrectly think they're different URLs causing
  t9107-git-svn-migrate.sh to fail.
---
 git-svn.perl   | 24 +++-
 perl/Git/SVN.pm|  2 +-
 perl/Git/SVN/Ra.pm | 27 +--
 3 files changed, 9 insertions(+), 44 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index 6e3e240..6e97545 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -1412,24 +1412,6 @@ sub cmd_commit_diff {
}
 }
 
-sub escape_uri_only {
-   my ($uri) = @_;
-   my @tmp;
-   foreach (split m{/}, $uri) {
-   s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf(%%%02X,ord($1))/eg;
-   push @tmp, $_;
-   }
-   join('/', @tmp);
-}
-
-sub escape_url {
-   my ($url) = @_;
-   if ($url =~ m#^([^:]+)://([^/]*)(.*)$#) {
-   my ($scheme, $domain, $uri) = ($1, $2, escape_uri_only($3));
-   $url = $scheme://$domain$uri;
-   }
-   $url;
-}
 
 sub cmd_info {
my $path = canonicalize_path(defined($_[0]) ? $_[0] : .);
@@ -1457,18 +1439,18 @@ sub cmd_info {
my $full_url = $url . ($fullpath eq  ?  : /$fullpath);
 
if ($_url) {
-   print escape_url($full_url), \n;
+   print canonicalize_url($full_url), \n;
return;
}
 
my $result = Path: $path\n;
$result .= Name:  . basename($path) . \n if $file_type ne dir;
-   $result .= URL:  . escape_url($full_url) . \n;
+   $result .= URL:  . canonicalize_url($full_url) . \n;
 
eval {
my $repos_root = $gs-repos_root;
Git::SVN::remove_username($repos_root);
-   $result .= Repository Root:  . escape_url($repos_root) . \n;
+   $result .= Repository Root:  . canonicalize_url($repos_root) 
. \n;
};
if ($@) {
$result .= Repository Root: (offline)\n;
diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index cb6d83a..4219e5b 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -296,7 +296,7 @@ sub find_existing_remote {
 
 sub init_remote_config {
my ($self, $url, $no_write) = @_;
-   $url =~ s!/+$!!; # strip trailing slash
+   $url = canonicalize_url($url);
my $r = read_all_remotes();
my $existing = find_existing_remote($url, $r);
if ($existing) {
diff --git a/perl/Git/SVN/Ra.pm b/perl/Git/SVN/Ra.pm
index ef7b3dd..ed9dbe9 100644
--- a/perl/Git/SVN/Ra.pm
+++ b/perl/Git/SVN/Ra.pm
@@ -66,24 +66,6 @@ sub _auth_providers () {
\@rv;
 }
 
-sub escape_uri_only {
-   my ($uri) = @_;
-   my @tmp;
-   foreach (split m{/}, $uri) {
-   s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf(%%%02X,ord($1))/eg;
-   push @tmp, $_;
-   }
-   join('/', @tmp);
-}
-
-sub escape_url {
-   my ($url) = @_;
-   if ($url =~ m#^(https?)://([^/]+)(.*)$#) {
-   my ($scheme, $domain, $uri) = ($1, $2, escape_uri_only($3));
-   $url = $scheme://$domain$uri;
-   }
-   $url;
-}
 
 sub new {
my ($class, $url) = @_;
@@ -119,7 +101,7 @@ sub new {
$Git::SVN::Prompt::_no_auth_cache = 1;
}
} # no warnings 'once'
-   my $self = SVN::Ra-new(url = escape_url($url), auth = $baton,
+   my $self = SVN::Ra-new(url = canonicalize_url($url), auth = $baton,
  config = $config,
  pool = SVN::Pool-new,
  auth_provider_callbacks = $callbacks);
@@ -314,7 +296,7 @@ sub gs_do_switch {
 
if ($old_url =~ m#^svn(\+ssh)?://# ||
($full_url =~ m#^https?://# 
-escape_url($full_url) ne $full_url)) {
+canonicalize_url($full_url) ne $full_url)) {
$_[0] = undef;
$self = undef;
$RA = undef;
@@ -327,7 +309,7 @@ sub gs_do_switch {
}
 
$ra ||= $self;
-   $url_b = escape_url($url_b);
+   $url_b = canonicalize_url($url_b);
my $reporter = $ra-do_switch($rev_b, '', 1, $url_b, $editor, $pool);
my @lock = (::compare_svn_version('1.2.0') = 0) ? (undef) : ();
$reporter-set_path('', $rev_a, 0, @lock, $pool);
@@ -582,7 +564,8 @@ sub minimize_url {
$ra-get_log(, $latest, 0, 1, 0, 1, sub {});
};
} while ($@  ($c = shift @components));
-   $url;
+
+   return canonicalize_url($url);
 }
 
 sub can_do_switch {
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/8] SVN 1.7 will truncate not-a%40{0} to just not-a.

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

Rather than guess what SVN is going to do for each version, make the test use
the branch name that was actually created.
---
 t/t9118-git-svn-funky-branch-names.sh | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/t/t9118-git-svn-funky-branch-names.sh 
b/t/t9118-git-svn-funky-branch-names.sh
index 63fc982..193d3ca 100755
--- a/t/t9118-git-svn-funky-branch-names.sh
+++ b/t/t9118-git-svn-funky-branch-names.sh
@@ -32,6 +32,11 @@ test_expect_success 'setup svnrepo' '
start_httpd
'
 
+# SVN 1.7 will truncate not-a%40{0] to just not-a.
+# Look at what SVN wound up naming the branch and use that.
+# Be sure to escape the @ if it shows up.
+non_reflog=`svn_cmd ls $svnrepo/pr ject/branches | grep not-a | sed 's/\///' 
| sed 's/@/%40/'`
+
 test_expect_success 'test clone with funky branch names' '
git svn clone -s $svnrepo/pr ject project 
(
@@ -42,7 +47,7 @@ test_expect_success 'test clone with funky branch names' '
git rev-parse refs/remotes/%2Eleading_dot 
git rev-parse refs/remotes/trailing_dot%2E 
git rev-parse refs/remotes/trailing_dotlock%2Elock 
-   git rev-parse refs/remotes/not-a%40{0}reflog
+   git rev-parse refs/remotes/$non_reflog
)
'
 
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/8] Remove some ad hoc canonicalizations.

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

---
 git-svn.perl| 8 
 perl/Git/SVN.pm | 1 -
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index 3d120d5..56d1ba7 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -36,6 +36,7 @@ use Git::SVN::Utils qw(
canonicalize_url
join_paths
add_path_to_url
+   join_paths
 );
 
 use Git qw(
@@ -1598,7 +1599,7 @@ sub post_fetch_checkout {
 
 sub complete_svn_url {
my ($url, $path) = @_;
-   $path =~ s#/+$##;
+   $path = canonicalize_path($path);
 
# If the path is not a URL...
if ($path !~ m#^[a-z\+]+://#) {
@@ -1617,7 +1618,7 @@ sub complete_url_ls_init {
print STDERR W: $switch not specified\n;
return;
}
-   $repo_path =~ s#/+$##;
+   $repo_path = canonicalize_path($repo_path);
if ($repo_path =~ m#^[a-z\+]+://#) {
$ra = Git::SVN::Ra-new($repo_path);
$repo_path = '';
@@ -1638,9 +1639,8 @@ sub complete_url_ls_init {
}
command_oneline('config', $k, $gs-url) unless $orig_url;
 
-   my $remote_path = $gs-path . /$repo_path;
+   my $remote_path = join_paths( $gs-path, $repo_path );
$remote_path =~ s{%([0-9A-F]{2})}{chr hex($1)}ieg;
-   $remote_path =~ s#/+#/#g;
$remote_path =~ s#^/##g;
$remote_path .= /* if $remote_path !~ /\*/;
my ($n) = ($switch =~ /^--(\w+)/);
diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index e5f7acc..3c68c09 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -460,7 +460,6 @@ sub new {
}
{
my $path = $self-path;
-   $path =~ s{/+}{/}g;
$path =~ s{\A/}{};
$path =~ s{/\z}{};
$self-path($path);
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] Add function to append a path to a URL.

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

Remove the ad-hoc versions.

This is mostly to normalize the process and ensure the URLs produced
don't have double slashes or anything.

Also provides a place to fix the corner case where a file path
contains a percent sign.
---
 git-svn.perl  |  3 ++-
 perl/Git/SVN.pm   | 33 +++--
 perl/Git/SVN/Ra.pm|  8 
 perl/Git/SVN/Utils.pm | 27 +++
 t/Git-SVN/Utils/add_path_to_url.t | 27 +++
 5 files changed, 75 insertions(+), 23 deletions(-)
 create mode 100644 t/Git-SVN/Utils/add_path_to_url.t

diff --git a/git-svn.perl b/git-svn.perl
index 6b90765..3d120d5 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -35,6 +35,7 @@ use Git::SVN::Utils qw(
canonicalize_path
canonicalize_url
join_paths
+   add_path_to_url
 );
 
 use Git qw(
@@ -1436,7 +1437,7 @@ sub cmd_info {
# canonicalize_path() will return  to make libsvn 1.5.x happy,
$path = . if $path eq ;
 
-   my $full_url = canonicalize_url( $url . ($fullpath eq  ?  : 
/$fullpath) );
+   my $full_url = canonicalize_url( add_path_to_url( $url, $fullpath ) );
 
if ($_url) {
print $full_url\n;
diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index 4219e5b..22bf207 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -29,6 +29,7 @@ use Git::SVN::Utils qw(
join_paths
canonicalize_path
canonicalize_url
+   add_path_to_url
 );
 
 my $can_use_yaml;
@@ -564,8 +565,7 @@ sub _set_svm_vars {
# username is of no interest
$src =~ s{(^[a-z\+]*://)[^/@]*@}{$1};
 
-   my $replace = $ra-url;
-   $replace .= /$path if length $path;
+   my $replace = add_path_to_url($ra-url, $path);
 
my $section = svn-remote.$self-{repo_id};
tmp_config($section.svm-source, $src);
@@ -582,7 +582,7 @@ sub _set_svm_vars {
my $path = $self-path;
my %tried;
while (length $path) {
-   my $try = $self-url . /$path;
+   my $try = add_path_to_url($self-url, $path);
unless ($tried{$try}) {
return $ra if $self-read_svm_props($ra, $path, $r);
$tried{$try} = 1;
@@ -591,7 +591,7 @@ sub _set_svm_vars {
}
die Path: '$path' should be ''\n if $path ne '';
return $ra if $self-read_svm_props($ra, $path, $r);
-   $tried{$self-url./$path} = 1;
+   $tried{ add_path_to_url($self-url, $path) } = 1;
 
if ($ra-{repos_root} eq $self-url) {
die @err, (map {   $_\n } keys %tried), \n;
@@ -603,7 +603,7 @@ sub _set_svm_vars {
$path = $ra-{svn_path};
$ra = Git::SVN::Ra-new($ra-{repos_root});
while (length $path) {
-   my $try = $ra-url ./$path;
+   my $try = add_path_to_url($ra-url, $path);
unless ($tried{$try}) {
$ok = $self-read_svm_props($ra, $path, $r);
last if $ok;
@@ -613,7 +613,7 @@ sub _set_svm_vars {
}
die Path: '$path' should be ''\n if $path ne '';
$ok ||= $self-read_svm_props($ra, $path, $r);
-   $tried{$ra-url ./$path} = 1;
+   $tried{ add_path_to_url($ra-url, $path) } = 1;
if (!$ok) {
die @err, (map {   $_\n } keys %tried), \n;
}
@@ -933,20 +933,19 @@ sub rewrite_uuid {
 
 sub metadata_url {
my ($self) = @_;
-   ($self-rewrite_root || $self-url) .
-  (length $self-path ? '/' . $self-path : '');
+   my $url = $self-rewrite_root || $self-url;
+   return add_path_to_url( $url, $self-path );
 }
 
 sub full_url {
my ($self) = @_;
-   $self-url . (length $self-path ? '/' . $self-path : '');
+   return add_path_to_url( $self-url, $self-path );
 }
 
 sub full_pushurl {
my ($self) = @_;
if ($self-{pushurl}) {
-   return $self-{pushurl} . (length $self-path ? '/' .
-  $self-path : '');
+   return add_path_to_url( $self-{pushurl}, $self-path );
} else {
return $self-full_url;
}
@@ -1114,7 +1113,7 @@ sub find_parent_branch {
my $r = $i-{copyfrom_rev};
my $repos_root = $self-ra-{repos_root};
my $url = $self-ra-url;
-   my $new_url = $url . $branch_from;
+   my $new_url = add_path_to_url( $url, $branch_from );
print STDERR  Found possible branch point: ,
  $new_url = , $self-full_url, , $r\n
  unless $::_q  1;
@@ -1443,12 +1442,11 @@ sub find_extra_svk_parents {
for my $ticket ( @tickets ) {
my ($uuid, $path, $rev) = split /:/, $ticket;
if ( $uuid eq $self-ra_uuid ) {
-   my $url = $self-url;
-   my $repos_root = $url;
+  

[PATCH 7/8] Turn on canonicalization on newly minted URLs.

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

Go through all the spots that use the new add_path_to_url() to
make a new URL and canonicalize them.

* copyfrom_path has to be canonicalized else find_parent_branch
  will get confused

* due to the `canonicalize_url($full_url) ne $full_url)` line of
  logic in gs_do_switch(), $full_url is left alone until after.

At this point SVN 1.7 passes except for 3 tests in
t9100-git-svn-basic.sh that look like an SVN bug to do with
symlinks.
---
 perl/Git/SVN.pm| 19 ++-
 perl/Git/SVN/Ra.pm |  9 -
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index 22bf207..e5f7acc 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -362,6 +362,8 @@ sub init_remote_config {
 sub find_by_url { # repos_root and, path are optional
my ($class, $full_url, $repos_root, $path) = @_;
 
+   $full_url = canonicalize_url($full_url);
+
return undef unless defined $full_url;
remove_username($full_url);
remove_username($repos_root) if defined $repos_root;
@@ -400,6 +402,11 @@ sub find_by_url { # repos_root and, path are optional
}
$p =~ s#^\Q$z\E(?:/|$)#$prefix# or next;
}
+
+   # remote fetch paths are not URI escaped.  Decode ours
+   # so they match
+   $p = uri_decode($p);
+
foreach my $f (keys %$fetch) {
next if $f ne $p;
return Git::SVN-new($fetch-{$f}, $repo_id, $f);
@@ -934,18 +941,18 @@ sub rewrite_uuid {
 sub metadata_url {
my ($self) = @_;
my $url = $self-rewrite_root || $self-url;
-   return add_path_to_url( $url, $self-path );
+   return canonicalize_url( add_path_to_url( $url, $self-path ) );
 }
 
 sub full_url {
my ($self) = @_;
-   return add_path_to_url( $self-url, $self-path );
+   return canonicalize_url( add_path_to_url( $self-url, $self-path ) );
 }
 
 sub full_pushurl {
my ($self) = @_;
if ($self-{pushurl}) {
-   return add_path_to_url( $self-{pushurl}, $self-path );
+   return canonicalize_url( add_path_to_url( $self-{pushurl}, 
$self-path ) );
} else {
return $self-full_url;
}
@@ -1113,7 +1120,7 @@ sub find_parent_branch {
my $r = $i-{copyfrom_rev};
my $repos_root = $self-ra-{repos_root};
my $url = $self-ra-url;
-   my $new_url = add_path_to_url( $url, $branch_from );
+   my $new_url = canonicalize_url( add_path_to_url( $url, $branch_from ) );
print STDERR  Found possible branch point: ,
  $new_url = , $self-full_url, , $r\n
  unless $::_q  1;
@@ -1869,7 +1876,9 @@ sub make_log_entry {
$email ||= $author\@$uuid;
$commit_email ||= $author\@$uuid;
} elsif ($self-use_svnsync_props) {
-   my $full_url = add_path_to_url( $self-svnsync-{url}, 
$self-path );
+   my $full_url = canonicalize_url(
+   add_path_to_url( $self-svnsync-{url}, $self-path )
+   );
remove_username($full_url);
my $uuid = $self-svnsync-{uuid};
$log_entry{metadata} = $full_url\@$rev $uuid;
diff --git a/perl/Git/SVN/Ra.pm b/perl/Git/SVN/Ra.pm
index 788e642..29b46e8 100644
--- a/perl/Git/SVN/Ra.pm
+++ b/perl/Git/SVN/Ra.pm
@@ -5,6 +5,7 @@ use warnings;
 use SVN::Client;
 use Git::SVN::Utils qw(
canonicalize_url
+   canonicalize_path
add_path_to_url
 );
 
@@ -102,6 +103,7 @@ sub new {
$Git::SVN::Prompt::_no_auth_cache = 1;
}
} # no warnings 'once'
+
my $self = SVN::Ra-new(url = $url, auth = $baton,
  config = $config,
  pool = SVN::Pool-new,
@@ -200,6 +202,7 @@ sub get_log {
qw/copyfrom_path copyfrom_rev action/;
if ($s{'copyfrom_path'}) {
$s{'copyfrom_path'} =~ s/$prefix_regex//;
+   $s{'copyfrom_path'} = 
canonicalize_path($s{'copyfrom_path'});
}
$_[0]{$p} = \%s;
}
@@ -303,7 +306,11 @@ sub gs_do_switch {
$ra = Git::SVN::Ra-new($full_url);
$ra_invalid = 1;
} elsif ($old_url ne $full_url) {
-   SVN::_Ra::svn_ra_reparent($self-{session}, $full_url, $pool);
+   SVN::_Ra::svn_ra_reparent(
+   $self-{session},
+   canonicalize_url($full_url),
+   $pool
+   );
$self-url($full_url);
$reparented = 1;
}
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More 

[PATCH 3/8] Improve our URL canonicalization to be more like SVN 1.7's.

2012-07-28 Thread Michael G. Schwern
From: Michael G. Schwern schw...@pobox.com

Previously, our URL canonicalization didn't do much of anything.
Now it actually escapes and collapses slashes.  This is mostly a cut  paste
of escape_url from git-svn.

This is closer to how SVN 1.7's canonicalization behaves.  Doing it with
1.6 lets us chase down some problems caused by more effective canonicalization
without having to deal with all the other 1.7 issues on top of that.

* Remote URLs have to be canonicalized otherwise Git::SVN-find_existing_remote
  will think they're different.

* The SVN remote is now written to the git config canonicalized.  That
  should be ok.  Adjust a test to account for that.
---
 perl/Git/SVN.pm|  4 ++--
 perl/Git/SVN/Utils.pm  | 19 +--
 t/Git-SVN/Utils/canonicalize_url.t | 26 ++
 t/t9107-git-svn-migrate.sh |  4 +++-
 4 files changed, 48 insertions(+), 5 deletions(-)
 create mode 100644 t/Git-SVN/Utils/canonicalize_url.t

diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index 798f6c4..cb6d83a 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -201,9 +201,9 @@ sub read_all_remotes {
} elsif (m!^(.+)\.usesvmprops=\s*(.*)\s*$!) {
$r-{$1}-{svm} = {};
} elsif (m!^(.+)\.url=\s*(.*)\s*$!) {
-   $r-{$1}-{url} = $2;
+   $r-{$1}-{url} = canonicalize_url($2);
} elsif (m!^(.+)\.pushurl=\s*(.*)\s*$!) {
-   $r-{$1}-{pushurl} = $2;
+   $r-{$1}-{pushurl} = canonicalize_url($2);
} elsif (m!^(.+)\.ignore-refs=\s*(.*)\s*$!) {
$r-{$1}-{ignore_refs_regex} = $2;
} elsif (m!^(.+)\.(branches|tags)=$svn_refspec$!) {
diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 7ae6fac..dab6e4d 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -147,10 +147,25 @@ sub canonicalize_url {
 }
 
 
+sub _canonicalize_url_path {
+   my ($uri_path) = @_;
+
+   my @parts;
+   foreach my $part (split m{/+}, $uri_path) {
+   $part =~ 
s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf(%%%02X,ord($1))/eg;
+   push @parts, $part;
+   }
+
+   return join('/', @parts);
+}
+
 sub _canonicalize_url_ourselves {
my ($url) = @_;
-   $url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;
-   return $url;
+   if ($url =~ m#^([^:]+)://([^/]*)(.*)$#) {
+   my ($scheme, $domain, $uri) = ($1, $2, 
_canonicalize_url_path(canonicalize_path($3)));
+   $url = $scheme://$domain$uri;
+   }
+   $url;
 }
 
 
diff --git a/t/Git-SVN/Utils/canonicalize_url.t 
b/t/Git-SVN/Utils/canonicalize_url.t
new file mode 100644
index 000..05795ab
--- /dev/null
+++ b/t/Git-SVN/Utils/canonicalize_url.t
@@ -0,0 +1,26 @@
+#!/usr/bin/env perl
+
+# Test our own home rolled URL canonicalizer.  Test the private one
+# directly because we can't predict what the SVN API is doing to do.
+
+use strict;
+use warnings;
+
+use Test::More 'no_plan';
+
+use Git::SVN::Utils;
+my $canonicalize_url = \Git::SVN::Utils::_canonicalize_url_ourselves;
+
+my %tests = (
+   http://x.com;  = http://x.com;,
+   http://x.com/; = http://x.com;,
+   http://x.com/foo/bar;  = http://x.com/foo/bar;,
+   http://x.com//foo//bar//;  = http://x.com/foo/bar;,
+   http://x.com/  /%/= http://x.com/%20%20/%25;,
+);
+
+for my $arg (keys %tests) {
+   my $want = $tests{$arg};
+
+   is $canonicalize_url-($arg), $want, canonicalize_url('$arg') = 
$want;
+}
diff --git a/t/t9107-git-svn-migrate.sh b/t/t9107-git-svn-migrate.sh
index cfb4453..ee73013 100755
--- a/t/t9107-git-svn-migrate.sh
+++ b/t/t9107-git-svn-migrate.sh
@@ -27,6 +27,8 @@ test_expect_success 'setup old-looking metadata' '
 head=`git rev-parse --verify refs/heads/git-svn-HEAD^0`
 test_expect_success 'git-svn-HEAD is a real HEAD' test -n '$head'
 
+svnrepo_escaped=`echo $svnrepo | sed 's/ /%20/'`
+
 test_expect_success 'initialize old-style (v0) git svn layout' '
mkdir -p $GIT_DIR/git-svn/info $GIT_DIR/svn/info 
echo $svnrepo  $GIT_DIR/git-svn/info/url 
@@ -35,7 +37,7 @@ test_expect_success 'initialize old-style (v0) git svn 
layout' '
! test -d $GIT_DIR/git-svn 
git rev-parse --verify refs/${remotes_git_svn}^0 
git rev-parse --verify refs/remotes/svn^0 
-   test $(git config --get svn-remote.svn.url) = $svnrepo 
+   test $(git config --get svn-remote.svn.url) = $svnrepo_escaped 
test `git config --get svn-remote.svn.fetch` = \
  :refs/${remotes_git_svn}
'
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Enable parallelism in git submodule update.

2012-07-28 Thread Heiko Voigt
Hi Stefan,

neat patch. See below for a few notes.

On Fri, Jul 27, 2012 at 11:37:34AM -0700, Stefan Zager wrote:
 diff --git a/git-submodule.sh b/git-submodule.sh
 index dba4d39..761420a 100755
 --- a/git-submodule.sh
 +++ b/git-submodule.sh
 @@ -491,6 +492,20 @@ cmd_update()
   -r|--rebase)
   update=rebase
   ;;
 + -j|--jobs)
 + case $2 in
 + ''|-*)
 + jobs=0
 + ;;
 + *)
 + jobs=$2
 + shift
 + ;;
 + esac
 + # Don't preserve this arg.
 + shift
 + continue
 + ;;
   --reference)
   case $2 in '') usage ;; esac
   reference=--reference=$2
 @@ -529,6 +544,12 @@ cmd_update()
   cmd_init -- $@ || return
   fi
  
 + if test $jobs != 1
 + then
 + module_list $@ | awk '{print $4}' | xargs -L 1 -P $jobs git 
 submodule update $orig_args

I do not see orig_args set anywhere in submodule.sh. It seems the
existing usage of it in cmd_status() is a leftover from commit
98dbe63 when this variable got renamed to orig_flags.

I will follow up with a patch to that location.

Another problem here is the passing of arguments. Have a look at
a7eff1a8 to see how this was solved for other locations.

The next thing I noticed is that the parallelism is not recursive. You
drop the option and only execute the first depth in parallel. How about
using the amount of modules defined by arguments left in $@ as an
indicator whether you need to fork parallel execution or not. If there
is exactly one you do the update if there are more you do the parallel
thing. That way you can just keep passing the --jobs flag to the
subprocesses.

The next question to solve is UI: Since the output lines of the parallel
update jobs will be mixed we need some way to distinguish them. Imagine
one of the update fails somewhere how do we find out which it was?

Two possible solutions come to my mind:

 1. Prefix each line with a job number. This way you can distinguish
which process outputted what and still have immediate feedback.

 2. Cache the output (to stderr and stdout) of each job and output it
once one job is done. I imagine this needs some infrastructure which
we need to implement. We already have some ideas how to collect such
output in C here[1].

I would prefer solution 2 since the output of 1 will be hard to read but
I guess we could start with 1 and then move over to 2 later on.

Cheers Heiko

[1] http://article.gmane.org/gmane.comp.version-control.git/197747
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Enable parallelism in git submodule update.

2012-07-28 Thread Heiko Voigt
Hi,

On Fri, Jul 27, 2012 at 04:25:58PM -0700, Junio C Hamano wrote:
 Stefan Zager sza...@google.com writes:
 
  On Fri, Jul 27, 2012 at 2:38 PM, Junio C Hamano gits...@pobox.com wrote:
 
  Stefan Zager sza...@google.com writes:
 
   + module_list $@ | awk '{print $4}' | xargs -L 1 -P
  $jobs git submodule update $orig_args
 
  Capital-P option to xargs is not even in POSIX, no?
 
  I wasn't aware of that, but you appear to be correct.  Don't know if you
  have a policy about that, but anecdotally, -P is supported on my linux,
  mac, and win/msys systems.
 
 About policy, we use POSIX as a rough yardstick to warn us that we
 might be breaking people on minority platforms.  We do _not_ say It
 is in POSIX, so it is safe to use it, but we say It is not even in
 POSIX, so we need to think twice.  We do not usually say Linux,
 Mac and Windows are the only things that matter, and they all
 support it.
 
 Of course, any set of rules have exceptions ;-) There are a few
 things to which we say Even though it is not in POSIX, everybody
 who matters supports it, and without taking advantage of it, what we
 want to achieve will become too cumbersome to express.

I was about to write that since this is limited to a given --jobs
options the majority platforms should be enough as a start and others
could add a parallelism mechanism later. Its only a matter of efficiency
and not features.

But if you look at my other post to this thread I described that we need
some UI output extension so the user can still make sense of it.
In short: The user should be able distinguish which job said what.

I was already thinking about how an output caching could be implemented in
core git. How about exposing it as a git command like this?

git run [-jnumber] ...

It works like the xargs call above except that it caches each jobs
output to stderr and stdout until its done and then replays the output
to stderr/out in the correct order.

We could design the code so that it can be reused later on to do the
caching in parallel fetch/push/... .

What do you think? If we decide to go this route I would have a look
into whipping something up.

Cheers Heiko
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] buitin_config: return postitive status in get_value

2012-07-28 Thread Nikolai Vladimirov
Returning -1 instead of 1 results in wrong exit status(255) since
the output of get_value is passed to exit().

'git config missing_section' should now return proper exit status = 1,
as specified by the git config documentation.

Signed-off-by: Nikolai Vladimirov niko...@vladimiroff.com
---
 builtin/config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/config.c b/builtin/config.c
index 8cd08da..c262cbb 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -160,7 +160,7 @@ static int show_config(const char *key_, const char 
*value_, void *cb)
 
 static int get_value(const char *key_, const char *regex_)
 {
-   int ret = -1;
+   int ret = 1;
char *global = NULL, *xdg = NULL, *repo_config = NULL;
const char *system_wide = NULL, *local;
struct config_include_data inc = CONFIG_INCLUDE_INIT;
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] cleanup argument passing in submodule status command

2012-07-28 Thread Heiko Voigt
In commit 98dbe63 the variable $orig_args was renamed to $orig_flags.
One location in cmd_status() was missed.

Note: This is a code cleanup and does not fix any bugs. As a side effect
the variables containing the parsed flags to git submodule status are
passed down recursively. So everything was already behaving as expected.

Signed-off-by: Heiko Voigt hvo...@hvoigt.net
---
 git-submodule.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index dba4d39..3a3f0a4 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -961,7 +961,7 @@ cmd_status()
prefix=$displaypath/
clear_local_git_env
cd $sm_path 
-   eval cmd_status $orig_args
+   eval cmd_status $orig_flags
) ||
die $(eval_gettext Failed to recurse into submodule 
path '\$sm_path')
fi
-- 
1.7.12.rc0.23.g3c7cae0

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] buitin_config: return postitive status in get_value

2012-07-28 Thread Nguyen Thai Ngoc Duy
On Sat, Jul 28, 2012 at 6:42 PM, Nikolai Vladimirov
niko...@vladimiroff.com wrote:
 Returning -1 instead of 1 results in wrong exit status(255) since
 the output of get_value is passed to exit().

 'git config missing_section' should now return proper exit status = 1,
 as specified by the git config documentation.

I'm curious. Why is -1 (or 255) wrong? It was introduced in the first
version of get_value in 4ddba79 (git-config-set: add more options -
2005-11-20). Back then it returned -1 when there's regex compile error
to distinguish with 0 and 1 (but git-config-set.txt in the same commit
did not get update about exit code). Maybe we should update document
instead of the code.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] buitin_config: return postitive status in get_value

2012-07-28 Thread Nguyen Thai Ngoc Duy
On Sat, Jul 28, 2012 at 04:18:49PM +0300, Nikolay Vladimirov wrote:
 But the behavior now seems kind of strange, or maybe I'm missing something:
 # git config foobar; echo $?
 error: key does not contain a section: foobar
 255
 
 # git config foobar.info; echo $?
 1
 
 git version 1.7.11.2
 
 I would generally expect the both to behave the same way.

Then the following patch may be better because it leaves other cases
untouched (I'm not saying that we should or should not do it though)

-- 8 --
diff --git a/builtin/config.c b/builtin/config.c
index 8cd08da..d048ebf 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -199,8 +199,10 @@ static int get_value(const char *key_, const char *regex_)
goto free_strings;
}
} else {
-   if (git_config_parse_key(key_, key, NULL))
+   if (git_config_parse_key(key_, key, NULL)) {
+   ret = 1;
goto free_strings;
+   }
}
 
if (regex_) {
-- 8 --

-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.

2012-07-28 Thread Jonathan Nieder
Hi,

Michael G. Schwern wrote:

 --- a/perl/Git/SVN/Utils.pm
 +++ b/perl/Git/SVN/Utils.pm
[...]
 @@ -100,6 +102,20 @@ API as a URL.
  =cut
  
  sub canonicalize_url {
 + my $url = shift;
 +
 + # The 1.7 way to do it
 + if ( defined SVN::_Core::svn_uri_canonicalize ) {
 + return SVN::_Core::svn_uri_canonicalize($url);
 + }
 + # There wasn't a 1.6 way to do it, so we do it ourself.
 + else {
 + return _canonicalize_url_ourselves($url);
 + }
 +}
 +
 +
 +sub _canonicalize_url_ourselves {
   my ($url) = @_;
   $url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;

Leaves me a bit nervous.

What effect should we expect this change to have?  Is our emulation
of svn_uri_canonicalize already perfect and this change just a little
futureproofing in case svn_uri_canonicalize gets even better, or is
this a trap waiting to happen when new callers of canonicalize_url
start relying on, e.g., %-encoding of special characters?

If I am reading Subversion r873487 correctly, in ancient times,
svn_path_canonicalize() did the appropriate tweaking for URIs.  Today
its implementation is comforting:

const char *
svn_path_canonicalize(const char *path, apr_pool_t *pool)
{
  if (svn_path_is_url(path))
return svn_uri_canonicalize(path, pool);
  else
return svn_dirent_canonicalize(path, pool);
}

It might be easier to rely on that on pre-1.7 systems.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] Switch path canonicalization to use the SVN API.

2012-07-28 Thread Jonathan Nieder
Michael G. Schwern wrote:

 --- a/perl/Git/SVN/Utils.pm
 +++ b/perl/Git/SVN/Utils.pm
 @@ -86,6 +86,27 @@ sub _collapse_dotdot {
  
  
  sub canonicalize_path {
 + my $path = shift;
 +
 + # The 1.7 way to do it
 + if ( defined SVN::_Core::svn_dirent_canonicalize ) {
 + $path = _collapse_dotdot($path);
 + return SVN::_Core::svn_dirent_canonicalize($path);
 + }
 + # The 1.6 way to do it
 + elsif ( defined SVN::_Core::svn_path_canonicalize ) {
 + $path = _collapse_dotdot($path);
 + return SVN::_Core::svn_path_canonicalize($path);
 + }
 + # No SVN API canonicalization is available, do it ourselves
 + else {

When would this else case trip?  Would it be safe to make it
return an error message, or even to do something like the following?

sub canonicalize_path {
my $path = shift;
$path = _collapse_dotdot($path);

# Subversion 1.7 split svn_path_canonicalize() into
# svn_dirent_canonicalize() and svn_uri_canonicalize().
if (!defined SVN::_Core::svn_dirent_canonicalize) {
return SVN::_Core::svn_path_canonicalize($path);
}

return SVN::_Core::svn_dirent_canonicalize($path);
}

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] Make Git::SVN and Git::SVN::Ra canonicalize paths and urls.

2012-07-28 Thread Jonathan Nieder
Michael G. Schwern wrote:

 This canonicalizes paths and urls as early as possible so we don't
 have to remember to do it at the point of use.

Yay!  Am I correct in imagining this makes the following sequence of
commands[1] no longer trip an assertion failure in svn_path_join[2]
with SVN 1.6?

git svn init -Thttp://trac-hacks.org/svn/tagsplugin/trunk \
-thttp://trac-hacks.org/svn/tagsplugin/tags \
-bhttp://trac-hacks.org/svn/tagsplugin/branches
git svn fetch

[1] http://bugs.debian.org/616168
[2] 
  $ git svn fetch
  W: Ignoring error from SVN, path probably does not exist: (160013): 
Filesystem has no item: File not found: revision 100, path '/tagsplugin'
  W: Do not be alarmed at the above message git-svn is just searching 
aggressively for old history.
  This may take a while on large repositories
  perl: 
/build/buildd-subversion_1.6.17dfsg-4-i386-MgNPeW/subversion-1.6.17dfsg/subversion/libsvn_subr/path.c:115:
 svn_path_join: Assertion `svn_path_is_canonical(component, pool)' failed.
  error: git-svn died of signal 6
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] SVN 1.7 will truncate not-a%40{0} to just not-a.

2012-07-28 Thread Jonathan Nieder
Michael G. Schwern wrote:

 Rather than guess what SVN is going to do for each version, make the test use
 the branch name that was actually created.
[...]
 - git rev-parse refs/remotes/not-a%40{0}reflog
 + git rev-parse refs/remotes/$non_reflog

Doesn't this defeat the point of the testcase (checking that git-svn
is able to avoid creating git refs containing @{, following the rules
from git-check-ref-format(1))?

Do you know when SVN truncates the directory name?  Would historical
SVN repositories or historical SVN servers be able to have a directory
named with a %40 in it, or has this been disallowed completely,
leaving problematic historical repositories to be dumped with old SVN,
tweaked, and reloaded with new SVN?

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] null sha1 in trees

2012-07-28 Thread Jeff King
I recently came across a tree in the wild that had a submodule entry
whose sha1 was the null sha1 (i.e., all-zeros). It triggered an
interesting bug in the diff code, which is fixed by patch 1.

Unfortunately, I have no clue how this tree came about. I'm assuming it
was simply a bug somewhere in git, where the entry should have had
another sha1, or possibly been removed from the tree entirely.. Patches
2 and 3 tighten up our checks for null sha1s in a few places, which
might help detect such a bug earlier.

I assume I have never seen this with a non-submodule entry because such
a tree would fail the usual connectivity checks during fsck or during a
transfer. However, since we don't enforce connectivity on submodule
entries, nothing blocked the creation and propagation of such an entry.

I'm not at liberty to share the repository in question, but if anybody
has specific things to look for, I'd be happy to investigate further.

The patches are:

  [1/3]: diff: do not use null sha1 as a sentinel value

This is the actual bug-fix, and I hope is obviously a good thing to do.

  [2/3]: do not write null sha1s to on-disk index

This one tries to tighten our writing a bit. There are unfortunately a
lot of different code paths that create trees in git. I hope by catching
the index write as a choke-point, we can prevent bugs from spreading.
However, there are a lot of tree-writers that update an index in-core
and then write a tree out directly. I would not be surprised if this
does not catch the bug by itself, but I think it is a step in the right
direction.

  [3/3]: fsck: detect null sha1 in tree entries

And this one will at least let us notice the bug once it has happened.
And if transfer.fsckObjects is set, it will prevent bogus trees from
passing between repositories, containing any damage.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] diff: do not use null sha1 as a sentinel value

2012-07-28 Thread Jeff King
The diff code represents paths using the diff_filespec
struct. This struct has a sha1 to represent the sha1 of the
content at that path, as well as a sha1_valid member which
indicates whether its sha1 field is actually useful. If
sha1_valid is not true, then the filespec represents a
working tree file (e.g., for the no-index case, or for when
the index is not up-to-date).

The diff_filespec is only used internally, though. At the
interfaces to the diff subsystem, callers feed the sha1
directly, and we create a diff_filespec from it. It's at
that point that we look at the sha1 and decide whether it is
valid or not; callers may pass the null sha1 as a sentinel
value to indicate that it is not.

We should not typically see the null sha1 coming from any
other source (e.g., in the index itself, or from a tree).
However, a corrupt tree might have a null sha1, which would
cause diff --patch to accidentally diff the working tree
version of a file instead of treating it as a blob.

This patch extends the edges of the diff interface to accept
a sha1_valid flag whenever we accept a sha1, and to use
that flag when creating a filespec. In some cases, this
means passing the flag through several layers, making the
code change larger than would be desirable.

One alternative would be to simply die() upon seeing
corrupted trees with null sha1s. However, this fix more
directly addresses the problem (while bogus sha1s in a tree
are probably a bad thing, it is really the sentinel
confusion sending us down the wrong code path that is what
makes it devastating). And it means that git is more capable
of examining and debugging these corrupted trees. For
example, you can still diff --raw such a tree to find out
when the bogus entry was introduced; you just cannot do a
--patch diff (just as you could not with any other
corrupted tree, as we do not have any content to diff).

Signed-off-by: Jeff King p...@peff.net
---
 builtin.h  |  2 +-
 builtin/blame.c|  9 ++---
 builtin/cat-file.c |  2 +-
 builtin/diff.c |  8 +++--
 combine-diff.c |  4 +--
 diff-lib.c | 20 ++-
 diff-no-index.c|  2 +-
 diff.c | 16 +
 diff.h |  5 +++
 diffcore-rename.c  |  2 +-
 diffcore.h |  2 +-
 revision.c |  2 ++
 t/t4054-diff-bogus-tree.sh | 83 ++
 tree-diff.c|  8 ++---
 14 files changed, 134 insertions(+), 31 deletions(-)
 create mode 100755 t/t4054-diff-bogus-tree.sh

diff --git a/builtin.h b/builtin.h
index ba6626b..8e37752 100644
--- a/builtin.h
+++ b/builtin.h
@@ -43,7 +43,7 @@ extern int check_pager_config(const char *cmd);
 struct diff_options;
 extern void setup_diff_pager(struct diff_options *);
 
-extern int textconv_object(const char *path, unsigned mode, const unsigned 
char *sha1, char **buf, unsigned long *buf_size);
+extern int textconv_object(const char *path, unsigned mode, const unsigned 
char *sha1, int sha1_valid, char **buf, unsigned long *buf_size);
 
 extern int cmd_add(int argc, const char **argv, const char *prefix);
 extern int cmd_annotate(int argc, const char **argv, const char *prefix);
diff --git a/builtin/blame.c b/builtin/blame.c
index 0d50273..f2c48ef 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -110,6 +110,7 @@ static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b, 
long ctxlen,
 int textconv_object(const char *path,
unsigned mode,
const unsigned char *sha1,
+   int sha1_valid,
char **buf,
unsigned long *buf_size)
 {
@@ -117,7 +118,7 @@ int textconv_object(const char *path,
struct userdiff_driver *textconv;
 
df = alloc_filespec(path);
-   fill_filespec(df, sha1, mode);
+   fill_filespec(df, sha1, sha1_valid, mode);
textconv = get_textconv(df);
if (!textconv) {
free_filespec(df);
@@ -142,7 +143,7 @@ static void fill_origin_blob(struct diff_options *opt,
 
num_read_blob++;
if (DIFF_OPT_TST(opt, ALLOW_TEXTCONV) 
-   textconv_object(o-path, o-mode, o-blob_sha1, file-ptr, 
file_size))
+   textconv_object(o-path, o-mode, o-blob_sha1, 1, 
file-ptr, file_size))
;
else
file-ptr = read_sha1_file(o-blob_sha1, type, 
file_size);
@@ -2123,7 +2124,7 @@ static struct commit *fake_working_tree_commit(struct 
diff_options *opt,
switch (st.st_mode  S_IFMT) {
case S_IFREG:
if (DIFF_OPT_TST(opt, ALLOW_TEXTCONV) 
-   textconv_object(read_from, mode, null_sha1, 
buf_ptr, buf_len))
+   textconv_object(read_from, mode, null_sha1, 0, 
buf_ptr, buf_len))
strbuf_attach(buf, buf_ptr, buf_len, 

[PATCH 2/3] do not write null sha1s to on-disk index

2012-07-28 Thread Jeff King
We should never need to write the null sha1 into an index
entry (short of the 1 in 2^160 chance that somebody actually
has content that hashes to it). If we attempt to do so, it
is much more likely that it is a bug, since we use the null
sha1 as a sentinel value to mean not valid.

The presence of null sha1s in the index (which can come
from, among other things, update-index --cacheinfo, or by
reading a corrupted tree) can cause problems for later
readers, because they cannot distinguish the literal null
sha1 from its use a sentinel value.  For example, git
diff-files on such an entry would make it appear as if it
is stat-dirty, and until recently, the diff code assumed
such an entry meant that we should be diffing a working tree
file rather than a blob.

Ideally, we would stop such entries from entering even our
in-core index. However, we do sometimes legitimately add
entries with null sha1s in order to represent these sentinel
situations; simply forbidding them in add_index_entry breaks
a lot of the existing code. However, we can at least make
sure that our in-core sentinel representation never makes it
to disk.

To be thorough, we will test an attempt to add both a blob
and a submodule entry. In the former case, we might run into
problems anyway because we will be missing the blob object.
But in the latter case, we do not enforce connectivity
across gitlink entries, making this our only point of
enforcement. The current implementation does not care which
type of entry we are seeing, but testing both cases helps
future-proof the test suite in case that changes.

Signed-off-by: Jeff King p...@peff.net
---
I confess to not being clear on all of the instances in which a null
sha1 might enter the in-core index. I did try modifying add_index_entry,
but the breakage was pretty severe. I traced through a few instances,
and it seemed to be mostly related to unmerged entries.

 read-cache.c  |  2 ++
 t/t2107-update-index-basic.sh | 19 +++
 2 files changed, 21 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 2f8159f..d2be78e 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1800,6 +1800,8 @@ int write_index(struct index_state *istate, int newfd)
continue;
if (!ce_uptodate(ce)  is_racy_timestamp(istate, ce))
ce_smudge_racily_clean_entry(ce);
+   if (is_null_sha1(ce-sha1))
+   return error(cache entry has null sha1: %s, ce-name);
if (ce_write_entry(c, newfd, ce, previous_name)  0)
return -1;
}
diff --git a/t/t2107-update-index-basic.sh b/t/t2107-update-index-basic.sh
index 809fafe..0dbbb00 100755
--- a/t/t2107-update-index-basic.sh
+++ b/t/t2107-update-index-basic.sh
@@ -29,4 +29,23 @@ test_expect_success 'update-index -h with corrupt index' '
grep [Uu]sage: git update-index broken/usage
 '
 
+test_expect_success '--cacheinfo does not accept blob null sha1' '
+   echo content file 
+   git add file 
+   git rev-parse :file expect 
+   test_must_fail git update-index --cacheinfo 100644 $_z40 file 
+   git rev-parse :file actual 
+   test_cmp expect actual
+'
+
+test_expect_success '--cacheinfo does not accept gitlink null sha1' '
+   git init submodule 
+   (cd submodule  test_commit foo) 
+   git add submodule 
+   git rev-parse :submodule expect 
+   test_must_fail git update-index --cacheinfo 16 $_z40 submodule 
+   git rev-parse :submodule actual 
+   test_cmp expect actual
+'
+
 test_done
-- 
1.7.11.3.42.g90758bf

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] fsck: detect null sha1 in tree entries

2012-07-28 Thread Jeff King
Short of somebody happening to beat the 1 in 2^160 odds of
actually generating content that hashes to the null sha1, we
should never see this value in a tree entry. So let's have
fsck warn if it it seen.

As in the previous commit, we test both blob and submodule
entries to future-proof the test suite against the
implementation depending on connectivity to notice the
error.

Signed-off-by: Jeff King p...@peff.net
---
I left this as a warning, because git can still mostly handle such
trees, which may aid in debugging or salvaging data.

 fsck.c  |  8 +++-
 t/t1450-fsck.sh | 26 ++
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 4c63b2c..7395ef6 100644
--- a/fsck.c
+++ b/fsck.c
@@ -139,6 +139,7 @@ static int verify_ordered(unsigned mode1, const char 
*name1, unsigned mode2, con
 static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 {
int retval;
+   int has_null_sha1 = 0;
int has_full_path = 0;
int has_empty_name = 0;
int has_zero_pad = 0;
@@ -157,9 +158,12 @@ static int fsck_tree(struct tree *item, int strict, 
fsck_error error_func)
while (desc.size) {
unsigned mode;
const char *name;
+   const unsigned char *sha1;
 
-   tree_entry_extract(desc, name, mode);
+   sha1 = tree_entry_extract(desc, name, mode);
 
+   if (is_null_sha1(sha1))
+   has_null_sha1 = 1;
if (strchr(name, '/'))
has_full_path = 1;
if (!*name)
@@ -207,6 +211,8 @@ static int fsck_tree(struct tree *item, int strict, 
fsck_error error_func)
}
 
retval = 0;
+   if (has_null_sha1)
+   retval += error_func(item-object, FSCK_WARN, contains 
entries pointing to null sha1);
if (has_full_path)
retval += error_func(item-object, FSCK_WARN, contains full 
pathnames);
if (has_empty_name)
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 5b79c51..bf7a2cd 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -213,4 +213,30 @@ test_expect_success 'rev-list --verify-objects with bad 
sha1' '
grep -q error: sha1 mismatch 63ff 
out
 '
 
+_bz='\0'
+_bz5=$_bz$_bz$_bz$_bz$_bz
+_bz20=$_bz5$_bz5$_bz5$_bz5
+
+test_expect_success 'fsck notices blob entry pointing to null sha1' '
+   (git init null-blob 
+cd null-blob 
+sha=$(printf 100644 file$_bz$_bz20 |
+  git hash-object -w --stdin -t tree) 
+ git fsck 2out 
+ cat out 
+ grep warning.*null sha1 out
+   )
+'
+
+test_expect_success 'fsck notices submodule entry pointing to null sha1' '
+   (git init null-commit 
+cd null-commit 
+sha=$(printf 16 submodule$_bz$_bz20 |
+  git hash-object -w --stdin -t tree) 
+ git fsck 2out 
+ cat out 
+ grep warning.*null sha1 out
+   )
+'
+
 test_done
-- 
1.7.11.3.42.g90758bf
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] t: add missing executable bit to t7409

2012-07-28 Thread Jeff King

Signed-off-by: Jeff King p...@peff.net
---
 0 files changed
 mode change 100644 = 100755 t/t7409-submodule-detached-worktree.sh

diff --git a/t/t7409-submodule-detached-worktree.sh 
b/t/t7409-submodule-detached-worktree.sh
old mode 100644
new mode 100755
-- 
1.7.12.rc0.15.g2184f59
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t1450-fsck (sometimes/often) failes on Mac OS X

2012-07-28 Thread Heiko Voigt
Hi,

just to verify that this is unlikely just a hardware issue on one machine. I
today experienced this failure on master as well.

On Mon, Jul 16, 2012 at 06:06:26PM +0200, Torsten B?gershausen wrote:
 
 Am 16.07.2012 um 09:57 schrieb Thomas Rast:
 
  Torsten Bögershausen tbo...@web.de writes:
  What OS X are you running?  I started a loop
  
   while : ; do ./t1450-fsck.sh || break; done
  
  and it hasn't failed yet.  It is
  
   $ uname -a
   Darwin mackeller.inf.ethz.ch 11.4.0 Darwin Kernel Version 11.4.0: Mon Apr  
  9 19:32:15 PDT 2012; root:xnu-1699.26.8~1/RELEASE_X86_64 x86_64
  
  uname -a
 Darwin birne.lan 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun  7 16:33:36 PDT 
 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386

$ uname -a
Darwin book.hvoigt.net 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun  7 16:33:36 
PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386

My bisect run did also not reveal anything useful. I tried to use valgrind.
Interestingly just by prepending my valgrind directory to the PATH the test
passes. When changing PATH further it sometimes passes and sometimes not.
Reopening a new shell it reliably fails.

The commit I am experiencing this with is cdd159b. This one reliably fails for
me with the correct path setting :-)

To me this smells a little bit like using a dangling pointer or uninitialized
memory since threading seems to be out of the game. A dangling pointer only
available on a certain OS X version?

If I modify the PATH after adding the valgrind bin folder so that it matches
the same amount of characters as before I get the failure again.

It seems the error only occurs if my PATH is exactly 280 characters long. E.g.:

export 
PATH=/usr/local/valgrind/bin:/Users/hvoigt/.local/bin:/usr/bin:/bin:/usr/local/bin:/sw/bin/:/aa

When running under valgrind test 13 (the original one) passes but 2 fails. Not
sure if this is a false positive[1].

If I add those eight bytes here the tests pass for me without valgrind:

diff --git a/environment.c b/environment.c
index 85edd7f..988f836 100644
--- a/environment.c
+++ b/environment.c
@@ -131,7 +131,7 @@ static void setup_git_env(void)
}
git_index_file = getenv(INDEX_ENVIRONMENT);
if (!git_index_file) {
-   git_index_file = xmalloc(strlen(git_dir) + 7);
+   git_index_file = xmalloc(strlen(git_dir) + 7 + 8);
sprintf(git_index_file, %s/index, git_dir);
}
git_graft_file = getenv(GRAFT_ENVIRONMENT);

But maybe thats just a coincidence and totally unrelated.

The valgrind error can be fixed by this change:

diff --git a/sha1_file.c b/sha1_file.c
index 4ccaf7a..631d0dd 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -251,7 +251,7 @@ static int link_alt_odb_entry(const char * entry, int len, 
const char * relative
const char *objdir = get_object_directory();
struct alternate_object_database *ent;
struct alternate_object_database *alt;
-   int pfxlen, entlen;
+   int pfxlen, entlen, objdirlen;
struct strbuf pathbuf = STRBUF_INIT;
 
if (!is_absolute_path(entry)  relative_base) {
@@ -298,7 +298,8 @@ static int link_alt_odb_entry(const char * entry, int len, 
const char * relative
return -1;
}
}
-   if (!memcmp(ent-base, objdir, pfxlen)) {
+   objdirlen = strlen(objdir);
+   if (!memcmp(ent-base, objdir, pfxlen  objdirlen ? objdirlen : 
pfxlen)) {
free(ent);
return -1;
}

I will format this into a proper patch independent from this mail.

These are my observations.

Cheers Heiko

[1]
$ ./t1450-fsck.sh --valgrind
[...]
expecting success: 
mkdir another 
(
cd another 
git init 
echo ../../../.git/objects .git/objects/info/alternates 
test_commit C fileC one 
git fsck --no-dangling ../actual 21
) 
test_cmp empty actual

Initialized empty Git repository in /Users/hvoigt/Repository/git/t/trash 
directory.t1450-fsck/another/.git/
==42686== Invalid read of size 8
==42686==at 0x100625064: bcmp (in /usr/lib/libSystem.B.dylib)
==42686==by 0x100112846: link_alt_odb_entries (in 
/Users/hvoigt/Repository/git/t/valgrind/../../git)
==42686==by 0x1001129C0: read_info_alternates (in 
/Users/hvoigt/Repository/git/t/valgrind/../../git)
==42686==by 0x100112B8C: prepare_alt_odb (in 
/Users/hvoigt/Repository/git/t/valgrind/../../git)
==42686==by 0x1001148B7: prepare_packed_git (in 
/Users/hvoigt/Repository/git/t/valgrind/../../git)
==42686==by 0x100116A8B: find_pack_entry (in 
/Users/hvoigt/Repository/git/t/valgrind/../../git)
==42686==by 0x100118008: has_sha1_file (in 
/Users/hvoigt/Repository/git/t/valgrind/../../git)
==42686== 

[PATCH] link_alt_odb_entry: fix read over array bounds reported by valgrind

2012-07-28 Thread Heiko Voigt
pfxlen can be longer than the path in objdir when relative_base contains
the path to gits object directory.

Signed-off-by: Heiko Voigt hvo...@hvoigt.net
---
 sha1_file.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 4ccaf7a..631d0dd 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -251,7 +251,7 @@ static int link_alt_odb_entry(const char * entry, int len, 
const char * relative
const char *objdir = get_object_directory();
struct alternate_object_database *ent;
struct alternate_object_database *alt;
-   int pfxlen, entlen;
+   int pfxlen, entlen, objdirlen;
struct strbuf pathbuf = STRBUF_INIT;
 
if (!is_absolute_path(entry)  relative_base) {
@@ -298,7 +298,8 @@ static int link_alt_odb_entry(const char * entry, int len, 
const char * relative
return -1;
}
}
-   if (!memcmp(ent-base, objdir, pfxlen)) {
+   objdirlen = strlen(objdir);
+   if (!memcmp(ent-base, objdir, pfxlen  objdirlen ? objdirlen : 
pfxlen)) {
free(ent);
return -1;
}
-- 
1.7.12.rc0.23.gc9a5ac4

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t1450-fsck (sometimes/often) failes on Mac OS X

2012-07-28 Thread Thomas Rast
Heiko Voigt hvo...@hvoigt.net writes:

 if (!git_index_file) {
 -   git_index_file = xmalloc(strlen(git_dir) + 7);
 +   git_index_file = xmalloc(strlen(git_dir) + 7 + 8);
 sprintf(git_index_file, %s/index, git_dir);
 }
[...]
 -   if (!memcmp(ent-base, objdir, pfxlen)) {
 +   objdirlen = strlen(objdir);
 +   if (!memcmp(ent-base, objdir, pfxlen  objdirlen ? objdirlen : 
 pfxlen)) {
[...]
 Initialized empty Git repository in /Users/hvoigt/Repository/git/t/trash 
 directory.t1450-fsck/another/.git/
 ==42686== Invalid read of size 8
 ==42686==at 0x100625064: bcmp (in /usr/lib/libSystem.B.dylib)
 ==42686==by 0x100112846: link_alt_odb_entries (in 
 /Users/hvoigt/Repository/git/t/valgrind/../../git)
 ==42686==by 0x1001129C0: read_info_alternates (in 
 /Users/hvoigt/Repository/git/t/valgrind/../../git)
[...]
 ==42686==  Address 0x100faca78 is 8 bytes inside a block of size 13 alloc'd
 ==42686==at 0x10029C679: malloc (vg_replace_malloc.c:266)
 ==42686==by 0x1001349CD: xmalloc (in 
 /Users/hvoigt/Repository/git/t/valgrind/../../git)
 ==42686==by 0x1000C23F5: setup_git_env (in 
 /Users/hvoigt/Repository/git/t/valgrind/../../git)

To me that looks just like a false positive.  memcmp (which seems to be
the same as bcmp) can load 8 bytes from an aligned address even if these
are only partially within the block being compared, since an aligned
load can never partially fault (it must all be within the same page).
Valgrind normally substitutes its own routines for memcmp etc. to
correctly handle this, but this does not seem to happen in your case for
some reason.

Then again I am not entirely sure how you could verify that this theory
is correct :-)

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug: Recursive submodules fail when the repo path contains spaces

2012-07-28 Thread Heiko Voigt
On Tue, Jul 24, 2012 at 01:33:44PM -0700, Justin Spahr-Summers wrote:
 Here's some real output, with a couple specific names removed, starting from 
 the root of the top-level repository (where External/twui is a submodule):
 
 $ cd External/twui
 $ git submodule add git://github.com/petejkim/expecta.git TwUITests/expecta
 Cloning into 'TwUITests/expecta'...
 remote: Counting objects: 988, done.
 remote: Compressing objects: 100% (404/404), done.
 remote: Total 988 (delta 680), reused 842 (delta 535)
 Receiving objects: 100% (988/988), 156.30 KiB, done.
 Resolving deltas: 100% (680/680), done.
 fatal: Not a git repository: ../../../../../../../../Volumes/drive name with 
 spaces/Users/justin/Documents/Programming/project name with 
 spaces/.git/modules/External/twui/modules/TwUITests/expecta

Is this a copypaste artefact or is the path in the error truncated?

Cheers Heiko
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug: Recursive submodules fail when the repo path contains spaces

2012-07-28 Thread Justin Spahr-Summers
On Saturday, 28. July 2012 at 09:21, Heiko Voigt wrote:
 On Tue, Jul 24, 2012 at 01:33:44PM -0700, Justin Spahr-Summers wrote:
  Here's some real output, with a couple specific names removed, starting 
  from the root of the top-level repository (where External/twui is a 
  submodule):
  
  $ cd External/twui
  $ git submodule add git://github.com/petejkim/expecta.git 
  (http://github.com/petejkim/expecta.git) TwUITests/expecta
  Cloning into 'TwUITests/expecta'...
  remote: Counting objects: 988, done.
  remote: Compressing objects: 100% (404/404), done.
  remote: Total 988 (delta 680), reused 842 (delta 535)
  Receiving objects: 100% (988/988), 156.30 KiB, done.
  Resolving deltas: 100% (680/680), done.
  fatal: Not a git repository: ../../../../../../../../Volumes/drive name 
  with spaces/Users/justin/Documents/Programming/project name with 
  spaces/.git/modules/External/twui/modules/TwUITests/expecta
 
 
 
 Is this a copypaste artefact or is the path in the error truncated? 
There's apparently some weird wrapping from my email client (the git error is 
on a single line in the terminal), but there is no truncation. That's the full 
path. Thanks for clarifying, though.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] t3300-*.sh: Fix a TAP parse error

2012-07-28 Thread Ramsay Jones
Jonathan Nieder wrote:
 [...]
 [1] For example, what should/will happen if someone uses test_must_fail,
 test_might_fail, etc., within the test_fixture script? Should they simply
 be banned within a text_fixture?
 
 Why wouldn't they act just like they do in test_expect_success blocks?

Heh, well they do indeed act just like they do in text_expect_success blocks!

I spent only about 20 minutes writing test_fixture, playing with it, and then
deciding to shelve it for now. Again, I wanted a *quick* fix for the TAP
parse error, so that it would make it into v1.7.12. :(

Having now spent a further 30 minutes, I can see that I did a better job than
I thought! :-P

Actually, scratch that; rather I should say that Junio and the other authors
of the test infrastructure did such a good job (particularly with separation
of concerns), that I lucked into a good implementation.

I still haven't done any serious testing, so if I subsequently find any
problems, then the lousy implementation is my fault! ;-)

 FWIW I find Junio's test_setup name more self-explanatory.  What
 mnemonic should I be using to remember the _fixture name?

I don't have a problem with 'test_setup' either; test-fixture comes from the
various xUnit unit-test libraries. (I think Kent Beck et.al. wrote JUnit first
and then it was ported to various other languages. eg cppUnit for C++).

Briefly, a test-fixture provides a context or common environment, via code for
test setup and teardown, in which to run one or more tests.

HTH

ATB,
Ramsay Jones


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] test results for v1.7.12-rc0 on cygwin

2012-07-28 Thread Ramsay Jones
Hi Junio,

I actually tested v1.7.12-rc0-22-gcdd159b, thus:

$ time $(GIT_SKIP_TESTS='t0061.3 t0070.3 t9010 t9300' make test 
test-outp1 21)

real137m11.901s
user118m55.071s
sys 59m50.695s
$

the result, ignoring the explicity skipped tests (nothing new there), was
three additional failures:

t1100.5, t7502.21 and t7810.59

I will be sending a patch to fix t1100.5 (patch #1).

The failure in t7502-commit.sh seems to be the result of commit 8c5b1ae1
(ident: reject bogus email addresses with IDENT_STRICT, 24-05-2012)
being more strict with non fully qualified hostnames. I get the tell me
who you are message and the error:

fatal: unable to auto-detect email address (got 'ramsay@toshiba.(none)')

However, I *think* I saw that Jeff has submitted a fix for this already.

Unfortunately, I was unable to reproduce the final failure in t7810-grep.sh.
I tried, among other things, to provoke a failure thus:

$ for i in $(seq 100); do
 if ! ./t7810-grep.sh -i -v; then
 break;
 fi
 done
$

but, apart from chewing on the cpu for about 50 minutes, it didn't result
in a failure. :(

However, after looking at test 59, it seems to me to be a stale (redundant)
test. So, patch #2 removes that test! :-D [I wish I could reproduce the
failure because I don't like not knowing why it failed, but ...]

ATB,
Ramsay Jones
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] t1100-*.sh: Fix an intermittent test failure

2012-07-28 Thread Ramsay Jones

In particular, the final test ('flags and then non flags') fails
intermittently, depending on how much time elapsed between the
invocations of git commit-tree when creating the commits which
later have their commit id's compared. For example, if the commits
for childid-3 and childid-4 are created 1 or more seconds apart,
then the commits, which would otherwise be identical, will have
different commit id's.

In order to make the test reproducible, we remove the variability
by setting the author and committer times to a well defined state.
We accomplish this with a single call to 'test_tick' at the start
of the test.

Signed-off-by: Ramsay Jones ram...@ramsay1.demon.co.uk
---
 t/t1100-commit-tree-options.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t1100-commit-tree-options.sh b/t/t1100-commit-tree-options.sh
index a3b7723..f8457f9 100755
--- a/t/t1100-commit-tree-options.sh
+++ b/t/t1100-commit-tree-options.sh
@@ -47,6 +47,7 @@ test_expect_success \
 
 
 test_expect_success 'flags and then non flags' '
+   test_tick 
echo comment text |
git commit-tree $(cat treeid) commitid 
echo comment text |
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] t7810-*.sh: Remove redundant test

2012-07-28 Thread Ramsay Jones

Since commit bbc09c22 (grep: rip out support for external grep,
12-01-2010), test number 60 (grep -C1 hunk mark between files) is
essentially the same as test number 59.

Test 59 was intended to verify the behaviour of git-grep resulting
from multiple invocations of an external grep. As part of the test,
it creates and adds 1024 files to the index, which is now wasted
effort.

Remove test 59, since it is now redundant.

Signed-off-by: Ramsay Jones ram...@ramsay1.demon.co.uk
---
 t/t7810-grep.sh | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 24e9b19..523d041 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -399,17 +399,6 @@ test_expect_success 'grep -q, silently report matches' '
test_cmp empty actual
 '
 
-# Create 1024 file names that sort between y and z to make sure
-# the two files are handled by different calls to an external grep.
-# This depends on MAXARGS in builtin-grep.c being 1024 or less.
-c32=0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v
-test_expect_success 'grep -C1, hunk mark between files' '
-   for a in $c32; do for b in $c32; do : y-$a$b; done; done 
-   git add y-?? 
-   git grep -C1 ^[yz] actual 
-   test_cmp expected actual
-'
-
 test_expect_success 'grep -C1 hunk mark between files' '
git grep -C1 ^[yz] actual 
test_cmp expected actual
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.

2012-07-28 Thread Michael G Schwern
On 2012.7.28 6:50 AM, Jonathan Nieder wrote:
 --- a/perl/Git/SVN/Utils.pm
 +++ b/perl/Git/SVN/Utils.pm
 [...]
 @@ -100,6 +102,20 @@ API as a URL.
  =cut
  
  sub canonicalize_url {
 +my $url = shift;
 +
 +# The 1.7 way to do it
 +if ( defined SVN::_Core::svn_uri_canonicalize ) {
 +return SVN::_Core::svn_uri_canonicalize($url);
 +}
 +# There wasn't a 1.6 way to do it, so we do it ourself.
 +else {
 +return _canonicalize_url_ourselves($url);
 +}
 +}
 +
 +
 +sub _canonicalize_url_ourselves {
  my ($url) = @_;
  $url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;
 
 Leaves me a bit nervous.

As it should, SVN dumped a mess on us.


 What effect should we expect this change to have?  Is our emulation
 of svn_uri_canonicalize already perfect and this change just a little
 futureproofing in case svn_uri_canonicalize gets even better, or is
 this a trap waiting to happen when new callers of canonicalize_url
 start relying on, e.g., %-encoding of special characters?

This change is *just* about sliding in the SVN API call and seeing if git-svn
still works with SVN 1.6.  It should have no effect on SVN 1.6.  These patches
are a very slow and careful refactoring doing just one thing at a time.  Every
time I tried to do too many things at once, tests broke and I had to tease the
patch apart.

At this point in the patch series the code is not ready for canonicalization.
 Until 3/8 in the next patch series, canonicalize_url() basically does nothing
on SVN 1.6 so the code has never had to deal with the problem.  3/8 deals with
improving _canonicalize_url_ourselves() to work more like
svn_uri_canonicalize() and thus turns on canonicalization for SVN 1.6 and
deals with the breakage.


 If I am reading Subversion r873487 correctly, in ancient times,
 svn_path_canonicalize() did the appropriate tweaking for URIs.  Today
 its implementation is comforting:
 
   const char *
   svn_path_canonicalize(const char *path, apr_pool_t *pool)
   {
 if (svn_path_is_url(path))
   return svn_uri_canonicalize(path, pool);
 else
   return svn_dirent_canonicalize(path, pool);
   }
 
 It might be easier to rely on that on pre-1.7 systems.

I didn't know about that.  I don't know what your SVN backwards compat
requirements are, but if that behavior goes back far enough in SVN to satisfy
you folks, then canonicalize_url() should fall back to
SVN::_Core::svn_path_canonicalize().  But try it at the end of the patch
series.  The code has to be prepared for canonicalization first.  Then how it
actually does it can be improved.


-- 
Defender of Lexical Encapsulation
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] Switch path canonicalization to use the SVN API.

2012-07-28 Thread Michael G Schwern
On 2012.7.28 6:55 AM, Jonathan Nieder wrote:
 Michael G. Schwern wrote:
 --- a/perl/Git/SVN/Utils.pm
 +++ b/perl/Git/SVN/Utils.pm
 @@ -86,6 +86,27 @@ sub _collapse_dotdot {
  
  
  sub canonicalize_path {
 +my $path = shift;
 +
 +# The 1.7 way to do it
 +if ( defined SVN::_Core::svn_dirent_canonicalize ) {
 +$path = _collapse_dotdot($path);
 +return SVN::_Core::svn_dirent_canonicalize($path);
 +}
 +# The 1.6 way to do it
 +elsif ( defined SVN::_Core::svn_path_canonicalize ) {
 +$path = _collapse_dotdot($path);
 +return SVN::_Core::svn_path_canonicalize($path);
 +}
 +# No SVN API canonicalization is available, do it ourselves
 +else {
 
 When would this else case trip?

When svn_path_canonicalize() does not exist in the SVN API, presumably because
their SVN is too old.


 Would it be safe to make it
 return an error message, or even to do something like the following?

I don't know what your SVN backwards compat requirements are, or when
svn_path_canonicalize() appears in the API, so I left it as is.  git-svn's
home rolled path canonicalization worked and its no work to leave it working.
 No reason to break it IMO.


   sub canonicalize_path {
   my $path = shift;
   $path = _collapse_dotdot($path);
 
   # Subversion 1.7 split svn_path_canonicalize() into
   # svn_dirent_canonicalize() and svn_uri_canonicalize().
   if (!defined SVN::_Core::svn_dirent_canonicalize) {
   return SVN::_Core::svn_path_canonicalize($path);
   }
 
   return SVN::_Core::svn_dirent_canonicalize($path);
   }

As a side note...
If they don't have Mars bar, get me a Twix.  Else get me a Mars bar.
If they have a Mars bar, get me one.  Else get me a Twix.


-- 
Look at me talking when there's science to do.
When I look out there it makes me glad I'm not you.
I've experiments to be run.
There is research to be done
On the people who are still alive.
-- Jonathan Coulton, Still Alive
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] Make Git::SVN and Git::SVN::Ra canonicalize paths and urls.

2012-07-28 Thread Michael G Schwern
On 2012.7.28 7:11 AM, Jonathan Nieder wrote:
 Yay!  Am I correct in imagining this makes the following sequence of
 commands[1] no longer trip an assertion failure in svn_path_join[2]
 with SVN 1.6?
 
   git svn init -Thttp://trac-hacks.org/svn/tagsplugin/trunk \
   -thttp://trac-hacks.org/svn/tagsplugin/tags \
   -bhttp://trac-hacks.org/svn/tagsplugin/branches
   git svn fetch

Works For Me™!

  ...
  Checked out HEAD:
http://trac-hacks.org/svn/tagsplugin/trunk r11776


-- 
Insulting our readers is part of our business model.
http://somethingpositive.net/sp07122005.shtml
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.

2012-07-28 Thread Jonathan Nieder
Michael G Schwern wrote:
 On 2012.7.28 6:50 AM, Jonathan Nieder wrote:

 If I am reading Subversion r873487 correctly, in ancient times,
 svn_path_canonicalize() did the appropriate tweaking for URIs.  Today
 its implementation is comforting:
 
  const char *
  svn_path_canonicalize(const char *path, apr_pool_t *pool)
  {
if (svn_path_is_url(path))
  return svn_uri_canonicalize(path, pool);
else
  return svn_dirent_canonicalize(path, pool);
  }
[...]
 I didn't know about that.  I don't know what your SVN backwards compat
 requirements are, but if that behavior goes back far enough in SVN to satisfy
 you folks, then canonicalize_url() should fall back to
 SVN::_Core::svn_path_canonicalize().

svn_path_canonicalize() has been usable for this kind of thing since
SVN 1.1, possibly earlier.

  But try it at the end of the patch
 series.  The code has to be prepared for canonicalization first.  Then how it
 actually does it can be improved.

Since this part of the series is not tested with SVN 1.7, this is
basically adding dead code, right?  That could be avoided by
reordering the changes to keep canonicalize_url as-is until later in
the series when the switchover is safe.

Thanks.  Will play around with this code more.

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] SVN 1.7 will truncate not-a%40{0} to just not-a.

2012-07-28 Thread Michael G Schwern
On 2012.7.28 7:16 AM, Jonathan Nieder wrote:
 Michael G. Schwern wrote:
 
 Rather than guess what SVN is going to do for each version, make the test use
 the branch name that was actually created.
 [...]
 -git rev-parse refs/remotes/not-a%40{0}reflog
 +git rev-parse refs/remotes/$non_reflog
 
 Doesn't this defeat the point of the testcase (checking that git-svn
 is able to avoid creating git refs containing @{, following the rules
 from git-check-ref-format(1))?

Unless I messed up, entirely possible as I'm not a shell programmer, the test
is still useful for testing SVN 1.6.  Under SVN 1.6 $non_reflog should be
'not-a%40{0}reflog' as before.


 Do you know when SVN truncates the directory name?

IIRC its silently does it during the svn cp.


 Would historical
 SVN repositories or historical SVN servers be able to have a directory
 named with a %40 in it, or has this been disallowed completely,
 leaving problematic historical repositories to be dumped with old SVN,
 tweaked, and reloaded with new SVN?

Dunno, lemme check...

$ source ~/bin/svn16
$ svnadmin --version
svnadmin, version 1.6.18 (r1303927)
...
$ svnadmin create svnrepo
$ mkdir project project/trunk project/branches project/tags
$ echo foo  project/trunk/foo
$ svn import -m 'test import' project
file:///Users/schwern/tmp/test/svnrepo/project
Adding project/tags
Adding project/trunk
Adding project/trunk/foo
Adding project/branches

Committed revision 1.
$ rm -rf project/
$ svn cp -m 'reflog' file:///Users/schwern/tmp/test/svnrepo/project/trunk
'file:///Users/schwern/tmp/test/svnrepo/project/branches/not-a%40{0}reflog'

Committed revision 2.
$ svn ls file:///Users/schwern/tmp/test/svnrepo/project/branches
not-a@{0}reflog/
$ source ~/bin/svn17
$ svn --version
svn, version 1.7.5 (r1336830)
...
$ svn ls file:///Users/schwern/tmp/test/svnrepo/project/branches
not-a@{0}reflog/

If you make it with SVN 1.6 its still there with SVN 1.7.  That's good, it
means you can ship a prebuilt repository and check it against SVN 1.7.

The bad news is the new code segfaults on it.  I don't know if that's the SVN
1.7 API choking on its own stuff or because of my changes or both.  If you set
up the test I can try and fix it.  Otherwise I'll just flounder in shell.


-- 
I went to college, which is a lot like being in the Army, except when
 stupid people yell at me for stupid things, I can hit them.
-- Jonathan Schwarz
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.

2012-07-28 Thread Michael G Schwern
On 2012.7.28 12:30 PM, Jonathan Nieder wrote:
 I didn't know about that.  I don't know what your SVN backwards compat
 requirements are, but if that behavior goes back far enough in SVN to satisfy
 you folks, then canonicalize_url() should fall back to
 SVN::_Core::svn_path_canonicalize().
 
 svn_path_canonicalize() has been usable for this kind of thing since
 SVN 1.1, possibly earlier.

Great!  Then _canonicalize_url_ourselves() can probably be replaced with that.
 Just take my advice and do it after 1.7 is working and the code is ready for
canonicalization.


  But try it at the end of the patch
 series.  The code has to be prepared for canonicalization first.  Then how it
 actually does it can be improved.
 
 Since this part of the series is not tested with SVN 1.7, this is
 basically adding dead code, right?  That could be avoided by
 reordering the changes to keep canonicalize_url as-is until later in
 the series when the switchover is safe.

I would suggest that worrying whether a few lines of code are introduced now
or 10 patches later in the same branch which is all going to be merged in one
go (and retesting the patches after it) is not the most important thing.  The
code needs humans looking over it and deciding if canonicalizations were
missed or applied inappropriately.  Or hey, work on that path and url object
idea that makes a lot of real code mess go away.


-- 
ROCKS FALL! EVERYONE DIES!
http://www.somethingpositive.net/sp05032002.shtml
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.

2012-07-28 Thread Jonathan Nieder
Jonathan Nieder wrote:
 Michael G Schwern wrote:

 I would suggest that worrying whether a few lines of code are introduced now
 or 10 patches later in the same branch which is all going to be merged in one
 go (and retesting the patches after it) is not the most important thing.
[...]
 In that case they should be one patch, I'd think.

 The advantage of introducing changes gradually is that (1) the changes
 can be examined and tested one at a time, and (2) if later a change
 proves to be problematic, it can be isolated, understood, and fixed
 more easily.  The strategy you are suggesting would have neither of
 those advantages.

(To avoid confusion: by The strategy you are suggesting I mean
introducing dead code first and activating it later, not the path and
url object idea.  The path and url object approach would be very
nice. :))

Sorry for the lack of clarity.
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.

2012-07-28 Thread Michael G Schwern
On 2012.7.28 1:02 PM, Jonathan Nieder wrote:
 Jonathan Nieder wrote:
 Michael G Schwern wrote:
 
 I would suggest that worrying whether a few lines of code are introduced now
 or 10 patches later in the same branch which is all going to be merged in 
 one
 go (and retesting the patches after it) is not the most important thing.
 [...]
 In that case they should be one patch, I'd think.

 The advantage of introducing changes gradually is that (1) the changes
 can be examined and tested one at a time, and (2) if later a change
 proves to be problematic, it can be isolated, understood, and fixed
 more easily.  The strategy you are suggesting would have neither of
 those advantages.
 
 (To avoid confusion: by The strategy you are suggesting I mean
 introducing dead code first and activating it later, not the path and
 url object idea.  The path and url object approach would be very
 nice. :))

If this is all a topic branch then it doesn't matter much whether a couple
lines of code is introduced at patch 8 of a branch or patch 13.  Sure, it
matters a little, but...
https://secure.wikimedia.org/wikipedia/en/wiki/Opportunity_cost

If it *isn't* going in a topic branch, if its not visible as a collected work
in history, if its going to be rebased on top of master, then yeah I can see
why you're so concerned.


-- 
Alligator sandwich, and make it snappy!
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


bug (?) in send email

2012-07-28 Thread Christoph Miebach

Hello!

send-email (tested versions 1.7.9.2 and 1.7.10.4) breaks email addresses.

Steps to reproduce:


Modify file.

git commit --author=Michał Tz name_1...@some.com modified.file -m Test

git format-patch -o patches origin

Now, the patch seems to have the address right, see [1]

git send-email  --to myown.addr...@mail.com --suppress-cc=author 
patches/0001-Test.patch


But checking my inbox now shows an email starting with:
From: Michał Tz name 1...@some.com

So the address is splitted at the underscore.

Furthermore, if I don't use --suppress-cc=author, the CC field shows the 
right address.


Regards

Christoph

[1]
less patches/0001-Test.patch
From: =?UTF-8?q?Micha=C5=82=20Tz?= name_1...@some.com

git show
Author: Michał Tz name_1...@some.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Enhancements to git-protocoll

2012-07-28 Thread Fredrik Gustafsson
Hi,
sometimes git communicates with something that's not git on the other
side (gitolite and github for example).

Sometimes the server wants to communicate directly to the git user.

git isn't really designed for this. gitolite solves this by do user
interaction on STDERR instead. The bad thing about this is that it can
only be one-direction communication, for example error messages.

If git would allow for the user to interact direct with the server, a
lot of cool and and userfriendly features could be developed.

For example:
gitolite has something called wild repos[1]. The management is
cumbersome and if you misspell when you clone a repo you might instead
create a new repo.

This could have been avoided with a simply:
Do you want to create a new repo[Yn]

To fix this, git protocol should have a command for printing input to
STDOUT and accepting input on STDIN which should be sent to the server.
And a command to switch back to orginal of course.

The server could then switch to user interaction, do that and then
switch back to normal operation.

Before eventually starting to implement this, I would know your
opinions. This feature would be wortless if it's not in the official git.

[1] http://sitaramc.github.com/gitolite/wild.html
-- 
Med vänliga hälsningar
Fredrik Gustafsson

tel: 0733-608274
e-post: iv...@iveqy.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] log: remove redundant check for merge commit

2012-07-28 Thread Martin von Zweigbergk
Sorry, I meant to CC the list. See below.

On Sat, Jul 28, 2012 at 9:56 PM, Martin von Zweigbergk
martin.von.zweigbe...@gmail.com wrote:
 On Fri, Jul 27, 2012 at 11:52 PM, Junio C Hamano gits...@pobox.com wrote:
 Junio C Hamano gits...@pobox.com writes:

 It seems to have some interaction with your other topic, though.
 These two patches alone will pass the existing tests, but merging it
 with mz/rebase-range breaks t3412.  I didn't look into it, but
 perhaps this breaks git cherry in some way?

 Yes, it breaks git cherry quite badly, by not ignoring merges at
 all. I incorrectly assumed that ignore_merges was about revision
 traversal, but now I think it's only diff output from 'git log' (and
 possibly others). What I think tricked me was seeing that
 ignore_merges = 1 closely followed by a comment saying ignore
 merges. But now I think the explicit code to ignore merges is
 necessary (as show by the failing test case), but can be replaced by
 rev_info.max_parents = 1. Setting ignore_merges = 1, OTOH, now
 seems doubly redundant: not only does it set the same value as was set
 in init_revisions, but it's also irrelevant. Since cherry doesn't
 generate any diff output, I think ignore_merges is never used.
 Flipping the values of all of ignore_merges, combine_merges and
 diff does not have any effect on test cases at least. I hope my
 explanation makes some sense at least...

 I'll send a reroll when I get time.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html