git-svn: Use prefix by default?

2013-08-22 Thread Thomas Ferris Nicolaisen
I've recently been forced back into using git-svn, and while I was at
it, I noticed that git-svn generally behaves a lot better when it is
initialized using the --prefix option.

For example, I make a standard-layout svn clone:

$ git svn clone -s https://svn.company.com/repos/project-foo/

.. and end up with this .gitconfig:

[svn-remote svn]
url = https://svn.company.com/repos/
fetch = project-foo/trunk:refs/remotes/trunk
branches = project-foo/branches/*:refs/remotes/*
tags = project-foo/tags/*:refs/remotes/tags/*

And my remote branches looks like this:
remotes/trunk
remotes/feat-bar

Now, let's say I want to work on the feat-bar branch, so I attempt to
create a tracking branch by the same name:

$ git checkout feat-bar  # will detach head
$ git checkout remotes/feat-bar  # will detach head
$ git checkout -t remotes/feat-bar # fatal: Missing branch name; try -b
$ git checkout -tb remotes/feat-bar # Branch remotes/feat-bar set up
to track local branch master.

Well, that's not what I wanted.. So I end up doing it the good
old-fashioned way:

$ git checkout -tb feat-bar remotes/feat-bar # works

Now I am up and rolling, but I get this warning with every checkout or rebase:

 warning: refname 'feat-bar' is ambiguous.

So, let's see what happens when I create the svn clone using a
--prefix=mirror/ instead. Here's the config:

[svn-remote svn]
url = https://svn.company.com/repos/
fetch = project-foo/trunk:refs/remotes/mirror/trunk
branches = project-foo/branches/*:refs/remotes/mirror/*
tags = project-foo/tags/*:refs/remotes/mirror/tags/*

Here are my remote branches:
 remotes/mirror/trunk
 remotes/mirror/feat-bar

Let's create the tracking branch:

$ git checkout feat-bar  # Branch feat-bar set up to track remote
branch feat-bar from mirror.

Voila, worked on the first try. And no more warnings about ambiguous refnames.

So now my question is: If using a prefix is so healthy for git's
branch tracking conventions, why doesn't git-svn default to use some
prefix like 'origin' or something when initializing a git svn clone?

The examples were made using 1.8.1.2 on OSX by the way.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCHv6 1/2] repack: rewrite the shell script in C

2013-08-22 Thread Johannes Sixt

Am 22.08.2013 00:15, schrieb Stefan Beller:

On 08/21/2013 10:56 PM, Junio C Hamano wrote:

Stefan Beller stefanbel...@googlemail.com writes:

+static int delta_base_offset = 1;
+char *packdir;


Does this have to be global?


As the path is pretty obvious (get_object_directory() + /pack),
we could however also construct it again in the signal handler.


I would advise against doing that. The recomputation would call malloc(), 
which is not async-signal-safe. (It would not be the first case where we 
call forbidden functions from signal handlers, but we need not pile more 
on top of them.)


-- Hannes

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] repack: rewrite the shell script in C.

2013-08-22 Thread Johannes Sixt

Am 21.08.2013 10:49, schrieb Matthieu Moy:

Stefan Beller stefanbel...@googlemail.com writes:

+   for_each_string_list_item(item, names) {
+   for (ext = 0; ext  2; ext++) {
+   char *fname, *fname_old;
+   fname = mkpathdup(%s/%s%s, packdir, item-string, 
exts[ext]);
+   if (!file_exists(fname)) {
+   free(fname);
+   continue;
+   }
+
+   fname_old = mkpath(%s/old-%s%s, packdir, 
item-string, exts[ext]);
+   if (file_exists(fname_old))
+   unlink(fname_old);


Unchecked returned value.


Good catch! The original was 'rm -f ...  mv ... || failed=t'


+   /* Now the ones with the same name are out of the way... */
+   for_each_string_list_item(item, names) {
+   for (ext = 0; ext  2; ext++) {
+   char *fname, *fname_old;
+   struct stat statbuffer;
+   fname = mkpathdup(%s/pack-%s%s, packdir, 
item-string, exts[ext]);
+   fname_old = mkpath(%s-%s%s, packtmp, item-string, 
exts[ext]);
+   if (!stat(fname_old, statbuffer)) {
+   statbuffer.st_mode = ~S_IWUSR | ~S_IWGRP | 
~S_IWOTH;
+   chmod(fname_old, statbuffer.st_mode);


Unchecked return value.


The original was an unchecked 'chmod a-w', so we don't care.

Of course, we could mimic the original better by issuing warnings.




+   /* Remove the old- files */
+   for_each_string_list_item(item, names) {
+   char *fname;
+   fname = mkpath(%s/old-pack-%s.idx, packdir, item-string);
+   if (remove_path(fname))
+   die_errno(_(removing '%s' failed), fname);
+
+   fname = mkpath(%s/old-pack-%s.pack, packdir, item-string);
+   if (remove_path(fname))
+   die_errno(_(removing '%s' failed), fname);


Does this have to be a fatal error? If I read correctly, it wasn't fatal
in the shell version.


Good catch.

-- Hannes

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] repack: rewrite the shell script in C.

2013-08-22 Thread Johannes Sixt

Am 21.08.2013 15:07, schrieb Matthieu Moy:

Stefan Beller stefanbel...@googlemail.com writes:


But as these follow up changes heavily rely on the very first patch
I will first try to get that right, meaning accepted into pu.
Then I can send patches with these proposals such as making more
functions.


I think it's better to get the style right before, to avoid doubling the
review effort (review a hard-to-review patch first, and then re-review a
style-fix one).


If by style fix you mean coding style fix, I agree.

But, IMO, refactoring the long function can wait because the long function 
is easier to compare to the shell script, and I think that is more 
important later when you need to dig the history.


It is already too late to save review effort.

-- Hannes

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Document the HTTP transport protocols

2013-08-22 Thread Duy Nguyen
On Thu, Aug 22, 2013 at 5:00 AM, Jeff King p...@peff.net wrote:
 On Wed, Aug 21, 2013 at 08:45:13PM +0700, Nguyen Thai Ngoc Duy wrote:

  On the topic, C Git's (maybe) violations on this spec are:

   - The client does not strip trailing slashes from $GIT_URL before
 sending to the server, as described in section URL Format.

 Yeah. We get the basic gist right by not adding an extra / if there is
 already a trailing slash (so you do not have http://host/path//info/refs;).
 But we do not go out of our way to remove multiple slashes that the user
 hands out (either at the end or in the middle of the URL). I doubt that
 it matters in practice.

It may make writing rewrite/matching patterns in http server a tiny
bit harder, but should not be a big deal. I agree with Junio this
could be something a new contributor can work on to get familiar with
(scary, imo) transport/connect code.

I agree with the rest of your comments that those violations do not
matter much (when I raised them I did not mean fix Git, just
checking if I missed anything) and the document is ok as-is.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] index-pack: optionally reject packs with duplicate objects

2013-08-22 Thread Duy Nguyen
On Thu, Aug 22, 2013 at 3:52 AM, Jeff King p...@peff.net wrote:
 @@ -68,6 +81,16 @@ const char *write_idx_file(const char *index_name, struct 
 pack_idx_entry **objec
 else
 sorted_by_sha = list = last = NULL;

 +   if (opts-duplicates == WRITE_IDX_DUPLICATES_REJECT) {
 +   struct pack_idx_entry **dup;
 +
 +   dup = find_duplicate(sorted_by_sha, nr_objects,
 +sizeof(*sorted_by_sha), sha1_compare);
 +   if (dup)
 +   die(pack has duplicate entries for %s,
 +   sha1_to_hex((*dup)-sha1));
 +   }
 +
 if (opts-flags  WRITE_IDX_VERIFY) {
 assert(index_name);
 f = sha1fd_check(index_name);

write_idx_file() is called after index-pack processes all delta
objects. Could resolve_deltas() go cyclic with certain duplicate
object setup?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] index-pack: optionally reject packs with duplicate objects

2013-08-22 Thread Jeff King
On Thu, Aug 22, 2013 at 08:45:19PM +0700, Nguyen Thai Ngoc Duy wrote:

 On Thu, Aug 22, 2013 at 3:52 AM, Jeff King p...@peff.net wrote:
  @@ -68,6 +81,16 @@ const char *write_idx_file(const char *index_name, 
  struct pack_idx_entry **objec
  else
  sorted_by_sha = list = last = NULL;
 
  +   if (opts-duplicates == WRITE_IDX_DUPLICATES_REJECT) {
  +   struct pack_idx_entry **dup;
  +
  +   dup = find_duplicate(sorted_by_sha, nr_objects,
  +sizeof(*sorted_by_sha), sha1_compare);
  +   if (dup)
  +   die(pack has duplicate entries for %s,
  +   sha1_to_hex((*dup)-sha1));
  +   }
  +
  if (opts-flags  WRITE_IDX_VERIFY) {
  assert(index_name);
  f = sha1fd_check(index_name);
 
 write_idx_file() is called after index-pack processes all delta
 objects. Could resolve_deltas() go cyclic with certain duplicate
 object setup?

Good question. I'm not sure. I'll check it out.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] git-remote-mediawiki: reset private ref after non-dumb push

2013-08-22 Thread Felipe Contreras
On Wed, Aug 21, 2013 at 4:36 PM, Matthieu Moy
matthieu@grenoble-inp.fr wrote:
 Felipe Contreras felipe.contre...@gmail.com writes:

 On Tue, Aug 13, 2013 at 8:31 AM, Matthieu Moy matthieu@imag.fr wrote:

 Felipe: Is this the right fix for git-remote-mediawiki? Any better idea?

 Why not keep track of the revisions yourself? You can have file where
 you store which was the last revision that was fetched.

 I don't really understand the point of the private namespace anymore I
 guess. Why do we have both refs/remotes/$remote and
 refs/$foreign_vcs/$remote, if they are always kept in sync?

They are not always in sync; if a push fails, the private namespace is
not updated.

 Keeping the last imported revision in a separate file would be possible,
 but then we'd have information about the remote in one file plus two
 refs, and I don't understand why we need to split the information in so
 many places. A ref seemed the right tool to store a revision.

As I said, they are not exactly the same. It is possible refs/remotes
point to a mercurial revision on the remote server, and refs/hg points
to a mercurial revision on the local internal repository, and they are
not the same.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] t3404 incremental improvements

2013-08-22 Thread Eric Sunshine
On Wed, Aug 21, 2013 at 7:25 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 This set of patches was meant to be a re-roll of [1] addressing Junio's
 comments, however [1] graduated to 'next' before I found time to work on
 it further, so these are instead incremental patches atop 'next'.

 Just FYI, 'next' will be rewound once the upcoming release is done,
 so we have a chance to rewind and squash.

How would we go about this? Is there something I can do to streamline
the squashing?

Unfortunately, the various fix-up patches do not have a one-to-one
correspondence to the original three patches in 'next'.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] t3404 incremental improvements

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 On Wed, Aug 21, 2013 at 7:25 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 This set of patches was meant to be a re-roll of [1] addressing Junio's
 comments, however [1] graduated to 'next' before I found time to work on
 it further, so these are instead incremental patches atop 'next'.

 Just FYI, 'next' will be rewound once the upcoming release is done,
 so we have a chance to rewind and squash.

 How would we go about this? Is there something I can do to streamline
 the squashing?

 Unfortunately, the various fix-up patches do not have a one-to-one
 correspondence to the original three patches in 'next'.

The most stream-lined way would be to send a replacement series
early next week, by which time hopefully the 1.8.4 final is out; as
long as the end-results of applying the series are the same, we know
that the new code we will be using is the same as the code already
in 'next' that people have been testing.

That way, there is no risk of me screwing up while trying to wiggle
the existing patches and ending up with a split that do not match a
logical progression of the series you would expect to see.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] repack: rewrite the shell script in C (squashing proposal)

2013-08-22 Thread Stefan Beller
This patch is meant to be squashed into bb4335a21441a0
(repack: rewrite the shell script in C), I'll do so when rerolling
the series. For reviewing I'll just send this patch.

* Remove comments, which likely get out of date (authorship is kept in
  git anyway)
* rename get_pack_filenames to get_non_kept_pack_filenames
* catch return value of unlink and fail as the shell version did
* beauty fixes to remove_temporary_files as Junio proposed
* install signal handling after static variables packdir, packtmp are set
* remove adding the empty string to the buffer.
* fix the rollback mechanism (wrong variable name)

Signed-off-by: Stefan Beller stefanbel...@googlemail.com
---
 builtin/repack.c | 78 ++--
 1 file changed, 36 insertions(+), 42 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index 1f13e0d..e0d1f17 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -1,8 +1,3 @@
-/*
- * The shell version was written by Linus Torvalds (2005) and many others.
- * This is a translation into C by Stefan Beller (2013)
- */
-
 #include builtin.h
 #include cache.h
 #include dir.h
@@ -13,9 +8,8 @@
 #include string-list.h
 #include argv-array.h
 
-/* enabled by default since 22c79eab (2008-06-25) */
 static int delta_base_offset = 1;
-char *packdir;
+static char *packdir, *packtmp;
 
 static const char *const git_repack_usage[] = {
N_(git repack [options]),
@@ -41,18 +35,16 @@ static void remove_temporary_files(void)
DIR *dir;
struct dirent *e;
 
-   /* .git/objects/pack */
-   strbuf_addstr(buf, get_object_directory());
-   strbuf_addstr(buf, /pack);
-   dir = opendir(buf.buf);
-   if (!dir) {
-   strbuf_release(buf);
+   dir = opendir(packdir);
+   if (!dir)
return;
-   }
 
-   /* .git/objects/pack/.tmp-$$-pack-* */
+   strbuf_addstr(buf, packdir);
+
+   /* dirlen holds the length of the path before the file name */
dirlen = buf.len + 1;
-   strbuf_addf(buf, /.tmp-%d-pack-, (int)getpid());
+   strbuf_addf(buf, %s, packtmp);
+   /* prefixlen holds the length of the prefix */
prefixlen = buf.len - dirlen;
 
while ((e = readdir(dir))) {
@@ -73,11 +65,16 @@ static void remove_pack_on_signal(int signo)
raise(signo);
 }
 
-static void get_pack_filenames(struct string_list *fname_list)
+/*
+ * Adds all packs hex strings to the fname list, which do not
+ * have a corresponding .keep file.
+ */
+static void get_non_kept_pack_filenames(struct string_list *fname_list)
 {
DIR *dir;
struct dirent *e;
char *fname;
+   size_t len;
 
if (!(dir = opendir(packdir)))
return;
@@ -86,7 +83,7 @@ static void get_pack_filenames(struct string_list *fname_list)
if (suffixcmp(e-d_name, .pack))
continue;
 
-   size_t len = strlen(e-d_name) - strlen(.pack);
+   len = strlen(e-d_name) - strlen(.pack);
fname = xmemdupz(e-d_name, len);
 
if (!file_exists(mkpath(%s/%s.keep, packdir, fname)))
@@ -95,14 +92,14 @@ static void get_pack_filenames(struct string_list 
*fname_list)
closedir(dir);
 }
 
-static void remove_redundant_pack(const char *path, const char *sha1)
+static void remove_redundant_pack(const char *path_prefix, const char *hex)
 {
const char *exts[] = {.pack, .idx, .keep};
int i;
struct strbuf buf = STRBUF_INIT;
size_t plen;
 
-   strbuf_addf(buf, %s/%s, path, sha1);
+   strbuf_addf(buf, %s/%s, path_prefix, hex);
plen = buf.len;
 
for (i = 0; i  ARRAY_SIZE(exts); i++) {
@@ -115,15 +112,14 @@ static void remove_redundant_pack(const char *path, const 
char *sha1)
 int cmd_repack(int argc, const char **argv, const char *prefix)
 {
const char *exts[2] = {.idx, .pack};
-   char *packtmp;
struct child_process cmd;
struct string_list_item *item;
struct argv_array cmd_args = ARGV_ARRAY_INIT;
struct string_list names = STRING_LIST_INIT_DUP;
-   struct string_list rollback = STRING_LIST_INIT_DUP;
+   struct string_list rollback = STRING_LIST_INIT_NODUP;
struct string_list existing_packs = STRING_LIST_INIT_DUP;
struct strbuf line = STRBUF_INIT;
-   int count_packs, ext, ret;
+   int nr_packs, ext, ret, failed;
FILE *out;
 
/* variables to be filled by option parsing */
@@ -173,11 +169,11 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
argc = parse_options(argc, argv, prefix, builtin_repack_options,
git_repack_usage, 0);
 
-   sigchain_push_common(remove_pack_on_signal);
-
packdir = mkpathdup(%s/pack, get_object_directory());
packtmp = mkpathdup(%s/.tmp-%d-pack, packdir, (int)getpid());
 
+   sigchain_push_common(remove_pack_on_signal);
+

[PATCHv2] git-diff: Clarify operation when not inside a repository.

2013-08-22 Thread Dale R. Worley
Clarify documentation for git-diff:  State that when not inside a
repository, --no-index is implied (and thus two arguments are
mandatory).

Clarify error message from diff-no-index to inform user that CWD is
not inside a repository and thus two arguments are mandatory.

Signed-off-by: Dale Worley wor...@ariadne.com
---


The error message has been updated from [PATCH].  git diff outside a
repository now produces:

Not a git repository
To compare two paths outside a working tree:
usage: git diff [--no-index] path path

This should inform the user of his error regardless of whether he
intended to perform a within-repository git diff or an
out-of-repository git diff.

This message is closer to the message that other Git commands produce:

fatal: Not a git repository (or any parent up to mount parent )
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

git diff --no-index produces the same message as before (since the
user is clearly invoking the non-repository behavior):

usage: git diff --no-index path path

Regarding the change to git-diff.txt, perhaps forced ... by executing
'git diff' outside of a working tree is not the best wording, but it
should be clear to the reader that (1) it is possible to execute 'git
diff' outside of a working tree, and (2) when doing so, the behavior
will be as if '--no-index' was specified.

I've also added some comments for the new code.


 Documentation/git-diff.txt |3 ++-
 diff-no-index.c|   12 +++-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 78d6d50..9f74989 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -31,7 +31,8 @@ two blob objects, or changes between two files on disk.
 +
 If exactly two paths are given and at least one points outside
 the current repository, 'git diff' will compare the two files /
-directories. This behavior can be forced by --no-index.
+directories. This behavior can be forced by --no-index or by 
+executing 'git diff' outside of a working tree.
 
 'git diff' [--options] --cached [commit] [--] [path...]::
 
diff --git a/diff-no-index.c b/diff-no-index.c
index e66fdf3..9734ec3 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -215,9 +215,19 @@ void diff_no_index(struct rev_info *revs,
 path_inside_repo(prefix, argv[i+1])))
return;
}
-   if (argc != i + 2)
+   if (argc != i + 2) {
+   if (!no_index) {
+   /* There was no --no-index and there were not two
+* paths.  It is possible that the user intended
+* to do an inside-repository operation. */
+   fprintf(stderr, Not a git repository\n);
+   fprintf(stderr,
+   To compare two paths outside a working 
tree:\n);
+   }
+   /* Give the usage message for non-repository usage and exit. */
usagef(git diff %s path path,
   no_index ? --no-index : [--no-index]);
+   }
 
diff_setup(revs-diffopt);
for (i = 1; i  argc - 2; ) {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] repack: rewrite the shell script in C (squashing proposal)

2013-08-22 Thread Junio C Hamano
Stefan Beller stefanbel...@googlemail.com writes:

 @@ -41,18 +35,16 @@ static void remove_temporary_files(void)
   DIR *dir;
   struct dirent *e;
  
 + dir = opendir(packdir);
 + if (!dir)
   return;
  
 + strbuf_addstr(buf, packdir);
 +
 + /* dirlen holds the length of the path before the file name */
   dirlen = buf.len + 1;
 + strbuf_addf(buf, %s, packtmp);
 + /* prefixlen holds the length of the prefix */

Thanks to the name of the variable that is self-describing, this
comment does not add much value.

But it misses the whole point of my suggestion in the earlier
message to phrase these like so:

/* Point at the slash at the end of .../objects/pack/ */
dirlen = strlen(packdir) + 1;
/* Point at the dash at the end of .../.tmp-%d-pack- */
prefixlen = buf.len - dirlen;

to clarify what the writer considers as the prefix is, which may
be quite different from what the readers think the prefix is.  In
.tmp-2342-pack-0d8beaa5b76e824c9869f0d1f1b19ec7acf4982f.pack, is
the prefix .tmp-2342-, .tmp-2342-pack, or .tmp-2342-pack-?

  int cmd_repack(int argc, const char **argv, const char *prefix)
  {
 ...
   packdir = mkpathdup(%s/pack, get_object_directory());
   packtmp = mkpathdup(%s/.tmp-%d-pack, packdir, (int)getpid());
  
 + sigchain_push_common(remove_pack_on_signal);
 +
   argv_array_push(cmd_args, pack-objects);
   argv_array_push(cmd_args, --keep-true-parents);
   argv_array_push(cmd_args, --honor-pack-keep);
 ...
 + rollback_failure.items[i].string,
 + rollback_failure.items[i].string);
   }
   exit(1);
   }

The scripted version uses

trap 'rm -f $PACKTMP-*' 0 1 2 3 15

so remove_temporary_files() needs to be called before exiting from
the program without getting killed by a signal.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Wed, Aug 21, 2013 at 5:41 PM, Junio C Hamano gits...@pobox.com wrote:
 Brian Gernhardt br...@gernhardtsoftware.com writes:

 With 2eac2a4: ls-files -k: a directory only can be killed if the index has 
 a non-directory applied, t3010 fails test 3 validate git ls-files -k 
 output.  It ends up missing the pathx/ju/nk file.

 OS X 10.8.4
 Xcode 4.6.3
 clang Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn)

 Very interesting, as it obviously does not reproduce for me.

I can confirm this failure on OS X, however, I am somewhat confused by
the follow-up t3010 changes in 3c56875176390eee. Are the t3010 changes
supposed to fail without 2eac2a4cc4bdc8d7 applied? For me, on Linux,
the tests succeed whether 2eac2a4cc4bdc8d7 is applied or not. On OS X,
the tests succeed without 2eac2a4cc4bdc8d7 but fail with it applied.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] git-diff: Clarify operation when not inside a repository.

2013-08-22 Thread Junio C Hamano
wor...@alum.mit.edu (Dale R. Worley) writes:

 The error message has been updated from [PATCH].  git diff outside a
 repository now produces:

 Not a git repository
 To compare two paths outside a working tree:
 usage: git diff [--no-index] path path

 This should inform the user of his error regardless of whether he
 intended to perform a within-repository git diff or an
 out-of-repository git diff.

 This message is closer to the message that other Git commands produce:

 fatal: Not a git repository (or any parent up to mount parent )
 Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

 git diff --no-index produces the same message as before (since the
 user is clearly invoking the non-repository behavior):

 usage: git diff --no-index path path

The above result looks good and I find the reasoning stated here
very sound.

 Regarding the change to git-diff.txt, perhaps forced ... by executing
 'git diff' outside of a working tree is not the best wording, but it
 should be clear to the reader that (1) it is possible to execute 'git
 diff' outside of a working tree, and (2) when doing so, the behavior
 will be as if '--no-index' was specified.

Then perhaps we can avoid the confusing forced by phrasing it like
so?

This behaviour can be forced by --no-index.  Also 'git diff
path path' outside of a working tree can be used to compare
two named paths.

Let's step back a bit, though.  The original text is:

'git diff' [--options] [--] [path...]::

This form is to view the changes you made relative to
the index (staging area for the next commit).  In other
words, the differences are what you _could_ tell Git to
further add to the index but you still haven't.  You can
stage these changes by using linkgit:git-add[1].
+
If exactly two paths are given and at least one points outside
the current repository, 'git diff' will compare the two files /
directories. This behavior can be forced by --no-index.

which _primarily_ explains how the index and the working tree
contents are compared, but also mixes the description of the
--no-index hack, which is quite different.  As its name suggests,
it is not about comparing with the index---in fact, it is not even
about Git at all.  Just a pair of random paths that do not have
anything to do with Git are compared.

I suspect that it may be a good idea to split the section altogether
to reduce confusion like what triggered this thread, e.g.

'git diff' [--options] [--] [path...]::

This form is to view the changes you made relative to
the index (staging area for the next commit).  In other
words, the differences are what you _could_ tell Git to
further add to the index but you still haven't.  You can
stage these changes by using linkgit:git-add[1].

'git diff' --no-index [--options] [--] path path::

This form is to compare the given two paths on the
filesystem.  When run in a working tree controlled by
Git, if at least one of the paths points outside the
working tree, or when run outside a working tree
controlled by Git, you can omit the `--no-index` option.

For now, I'll queue your version as-is modulo style fixes, while
waiting for others to help polishing the documentation better.

 I've also added some comments for the new code.

Thanks.

  Documentation/git-diff.txt |3 ++-
  diff-no-index.c|   12 +++-
  2 files changed, 13 insertions(+), 2 deletions(-)

 diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
 index 78d6d50..9f74989 100644
 --- a/Documentation/git-diff.txt
 +++ b/Documentation/git-diff.txt
 @@ -31,7 +31,8 @@ two blob objects, or changes between two files on disk.
  +
  If exactly two paths are given and at least one points outside
  the current repository, 'git diff' will compare the two files /
 -directories. This behavior can be forced by --no-index.
 +directories. This behavior can be forced by --no-index or by 
 +executing 'git diff' outside of a working tree.
  
  'git diff' [--options] --cached [commit] [--] [path...]::
  
 diff --git a/diff-no-index.c b/diff-no-index.c
 index e66fdf3..9734ec3 100644
 --- a/diff-no-index.c
 +++ b/diff-no-index.c
 @@ -215,9 +215,19 @@ void diff_no_index(struct rev_info *revs,
path_inside_repo(prefix, argv[i+1])))
   return;
   }
 - if (argc != i + 2)
 + if (argc != i + 2) {
 + if (!no_index) {
 + /* There was no --no-index and there were not two
 +  * paths.  It is possible that the user intended
 +  * to do an inside-repository operation. */
 + fprintf(stderr, Not a git repository\n);
 + fprintf(stderr,
 + To compare two 

Re: [RFC PATCHv6 1/2] repack: rewrite the shell script in C

2013-08-22 Thread Jonathan Nieder
Junio C Hamano wrote:
 Stefan Beller stefanbel...@googlemail.com writes:

 The motivation of this patch is to get closer to a goal of being
 able to have a core subset of git functionality built in to git.
 That would mean

  * people on Windows could get a copy of at least the core parts
of Git without having to install a Unix-style shell

  * people deploying to servers don't have to rewrite the #! line
or worry about the PATH and quality of installed POSIX
utilities, if they are only using the built-in part written
in C

I am not sure what is meant by the latter.  Rewriting #! is part of
 any scripted Porcelain done by the top-level Makefile, and I do not
 think we have seen any problem reports on it.

 As to quality of ... utilities, I think the real issue some people
 in the thread had was not about deploying to servers but about
 installing in a minimalistic chrooted environment where standard
 tools may be lacking.

Thanks for a sanity check.  Yeah, the second item should be about
minimal chroots, not my sloppy guess about some hypothetical bad
operating system with untrustworthy tools.

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 I can confirm this failure on OS X, however, I am somewhat confused by
 the follow-up t3010 changes in 3c56875176390eee. Are the t3010 changes
 supposed to fail without 2eac2a4cc4bdc8d7 applied? For me, on Linux,
 the tests succeed whether 2eac2a4cc4bdc8d7 is applied or not. On OS X,
 the tests succeed without 2eac2a4cc4bdc8d7 but fail with it applied.

The 2eac2a4c (ls-files -k: a directory only can be killed if the
index has a non-directory, 2013-08-15) is NOT a correctness fix.

It is an optimization to avoid scanning directories that are known
not to be killed when ls-files -k is asked to list killed
paths. The original code without the patch is correct already; it
just is too inefficient because it scans all the directories.  It is
not surprising if the test added by 3c568751 (t3010: update to
demonstrate ls-files -k optimization pitfalls, 2013-08-15) passes
without 2eac2a4c.

As its log message explains, 3c568751 (t3010: update to demonstrate
ls-files -k optimization pitfalls, 2013-08-15) is to catch a case
where an earlier something like this patch (which is the draft for
2eac2a4c) posted to the list would have broken.  That draft patch
was correct only for the case where the top-level directory is
killed, but was broken when a subdirectory (e.g. pathx/ju) is
killed.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Thu, Aug 22, 2013 at 5:16 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 I can confirm this failure on OS X, however, I am somewhat confused by
 the follow-up t3010 changes in 3c56875176390eee. Are the t3010 changes
 supposed to fail without 2eac2a4cc4bdc8d7 applied? For me, on Linux,
 the tests succeed whether 2eac2a4cc4bdc8d7 is applied or not. On OS X,
 the tests succeed without 2eac2a4cc4bdc8d7 but fail with it applied.

 The 2eac2a4c (ls-files -k: a directory only can be killed if the
 index has a non-directory, 2013-08-15) is NOT a correctness fix.

 It is an optimization to avoid scanning directories that are known
 not to be killed when ls-files -k is asked to list killed
 paths. The original code without the patch is correct already; it
 just is too inefficient because it scans all the directories.  It is
 not surprising if the test added by 3c568751 (t3010: update to
 demonstrate ls-files -k optimization pitfalls, 2013-08-15) passes
 without 2eac2a4c.

 As its log message explains, 3c568751 (t3010: update to demonstrate
 ls-files -k optimization pitfalls, 2013-08-15) is to catch a case
 where an earlier something like this patch (which is the draft for
 2eac2a4c) posted to the list would have broken.  That draft patch
 was correct only for the case where the top-level directory is
 killed, but was broken when a subdirectory (e.g. pathx/ju) is
 killed.

Thanks for the explanation.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 On Thu, Aug 22, 2013 at 5:16 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 I can confirm this failure on OS X, however,...

 Thanks for the explanation.

Now, I am curious how it breaks on OS X.

My suspition is that ignore_case may have something to do with it,
but what 2eac2a4c (ls-files -k: a directory only can be killed if
the index has a non-directory, 2013-08-15) uses are the bog-standard
cache_name_exists() and directory_exists_in_index(), so one of these
internal API implementation has trouble on case insensitive
filesystems, perhaps?  I dunno.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Thu, Aug 22, 2013 at 5:32 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 On Thu, Aug 22, 2013 at 5:16 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 I can confirm this failure on OS X, however,...

 Thanks for the explanation.

 Now, I am curious how it breaks on OS X.

 My suspition is that ignore_case may have something to do with it,
 but what 2eac2a4c (ls-files -k: a directory only can be killed if
 the index has a non-directory, 2013-08-15) uses are the bog-standard
 cache_name_exists() and directory_exists_in_index(), so one of these
 internal API implementation has trouble on case insensitive
 filesystems, perhaps?  I dunno.

That's exactly my suspicion at the moment. It's an obvious difference
between Linux and OS X. I'm just in the process of trying to compare
between the two platforms.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 On Thu, Aug 22, 2013 at 5:32 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 On Thu, Aug 22, 2013 at 5:16 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 I can confirm this failure on OS X, however,...

 Thanks for the explanation.

 Now, I am curious how it breaks on OS X.

 My suspition is that ignore_case may have something to do with it,
 but what 2eac2a4c (ls-files -k: a directory only can be killed if
 the index has a non-directory, 2013-08-15) uses are the bog-standard
 cache_name_exists() and directory_exists_in_index(), so one of these
 internal API implementation has trouble on case insensitive
 filesystems, perhaps?  I dunno.

 That's exactly my suspicion at the moment. It's an obvious difference
 between Linux and OS X. I'm just in the process of trying to compare
 between the two platforms.

Or perhaps de-d_type does not exist?  In such a case, we end up
doing get_index_dtype() via get_dtype(), but in this codepath I
suspect that we do not want to.  We are interested in the type of
the entity on the filesystem.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Thu, Aug 22, 2013 at 5:43 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:
 On Thu, Aug 22, 2013 at 5:32 PM, Junio C Hamano gits...@pobox.com wrote:
 Now, I am curious how it breaks on OS X.

 My suspition is that ignore_case may have something to do with it,
 but what 2eac2a4c (ls-files -k: a directory only can be killed if
 the index has a non-directory, 2013-08-15) uses are the bog-standard
 cache_name_exists() and directory_exists_in_index(), so one of these
 internal API implementation has trouble on case insensitive
 filesystems, perhaps?  I dunno.

 That's exactly my suspicion at the moment. It's an obvious difference
 between Linux and OS X. I'm just in the process of trying to compare
 between the two platforms.

 Or perhaps de-d_type does not exist?  In such a case, we end up
 doing get_index_dtype() via get_dtype(), but in this codepath I
 suspect that we do not want to.  We are interested in the type of
 the entity on the filesystem.

de-d_type exists on both platforms. get_dtype() is never called.

However, I did discover that treat_path() is being invoked fewer times
on OSX than on Linux. For instance, in the repository created by
t3010, treat_path() is called 19 times on Linux, but only 17 times on
OSX.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 0/6] duplicate objects and delta cycles, oh my!

2013-08-22 Thread Jeff King
On Thu, Aug 22, 2013 at 10:43:05AM -0400, Jeff King wrote:

  write_idx_file() is called after index-pack processes all delta
  objects. Could resolve_deltas() go cyclic with certain duplicate
  object setup?
 
 Good question. I'm not sure. I'll check it out.

I think the answer is no, based on both reasoning and testing (both of
which are included in patches 3-4 of the series below).

So here's my re-roll of the series.

  [1/6]: test-sha1: add a binary output mode

New in this iteration; the previous version piped test-sha1 into
perl to create the pack trailer, but with this simple change we can
drop the perl dependency.

  [2/6]: sha1-lookup: handle duplicate keys with GIT_USE_LOOKUP

Same code as before. I've factored the pack-creation bits from the
tests into lib-pack.sh, so they can be reused elsewhere when we want
to create bogus packs (and patches 3-4 reuse them here).

  [3/6]: add tests for indexing packs with delta cycles
  [4/6]: test index-pack on packs with recoverable delta cycles

New tests covering delta cycles.

  [5/6]: index-pack: optionally reject packs with duplicate objects

Similar to before, but I converted the config flag to a simple
boolean (since we scrapped the fix of the tri-state allow,
reject, fix).

  [6/6]: default pack.indexDuplicates to false

This flips the safety check on by default everywhere (before, it was
left off for index-pack).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] test-sha1: add a binary output mode

2013-08-22 Thread Jeff King
The test-sha1 helper program will run our internal sha1
routines over its input and output the 40-byte hex sha1.
Sometimes, however, it is useful to have the binary 20-byte
sha1 (and it's a pain to convert back in the shell). Let's
add a -b option to output the binary version.

Signed-off-by: Jeff King p...@peff.net
---
 test-sha1.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/test-sha1.c b/test-sha1.c
index 80daba9..e57eae1 100644
--- a/test-sha1.c
+++ b/test-sha1.c
@@ -5,10 +5,15 @@ int main(int ac, char **av)
git_SHA_CTX ctx;
unsigned char sha1[20];
unsigned bufsz = 8192;
+   int binary = 0;
char *buffer;
 
-   if (ac == 2)
-   bufsz = strtoul(av[1], NULL, 10) * 1024 * 1024;
+   if (ac == 2) {
+   if (!strcmp(av[1], -b))
+   binary = 1;
+   else
+   bufsz = strtoul(av[1], NULL, 10) * 1024 * 1024;
+   }
 
if (!bufsz)
bufsz = 8192;
@@ -42,6 +47,10 @@ int main(int ac, char **av)
git_SHA1_Update(ctx, buffer, this_sz);
}
git_SHA1_Final(sha1, ctx);
-   puts(sha1_to_hex(sha1));
+
+   if (binary)
+   fwrite(sha1, 1, 20, stdout);
+   else
+   puts(sha1_to_hex(sha1));
exit(0);
 }
-- 
1.8.4.rc2.28.g6bb5f3f

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 Status update: For the 'pathx' directory created by the t3010 test,
 directory_exists_in_index() returns false on OSX, but true is returned
 on Linux.

Because a regular pathx/ju is in the index at that point, the
correct answer directory_exists_in_index() should give for 'pathx'
is index_directory, not index_nonexistent, I think.



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6] sha1-lookup: handle duplicate keys with GIT_USE_LOOKUP

2013-08-22 Thread Jeff King
The sha1_entry_pos function tries to be smart about
selecting the middle of a range for its binary search by
looking at the value differences between the lo and hi
constraints. However, it is unable to cope with entries with
duplicate keys in the sorted list.

We may hit a point in the search where both our lo and
hi point to the same key. In this case, the range of
values between our endpoints is 0, and trying to scale the
difference between our key and the endpoints over that range
is undefined (i.e., divide by zero). The current code
catches this with an assert(lov  hiv).

Moreover, after seeing that the first 20 byte of the key are
the same, we will try to establish a value from the 21st
byte. Which is nonsensical.

Instead, we can detect the case that we are in a run of
duplicates, and simply do a final comparison against any one
of them (since they are all the same, it does not matter
which). If the keys match, we have found our entry (or one
of them, anyway).  If not, then we know that we do not need
to look further, as we must be in a run of the duplicate
key.

Furthermore, we know that one of our endpoints must be
the edge of the run of duplicates. For example, given this
sequence:

 idx 0 1 2 3 4 5
 key A C C C C D

If we are searching for B, we might hit the duplicate run
at lo=1, hi=3 (e.g., by first mi=3, then mi=0). But we can
never have lo  1, because B  C. That is, if our key is
less than the run, we know that lo is the edge, but we can
say nothing of hi. Similarly, if our key is greater than
the run, we know that hi is the edge, but we can say
nothing of lo. But that is enough for us to return not
only not found, but show the position at which we would
insert the new item.

Signed-off-by: Jeff King p...@peff.net
---
 sha1-lookup.c | 23 
 t/lib-pack.sh | 78 +++
 t/t5308-pack-detect-duplicates.sh | 73 
 3 files changed, 174 insertions(+)
 create mode 100644 t/lib-pack.sh
 create mode 100755 t/t5308-pack-detect-duplicates.sh

diff --git a/sha1-lookup.c b/sha1-lookup.c
index c4dc55d..614cbb6 100644
--- a/sha1-lookup.c
+++ b/sha1-lookup.c
@@ -204,7 +204,30 @@ int sha1_entry_pos(const void *table,
 * byte 0 thru (ofs-1) are the same between
 * lo and hi; ofs is the first byte that is
 * different.
+*
+* If ofs==20, then no bytes are different,
+* meaning we have entries with duplicate
+* keys. We know that we are in a solid run
+* of this entry (because the entries are
+* sorted, and our lo and hi are the same,
+* there can be nothing but this single key
+* in between). So we can stop the search.
+* Either one of these entries is it (and
+* we do not care which), or we do not have
+* it.
 */
+   if (ofs == 20) {
+   mi = lo;
+   mi_key = base + elem_size * mi + key_offset;
+   cmp = memcmp(mi_key, key, 20);
+   if (!cmp)
+   return mi;
+   if (cmp  0)
+   return -1 - hi;
+   else
+   return -1 - lo;
+   }
+
hiv = hi_key[ofs_0];
if (ofs_0  19)
hiv = (hiv  8) | hi_key[ofs_0+1];
diff --git a/t/lib-pack.sh b/t/lib-pack.sh
new file mode 100644
index 000..61c5d19
--- /dev/null
+++ b/t/lib-pack.sh
@@ -0,0 +1,78 @@
+#!/bin/sh
+#
+# Support routines for hand-crafting weird or malicious packs.
+#
+# You can make a complete pack like:
+#
+#   pack_header 2 foo.pack 
+#   pack_obj e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 foo.pack 
+#   pack_obj e68fe8129b546b101aee9510c5328e7f21ca1d18 foo.pack 
+#   pack_trailer foo.pack
+
+# Print the big-endian 4-byte octal representation of $1
+uint32_octal() {
+   n=$1
+   printf '\%o' $(($n / 16777216)); n=$((n % 16777216))
+   printf '\%o' $(($n /65536)); n=$((n %65536))
+   printf '\%o' $(($n /  256)); n=$((n %  256))
+   printf '\%o' $(($n   ));
+}
+
+# Print the big-endian 4-byte binary representation of $1
+uint32_binary() {
+   printf $(uint32_octal $1)
+}
+
+# Print a pack header, version 2, for a pack with $1 objects
+pack_header() {
+   printf 'PACK' 
+   printf '\0\0\0\2' 
+   uint32_binary $1
+}
+
+# Print the pack data for object $1, as a delta against object $2 (or as a full
+# object if $2 is missing or empty). The output is 

[PATCH 3/6] add tests for indexing packs with delta cycles

2013-08-22 Thread Jeff King
If we receive a broken or malicious pack from a remote, we
will feed it to index-pack. As index-pack processes the
objects as a stream, reconstructing and hashing each object
to get its name, it is not very susceptible to doing the
wrong with bad data (it simply notices that the data is
bogus and aborts).

However, one question raised on the list is whether it could
be susceptible to problems during the delta-resolution
phase. In particular, can a cycle in the packfile deltas
cause us to go into an infinite loop or cause any other
problem?

The answer is no.

We cannot have a cycle of delta-base offsets, because they
go only in one direction (the OFS_DELTA object mentions its
base by an offset towards the beginning of the file, and we
explicitly reject negative offsets).

We can have a cycle of REF_DELTA objects, which refer to
base objects by sha1 name. However, index-pack does not know
these sha1 names ahead of time; it has to reconstruct the
objects to get their names, and it cannot do so if there is
a delta cycle (in other words, it does not even realize
there is a cycle, but only that there are items that cannot
be resolved).

Even though we can reason out that index-pack should handle
this fine, let's add a few tests to make sure it behaves
correctly.

Signed-off-by: Jeff King p...@peff.net
---
 t/lib-pack.sh| 22 +
 t/t5309-pack-delta-cycles.sh | 59 
 2 files changed, 81 insertions(+)
 create mode 100755 t/t5309-pack-delta-cycles.sh

diff --git a/t/lib-pack.sh b/t/lib-pack.sh
index 61c5d19..e028c40 100644
--- a/t/lib-pack.sh
+++ b/t/lib-pack.sh
@@ -55,6 +55,28 @@ pack_obj() {
printf '\062\170\234\143\267\3\0\0\116\0\106'
return
;;
+   01d7713666f4de822776c7622c10f1b07de280dc)
+   printf '\165\1\327\161\66\146\364\336\202\47\166' 
+   printf '\307\142\54\20\361\260\175\342\200\334\170' 
+   printf '\234\143\142\142\142\267\003\0\0\151\0\114'
+   return
+   ;;
+   esac
+   ;;
+
+   # blob containing \7\0
+   01d7713666f4de822776c7622c10f1b07de280dc)
+   case $2 in
+   '')
+   printf '\062\170\234\143\147\0\0\0\20\0\10'
+   return
+   ;;
+   e68fe8129b546b101aee9510c5328e7f21ca1d18)
+   printf '\165\346\217\350\22\233\124\153\20\32\356' 
+   printf '\225\20\305\62\216\177\41\312\35\30\170\234' 
+   printf '\143\142\142\142\147\0\0\0\53\0\16'
+   return
+   ;;
esac
;;
esac
diff --git a/t/t5309-pack-delta-cycles.sh b/t/t5309-pack-delta-cycles.sh
new file mode 100755
index 000..9b3f2c3
--- /dev/null
+++ b/t/t5309-pack-delta-cycles.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+
+test_description='test index-pack handling of delta cycles in packfiles'
+. ./test-lib.sh
+. ../lib-pack.sh
+
+# Two similar-ish objects that we have computed deltas between.
+A=01d7713666f4de822776c7622c10f1b07de280dc
+B=e68fe8129b546b101aee9510c5328e7f21ca1d18
+
+# double-check our hand-constucted packs
+test_expect_success 'index-pack works with a single delta (A-B)' '
+   clear_packs 
+   {
+   pack_header 2 
+   pack_obj $A $B 
+   pack_obj $B
+   } ab.pack 
+   pack_trailer ab.pack 
+   git index-pack --stdin ab.pack 
+   git cat-file -t $A 
+   git cat-file -t $B
+'
+
+test_expect_success 'index-pack works with a single delta (B-A)' '
+   clear_packs 
+   {
+   pack_header 2 
+   pack_obj $A 
+   pack_obj $B $A
+   } ba.pack 
+   pack_trailer ba.pack 
+   git index-pack --stdin ba.pack 
+   git cat-file -t $A 
+   git cat-file -t $B
+'
+
+test_expect_success 'index-pack detects missing base objects' '
+   clear_packs 
+   {
+   pack_header 1 
+   pack_obj $A $B
+   } missing.pack 
+   pack_trailer missing.pack 
+   test_must_fail git index-pack --fix-thin --stdin missing.pack
+'
+
+test_expect_success 'index-pack detects REF_DELTA cycles' '
+   clear_packs 
+   {
+   pack_header 2 
+   pack_obj $A $B 
+   pack_obj $B $A
+   } cycle.pack 
+   pack_trailer cycle.pack 
+   test_must_fail git index-pack --fix-thin --stdin cycle.pack
+'
+
+test_done
-- 
1.8.4.rc2.28.g6bb5f3f

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] test index-pack on packs with recoverable delta cycles

2013-08-22 Thread Jeff King
The previous commit added tests to show that index-pack
correctly bails in unrecoverable situations. There are some
situations where the data could be recovered, but it is not
currently:

  1. If we can break the cycle using an object from another
 pack via --fix-thin.

  2. If we can break the cycle using a duplicate of one of
 the objects found in the same pack.

Note that neither of these is particularly high priority; a
delta cycle within a pack should never occur, and we have no
record of even a buggy git implementation creating such a
pack.

However, it's worth adding these tests for two reasons. One,
to document that we do not currently handle the situation,
even though it is possible. And two, to exercise the code
that runs in this situation; even though it fails, by
running it we can confirm that index-pack detects the
situation and aborts, and does not misbehave (e.g., by
following the cycle in an infinite loop).

In both cases, we hit an assert that aborts index-pack.

Signed-off-by: Jeff King p...@peff.net
---
 t/t5309-pack-delta-cycles.sh | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/t/t5309-pack-delta-cycles.sh b/t/t5309-pack-delta-cycles.sh
index 9b3f2c3..38e1809 100755
--- a/t/t5309-pack-delta-cycles.sh
+++ b/t/t5309-pack-delta-cycles.sh
@@ -56,4 +56,22 @@ test_expect_success 'index-pack detects REF_DELTA cycles' '
test_must_fail git index-pack --fix-thin --stdin cycle.pack
 '
 
+test_expect_failure 'failover to an object in another pack' '
+   clear_packs 
+   git index-pack --stdin ab.pack 
+   git index-pack --stdin --fix-thin cycle.pack
+'
+
+test_expect_failure 'failover to a duplicate object in the same pack' '
+   clear_packs 
+   {
+   pack_header 3 
+   pack_obj $A $B 
+   pack_obj $B $A 
+   pack_obj $A
+   } recoverable.pack 
+   pack_trailer recoverable.pack 
+   git index-pack --fix-thin --stdin recoverable.pack
+'
+
 test_done
-- 
1.8.4.rc2.28.g6bb5f3f

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Thu, Aug 22, 2013 at 7:12 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 Status update: For the 'pathx' directory created by the t3010 test,
 directory_exists_in_index() returns false on OSX, but true is returned
 on Linux.

 Because a regular pathx/ju is in the index at that point, the
 correct answer directory_exists_in_index() should give for 'pathx'
 is index_directory, not index_nonexistent, I think.

directory_exists_in_index() and directory_exists_in_index_icase() are
behaving differently. You can replicate the problem on Linux by
enabling core.ignorecase in the test (sans gmail whitespace damage):

--8--
diff --git a/t/t3010-ls-files-killed-modified.sh b/t/t3010-ls-files-killed-modif
index 3120efd..8c76160 100755
--- a/t/t3010-ls-files-killed-modified.sh
+++ b/t/t3010-ls-files-killed-modified.sh
@@ -89,7 +89,7 @@ test_expect_success 'git ls-files -k to show killed files.' '
: path9 
touch path10 
pathx/ju/nk 
-   git ls-files -k .output
+   git -c core.ignorecase=true ls-files -k .output
 '
--8--
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] duplicate objects and delta cycles, oh my!

2013-08-22 Thread Nicolas Pitre
On Thu, 22 Aug 2013, Jeff King wrote:

 On Thu, Aug 22, 2013 at 10:43:05AM -0400, Jeff King wrote:
 
   write_idx_file() is called after index-pack processes all delta
   objects. Could resolve_deltas() go cyclic with certain duplicate
   object setup?
  
  Good question. I'm not sure. I'll check it out.
 
 I think the answer is no, based on both reasoning and testing (both of
 which are included in patches 3-4 of the series below).
 
 So here's my re-roll of the series.

This looks all good to me.


Acked-by: Nicolas Pitre n...@fluxnic.net


Nicolas
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git-p4 out of memory for very large repository

2013-08-22 Thread Corey Thompson
Hello,

Has anyone actually gotten git-p4 to clone a large Perforce repository?
I have one codebase in particular that gets to about 67%, then
consistently gets get-fast-import (and often times a few other
processes) killed by the OOM killer.

I've found some patches out there that claim to resolve this, but
they're all for versions of git-p4.py from several years ago.  Not only
will they not apply cleanly, but as far as I can tell the issues that
these patches are meant to address aren't in the current version,
anyway.

Any suggestions would be greatly appreciated.

Thanks,
Corey
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] fix t3010 failure when core.ignorecase=true

2013-08-22 Thread Eric Sunshine
This series fixes a bug in dir.c which causes t3010 to fail [1] when
core.ignorecase is true. The problem is that
directory_exists_in_index(dirname,len) and
directory_exists_in_index_icase() behave differently if dirname[len] is
not a '/', even though this is beyond end-of-string.  2eac2a4cc4bdc8d7
(ls-files -k: a directory only can be killed if the index has a
non-directory; 2013-08-15) adds a caller which neglects to ensure that
the the required '/' is present, hence the failure.

I am not happy with the fix, which is too add a '/' after the last
character in dirname just at the call site introduced by
2eac2a4cc4bdc8d7.

The reason for my unhappiness is that directory_exists_in_index_icase()
makes the assumption, not only that it can access the character beyond
the end-of-string, but also that that character will unconditionally be
'/'. I presume that this was done for the sake of speed (existing
callers always had a '/' beyond end-of-string), but it feels like an
ugly wart. Since the required trailing '/' is purely an implementation
detail of directory_exists_in_index_icase(), and not of
directory_exists_in_index(), a cleaner fix would be for
directory_exists_in_index_icase() to add the '/' it needs, and not
expect the passed in dirname to have a '/' after its last character.
Unfortunately, such a fix would probably negate any optimization benefit
gained by the present implementation.

[1]: http://thread.gmane.org/gmane.comp.version-control.git/232727

Eric Sunshine (2):
  t3103: demonstrate dir.c:treat_one_path() core.ignorecase failure
  dir: test_one_path: fix inconsistent behavior due to missing '/'

 dir.c   | 12 +---
 t/t3103-ls-tree-misc.sh | 15 +++
 2 files changed, 24 insertions(+), 3 deletions(-)

-- 
1.8.4.rc4.529.g78818d7

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] t3103: demonstrate dir.c:treat_one_path() core.ignorecase failure

2013-08-22 Thread Eric Sunshine
2eac2a4cc4bdc8d7 (ls-files -k: a directory only can be killed if the
index has a non-directory; 2013-08-15) adds a caller of
directory_exists_in_index(dirname,len) which forgets to satisfy the
undocumented requirement that a '/' must be present at dirname[len]
(despite being past the end-of-string). This oversight leads to
incorrect behavior when core.ignorecase is true. Demonstrate this.

Signed-off-by: Eric Sunshine sunsh...@sunshineco.com
---
 t/t3103-ls-tree-misc.sh | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/t/t3103-ls-tree-misc.sh b/t/t3103-ls-tree-misc.sh
index 09dcf04..fd95991 100755
--- a/t/t3103-ls-tree-misc.sh
+++ b/t/t3103-ls-tree-misc.sh
@@ -4,6 +4,7 @@ test_description='
 Miscellaneous tests for git ls-tree.
 
  1. git ls-tree fails in presence of tree damage.
+ 2. git ls-tree acts sanely with core.ignorecase.
 
 '
 
@@ -21,4 +22,18 @@ test_expect_success 'ls-tree fails with non-zero exit code 
on broken tree' '
test_must_fail git ls-tree -r HEAD
 '
 
+test_expect_failure 'ls-tree directory core.ignorecase' '
+   cat expect -\EOF 
+   d/e/f
+   EOF
+   mkdir d 
+   d/e 
+   git update-index --add -- d/e 
+   rm d/e 
+   mkdir d/e 
+   d/e/f 
+   git -c core.ignorecase=true ls-files -k actual 
+   test_cmp expect actual
+'
+
 test_done
-- 
1.8.4.rc4.529.g78818d7

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] dir: test_one_path: fix inconsistent behavior due to missing '/'

2013-08-22 Thread Eric Sunshine
Although undocumented, directory_exists_in_index_icase(dirname,len)
unconditionally assumes the presence of a '/' at dirname[len] (despite
being past the end-of-string). Callers are expected to respect this
assumption by ensuring that a '/' is present beyond the last character
of the passed path. directory_exists_in_index(), on the other hand,
does not assume nor care about a trailing '/' beyond end-of-string.

2eac2a4cc4bdc8d7 (ls-files -k: a directory only can be killed if the
index has a non-directory; 2013-08-15) adds a caller which forgets to
ensure the trailing '/', thus leading to inconsistent behavior between
directory_exists_in_index() and directory_exists_in_index_icase()
depending upon the setting of core.ignorecase. Fix this problem.

This also fixes an initially-unnoticed failure in a t3010 test added by
3c56875176390eee (t3010: update to demonstrate ls-files -k
optimization pitfalls; 2013-08-15) when core.ignorecase is true.

Signed-off-by: Eric Sunshine sunsh...@sunshineco.com
---
 dir.c   | 12 +---
 t/t3103-ls-tree-misc.sh |  2 +-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/dir.c b/dir.c
index edd666a..a52c6f9 100644
--- a/dir.c
+++ b/dir.c
@@ -1160,9 +1160,15 @@ static enum path_treatment treat_one_path(struct 
dir_struct *dir,
 */
if ((dir-flags  DIR_COLLECT_KILLED_ONLY) 
(dtype == DT_DIR) 
-   !has_path_in_index 
-   (directory_exists_in_index(path-buf, path-len) == 
index_nonexistent))
-   return path_none;
+   !has_path_in_index) {
+   strbuf_addch(path, '/');
+   if (directory_exists_in_index(path-buf, path-len - 1) ==
+   index_nonexistent) {
+   strbuf_setlen(path, path-len - 1);
+   return path_none;
+   }
+   strbuf_setlen(path, path-len - 1);
+   }
 
exclude = is_excluded(dir, path-buf, dtype);
 
diff --git a/t/t3103-ls-tree-misc.sh b/t/t3103-ls-tree-misc.sh
index fd95991..9fb1706 100755
--- a/t/t3103-ls-tree-misc.sh
+++ b/t/t3103-ls-tree-misc.sh
@@ -22,7 +22,7 @@ test_expect_success 'ls-tree fails with non-zero exit code on 
broken tree' '
test_must_fail git ls-tree -r HEAD
 '
 
-test_expect_failure 'ls-tree directory core.ignorecase' '
+test_expect_success 'ls-tree directory core.ignorecase' '
cat expect -\EOF 
d/e/f
EOF
-- 
1.8.4.rc4.529.g78818d7

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Thu, Aug 22, 2013 at 7:15 PM, Eric Sunshine sunsh...@sunshineco.com wrote:
 On Thu, Aug 22, 2013 at 7:12 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 Status update: For the 'pathx' directory created by the t3010 test,
 directory_exists_in_index() returns false on OSX, but true is returned
 on Linux.

 Because a regular pathx/ju is in the index at that point, the
 correct answer directory_exists_in_index() should give for 'pathx'
 is index_directory, not index_nonexistent, I think.

 directory_exists_in_index() and directory_exists_in_index_icase() are
 behaving differently. You can replicate the problem on Linux by
 enabling core.ignorecase in the test (sans gmail whitespace damage):

 --8--
 diff --git a/t/t3010-ls-files-killed-modified.sh 
 b/t/t3010-ls-files-killed-modif
 index 3120efd..8c76160 100755
 --- a/t/t3010-ls-files-killed-modified.sh
 +++ b/t/t3010-ls-files-killed-modified.sh
 @@ -89,7 +89,7 @@ test_expect_success 'git ls-files -k to show killed files.' 
 '
 : path9 
 touch path10 
 pathx/ju/nk 
 -   git ls-files -k .output
 +   git -c core.ignorecase=true ls-files -k .output
  '
 --8--

I sent a patch [1] which resolves the problem, although the solution
is not especially pretty (due to some ugliness in the existing
implementation).

[1]: http://thread.gmane.org/gmane.comp.version-control.git/232796
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 I sent a patch [1] which resolves the problem, although the solution
 is not especially pretty (due to some ugliness in the existing
 implementation).

Yeah, thanks.

I tend to agree with you that fixing the icase callee not to rely
on having the trailing slash (which is looking past the end of the
given string), instead of working that breakage around on the
caller's side like your patch did, would be a better alternative,
though.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html