[PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Thomas Gummerer
git diff --no-index ... currently reads the index, during setup, when
calling gitmodules_config().  In the usual case this gives us some
performance drawbacks, but it's especially annoying if there is a broken
index file.

Avoid calling the unnecessary gitmodules_config() when the --no-index
option is given.  Also add a test to guard against similar breakages in the 
future.

Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 builtin/diff.c   | 13 +++--
 t/t4053-diff-no-index.sh |  6 ++
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index adb93a9..47c0833 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -257,7 +257,7 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix)
int blobs = 0, paths = 0;
const char *path = NULL;
struct blobinfo blob[2];
-   int nongit;
+   int nongit, no_index = 0;
int result = 0;
 
/*
@@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix)
 *
 * Other cases are errors.
 */
+   for (i = 1; i  argc; i++) {
+   if (!strcmp(argv[i], --))
+   break;
+   if (!strcmp(argv[i], --no-index)) {
+   no_index = 1;
+   break;
+   }
+   }
 
prefix = setup_git_directory_gently(nongit);
-   gitmodules_config();
+   if (!no_index)
+   gitmodules_config();
git_config(git_diff_ui_config, NULL);
 
init_revisions(rev, prefix);
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 979e983..a24ae4d 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -29,4 +29,10 @@ test_expect_success 'git diff --no-index relative path 
outside repo' '
)
 '
 
+test_expect_success 'git diff --no-index with broken index' '
+   cd repo 
+   echo broken .git/index 
+   test_expect_code 0 git diff --no-index a ../non/git/a
+'
+
 test_done
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Jonathan Nieder
Thomas Gummerer wrote:

 git diff --no-index ... currently reads the index, during setup, when
 calling gitmodules_config().  In the usual case this gives us some
 performance drawbacks,

Makes sense.

but it's especially annoying if there is a broken
 index file.

Is this really a normal case?  It makes sense that as a side-effect it
is easier to use git diff --no-index as a general-purpose tool while
investigating a broken repo, but I would have thought that quickly
learning a repo is broken is a good thing in any case.

A little more information about the context where this came up would
be helpful, I guess.

[...]
 --- a/builtin/diff.c
 +++ b/builtin/diff.c
[...]
 @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char 
 *prefix)
*
* Other cases are errors.
*/
 + for (i = 1; i  argc; i++) {
 + if (!strcmp(argv[i], --))
 + break;
 + if (!strcmp(argv[i], --no-index)) {
 + no_index = 1;
 + break;
 + }

setup_revisions() uses the same logic that doesn't handle options that
take arguments in their unstuck form (e.g., --word-diff-regex --),
so this is probably not a regression, though I haven't checked. :)

[...]
   prefix = setup_git_directory_gently(nongit);
 - gitmodules_config();
 + if (!no_index)
 + gitmodules_config();

Perhaps we can improve performance and behavior by skipping the
setup_git_directory_gently() call, too?

That would help with the repairing-broken-repository case by
working even if .git/config or .git/HEAD is broken, and I think
it is more intuitive that the repository-local configuration is
ignored by git diff --no-index.  It would also help with
performance by avoiding some filesystem access.

[...]
 +test_expect_success 'git diff --no-index with broken index' '
 + cd repo 
 + echo broken .git/index 
 + test_expect_code 0 git diff --no-index a ../non/git/a

Clever.  I wouldn't use test_expect_code 0, since that's
already implied by including the git diff --no-index call
in the  chain.

Thanks and hope that helps,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Jens Lehmann
Am 09.12.2013 16:16, schrieb Jonathan Nieder:
 Thomas Gummerer wrote:
 
 git diff --no-index ... currently reads the index, during setup, when
 calling gitmodules_config().  In the usual case this gives us some
 performance drawbacks,
 
 Makes sense.

Hmm, but this will disable the submodule specific ignore configuration
options defined in the .gitmodules file, no? (E.g. when diffing two
directories containing submodules)

but it's especially annoying if there is a broken
 index file.

 Is this really a normal case?  It makes sense that as a side-effect it
 is easier to use git diff --no-index as a general-purpose tool while
 investigating a broken repo, but I would have thought that quickly
 learning a repo is broken is a good thing in any case.

But I agree that dying with index file corrupt is a bit strange when
calling diff with --no-index. Wouldn't adding a gently option (which
could then warn instead of dying) to gitmodules_config() be a better
solution here?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Jonathan Nieder
Jens Lehmann wrote:
 Am 09.12.2013 16:16, schrieb Jonathan Nieder:
 Thomas Gummerer wrote:

 git diff --no-index ... currently reads the index, during setup, when
 calling gitmodules_config().  In the usual case this gives us some
 performance drawbacks,

 Makes sense.

 Hmm, but this will disable the submodule specific ignore configuration
 options defined in the .gitmodules file, no? (E.g. when diffing two
 directories containing submodules)

Yes.  That seems like a good behavior.

git diff --no-index was invented as just a fancy version of 'diff
-uR, without any awareness of the current git repository.  That means
that at least in principle, .gitmodules and .gitignore should not
affect it.

[...]
   Wouldn't adding a gently option (which
 could then warn instead of dying) to gitmodules_config() be a better
 solution here?

I don't think so.

Thanks and hope that helps,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Thomas Gummerer
Jonathan Nieder jrnie...@gmail.com writes:

 Thomas Gummerer wrote:

 git diff --no-index ... currently reads the index, during setup, when
 calling gitmodules_config().  In the usual case this gives us some
 performance drawbacks,

 Makes sense.

but it's especially annoying if there is a broken
 index file.

 Is this really a normal case?  It makes sense that as a side-effect it
 is easier to use git diff --no-index as a general-purpose tool while
 investigating a broken repo, but I would have thought that quickly
 learning a repo is broken is a good thing in any case.

 A little more information about the context where this came up would
 be helpful, I guess.

It came up while I was working on index-v5, where I had to investigate
quite a few repositories where the index was broken, especially when I
was changing the index format slightly.  For example I would take one
version, use test-dump-cache-tree to dump the cache tree to a file,
change the format slightly, use test-dump-cache-tree again, and check
the difference with git diff --no-index.

This might not be a very common use case, but maybe the patch might help
someone else too. (In addition to the performance improvements)

I'm not sure how much diff --no-index is used normally, but when the
index is broken that would be detected relatively soon anyway.  I'm not
so worried about one more command working when it's broken.

 [...]
 --- a/builtin/diff.c
 +++ b/builtin/diff.c
 [...]
 @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char 
 *prefix)
   *
   * Other cases are errors.
   */
 +for (i = 1; i  argc; i++) {
 +if (!strcmp(argv[i], --))
 +break;
 +if (!strcmp(argv[i], --no-index)) {
 +no_index = 1;
 +break;
 +}

 setup_revisions() uses the same logic that doesn't handle options that
 take arguments in their unstuck form (e.g., --word-diff-regex --),
 so this is probably not a regression, though I haven't checked. :)

The same logic is used in diff_no_index(), so I think it should be fine.

 [...]
  prefix = setup_git_directory_gently(nongit);
 -gitmodules_config();
 +if (!no_index)
 +gitmodules_config();

 Perhaps we can improve performance and behavior by skipping the
 setup_git_directory_gently() call, too?

 That would help with the repairing-broken-repository case by
 working even if .git/config or .git/HEAD is broken, and I think
 it is more intuitive that the repository-local configuration is
 ignored by git diff --no-index.  It would also help with
 performance by avoiding some filesystem access.

Yes, I think that would make sense, thanks.  I tested it, and it didn't
change the performance, but it's still a good change for the broken
repository case.  Will change in the re-roll and add a test for that.

 [...]
 +test_expect_success 'git diff --no-index with broken index' '
 +cd repo 
 +echo broken .git/index 
 +test_expect_code 0 git diff --no-index a ../non/git/a

 Clever.  I wouldn't use test_expect_code 0, since that's
 already implied by including the git diff --no-index call
 in the  chain.

Thanks, will change.  Thanks a lot for your review.  Will send a re-roll
soon.

 Thanks and hope that helps,
 Jonathan

--
Thomas
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Junio C Hamano
Thomas Gummerer t.gumme...@gmail.com writes:

 git diff --no-index ... currently reads the index, during setup, when
 calling gitmodules_config().  In the usual case this gives us some
 performance drawbacks, but it's especially annoying if there is a broken
 index file.

 Avoid calling the unnecessary gitmodules_config() when the --no-index
 option is given.  Also add a test to guard against similar breakages in the 
 future.

 Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
 ---
  builtin/diff.c   | 13 +++--
  t/t4053-diff-no-index.sh |  6 ++
  2 files changed, 17 insertions(+), 2 deletions(-)

 diff --git a/builtin/diff.c b/builtin/diff.c
 index adb93a9..47c0833 100644
 --- a/builtin/diff.c
 +++ b/builtin/diff.c
 @@ -257,7 +257,7 @@ int cmd_diff(int argc, const char **argv, const char 
 *prefix)
   int blobs = 0, paths = 0;
   const char *path = NULL;
   struct blobinfo blob[2];
 - int nongit;
 + int nongit, no_index = 0;
   int result = 0;
  
   /*
 @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char 
 *prefix)
*
* Other cases are errors.
*/
 + for (i = 1; i  argc; i++) {
 + if (!strcmp(argv[i], --))
 + break;
 + if (!strcmp(argv[i], --no-index)) {
 + no_index = 1;
 + break;
 + }
 + }

This seems to duplicate only half the logic at the beginning of
diff_no_index(), right?  E.g., running git diff /var/tmp/[12]
inside a working tree that is controlled by a Git repository when
/var/tmp/ is outside, we do want to behave as if the command line
were git diff --no-index /var/tmp/[12], but this half duplication
makes these two behave differently, no?

I think the issue you are trying to address is worth tackling, but I
wonder if a bit of preparatory refactoring is necessary to avoid the
partial duplication.

   prefix = setup_git_directory_gently(nongit);
 - gitmodules_config();
 + if (!no_index)
 + gitmodules_config();
   git_config(git_diff_ui_config, NULL);
  
   init_revisions(rev, prefix);
 diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
 index 979e983..a24ae4d 100755
 --- a/t/t4053-diff-no-index.sh
 +++ b/t/t4053-diff-no-index.sh
 @@ -29,4 +29,10 @@ test_expect_success 'git diff --no-index relative path 
 outside repo' '
   )
  '
  
 +test_expect_success 'git diff --no-index with broken index' '
 + cd repo 
 + echo broken .git/index 
 + test_expect_code 0 git diff --no-index a ../non/git/a
 +'
 +
  test_done
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Thomas Gummerer

[Sorry for not seeing this before sending out v2.]

Junio C Hamano gits...@pobox.com writes:

 Thomas Gummerer t.gumme...@gmail.com writes:

 git diff --no-index ... currently reads the index, during setup, when
 calling gitmodules_config().  In the usual case this gives us some
 performance drawbacks, but it's especially annoying if there is a broken
 index file.

 Avoid calling the unnecessary gitmodules_config() when the --no-index
 option is given.  Also add a test to guard against similar breakages in the 
 future.

 Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
 ---
  builtin/diff.c   | 13 +++--
  t/t4053-diff-no-index.sh |  6 ++
  2 files changed, 17 insertions(+), 2 deletions(-)

 diff --git a/builtin/diff.c b/builtin/diff.c
 index adb93a9..47c0833 100644
 --- a/builtin/diff.c
 +++ b/builtin/diff.c
 @@ -257,7 +257,7 @@ int cmd_diff(int argc, const char **argv, const char 
 *prefix)
  int blobs = 0, paths = 0;
  const char *path = NULL;
  struct blobinfo blob[2];
 -int nongit;
 +int nongit, no_index = 0;
  int result = 0;

  /*
 @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char 
 *prefix)
   *
   * Other cases are errors.
   */
 +for (i = 1; i  argc; i++) {
 +if (!strcmp(argv[i], --))
 +break;
 +if (!strcmp(argv[i], --no-index)) {
 +no_index = 1;
 +break;
 +}
 +}

 This seems to duplicate only half the logic at the beginning of
 diff_no_index(), right?  E.g., running git diff /var/tmp/[12]
 inside a working tree that is controlled by a Git repository when
 /var/tmp/ is outside, we do want to behave as if the command line
 were git diff --no-index /var/tmp/[12], but this half duplication
 makes these two behave differently, no?

Yes you're right, I missed that.  git diff /var/tmp/[12] inside a
working tree with a broken index has the same problems, which should be
fixed too.  I'll try to fix that and send a new patch afterwards.

 I think the issue you are trying to address is worth tackling, but I
 wonder if a bit of preparatory refactoring is necessary to avoid the
 partial duplication.

  prefix = setup_git_directory_gently(nongit);
 -gitmodules_config();
 +if (!no_index)
 +gitmodules_config();
  git_config(git_diff_ui_config, NULL);

  init_revisions(rev, prefix);
 diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
 index 979e983..a24ae4d 100755
 --- a/t/t4053-diff-no-index.sh
 +++ b/t/t4053-diff-no-index.sh
 @@ -29,4 +29,10 @@ test_expect_success 'git diff --no-index relative path 
 outside repo' '
  )
  '

 +test_expect_success 'git diff --no-index with broken index' '
 +cd repo 
 +echo broken .git/index 
 +test_expect_code 0 git diff --no-index a ../non/git/a
 +'
 +
  test_done

--
Thomas
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html