Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Thomas Gummerer

[Sorry for not seeing this before sending out v2.]

Junio C Hamano  writes:

> Thomas Gummerer  writes:
>
>> git diff --no-index ... currently reads the index, during setup, when
>> calling gitmodules_config().  In the usual case this gives us some
>> performance drawbacks, but it's especially annoying if there is a broken
>> index file.
>>
>> Avoid calling the unnecessary gitmodules_config() when the --no-index
>> option is given.  Also add a test to guard against similar breakages in the 
>> future.
>>
>> Signed-off-by: Thomas Gummerer 
>> ---
>>  builtin/diff.c   | 13 +++--
>>  t/t4053-diff-no-index.sh |  6 ++
>>  2 files changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/builtin/diff.c b/builtin/diff.c
>> index adb93a9..47c0833 100644
>> --- a/builtin/diff.c
>> +++ b/builtin/diff.c
>> @@ -257,7 +257,7 @@ int cmd_diff(int argc, const char **argv, const char 
>> *prefix)
>>  int blobs = 0, paths = 0;
>>  const char *path = NULL;
>>  struct blobinfo blob[2];
>> -int nongit;
>> +int nongit, no_index = 0;
>>  int result = 0;
>>
>>  /*
>> @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char 
>> *prefix)
>>   *
>>   * Other cases are errors.
>>   */
>> +for (i = 1; i < argc; i++) {
>> +if (!strcmp(argv[i], "--"))
>> +break;
>> +if (!strcmp(argv[i], "--no-index")) {
>> +no_index = 1;
>> +break;
>> +}
>> +}
>
> This seems to duplicate only half the logic at the beginning of
> diff_no_index(), right?  E.g., running "git diff /var/tmp/[12]"
> inside a working tree that is controlled by a Git repository when
> /var/tmp/ is outside, we do want to behave as if the command line
> were "git diff --no-index /var/tmp/[12]", but this half duplication
> makes these two behave differently, no?

Yes you're right, I missed that.  "git diff /var/tmp/[12]" inside a
working tree with a broken index has the same problems, which should be
fixed too.  I'll try to fix that and send a new patch afterwards.

> I think the issue you are trying to address is worth tackling, but I
> wonder if a bit of preparatory refactoring is necessary to avoid the
> partial duplication.
>
>>  prefix = setup_git_directory_gently(&nongit);
>> -gitmodules_config();
>> +if (!no_index)
>> +gitmodules_config();
>>  git_config(git_diff_ui_config, NULL);
>>
>>  init_revisions(&rev, prefix);
>> diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
>> index 979e983..a24ae4d 100755
>> --- a/t/t4053-diff-no-index.sh
>> +++ b/t/t4053-diff-no-index.sh
>> @@ -29,4 +29,10 @@ test_expect_success 'git diff --no-index relative path 
>> outside repo' '
>>  )
>>  '
>>
>> +test_expect_success 'git diff --no-index with broken index' '
>> +cd repo &&
>> +echo broken >.git/index &&
>> +test_expect_code 0 git diff --no-index a ../non/git/a
>> +'
>> +
>>  test_done

--
Thomas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Junio C Hamano
Thomas Gummerer  writes:

> git diff --no-index ... currently reads the index, during setup, when
> calling gitmodules_config().  In the usual case this gives us some
> performance drawbacks, but it's especially annoying if there is a broken
> index file.
>
> Avoid calling the unnecessary gitmodules_config() when the --no-index
> option is given.  Also add a test to guard against similar breakages in the 
> future.
>
> Signed-off-by: Thomas Gummerer 
> ---
>  builtin/diff.c   | 13 +++--
>  t/t4053-diff-no-index.sh |  6 ++
>  2 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/diff.c b/builtin/diff.c
> index adb93a9..47c0833 100644
> --- a/builtin/diff.c
> +++ b/builtin/diff.c
> @@ -257,7 +257,7 @@ int cmd_diff(int argc, const char **argv, const char 
> *prefix)
>   int blobs = 0, paths = 0;
>   const char *path = NULL;
>   struct blobinfo blob[2];
> - int nongit;
> + int nongit, no_index = 0;
>   int result = 0;
>  
>   /*
> @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char 
> *prefix)
>*
>* Other cases are errors.
>*/
> + for (i = 1; i < argc; i++) {
> + if (!strcmp(argv[i], "--"))
> + break;
> + if (!strcmp(argv[i], "--no-index")) {
> + no_index = 1;
> + break;
> + }
> + }

This seems to duplicate only half the logic at the beginning of
diff_no_index(), right?  E.g., running "git diff /var/tmp/[12]"
inside a working tree that is controlled by a Git repository when
/var/tmp/ is outside, we do want to behave as if the command line
were "git diff --no-index /var/tmp/[12]", but this half duplication
makes these two behave differently, no?

I think the issue you are trying to address is worth tackling, but I
wonder if a bit of preparatory refactoring is necessary to avoid the
partial duplication.

>   prefix = setup_git_directory_gently(&nongit);
> - gitmodules_config();
> + if (!no_index)
> + gitmodules_config();
>   git_config(git_diff_ui_config, NULL);
>  
>   init_revisions(&rev, prefix);
> diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
> index 979e983..a24ae4d 100755
> --- a/t/t4053-diff-no-index.sh
> +++ b/t/t4053-diff-no-index.sh
> @@ -29,4 +29,10 @@ test_expect_success 'git diff --no-index relative path 
> outside repo' '
>   )
>  '
>  
> +test_expect_success 'git diff --no-index with broken index' '
> + cd repo &&
> + echo broken >.git/index &&
> + test_expect_code 0 git diff --no-index a ../non/git/a
> +'
> +
>  test_done
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Jonathan Nieder
Thomas Gummerer wrote:

>  For example I would take one
> version, use test-dump-cache-tree to dump the cache tree to a file,
> change the format slightly, use test-dump-cache-tree again, and check
> the difference with "git diff --no-index".

Makes a lot of sense.  Thanks for explaining.

Regards,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Thomas Gummerer
Jonathan Nieder  writes:

> Thomas Gummerer wrote:
>
>> git diff --no-index ... currently reads the index, during setup, when
>> calling gitmodules_config().  In the usual case this gives us some
>> performance drawbacks,
>
> Makes sense.
>
>>but it's especially annoying if there is a broken
>> index file.
>
> Is this really a normal case?  It makes sense that as a side-effect it
> is easier to use "git diff --no-index" as a general-purpose tool while
> investigating a broken repo, but I would have thought that quickly
> learning a repo is broken is a good thing in any case.
>
> A little more information about the context where this came up would
> be helpful, I guess.

It came up while I was working on index-v5, where I had to investigate
quite a few repositories where the index was broken, especially when I
was changing the index format slightly.  For example I would take one
version, use test-dump-cache-tree to dump the cache tree to a file,
change the format slightly, use test-dump-cache-tree again, and check
the difference with "git diff --no-index".

This might not be a very common use case, but maybe the patch might help
someone else too. (In addition to the performance improvements)

I'm not sure how much diff --no-index is used normally, but when the
index is broken that would be detected relatively soon anyway.  I'm not
so worried about one more command working when it's broken.

> [...]
>> --- a/builtin/diff.c
>> +++ b/builtin/diff.c
> [...]
>> @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char 
>> *prefix)
>>   *
>>   * Other cases are errors.
>>   */
>> +for (i = 1; i < argc; i++) {
>> +if (!strcmp(argv[i], "--"))
>> +break;
>> +if (!strcmp(argv[i], "--no-index")) {
>> +no_index = 1;
>> +break;
>> +}
>
> setup_revisions() uses the same logic that doesn't handle options that
> take arguments in their "unstuck" form (e.g., "--word-diff-regex --"),
> so this is probably not a regression, though I haven't checked. :)

The same logic is used in diff_no_index(), so I think it should be fine.

> [...]
>>  prefix = setup_git_directory_gently(&nongit);
>> -gitmodules_config();
>> +if (!no_index)
>> +gitmodules_config();
>
> Perhaps we can improve performance and behavior by skipping the
> setup_git_directory_gently() call, too?
>
> That would help with the repairing-broken-repository case by
> working even if .git/config or .git/HEAD is broken, and I think
> it is more intuitive that the repository-local configuration is
> ignored by "git diff --no-index".  It would also help with
> performance by avoiding some filesystem access.

Yes, I think that would make sense, thanks.  I tested it, and it didn't
change the performance, but it's still a good change for the broken
repository case.  Will change in the re-roll and add a test for that.

> [...]
>> +test_expect_success 'git diff --no-index with broken index' '
>> +cd repo &&
>> +echo broken >.git/index &&
>> +test_expect_code 0 git diff --no-index a ../non/git/a
>
> Clever.  I wouldn't use "test_expect_code 0", since that's
> already implied by including the "git diff --no-index" call
> in the && chain.

Thanks, will change.  Thanks a lot for your review.  Will send a re-roll
soon.

> Thanks and hope that helps,
> Jonathan

--
Thomas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Jonathan Nieder
Jens Lehmann wrote:
> Am 09.12.2013 16:16, schrieb Jonathan Nieder:
>> Thomas Gummerer wrote:

>>> git diff --no-index ... currently reads the index, during setup, when
>>> calling gitmodules_config().  In the usual case this gives us some
>>> performance drawbacks,
>>
>> Makes sense.
>
> Hmm, but this will disable the submodule specific ignore configuration
> options defined in the .gitmodules file, no? (E.g. when diffing two
> directories containing submodules)

Yes.  That seems like a good behavior.

"git diff --no-index" was invented as just a fancy version of 'diff
-uR", without any awareness of the current git repository.  That means
that at least in principle, .gitmodules and .gitignore should not
affect it.

[...]
>   Wouldn't adding a "gently" option (which
> could then warn instead of dying) to gitmodules_config() be a better
> solution here?

I don't think so.

Thanks and hope that helps,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Jens Lehmann
Am 09.12.2013 16:16, schrieb Jonathan Nieder:
> Thomas Gummerer wrote:
> 
>> git diff --no-index ... currently reads the index, during setup, when
>> calling gitmodules_config().  In the usual case this gives us some
>> performance drawbacks,
> 
> Makes sense.

Hmm, but this will disable the submodule specific ignore configuration
options defined in the .gitmodules file, no? (E.g. when diffing two
directories containing submodules)

>>but it's especially annoying if there is a broken
>> index file.
>
> Is this really a normal case?  It makes sense that as a side-effect it
> is easier to use "git diff --no-index" as a general-purpose tool while
> investigating a broken repo, but I would have thought that quickly
> learning a repo is broken is a good thing in any case.

But I agree that dying with "index file corrupt" is a bit strange when
calling diff with --no-index. Wouldn't adding a "gently" option (which
could then warn instead of dying) to gitmodules_config() be a better
solution here?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Jonathan Nieder
Thomas Gummerer wrote:

> git diff --no-index ... currently reads the index, during setup, when
> calling gitmodules_config().  In the usual case this gives us some
> performance drawbacks,

Makes sense.

>but it's especially annoying if there is a broken
> index file.

Is this really a normal case?  It makes sense that as a side-effect it
is easier to use "git diff --no-index" as a general-purpose tool while
investigating a broken repo, but I would have thought that quickly
learning a repo is broken is a good thing in any case.

A little more information about the context where this came up would
be helpful, I guess.

[...]
> --- a/builtin/diff.c
> +++ b/builtin/diff.c
[...]
> @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char 
> *prefix)
>*
>* Other cases are errors.
>*/
> + for (i = 1; i < argc; i++) {
> + if (!strcmp(argv[i], "--"))
> + break;
> + if (!strcmp(argv[i], "--no-index")) {
> + no_index = 1;
> + break;
> + }

setup_revisions() uses the same logic that doesn't handle options that
take arguments in their "unstuck" form (e.g., "--word-diff-regex --"),
so this is probably not a regression, though I haven't checked. :)

[...]
>   prefix = setup_git_directory_gently(&nongit);
> - gitmodules_config();
> + if (!no_index)
> + gitmodules_config();

Perhaps we can improve performance and behavior by skipping the
setup_git_directory_gently() call, too?

That would help with the repairing-broken-repository case by
working even if .git/config or .git/HEAD is broken, and I think
it is more intuitive that the repository-local configuration is
ignored by "git diff --no-index".  It would also help with
performance by avoiding some filesystem access.

[...]
> +test_expect_success 'git diff --no-index with broken index' '
> + cd repo &&
> + echo broken >.git/index &&
> + test_expect_code 0 git diff --no-index a ../non/git/a

Clever.  I wouldn't use "test_expect_code 0", since that's
already implied by including the "git diff --no-index" call
in the && chain.

Thanks and hope that helps,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] diff: don't read index when --no-index is given

2013-12-09 Thread Thomas Gummerer
git diff --no-index ... currently reads the index, during setup, when
calling gitmodules_config().  In the usual case this gives us some
performance drawbacks, but it's especially annoying if there is a broken
index file.

Avoid calling the unnecessary gitmodules_config() when the --no-index
option is given.  Also add a test to guard against similar breakages in the 
future.

Signed-off-by: Thomas Gummerer 
---
 builtin/diff.c   | 13 +++--
 t/t4053-diff-no-index.sh |  6 ++
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index adb93a9..47c0833 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -257,7 +257,7 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix)
int blobs = 0, paths = 0;
const char *path = NULL;
struct blobinfo blob[2];
-   int nongit;
+   int nongit, no_index = 0;
int result = 0;
 
/*
@@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char 
*prefix)
 *
 * Other cases are errors.
 */
+   for (i = 1; i < argc; i++) {
+   if (!strcmp(argv[i], "--"))
+   break;
+   if (!strcmp(argv[i], "--no-index")) {
+   no_index = 1;
+   break;
+   }
+   }
 
prefix = setup_git_directory_gently(&nongit);
-   gitmodules_config();
+   if (!no_index)
+   gitmodules_config();
git_config(git_diff_ui_config, NULL);
 
init_revisions(&rev, prefix);
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 979e983..a24ae4d 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -29,4 +29,10 @@ test_expect_success 'git diff --no-index relative path 
outside repo' '
)
 '
 
+test_expect_success 'git diff --no-index with broken index' '
+   cd repo &&
+   echo broken >.git/index &&
+   test_expect_code 0 git diff --no-index a ../non/git/a
+'
+
 test_done
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html