Re: [PATCH] diff: don't read index when --no-index is given
[Sorry for not seeing this before sending out v2.] Junio C Hamano writes: > Thomas Gummerer writes: > >> git diff --no-index ... currently reads the index, during setup, when >> calling gitmodules_config(). In the usual case this gives us some >> performance drawbacks, but it's especially annoying if there is a broken >> index file. >> >> Avoid calling the unnecessary gitmodules_config() when the --no-index >> option is given. Also add a test to guard against similar breakages in the >> future. >> >> Signed-off-by: Thomas Gummerer >> --- >> builtin/diff.c | 13 +++-- >> t/t4053-diff-no-index.sh | 6 ++ >> 2 files changed, 17 insertions(+), 2 deletions(-) >> >> diff --git a/builtin/diff.c b/builtin/diff.c >> index adb93a9..47c0833 100644 >> --- a/builtin/diff.c >> +++ b/builtin/diff.c >> @@ -257,7 +257,7 @@ int cmd_diff(int argc, const char **argv, const char >> *prefix) >> int blobs = 0, paths = 0; >> const char *path = NULL; >> struct blobinfo blob[2]; >> -int nongit; >> +int nongit, no_index = 0; >> int result = 0; >> >> /* >> @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char >> *prefix) >> * >> * Other cases are errors. >> */ >> +for (i = 1; i < argc; i++) { >> +if (!strcmp(argv[i], "--")) >> +break; >> +if (!strcmp(argv[i], "--no-index")) { >> +no_index = 1; >> +break; >> +} >> +} > > This seems to duplicate only half the logic at the beginning of > diff_no_index(), right? E.g., running "git diff /var/tmp/[12]" > inside a working tree that is controlled by a Git repository when > /var/tmp/ is outside, we do want to behave as if the command line > were "git diff --no-index /var/tmp/[12]", but this half duplication > makes these two behave differently, no? Yes you're right, I missed that. "git diff /var/tmp/[12]" inside a working tree with a broken index has the same problems, which should be fixed too. I'll try to fix that and send a new patch afterwards. > I think the issue you are trying to address is worth tackling, but I > wonder if a bit of preparatory refactoring is necessary to avoid the > partial duplication. > >> prefix = setup_git_directory_gently(&nongit); >> -gitmodules_config(); >> +if (!no_index) >> +gitmodules_config(); >> git_config(git_diff_ui_config, NULL); >> >> init_revisions(&rev, prefix); >> diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh >> index 979e983..a24ae4d 100755 >> --- a/t/t4053-diff-no-index.sh >> +++ b/t/t4053-diff-no-index.sh >> @@ -29,4 +29,10 @@ test_expect_success 'git diff --no-index relative path >> outside repo' ' >> ) >> ' >> >> +test_expect_success 'git diff --no-index with broken index' ' >> +cd repo && >> +echo broken >.git/index && >> +test_expect_code 0 git diff --no-index a ../non/git/a >> +' >> + >> test_done -- Thomas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] diff: don't read index when --no-index is given
Thomas Gummerer writes: > git diff --no-index ... currently reads the index, during setup, when > calling gitmodules_config(). In the usual case this gives us some > performance drawbacks, but it's especially annoying if there is a broken > index file. > > Avoid calling the unnecessary gitmodules_config() when the --no-index > option is given. Also add a test to guard against similar breakages in the > future. > > Signed-off-by: Thomas Gummerer > --- > builtin/diff.c | 13 +++-- > t/t4053-diff-no-index.sh | 6 ++ > 2 files changed, 17 insertions(+), 2 deletions(-) > > diff --git a/builtin/diff.c b/builtin/diff.c > index adb93a9..47c0833 100644 > --- a/builtin/diff.c > +++ b/builtin/diff.c > @@ -257,7 +257,7 @@ int cmd_diff(int argc, const char **argv, const char > *prefix) > int blobs = 0, paths = 0; > const char *path = NULL; > struct blobinfo blob[2]; > - int nongit; > + int nongit, no_index = 0; > int result = 0; > > /* > @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char > *prefix) >* >* Other cases are errors. >*/ > + for (i = 1; i < argc; i++) { > + if (!strcmp(argv[i], "--")) > + break; > + if (!strcmp(argv[i], "--no-index")) { > + no_index = 1; > + break; > + } > + } This seems to duplicate only half the logic at the beginning of diff_no_index(), right? E.g., running "git diff /var/tmp/[12]" inside a working tree that is controlled by a Git repository when /var/tmp/ is outside, we do want to behave as if the command line were "git diff --no-index /var/tmp/[12]", but this half duplication makes these two behave differently, no? I think the issue you are trying to address is worth tackling, but I wonder if a bit of preparatory refactoring is necessary to avoid the partial duplication. > prefix = setup_git_directory_gently(&nongit); > - gitmodules_config(); > + if (!no_index) > + gitmodules_config(); > git_config(git_diff_ui_config, NULL); > > init_revisions(&rev, prefix); > diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh > index 979e983..a24ae4d 100755 > --- a/t/t4053-diff-no-index.sh > +++ b/t/t4053-diff-no-index.sh > @@ -29,4 +29,10 @@ test_expect_success 'git diff --no-index relative path > outside repo' ' > ) > ' > > +test_expect_success 'git diff --no-index with broken index' ' > + cd repo && > + echo broken >.git/index && > + test_expect_code 0 git diff --no-index a ../non/git/a > +' > + > test_done -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] diff: don't read index when --no-index is given
Thomas Gummerer wrote: > For example I would take one > version, use test-dump-cache-tree to dump the cache tree to a file, > change the format slightly, use test-dump-cache-tree again, and check > the difference with "git diff --no-index". Makes a lot of sense. Thanks for explaining. Regards, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] diff: don't read index when --no-index is given
Jonathan Nieder writes: > Thomas Gummerer wrote: > >> git diff --no-index ... currently reads the index, during setup, when >> calling gitmodules_config(). In the usual case this gives us some >> performance drawbacks, > > Makes sense. > >>but it's especially annoying if there is a broken >> index file. > > Is this really a normal case? It makes sense that as a side-effect it > is easier to use "git diff --no-index" as a general-purpose tool while > investigating a broken repo, but I would have thought that quickly > learning a repo is broken is a good thing in any case. > > A little more information about the context where this came up would > be helpful, I guess. It came up while I was working on index-v5, where I had to investigate quite a few repositories where the index was broken, especially when I was changing the index format slightly. For example I would take one version, use test-dump-cache-tree to dump the cache tree to a file, change the format slightly, use test-dump-cache-tree again, and check the difference with "git diff --no-index". This might not be a very common use case, but maybe the patch might help someone else too. (In addition to the performance improvements) I'm not sure how much diff --no-index is used normally, but when the index is broken that would be detected relatively soon anyway. I'm not so worried about one more command working when it's broken. > [...] >> --- a/builtin/diff.c >> +++ b/builtin/diff.c > [...] >> @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char >> *prefix) >> * >> * Other cases are errors. >> */ >> +for (i = 1; i < argc; i++) { >> +if (!strcmp(argv[i], "--")) >> +break; >> +if (!strcmp(argv[i], "--no-index")) { >> +no_index = 1; >> +break; >> +} > > setup_revisions() uses the same logic that doesn't handle options that > take arguments in their "unstuck" form (e.g., "--word-diff-regex --"), > so this is probably not a regression, though I haven't checked. :) The same logic is used in diff_no_index(), so I think it should be fine. > [...] >> prefix = setup_git_directory_gently(&nongit); >> -gitmodules_config(); >> +if (!no_index) >> +gitmodules_config(); > > Perhaps we can improve performance and behavior by skipping the > setup_git_directory_gently() call, too? > > That would help with the repairing-broken-repository case by > working even if .git/config or .git/HEAD is broken, and I think > it is more intuitive that the repository-local configuration is > ignored by "git diff --no-index". It would also help with > performance by avoiding some filesystem access. Yes, I think that would make sense, thanks. I tested it, and it didn't change the performance, but it's still a good change for the broken repository case. Will change in the re-roll and add a test for that. > [...] >> +test_expect_success 'git diff --no-index with broken index' ' >> +cd repo && >> +echo broken >.git/index && >> +test_expect_code 0 git diff --no-index a ../non/git/a > > Clever. I wouldn't use "test_expect_code 0", since that's > already implied by including the "git diff --no-index" call > in the && chain. Thanks, will change. Thanks a lot for your review. Will send a re-roll soon. > Thanks and hope that helps, > Jonathan -- Thomas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] diff: don't read index when --no-index is given
Jens Lehmann wrote: > Am 09.12.2013 16:16, schrieb Jonathan Nieder: >> Thomas Gummerer wrote: >>> git diff --no-index ... currently reads the index, during setup, when >>> calling gitmodules_config(). In the usual case this gives us some >>> performance drawbacks, >> >> Makes sense. > > Hmm, but this will disable the submodule specific ignore configuration > options defined in the .gitmodules file, no? (E.g. when diffing two > directories containing submodules) Yes. That seems like a good behavior. "git diff --no-index" was invented as just a fancy version of 'diff -uR", without any awareness of the current git repository. That means that at least in principle, .gitmodules and .gitignore should not affect it. [...] > Wouldn't adding a "gently" option (which > could then warn instead of dying) to gitmodules_config() be a better > solution here? I don't think so. Thanks and hope that helps, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] diff: don't read index when --no-index is given
Am 09.12.2013 16:16, schrieb Jonathan Nieder: > Thomas Gummerer wrote: > >> git diff --no-index ... currently reads the index, during setup, when >> calling gitmodules_config(). In the usual case this gives us some >> performance drawbacks, > > Makes sense. Hmm, but this will disable the submodule specific ignore configuration options defined in the .gitmodules file, no? (E.g. when diffing two directories containing submodules) >>but it's especially annoying if there is a broken >> index file. > > Is this really a normal case? It makes sense that as a side-effect it > is easier to use "git diff --no-index" as a general-purpose tool while > investigating a broken repo, but I would have thought that quickly > learning a repo is broken is a good thing in any case. But I agree that dying with "index file corrupt" is a bit strange when calling diff with --no-index. Wouldn't adding a "gently" option (which could then warn instead of dying) to gitmodules_config() be a better solution here? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] diff: don't read index when --no-index is given
Thomas Gummerer wrote: > git diff --no-index ... currently reads the index, during setup, when > calling gitmodules_config(). In the usual case this gives us some > performance drawbacks, Makes sense. >but it's especially annoying if there is a broken > index file. Is this really a normal case? It makes sense that as a side-effect it is easier to use "git diff --no-index" as a general-purpose tool while investigating a broken repo, but I would have thought that quickly learning a repo is broken is a good thing in any case. A little more information about the context where this came up would be helpful, I guess. [...] > --- a/builtin/diff.c > +++ b/builtin/diff.c [...] > @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char > *prefix) >* >* Other cases are errors. >*/ > + for (i = 1; i < argc; i++) { > + if (!strcmp(argv[i], "--")) > + break; > + if (!strcmp(argv[i], "--no-index")) { > + no_index = 1; > + break; > + } setup_revisions() uses the same logic that doesn't handle options that take arguments in their "unstuck" form (e.g., "--word-diff-regex --"), so this is probably not a regression, though I haven't checked. :) [...] > prefix = setup_git_directory_gently(&nongit); > - gitmodules_config(); > + if (!no_index) > + gitmodules_config(); Perhaps we can improve performance and behavior by skipping the setup_git_directory_gently() call, too? That would help with the repairing-broken-repository case by working even if .git/config or .git/HEAD is broken, and I think it is more intuitive that the repository-local configuration is ignored by "git diff --no-index". It would also help with performance by avoiding some filesystem access. [...] > +test_expect_success 'git diff --no-index with broken index' ' > + cd repo && > + echo broken >.git/index && > + test_expect_code 0 git diff --no-index a ../non/git/a Clever. I wouldn't use "test_expect_code 0", since that's already implied by including the "git diff --no-index" call in the && chain. Thanks and hope that helps, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] diff: don't read index when --no-index is given
git diff --no-index ... currently reads the index, during setup, when calling gitmodules_config(). In the usual case this gives us some performance drawbacks, but it's especially annoying if there is a broken index file. Avoid calling the unnecessary gitmodules_config() when the --no-index option is given. Also add a test to guard against similar breakages in the future. Signed-off-by: Thomas Gummerer --- builtin/diff.c | 13 +++-- t/t4053-diff-no-index.sh | 6 ++ 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/builtin/diff.c b/builtin/diff.c index adb93a9..47c0833 100644 --- a/builtin/diff.c +++ b/builtin/diff.c @@ -257,7 +257,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix) int blobs = 0, paths = 0; const char *path = NULL; struct blobinfo blob[2]; - int nongit; + int nongit, no_index = 0; int result = 0; /* @@ -282,9 +282,18 @@ int cmd_diff(int argc, const char **argv, const char *prefix) * * Other cases are errors. */ + for (i = 1; i < argc; i++) { + if (!strcmp(argv[i], "--")) + break; + if (!strcmp(argv[i], "--no-index")) { + no_index = 1; + break; + } + } prefix = setup_git_directory_gently(&nongit); - gitmodules_config(); + if (!no_index) + gitmodules_config(); git_config(git_diff_ui_config, NULL); init_revisions(&rev, prefix); diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh index 979e983..a24ae4d 100755 --- a/t/t4053-diff-no-index.sh +++ b/t/t4053-diff-no-index.sh @@ -29,4 +29,10 @@ test_expect_success 'git diff --no-index relative path outside repo' ' ) ' +test_expect_success 'git diff --no-index with broken index' ' + cd repo && + echo broken >.git/index && + test_expect_code 0 git diff --no-index a ../non/git/a +' + test_done -- 1.8.4.2 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html