NOTE: This series builds on sg/clean-nested-repo-with-ignored, as it
(among other things) modifies his testcase from expect_failure
to expect_success. Also, Peff is probably the only one who
remembers v1 (and even he may have forgotten it): v1 was posted
a year and a half ago.
This patch series fixes a few issues with git-clean:
* Failure to clean when multiple pathspecs are specified, reported both
in April 2018[1] and again in May 2019[2].
* Failure to preserve both tracked and untracked files within a nested
Git repository reported a few weeks ago by SZEDER[3].
[1] https://public-inbox.org/git/[email protected]/
[2] https://public-inbox.org/git/[email protected]/
[3] https://public-inbox.org/git/[email protected]/
Changes since v2:
* Removed the RFC label
* Fixed up a few things SZEDER pointed out in v2 -- some commit
message improvements, and an extra safety precaution for the
final patch.
Stuff I'd like reviewer to focus on:
* Read the (long) commit messages for patches 9 & 10; do folks agree
with my declaration of "correct" behavior and my changes to the
few regression tests in those patches?
Elijah Newren (12):
t7300: add testcases showing failure to clean specified pathspecs
dir: fix typo in comment
dir: fix off-by-one error in match_pathspec_item
dir: also check directories for matching pathspecs
dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule
case
dir: if our pathspec might match files under a dir, recurse into it
dir: add commentary explaining match_pathspec_item's return value
git-clean.txt: do not claim we will delete files with -n/--dry-run
clean: disambiguate the definition of -d
clean: avoid removing untracked files in a nested git repository
clean: rewrap overly long line
clean: fix theoretical path corruption
Documentation/git-clean.txt | 16 +++++-----
builtin/clean.c | 15 +++++++--
dir.c | 63 +++++++++++++++++++++++++++----------
dir.h | 8 +++--
t/t7300-clean.sh | 44 +++++++++++++++++++++++---
5 files changed, 112 insertions(+), 34 deletions(-)
Range-diff:
1: 82328e2033 ! 1: fe35ab8cc3 t7300: Add some testcases showing failure to
clean specified pathspecs
@@ Metadata
Author: Elijah Newren <[email protected]>
## Commit message ##
- t7300: Add some testcases showing failure to clean specified pathspecs
+ t7300: add testcases showing failure to clean specified pathspecs
Someone brought me a testcase where multiple git-clean invocations were
required to clean out unwanted files:
2: 5c1f58fd9d = 2: 707d287d79 dir: fix typo in comment
3: 0e8b415af3 = 3: bb316e82b2 dir: fix off-by-one error in
match_pathspec_item
4: 30b3ede443 ! 4: 56319f934a dir: Directories should be checked for
matching pathspecs too
@@ Metadata
Author: Elijah Newren <[email protected]>
## Commit message ##
- dir: Directories should be checked for matching pathspecs too
+ dir: also check directories for matching pathspecs
Even if a directory doesn't match a pathspec, it is possible, depending
on the precise pathspecs, that some file underneath it might. So we
5: bab01f4cda ! 5: 81593a565c dir: Make the DO_MATCH_SUBMODULE code
reusable for a non-submodule case
@@ Metadata
Author: Elijah Newren <[email protected]>
## Commit message ##
- dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case
+ dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case
The specific checks done in match_pathspec_item for the
DO_MATCH_SUBMODULE
case are useful for other cases which have nothing to do with
submodules.
6: c619ab4b3e ! 6: 9566823a0f dir: If our pathspec might match files under
a dir, recurse into it
@@ Metadata
Author: Elijah Newren <[email protected]>
## Commit message ##
- dir: If our pathspec might match files under a dir, recurse into it
+ dir: if our pathspec might match files under a dir, recurse into it
For git clean, if a directory is entirely untracked and the user did
not
specify -d (corresponding to DIR_SHOW_IGNORED_TOO), then we usually do
@@ Commit message
the given directory which match one of the pathspecs being added to the
entries list.
- Two particular considerations for this patch:
+ Two notes of potential interest to future readers:
- * If we want to only recurse into a directory when it is specifically
+ * If we wanted to only recurse into a directory when it is
specifically
matched rather than matched-via-glob (e.g. '*.c'), then we could do
so via making the final non-zero return in match_pathspec_item be
MATCHED_RECURSIVELY instead of
MATCHED_RECURSIVELY_LEADING_PATHSPEC.
- (See final patch of this RFC series for details; note that the
- relative order of MATCHED_RECURSIVELY_LEADING_PATHSPEC and
- MATCHED_RECURSIVELY are important for such a change.))
+ (Note that the relative order of
MATCHED_RECURSIVELY_LEADING_PATHSPEC
+ and MATCHED_RECURSIVELY are important for such a change.) I was
+ leaving open that possibility while writing an RFC asking for the
+ behavior we want, but even though we don't want it, that knowledge
+ might help you understand the code flow better.
* There is a growing amount of logic in read_directory_recursive()
for
- deciding whether to recurse into a subdirectory. However, there is
- a comment immediately preceding this logic that says to recurse if
+ deciding whether to recurse into a subdirectory. However, there
is a
+ comment immediately preceding this logic that says to recurse if
instructed by treat_path(). It may be better for the logic in
- read_directory_recursive() to be moved to treat_path() (or another
- function it calls, such as treat_directory()), but I did not feel
- strongly about this and just left the logic where it was while
- adding to it. Do others have strong opinions on this?
+ read_directory_recursive() to ultimately be moved to treat_path()
(or
+ another function it calls, such as treat_directory()), but I have
+ left that for someone else to tackle in the future.
Signed-off-by: Elijah Newren <[email protected]>
7: 5e1686a30e = 7: 7821898ba7 dir: add commentary explaining
match_pathspec_item's return value
8: d92637f961 = 8: 13def5df57 git-clean.txt: do not claim we will delete
files with -n/--dry-run
9: 4550a30df8 = 9: e6b274abf7 clean: disambiguate the definition of -d
10: 0985be2793 ! 10: 5f4ef14765 clean: avoid removing untracked files in a
nested git repository
@@ Commit message
Also, clean up some grammar errors describing this functionality in the
git-clean manpage.
- Finally, there is one somewhat related bug which this patch does not
- fix, coming from the opposite angle. If the user runs
- git clean -ffd
- to force deletion of untracked nested repositories, and within an
- untracked nested repo the user has ignored files (according to the
inner
- OR outer repositories' .gitignore), then not only will those ignored
- files be left alone but the .git/ subdirectory of the nested repo will
- be left alone too. I am not completely sure if this should be
- considered a bug (though it seems like it since the lack of the
- untracked file would result in the .git/ subdirectory being deleted),
- but in any event it is very minor compared to accidentally deleting
user
- data and I did not dive into it.
+ Finally, there are still a couple bugs with -ffd not cleaning out
enough
+ (e.g. missing the nested .git) and with -ffdX possibly cleaning out
the
+ wrong files (paying attention to outer .gitignore instead of inner).
+ This patch does not address these cases at all (and does not change the
+ behavior relative to those flags), it only fixes the handling when
given
+ a single -f. See
+ https://public-inbox.org/git/[email protected]/ for
more
+ discussion of the -ffd[X?] bugs.
Signed-off-by: Elijah Newren <[email protected]>
11: 2d36e3f7cb = 11: 4e30e62eb1 clean: rewrap overly long line
12: 3c5d1ff16a < -: ---------- clean: fix theoretical path corruption
-: ---------- > 12: de2444f7cb clean: fix theoretical path corruption
--
2.23.0.173.gad11b3a635.dirty