Re: [PATCH v1 1/1] dir: teach status to show ignored directories
On Fri, Aug 11, 2017 at 10:48 AM, Jameson Miller wrote: > The set of files listed by "--ignored" changes when different > values are given to "--untracked-files". If would be nice to > be able to make the ignored output independent of the untracked > settings. This patch attempts to do that while maintaining > backward compatibility with the existing behavior. So after rereading this, maybe we could have '--ignored[=as-untracked, new-mode]' with as-untracked as default? This would satisfy your goal, while still maintaining backwards compatibility. > We want to see the paths that explicitly match an ignore pattern. > This means that if a directory only contains ignored files, but the > directory itself is not explicitly ignored, then we want to see the > individual files. This is slightly different than the current behavior > of "--ignored". > > I am open to suggestions on how to present the options to control > this behavior. I would strongly favor not introducing yet another flag to tweak the behavior of another command line option. (The UX of Git is already bloated, IMHO. But that is just my general stance on things) So we'd have two proposals: (1) The current patch proposes --show-ignored-directory that modifies the behavior of other flags (2) extend an already existing option to take another mode (which I propose) Strong arguments for (1) would be if the flag is orthogonal and apply to the other flags in a uniform way, i.e.: status --no-ignored --untracked=no --show-ignored-directory would make sense and produce an output that a user doesn't find confusing. Arguments for (2), are * less documentation for users (see below) * if the two flags are truly orthogonal, test effort is decreased * no exposure of an API that may yield strange results when the user gives strange input. >>> This >>> change exposes a flag to report all untracked files while not showing >>> individual files in ignored directories. >> >> By the description up to here, it sounds as if you want to introduce >> mode for --untracked, e.g. "normal-adjusted-for-ignored" (it's a bad >> suggestion)? However the patch seems to add an orthogonal flag, >> such that >> >>status --no-ignored --untracked=no --show-ignored-directory >> >> would also be possible. What is a reasonable expectation for >> the output of such? > > > The current patch does add another flag. This flag only has meaning if > the "--ignored" and "--untracked=all" flags are also specified. Another > option I had considered is to let the "--ignored" flag take an argument. I would think that is a better solution, as then untracked and ignored files (and their parameters) will be orthogonal in the long run. Things like --untracked=all --ignored=none --untracked=none --ignored=only-dirs --untracked=reasonable-default --ignored=other-reasonable-default will be possible and eventually the default for ignored files may be independent of untracked settings. > Then, we could express this new behavior through (for example) a > "--ignored=exact" flag to reflect the fact that this new option > returns paths that match the ignore pattern, and does not > collapse directories that contain only ignored files. Yes, and we would not need to worry about how the new flag interacts with other flags. I would prefer that solution as it seems the API will be cleaner that way. Less things depend on each other, e.g. as a user I do not need to know about the 'other' flag, be it ignored or untracked to solve my problem. If we'd go with this third flag, we'd need to (a) document it, but (b) also have to add a sentence to untracked and ignored flag "In the corner case of also having the third flag, we change behavior once again" which I'd dislike from a users perspective. >>> Commands: >>> 1) status >>> 2) status --ignored >>> 3) status --ignored --untracked-files=all >>> 4) status --ignored --untracked-files=all --show-ignored-directory >>> (2) is --untracked-files=normal I'd presume, such that this flag >>> can be understood as a tweak to "normal" based on the similar size >>> between 2 and 4? (The timing improvement from 2 to 4 is huge though). > > (2) is --untracked-files=normal. Although the count of ignored > files similar between 2 and 4, I consider this flag more of a > tweak on 3, as we want the untracked files reported with > the "--untracked=all" flag. The counts between 2 and 4 are > similar in this case because most of the ignored files are > contained in ignored directories. > > Our application calls status including the following flags: > > --porcelain=v2 --ignored --untracked-files=all --ignore-submodules=none > > This means we have bad performance compared to just "git status" > when there is a large number of files in ignored directories > With this new behavior, our application would move from case 3 to > case 4 for this repository. > > You also point out the timing difference between case 2 and 4. I > think there is an optimiza
Re: [PATCH v1 1/1] dir: teach status to show ignored directories
On 08/11/2017 01:39 PM, Brandon Williams wrote: On 08/10, Jameson Miller wrote: Teach Git to optionally show ignored directories when showing all untracked files. The git status command exposes the options to report ignored and/or untracked files. However, when reporting all untracked files (--untracked-files=all), all individual ignored files are reported as well. It is not currently possible to get the reporting behavior of the --ignored flag, while also reporting all untracked files. This change exposes a flag to report all untracked files while not showing individual files in ignored directories. Motivation: Our application (Visual Studio) needs all untracked files listed individually, but does not need all ignored files listed individually. Reporting all ignored files can affect the time it takes for status to run. For a representative repository, here are some measurements showing a large perf improvement for this scenario: | Command | Reported ignored entries | Time (s) | | --- | | | | 1 | 0| 1.3 | | 2 | 1024 | 4.2 | | 3 | 174904 | 7.5 | | 4 | 1046 | 1.6 | Commands: 1) status 2) status --ignored 3) status --ignored --untracked-files=all 4) status --ignored --untracked-files=all --show-ignored-directory This changes exposes a --show-ignored-directory flag to the git status command. This flag is utilized when running git status with the --ignored and --untracked-files options to not list ignored individual ignored files contained in directories that match an ignore pattern. I can't help feeling that there is a better way express this with a better UI. I'm not saying this is wrong, I'm just not sure how --show-ignored-directory would work when not paired with --ignored and --untracked-files. Does it require --ignored to also be given? Yes. This flag only has meaning when --ignored and --untracked=all are specified. I am open to other suggestions on how to express this. Another option might be to modify the "--ignored" flag to take an argument, which would allow more explicit control over how ignored paths are reported. This way, we could specify (for example) "--ignored=explicit" to control the output of ignored paths. Part of the perf improvement comes from the tweak to read_directory_recursive to stop scanning the file system after it encounters the first file. When a directory is ignored, all it needs to determine is if the directory is empty or not. The logic currently keeps scanning the file system until it finds an untracked file. However, as the directory is ignored, all the contained contents are also marked excluded. For ignored directories that contain a large number of files, this can take some time. Signed-off-by: Jameson Miller
Re: [PATCH v1 1/1] dir: teach status to show ignored directories
On 08/10/2017 04:03 PM, Stefan Beller wrote: On Thu, Aug 10, 2017 at 11:49 AM, Jameson Miller wrote: Welcome to the Git mailing list. :) Thank you for the welcome and the review! I will include the suggested code changes in the next patch version. Teach Git to optionally show ignored directories when showing all untracked files. The git status command exposes the options to report ignored and/or untracked files. However, when reporting all untracked files (--untracked-files=all), all individual ignored files are reported as well. It is not currently possible to get the reporting behavior of the --ignored flag, while also reporting all untracked files. Trying to understand this based off the documentation for --untracked=all and --ignored, I realize that the documentation for --ignored seems to be lacking as I do not understand what the --ignored behavior is in combination with --untracked=[all, normal, no] The set of files listed by "--ignored" changes when different values are given to "--untracked-files". If would be nice to be able to make the ignored output independent of the untracked settings. This patch attempts to do that while maintaining backward compatibility with the existing behavior. When "--ignored" is used by itself ("--untracked-files=normal"), ignored directories (both directories that explicitly match a directory ignore pattern -and- directories containing only ignored files) are listed. Individual files within are *not* listed. When "--ignored" is used with "--untracked-files=all", git always lists the individual files within the ignored directories and DOES NOT collapse the output to just the containing (explicitly or implicitly ignored) directory. This can cause a massive performance problem when there are a lot of ignored files in a well-defined set of ignored directories. For example, on Windows, Visual Studio creates a bin/ and obj/ directory inside your project where it writes all .obj files. Normal usage is to explicitly ignore those 2 directory names in your .gitignores (rather than or in addition to *.obj). We just want to see the "bin/" and "obj/" paths regardless of which "--untracked" flag is passed in. We want to see the paths that explicitly match an ignore pattern. This means that if a directory only contains ignored files, but the directory itself is not explicitly ignored, then we want to see the individual files. This is slightly different than the current behavior of "--ignored". I am open to suggestions on how to present the options to control this behavior. This change exposes a flag to report all untracked files while not showing individual files in ignored directories. By the description up to here, it sounds as if you want to introduce mode for --untracked, e.g. "normal-adjusted-for-ignored" (it's a bad suggestion)? However the patch seems to add an orthogonal flag, such that status --no-ignored --untracked=no --show-ignored-directory would also be possible. What is a reasonable expectation for the output of such? The current patch does add another flag. This flag only has meaning if the "--ignored" and "--untracked=all" flags are also specified. Another option I had considered is to let the "--ignored" flag take an argument. Then, we could express this new behavior through (for example) a "--ignored=exact" flag to reflect the fact that this new option returns paths that match the ignore pattern, and does not collapse directories that contain only ignored files. Motivation: Our application (Visual Studio) needs all untracked files listed individually, but does not need all ignored files listed individually. For parsing output, I would strongly recommend --porcelain[=2], but that is a tangent. Reporting all ignored files can affect the time it takes for status to run. For a representative repository, here are some measurements showing a large perf improvement for this scenario: | Command | Reported ignored entries | Time (s) | | --- | | | | 1 | 0| 1.3 | | 2 | 1024 | 4.2 | | 3 | 174904 | 7.5 | | 4 | 1046 | 1.6 | Commands: 1) status 2) status --ignored 3) status --ignored --untracked-files=all 4) status --ignored --untracked-files=all --show-ignored-directory (2) is --untracked-files=normal I'd presume, such that this flag can be understood as a tweak to "normal" based on the similar size between 2 and 4? (The timing improvement from 2 to 4 is huge though). (2) is --untracked-files=normal. Although the count of ignored files similar between 2 and 4, I consider this flag more of a tweak on 3, as we want the untracked files reported with the "--untracked=all" flag. The counts between 2 and 4 are similar in this case because most of the ignored files are contained in ignored directories. Our application calls status including the following flags: --porcelain=
Re: [PATCH v1 1/1] dir: teach status to show ignored directories
On 08/10, Jameson Miller wrote: > Teach Git to optionally show ignored directories when showing all > untracked files. The git status command exposes the options to report > ignored and/or untracked files. However, when reporting all untracked > files (--untracked-files=all), all individual ignored files are reported > as well. It is not currently possible to get the reporting behavior of > the --ignored flag, while also reporting all untracked files. This > change exposes a flag to report all untracked files while not showing > individual files in ignored directories. > > Motivation: > Our application (Visual Studio) needs all untracked files listed > individually, but does not need all ignored files listed individually. > Reporting all ignored files can affect the time it takes for status > to run. For a representative repository, here are some measurements > showing a large perf improvement for this scenario: > > | Command | Reported ignored entries | Time (s) | > | --- | | | > | 1 | 0| 1.3 | > | 2 | 1024 | 4.2 | > | 3 | 174904 | 7.5 | > | 4 | 1046 | 1.6 | > > Commands: > 1) status > 2) status --ignored > 3) status --ignored --untracked-files=all > 4) status --ignored --untracked-files=all --show-ignored-directory > > This changes exposes a --show-ignored-directory flag to the git status > command. This flag is utilized when running git status with the > --ignored and --untracked-files options to not list ignored individual > ignored files contained in directories that match an ignore pattern. I can't help feeling that there is a better way express this with a better UI. I'm not saying this is wrong, I'm just not sure how --show-ignored-directory would work when not paired with --ignored and --untracked-files. Does it require --ignored to also be given? > > Part of the perf improvement comes from the tweak to > read_directory_recursive to stop scanning the file system after it > encounters the first file. When a directory is ignored, all it needs to > determine is if the directory is empty or not. The logic currently keeps > scanning the file system until it finds an untracked file. However, as > the directory is ignored, all the contained contents are also marked > excluded. For ignored directories that contain a large number of files, > this can take some time. > > Signed-off-by: Jameson Miller -- Brandon Williams
Re: [PATCH v1 1/1] dir: teach status to show ignored directories
On Thu, Aug 10, 2017 at 11:49 AM, Jameson Miller wrote: Welcome to the Git mailing list. :) > Teach Git to optionally show ignored directories when showing all > untracked files. The git status command exposes the options to report > ignored and/or untracked files. However, when reporting all untracked > files (--untracked-files=all), all individual ignored files are reported > as well. It is not currently possible to get the reporting behavior of > the --ignored flag, while also reporting all untracked files. Trying to understand this based off the documentation for --untracked=all and --ignored, I realize that the documentation for --ignored seems to be lacking as I do not understand what the --ignored behavior is in combination with --untracked=[all, normal, no] > This > change exposes a flag to report all untracked files while not showing > individual files in ignored directories. By the description up to here, it sounds as if you want to introduce mode for --untracked, e.g. "normal-adjusted-for-ignored" (it's a bad suggestion)? However the patch seems to add an orthogonal flag, such that status --no-ignored --untracked=no --show-ignored-directory would also be possible. What is a reasonable expectation for the output of such? > Motivation: > Our application (Visual Studio) needs all untracked files listed > individually, but does not need all ignored files listed individually. For parsing output, I would strongly recommend --porcelain[=2], but that is a tangent. > Reporting all ignored files can affect the time it takes for status > to run. For a representative repository, here are some measurements > showing a large perf improvement for this scenario: > > | Command | Reported ignored entries | Time (s) | > | --- | | | > | 1 | 0| 1.3 | > | 2 | 1024 | 4.2 | > | 3 | 174904 | 7.5 | > | 4 | 1046 | 1.6 | > > Commands: > 1) status > 2) status --ignored > 3) status --ignored --untracked-files=all > 4) status --ignored --untracked-files=all --show-ignored-directory (2) is --untracked-files=normal I'd presume, such that this flag can be understood as a tweak to "normal" based on the similar size between 2 and 4? (The timing improvement from 2 to 4 is huge though). > This changes exposes a --show-ignored-directory flag to the git status s/changes/change/ > command. This flag is utilized when running git status with the > --ignored and --untracked-files options to not list ignored individual > ignored files contained in directories that match an ignore pattern. > > Part of the perf improvement comes from the tweak to > read_directory_recursive to stop scanning the file system after it > encounters the first file. When a directory is ignored, all it needs to > determine is if the directory is empty or not. The logic currently keeps > scanning the file system until it finds an untracked file. However, as > the directory is ignored, all the contained contents are also marked > excluded. For ignored directories that contain a large number of files, > this can take some time. I think it is possible to ignore a directory and still track files in it, what are the implications of this flag on a tracked (and changed) file in an ignored dir? What happens to empty directories that match an ignore pattern? > @@ -1362,6 +1363,8 @@ int cmd_status(int argc, const char **argv, const char > *prefix) > N_("ignore changes to submodules, optional when: all, > dirty, untracked. (Default: all)"), > PARSE_OPT_OPTARG, NULL, (intptr_t)"all" }, > OPT_COLUMN(0, "column", &s.colopts, N_("list untracked files > in columns")), > + OPT_BOOL(0, "show-ignored-directory", &show_ignored_directory, Is it possible to directly read into s.show_ignored_directory here? > +test_expect_success 'Verify behavior of status on folders with ignored > files' ' > + test_when_finished "git clean -fdx" && > + git status --porcelain=v2 --ignored --untracked-files=all > --show-ignored-directory >output && > + test_i18ncmp expect output > +' > + > +# Test status bahavior on folder with tracked and ignored files behavior > +cat >expect <<\EOF > +? expect > +? output > +! dir/tracked_ignored/ignored_1.ign > +! dir/tracked_ignored/ignored_2.ign > +! tracked_ignored/ignored_1.ign > +! tracked_ignored/ignored_2.ign > +EOF I think our latest 'best style' is to include these heredocs into the test.
[PATCH v1 1/1] dir: teach status to show ignored directories
Teach Git to optionally show ignored directories when showing all untracked files. The git status command exposes the options to report ignored and/or untracked files. However, when reporting all untracked files (--untracked-files=all), all individual ignored files are reported as well. It is not currently possible to get the reporting behavior of the --ignored flag, while also reporting all untracked files. This change exposes a flag to report all untracked files while not showing individual files in ignored directories. Motivation: Our application (Visual Studio) needs all untracked files listed individually, but does not need all ignored files listed individually. Reporting all ignored files can affect the time it takes for status to run. For a representative repository, here are some measurements showing a large perf improvement for this scenario: | Command | Reported ignored entries | Time (s) | | --- | | | | 1 | 0| 1.3 | | 2 | 1024 | 4.2 | | 3 | 174904 | 7.5 | | 4 | 1046 | 1.6 | Commands: 1) status 2) status --ignored 3) status --ignored --untracked-files=all 4) status --ignored --untracked-files=all --show-ignored-directory This changes exposes a --show-ignored-directory flag to the git status command. This flag is utilized when running git status with the --ignored and --untracked-files options to not list ignored individual ignored files contained in directories that match an ignore pattern. Part of the perf improvement comes from the tweak to read_directory_recursive to stop scanning the file system after it encounters the first file. When a directory is ignored, all it needs to determine is if the directory is empty or not. The logic currently keeps scanning the file system until it finds an untracked file. However, as the directory is ignored, all the contained contents are also marked excluded. For ignored directories that contain a large number of files, this can take some time. Signed-off-by: Jameson Miller --- Documentation/git-status.txt | 5 + Documentation/technical/api-directory-listing.txt | 6 + builtin/commit.c | 4 + dir.c | 48 +- dir.h | 3 +- t/t7519-status-show-ignored-directory.sh | 189 ++ wt-status.c | 4 + wt-status.h | 1 + 8 files changed, 253 insertions(+), 7 deletions(-) create mode 100755 t/t7519-status-show-ignored-directory.sh diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt index d47f198f15..48ebe18571 100644 --- a/Documentation/git-status.txt +++ b/Documentation/git-status.txt @@ -100,6 +100,11 @@ configuration variable documented in linkgit:git-config[1]. --ignored:: Show ignored files as well. +--show-ignored-directory:: +Show directories that are ignored, instead of individual files. +Does not recurse into excluded directories when listing all +untracked files. + -z:: Terminate entries with NUL, instead of LF. This implies the `--porcelain=v1` output format if no other format is given. diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt index 6c77b4920c..a9816df44c 100644 --- a/Documentation/technical/api-directory-listing.txt +++ b/Documentation/technical/api-directory-listing.txt @@ -33,6 +33,12 @@ The notable options are: Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]` in addition to untracked files in `entries[]`. +`DIR_SHOW_IGNORED_DIRECTORY`::: + + If this is set, non-empty directories that match an ignore pattern are + returned. The individual files contained in ignored directories are not + included. + `DIR_KEEP_UNTRACKED_CONTENTS`::: Only has meaning if `DIR_SHOW_IGNORED_TOO` is also set; if this is set, the diff --git a/builtin/commit.c b/builtin/commit.c index 8e93802511..9fcf77ae3a 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1333,6 +1333,7 @@ static int git_status_config(const char *k, const char *v, void *cb) int cmd_status(int argc, const char **argv, const char *prefix) { + static int show_ignored_directory = 0; static struct wt_status s; int fd; struct object_id oid; @@ -1362,6 +1363,8 @@ int cmd_status(int argc, const char **argv, const char *prefix) N_("ignore changes to submodules, optional when: all, dirty, untracked. (Default: all)"), PARSE_OPT_OPTARG, NULL, (intptr_t)"all" }, OPT_COLUMN(0, "column", &s.colopts, N_("list untracked files in columns")), + OPT_BOOL(0, "show-ignored-directory", &show_i