Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
Robin Rosenberg writes: > - Ursprungligt meddelande - > >> That configurability is a slipperly slope to drag us into giving >> users >> more complexity that does not help them very much, I suspect. >> >> Earlier somebody mentioned "size and mtime is often enough", so I >> think a single option core.looseStatInfo (substitute "loose" with >> short, minimum or whatever adjective that is more appropriate---I am >> not good at picking phrases, it sounds to me a way to more loosely >> define stat info cleanliness than we usually do) that makes us >> ignore all fields (regardless of their zero-ness) other than those >> two fields might not be a bad way to go. > > Would something like this be good? > > core.statinfo = > default = all fields > minimal = whole seconds of mtime and size > medium = seconds, nanos of mtime and size > nonzero = all non-zero fields > > -- robin If you mean to exclude ctime and other fields we already exclude as useless from your "all", that may make sense, but do we really need that much "flexibility", or do "more choices" just confuse users? I have this suspicion that it may be the latter. Wouldn't a single boolean that lets users choose between your "minimal" and "default" be sufficient? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
- Ursprungligt meddelande - > That configurability is a slipperly slope to drag us into giving > users > more complexity that does not help them very much, I suspect. > > Earlier somebody mentioned "size and mtime is often enough", so I > think a single option core.looseStatInfo (substitute "loose" with > short, minimum or whatever adjective that is more appropriate---I am > not good at picking phrases, it sounds to me a way to more loosely > define stat info cleanliness than we usually do) that makes us > ignore all fields (regardless of their zero-ness) other than those > two fields might not be a bad way to go. Would something like this be good? core.statinfo = default = all fields minimal = whole seconds of mtime and size medium = seconds, nanos of mtime and size nonzero = all non-zero fields -- robin -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
Junio C Hamano wrote: > Robin Rosenberg writes: > That configurability is a slipperly slope to drag us into giving users > more complexity that does not help them very much, I suspect. > > Earlier somebody mentioned "size and mtime is often enough", so I > think a single option core.looseStatInfo (substitute "loose" with > short, minimum or whatever adjective that is more appropriate---I am > not good at picking phrases, it sounds to me a way to more loosely > define stat info cleanliness than we usually do) that makes us > ignore all fields (regardless of their zero-ness) other than those > two fields might not be a bad way to go. At one point, I used to build (and test) the MSVC version of git on cygwin, which leads to exactly the same problem. So, this is not just an EGit/JGit vs c-git issue, although there can't be many people that will have this problem. (Mixing the MinGW and cygwin versions on the same repo will also have this problem). I had a patch which, essentially, did what you suggest above; ie ignore everything other than size and mtime, *including* ignoring the zero-ness in the index. (I just don't understand why you would think of doing otherwise!! ;-) ). As part of that patch, I also suppressed the "empty diff" output that used to be shown for stat-dirty files (that's been fixed now right?), otherwise using gitk was a pain. [BTW, given the "schizophrenic stat" functions on cygwin, you can have this problem with the cygwin version of git - all on it's lonesome!] I can't help with naming, BTW, since I called the config variable "core.ramsay-stat". :-P > > I do not offhand know if such a loose mode is too simple and make it > excessively risky, though. I suspect it would be fine ... *however*, I never sent my patch because I didn't think there would be many idiots^H^H^H^H^H^H pioneers like me! :-D ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
Robin Rosenberg writes: >> I'd say a simplistic "ignore if zero is stored" or even "ignore this >> as one of the systems that shares this file writes crap in it" may >> be sufficient, and if this is a jGit specific issue, it might even >> make sense to introduce a single configuration variable with string >> "jgit" somewhere in its name and bypass the stat field comparison >> for known-problematic fields, instead of having the user know and >> list what stat fields need special attention. > > My first patch was something like that, just not using the word jgit. As > for what fields to ignore, it's something that can be configured by EGit > and documented on the EGit/JGit wiki. That configurability is a slipperly slope to drag us into giving users more complexity that does not help them very much, I suspect. Earlier somebody mentioned "size and mtime is often enough", so I think a single option core.looseStatInfo (substitute "loose" with short, minimum or whatever adjective that is more appropriate---I am not good at picking phrases, it sounds to me a way to more loosely define stat info cleanliness than we usually do) that makes us ignore all fields (regardless of their zero-ness) other than those two fields might not be a bad way to go. I do not offhand know if such a loose mode is too simple and make it excessively risky, though. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
- Ursprungligt meddelande - > Robin Rosenberg writes: > > > Semantically they're somewhat different. My flags are for ignoring > > a value when it's not used as indicated by the value zero, while > > trustctime is for ignoring untrustworthy, non-zero, values. > > Yeah, I realized that after writing that message. > > > Another thing that I noticed, is that I probably wanto to be able > > to filter on the precision > > of timestamps. Again, this i JGit-related. Current JGit has > > milliseconds precision (max), whereas > > Git has down to nanosecond precision in timestamps. Newer JGits may > > get nanoseconds timestamps too, > > but on current Linux versions JGit gets only integral seconds > > regardless of file system. > > > > Would the names, milli, micro, nano be good for ignoring the tail > > when zero, or n1..n9 (obviously > > n2 would be ok too). nN = ignore all but first N nsec digits if > > they are zero)? > > It somehow starts to sound like over-engineering to solve a wrong > problem. > > I'd say a simplistic "ignore if zero is stored" or even "ignore this > as one of the systems that shares this file writes crap in it" may > be sufficient, and if this is a jGit specific issue, it might even > make sense to introduce a single configuration variable with string > "jgit" somewhere in its name and bypass the stat field comparison > for known-problematic fields, instead of having the user know and > list what stat fields need special attention. My first patch was something like that, just not using the word jgit. As for what fields to ignore, it's something that can be configured by EGit and documented on the EGit/JGit wiki. -- robin -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
Am 1/15/2013 1:11, schrieb Junio C Hamano: > I'd say a simplistic "ignore if zero is stored" or even "ignore this > as one of the systems that shares this file writes crap in it" may > be sufficient, and if this is a jGit specific issue, it might even > make sense to introduce a single configuration variable with string > "jgit" somewhere in its name and bypass the stat field comparison > for known-problematic fields, instead of having the user know and > list what stat fields need special attention. It was my suggestion to have a list of names to ignore because the answer to this question > Is this "the user edits in eclipse and then runs 'git status' from the > terminal" problem? was "It is purely for performance in some situations" back then. But today, the answer is "Yes". With this new background, your suggestion to have just a single option that contains the token "jgit" may make more sense. (core.ignoreCygwinFSTricks may serve as a precedent.) The original patch was along this way, and the name contained "minimal", which I objected to. -- Hannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
> Is this "the user edits in eclipse and then runs 'git status' from > the > terminal" problem? Yes. Of course not just status, but any command that validates the index. On Unix this is usually bearable, though slow, but on Windows I often see git status take minutes (yes large files...). -- robin -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
Robin Rosenberg writes: > Semantically they're somewhat different. My flags are for ignoring > a value when it's not used as indicated by the value zero, while > trustctime is for ignoring untrustworthy, non-zero, values. Yeah, I realized that after writing that message. > Another thing that I noticed, is that I probably wanto to be able to filter > on the precision > of timestamps. Again, this i JGit-related. Current JGit has milliseconds > precision (max), whereas > Git has down to nanosecond precision in timestamps. Newer JGits may get > nanoseconds timestamps too, > but on current Linux versions JGit gets only integral seconds regardless of > file system. > > Would the names, milli, micro, nano be good for ignoring the tail when zero, > or n1..n9 (obviously > n2 would be ok too). nN = ignore all but first N nsec digits if they are > zero)? It somehow starts to sound like over-engineering to solve a wrong problem. I'd say a simplistic "ignore if zero is stored" or even "ignore this as one of the systems that shares this file writes crap in it" may be sufficient, and if this is a jGit specific issue, it might even make sense to introduce a single configuration variable with string "jgit" somewhere in its name and bypass the stat field comparison for known-problematic fields, instead of having the user know and list what stat fields need special attention. Is this "the user edits in eclipse and then runs 'git status' from the terminal" problem? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
- Ursprungligt meddelande - > Robin Rosenberg writes: > > > diff --git a/read-cache.c b/read-cache.c > > index fda78bc..f7fe15d 100644 > > --- a/read-cache.c > > +++ b/read-cache.c > > @@ -197,8 +197,9 @@ static int ce_match_stat_basic(struct > > cache_entry *ce, struct stat *st) > > } > > if (ce->ce_mtime.sec != (unsigned int)st->st_mtime) > > changed |= MTIME_CHANGED; > > - if (trust_ctime && ce->ce_ctime.sec != (unsigned > > int)st->st_ctime) > > - changed |= CTIME_CHANGED; > > + if ((trust_ctime || > > ((check_nonzero_stat&CHECK_NONZERO_STAT_CTIME) && > > ce->ce_ctime.sec))) > > One SP is required on each side of a binary operator; please have > one after check_nonzero_stat and after the & after it. > > I wonder if we should lose the trust_ctime variable and use this > check_nonzero_stat bitset exclusively, provided that this were a > good direction to go? Semantically they're somewhat different. My flags are for ignoring a value when it's not used as indicated by the value zero, while trustctime is for ignoring untrustworthy, non-zero, values. >From 1ce4790bf5e: A new configuration variable 'core.trustctime' is introduced to allow ignoring st_ctime information when checking if paths in the working tree has changed, because there are situations where it produces too much false positives. Like when file system crawlers keep changing it when scanning and using the ctime for marking scanned files. (your second mail) >Also I am getting these: > >config.c: In function 'git_default_core_config': >config.c:571: error: passing argument 1 of 'git_config_string' from >incompatible pointer type >config.c:540: note: expected 'const char **' but argument is of type 'char **' >config.c:573: error: passing argument 1 of 'strtok' discards qualifiers from >pointer target type Different compilers have different defaults. I'm on OS X (mountain lion), or am I missing something? I do get a warning. Am I allowed to modify the value, like strtok does? Seems I missed the opportunity to use the copy rather then the original value. Another thing that I noticed, is that I probably wanto to be able to filter on the precision of timestamps. Again, this i JGit-related. Current JGit has milliseconds precision (max), whereas Git has down to nanosecond precision in timestamps. Newer JGits may get nanoseconds timestamps too, but on current Linux versions JGit gets only integral seconds regardless of file system. Would the names, milli, micro, nano be good for ignoring the tail when zero, or n1..n9 (obviously n2 would be ok too). nN = ignore all but first N nsec digits if they are zero)? -- robin -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
Robin Rosenberg writes: > @@ -566,6 +566,31 @@ static int git_default_core_config(const char *var, > const char *value) > trust_ctime = git_config_bool(var, value); > return 0; > } > + if (!strcmp(var, "core.ignorezerostat")) { > + char *copy, *tok; > + git_config_string(©, "core.ignorezerostat", value); > + check_nonzero_stat = CHECK_NONZERO_STAT_MASK; > + tok = strtok(value, ","); > + while (tok) { > + if (strcasecmp(tok, "uid") == 0) > + check_nonzero_stat &= ~CHECK_NONZERO_STAT_UID; > + else if (strcasecmp(tok, "gid") == 0) > + check_nonzero_stat &= ~CHECK_NONZERO_STAT_GID; > + else if (strcasecmp(tok, "ctime") == 0) { > + check_nonzero_stat &= ~CHECK_NONZERO_STAT_CTIME; > + trust_ctime = 0; > + } else if (strcasecmp(tok, "ino") == 0) > + check_nonzero_stat &= ~CHECK_NONZERO_STAT_INO; > + else if (strcasecmp(tok, "dev") == 0) > + check_nonzero_stat &= ~CHECK_NONZERO_STAT_DEV; > + else > + die_bad_config(var); > + tok = strtok(NULL, ","); > + } > + if (check_nonzero_stat >= CHECK_NONZERO_STAT_MASK) > + die_bad_config(var); > + free(copy); > + } Also I am getting these: config.c: In function 'git_default_core_config': config.c:571: error: passing argument 1 of 'git_config_string' from incompatible pointer type config.c:540: note: expected 'const char **' but argument is of type 'char **' config.c:573: error: passing argument 1 of 'strtok' discards qualifiers from pointer target type -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make git selectively and conditionally ignore certain stat fields
Robin Rosenberg writes: > diff --git a/read-cache.c b/read-cache.c > index fda78bc..f7fe15d 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -197,8 +197,9 @@ static int ce_match_stat_basic(struct cache_entry *ce, > struct stat *st) > } > if (ce->ce_mtime.sec != (unsigned int)st->st_mtime) > changed |= MTIME_CHANGED; > - if (trust_ctime && ce->ce_ctime.sec != (unsigned int)st->st_ctime) > - changed |= CTIME_CHANGED; > + if ((trust_ctime || ((check_nonzero_stat&CHECK_NONZERO_STAT_CTIME) && > ce->ce_ctime.sec))) One SP is required on each side of a binary operator; please have one after check_nonzero_stat and after the & after it. I wonder if we should lose the trust_ctime variable and use this check_nonzero_stat bitset exclusively, provided that this were a good direction to go? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] Make git selectively and conditionally ignore certain stat fields
Specifically the fields uid, gid, ctime, ino and dev are set to zero by JGit. Any stat checking by git will then need to check content, which may be very slow, particularly on Windows. Since mtime and size is typically enough we should allow the user to tell git to avoid checking these fields if they are set to zero in the index. This change introduces a core.ignorezerostat config option where the user can list the fields to ignore using the names above. Signed-off-by: Robin Rosenberg --- Documentation/config.txt | 9 + cache.h | 8 config.c | 25 + environment.c| 1 + read-cache.c | 24 +++- 5 files changed, 58 insertions(+), 9 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index d5809e0..7f34c94 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -235,6 +235,15 @@ core.trustctime:: crawlers and some backup systems). See linkgit:git-update-index[1]. True by default. +core.ignorezerostat:: + Affects the interpretation of some fields in the index. If + unset has no effect. When set to a comma separated list of fields, + each of the fields in the index will be excluded from comparison with + working tree if the index value is zero. The following fields + are recognzed: `uid', `gid', `ctime', `ino' and `dev'. When ctime is ignored + the setting of 'core.trustctime' is overridden by by this config + value. + core.quotepath:: The commands that output paths (e.g. 'ls-files', 'diff'), when not given the `-z` option, will quote diff --git a/cache.h b/cache.h index c257953..524e49a 100644 --- a/cache.h +++ b/cache.h @@ -536,6 +536,14 @@ extern int delete_ref(const char *, const unsigned char *sha1, int delopt); /* Environment bits from configuration mechanism */ extern int trust_executable_bit; extern int trust_ctime; +extern int check_nonzero_stat; +#define CHECK_NONZERO_STAT_UID (1<<0) +#define CHECK_NONZERO_STAT_GID (1<<1) +#define CHECK_NONZERO_STAT_CTIME (1<<2) +#define CHECK_NONZERO_STAT_INO (1<<3) +#define CHECK_NONZERO_STAT_DEV (1<<4) +#define CHECK_NONZERO_STAT_MASK ((1<<5)-1) + extern int quote_path_fully; extern int has_symlinks; extern int minimum_abbrev, default_abbrev; diff --git a/config.c b/config.c index 7b444b6..79485cd 100644 --- a/config.c +++ b/config.c @@ -566,6 +566,31 @@ static int git_default_core_config(const char *var, const char *value) trust_ctime = git_config_bool(var, value); return 0; } + if (!strcmp(var, "core.ignorezerostat")) { + char *copy, *tok; + git_config_string(©, "core.ignorezerostat", value); + check_nonzero_stat = CHECK_NONZERO_STAT_MASK; + tok = strtok(value, ","); + while (tok) { + if (strcasecmp(tok, "uid") == 0) + check_nonzero_stat &= ~CHECK_NONZERO_STAT_UID; + else if (strcasecmp(tok, "gid") == 0) + check_nonzero_stat &= ~CHECK_NONZERO_STAT_GID; + else if (strcasecmp(tok, "ctime") == 0) { + check_nonzero_stat &= ~CHECK_NONZERO_STAT_CTIME; + trust_ctime = 0; + } else if (strcasecmp(tok, "ino") == 0) + check_nonzero_stat &= ~CHECK_NONZERO_STAT_INO; + else if (strcasecmp(tok, "dev") == 0) + check_nonzero_stat &= ~CHECK_NONZERO_STAT_DEV; + else + die_bad_config(var); + tok = strtok(NULL, ","); + } + if (check_nonzero_stat >= CHECK_NONZERO_STAT_MASK) + die_bad_config(var); + free(copy); + } if (!strcmp(var, "core.quotepath")) { quote_path_fully = git_config_bool(var, value); diff --git a/environment.c b/environment.c index 85edd7f..e90b52f 100644 --- a/environment.c +++ b/environment.c @@ -13,6 +13,7 @@ int trust_executable_bit = 1; int trust_ctime = 1; +int check_nonzero_stat = CHECK_NONZERO_STAT_MASK; int has_symlinks = 1; int minimum_abbrev = 4, default_abbrev = 7; int ignore_case; diff --git a/read-cache.c b/read-cache.c index fda78bc..f7fe15d 100644 --- a/read-cache.c +++ b/read-cache.c @@ -197,8 +197,9 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st) } if (ce->ce_mtime.sec != (unsigned int)st->st_mtime) changed |= MTIME_CHANGED; - if (trust_ctime && ce->ce_ctime.sec != (unsigned int)st->st_ctime) - changed |= CTIME_CHANGED; + if ((trust_ctime || ((check_nonzero_stat&CHECK_NONZERO_STAT_CTIME) && ce->ce_ctime.sec))) + if (ce->ce_ctime.sec