Re: [PATCH 00/17] Clean up how fetch_pack() handles the heads list
On 08/23/2012 10:31 PM, Jeff King wrote: On Thu, Aug 23, 2012 at 03:56:48PM -0400, Jeff King wrote: This code blames back to: commit 4bcb310c2539b66d535e87508d1b7a90fe29c083 Author: Alexandre Julliard julli...@winehq.org Date: Fri Nov 24 16:00:13 2006 +0100 fetch-pack: Do not fetch tags for shallow clones. A better fix may be to only fetch tags that point to commits that we are downloading, but git-clone doesn't have support for following tags. This will happen automatically on the next git-fetch though. So it is about making sure that git clone --depth=1 does not accidentally pull a single commit from v1.0, v1.1, v1.2, and so forth, losing the purpose of using --depth in the first place. These days it is largely irrelevant, since this code path is not followed by clone, and clone will automatically restrict its list of fetched refs to a single branch if --depth is used. I think part of the confusion of this code is that inside the loop over the refs it is sometimes checking aspects of the ref, and sometimes checking invariants of the loop (like args.fetch_all). Splitting it into separate loops makes it easier to see what is going on, like the patch below (on top of Michael's series). I'm not sure if it ends up being more readable, since the generic cut down this linked list function has to operate through callbacks with a void pointer. On the other hand, that function could also be used elsewhere. diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 90683ca..877cf38 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -521,51 +521,80 @@ static void mark_recent_complete_commits(unsigned long cutoff) } } -static void filter_refs(struct ref **refs, int *nr_heads, char **heads) +static void filter_refs_callback(struct ref **refs, + int (*want)(struct ref *, void *), + void *data) { - struct ref *newlist = NULL; - struct ref **newtail = newlist; + struct ref **tail = refs; struct ref *ref, *next; - int match_pos = 0, unmatched = 0; for (ref = *refs; ref; ref = next) { - int keep_ref = 0; next = ref-next; - if (!memcmp(ref-name, refs/, 5) - check_refname_format(ref-name, 0)) - ; /* trash */ - else if (args.fetch_all -(!args.depth || prefixcmp(ref-name, refs/tags/))) - keep_ref = 1; - else - while (match_pos *nr_heads) { - int cmp = strcmp(ref-name, heads[match_pos]); - if (cmp 0) { /* definitely do not have it */ - break; - } else if (cmp == 0) { /* definitely have it */ - free(heads[match_pos++]); - keep_ref = 1; - break; - } else { /* might have it; keep looking */ - heads[unmatched++] = heads[match_pos++]; - } - } - - if (keep_ref) { - *newtail = ref; - ref-next = NULL; - newtail = ref-next; - } else { + if (want(ref, data)) + tail = ref-next; + else { free(ref); + *tail = next; } } +} - /* copy any remaining unmatched heads: */ - while (match_pos *nr_heads) - heads[unmatched++] = heads[match_pos++]; - *nr_heads = unmatched; +static int ref_name_is_ok(struct ref *ref, void *data) +{ + return memcmp(ref-name, refs/, 5) || + !check_refname_format(ref-name, 0); +} + +static int ref_ok_for_shallow(struct ref *ref, void *data) +{ + return prefixcmp(ref-name, refs/tags/); +} - *refs = newlist; +struct filter_by_name_data { + char **heads; + int nr_heads; + int match_pos; + int unmatched; +}; + +static int want_ref_name(struct ref *ref, void *data) +{ + struct filter_by_name_data *f = data; + + while (f-match_pos f-nr_heads) { + int cmp = strcmp(ref-name, f-heads[f-match_pos]); + if (cmp 0) /* definitely do not have it */ + return 0; + else if (cmp == 0) { /* definitely have it */ + free(f-heads[f-match_pos++]); + return 1; + } else /* might have it; keep looking */ + f-heads[f-unmatched++] = f-heads[f-match_pos++]; + } + return 0; +} + +static void filter_refs(struct ref **refs, int *nr_heads, char **heads) +{ + struct filter_by_name_data f; +
Re: [PATCH 9/9] Add git-check-ignores
Hi there, Firstly, thanks for the quick feedback! On Sun, Sep 2, 2012 at 11:41 AM, Nguyen Thai Ngoc Duy pclo...@gmail.com wrote: On Sun, Sep 2, 2012 at 7:12 AM, Adam Spiers g...@adamspiers.org wrote: This works in a similar manner to git-check-attr. Some code was reused from add.c by refactoring out into pathspec.c. Thanks, comments from a quick glance. First of all, can we make it work (or share code) with .gitattributes? We may need to debug .gitattributes as well as .gitignore. A common command would be nice. I'm no expert on .gitattributes and check-attr, but AFAICS, all the opportunities to share code in the plumbing and front-end seem to be taken already, e.g. the directory traversal and path handling. The CLI argument parsing is necessarily different because check-attr requires a list of attributes as well as a list of files, and of course the output routines have to be different too. The only opportunity for code reuse which I saw but /didn't/ take was around the --stdin line parsing code which is duplicated between: check_attr_stdin_paths check_ignore_stdin_paths cmd_checkout_index cmd_update_index hash_stdin_paths I attempted to refactor these, but quickly realised that due to the lack of proper closures in C, the overheads and complexity incurred by performing such a refactoring probably outweighed the benefits, so I gave up on the idea. Having said that, I'm totally open to suggestions if you can spot other places where code could be reused :) +SYNOPSIS + +[verse] +'git check-ignore' pathname... +'git check-ignore' --stdin [-z] list-of-paths Also --quiet option, where check-ignore returns 0 if the given path is ignored, 1 otherwise? I considered that, but couldn't think of appropriate behaviour when multiple paths are given, so in the end I decided to remain consistent with check-attr, which always returns 0. But I'm happy to change it if you can think of a more useful behaviour. For example we could have a --count option which produces no output but has an exit status corresponding to the number of ignored files. - If many paths are given, then perhaps we could print ignored paths (no extra info). How is this different to git ls-files -i -o ? - Going to the next level, we could print path and the the location of the final exclude/include rule (file and line number). That's the current behaviour, and I believe it covers the most common use case. - For debugging, given one path, we print all the rules that are applied to it, which may help understand how/why it goes wrong. That would be nice, but I'm not sure it's a tremendously common use case. Could you think of a scenario in which it would be useful? I guess it could be done by adding a new DIR_DEBUG_IGNORED flag to dir_struct which would make the exclude matcher functions collect all matching patterns, rather than just returning the first one. This in turn would require another field for collecting all matched patterns. @@ -338,6 +338,7 @@ static void handle_internal_command(int argc, const char **argv) { bundle, cmd_bundle, RUN_SETUP_GENTLY }, { cat-file, cmd_cat_file, RUN_SETUP }, { check-attr, cmd_check_attr, RUN_SETUP }, + { check-ignore, cmd_check_ignore, RUN_SETUP | NEED_WORK_TREE }, { check-ref-format, cmd_check_ref_format }, { checkout, cmd_checkout, RUN_SETUP | NEED_WORK_TREE }, { checkout-index, cmd_checkout_index, I don't think we really need NEED_WORK_TREE here. .gitignore can be read from index only. I thought about that, but in the end I decided it probably didn't make sense, because none of the exclude matching routines match against the index - they all match against the working tree and core.excludesfile. This would also require changing the matching logic to honor the index, but I didn't see the benefit in doing that, since all operations which involve excludes (add, status, etc.) relate to a work tree. But as with all of the above, please don't hesitate to point out if I've missed something. You guys are the experts, not me ;-) Thanks again, Adam -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/9] For each exclude pattern, store information about where it came from
From: Adam Spiers g...@adamspiers.org Sent: Sunday, September 02, 2012 1:12 AM Subject: [PATCH 6/9] For each exclude pattern, store information about where it came from For exclude patterns read in from files, the filename is stored together with the corresponding line number (counting starting at 1). Is there a way to identify the config core.excludesfile if present? i.e. that it is from that config variable, rather than directory traversal. This came up in a recent http://stackoverflow.com/questions/12199150/effective-gitignore-and-clean-strategy/12205852#12205852 discussion. For exclude patterns provided on the command line, the sequence number is negative, with counting starting at -1, so for example the 2nd pattern provided via --exclude would be numbered -2. This allows any future consumers of that data to easily distinguish between exclude patterns from files vs. from the CLI. Signed-off-by: Adam Spiers g...@adamspiers.org --- builtin/clean.c| 2 +- builtin/ls-files.c | 3 ++- dir.c | 25 +++-- dir.h | 5 - 4 files changed, 26 insertions(+), 9 deletions(-) diff --git a/builtin/clean.c b/builtin/clean.c index 0c7b3d0..f618231 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -99,7 +99,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix) for (i = 0; i exclude_list.nr; i++) add_exclude(exclude_list.items[i].string, , 0, - dir.exclude_list[EXC_CMDL]); + dir.exclude_list[EXC_CMDL], --exclude option, -(i+1)); pathspec = get_pathspec(prefix, argv); diff --git a/builtin/ls-files.c b/builtin/ls-files.c index 31b3f2d..420ff40 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -35,6 +35,7 @@ static int error_unmatch; static char *ps_matched; static const char *with_tree; static int exc_given; +static int exclude_args = 0; static const char *tag_cached = ; static const char *tag_unmerged = ; @@ -423,7 +424,7 @@ static int option_parse_exclude(const struct option *opt, struct exclude_list *list = opt-value; exc_given = 1; - add_exclude(arg, , 0, list); + add_exclude(arg, , 0, list, --exclude option, --exclude_args); return 0; } diff --git a/dir.c b/dir.c index 3d438c3..ac8c838 100644 --- a/dir.c +++ b/dir.c @@ -311,7 +311,7 @@ static int no_wildcard(const char *string) } void add_exclude(const char *string, const char *base, - int baselen, struct exclude_list *el) + int baselen, struct exclude_list *el, const char *src, int srcpos) { struct exclude *x; size_t len; @@ -341,6 +341,8 @@ void add_exclude(const char *string, const char *base, x-base = base; x-baselen = baselen; x-flags = flags; + x-src = src; + x-srcpos = srcpos; if (!strchr(string, '/')) x-flags |= EXC_FLAG_NODIR; x-nowildcardlen = simple_length(string); @@ -393,7 +395,7 @@ int add_excludes_from_file_to_list(const char *fname, int check_index) { struct stat st; - int fd, i; + int fd, i, lineno = 1; size_t size = 0; char *buf, *entry; @@ -436,8 +438,9 @@ int add_excludes_from_file_to_list(const char *fname, if (buf[i] == '\n') { if (entry != buf + i entry[0] != '#') { buf[i - (i buf[i-1] == '\r')] = 0; - add_exclude(entry, base, baselen, el); + add_exclude(entry, base, baselen, el, fname, lineno); } + lineno++; entry = buf + i + 1; } } @@ -472,8 +475,10 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen) !strncmp(dir-basebuf, base, stk-baselen)) break; dir-exclude_stack = stk-prev; - while (stk-exclude_ix el-nr) - free(el-excludes[--el-nr]); + while (stk-exclude_ix el-nr) { + struct exclude *exclude = el-excludes[--el-nr]; + free(exclude); + } free(stk-filebuf); free(stk); } @@ -500,7 +505,15 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen) memcpy(dir-basebuf + current, base + current, stk-baselen - current); strcpy(dir-basebuf + stk-baselen, dir-exclude_per_dir); - add_excludes_from_file_to_list(dir-basebuf, + + /* dir-basebuf gets reused by the traversal, but we + * need fname to remain unchanged to ensure the src + * member of each struct exclude correctly back-references + * its source file. + */ + char *fname = strdup(dir-basebuf); + + add_excludes_from_file_to_list(fname, dir-basebuf, stk-baselen, stk-filebuf, el, 1); dir-exclude_stack = stk; diff --git a/dir.h b/dir.h index 1b4f9dc..81efee4 100644 --- a/dir.h +++ b/dir.h @@ -31,6 +31,9 @@ struct exclude_list { int baselen; int to_exclude; int flags; + const char *src; + int srcpos; /* counting starts from 1 for line numbers in ignore files, +and from -1 decrementing for patterns from CLI (--exclude) */ } **excludes; }; @@ -123,7 +126,7 @@ extern int add_excludes_from_file_to_list(const char *fname, const char *base, i char **buf_p, struct exclude_list *el, int check_index); extern void add_excludes_from_file(struct dir_struct *, const char *fname); extern void add_exclude(const char *string, const char *base,
Re: [PATCH v4] Thunderbird: fix appp.sh format problems
Johannes Sixt j...@kdbg.org writes: Am 31.08.2012 16:09, schrieb Marco Stornelli: +CCS=`perl -e 'local $/=undef; open FILE, $ENV{'PATCHTMP'}; $text=FILE; +close FILE; $addr = $1 if $text =~ /Cc: (.*?(,\n .*?)*)\n/s; $addr =~ s/\n//g; +print $addr;'` The quoting is broken in this line (sq within sq does not work). As you said later in the thread, perl lets you be loose and say $ENV{PATCHTMP} without quoting the string PATCHTMP, so it is not quite _broken_ per-se, but the above gives a false impression to readers that the author meant to feed perl open FILE, $ENV{'PATCHTMP'}; which is not happening, so at least it is misleading. Am I correct that you intend to treat continuation lines with this non-trivial perl script? As the above regexp seems to try to match Cc: marco, git, j6t, other recipient I think that indeed is the intent. All this processing gets tedious and unreadable. Let me suggest this pattern instead: # translate to Thunderbird language LANG_TO=To: LANG_SUBJ=Subject: LANG_CC=Cc: LF= # terminates the _previous_ line while read -r line do case $line in 'To: '*) printf ${LF}%s $LANG_TO ${line#To: } ;; 'Cc: '*) ...similar... 'Subject: '*) ...similar... ' '*) # continuation line printf %s $line ;; '') print ${LF}\n cat ;; esac LF='\n' done $PATCH $1 I think that is much more readable. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] Thunderbird: fix appp.sh format problems
Marco Stornelli marco.storne...@gmail.com writes: I also wonder what would happen if To: and Cc: in the input were split into continuation lines, but that was already present in the Do you mean To: mail1,.mailN\nCc: mail1,.mailN? No, I meant To: mail1,...\n mailN\n. But see my response to J6t's message. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/9] For each exclude pattern, store information about where it came from
Philip Oakley philipoak...@iee.org writes: Is there a way to identify the config core.excludesfile if present? i.e. that it is from that config variable, rather than directory traversal. If the code handles $GIT_DIR/info/exclude then that configuration would also be handled the same way, no? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] Thunderbird: fix appp.sh format problems
Marco Stornelli marco.storne...@gmail.com writes: Il 01/09/2012 15:59, Johannes Sixt ha scritto: Look how you write: perl -e '... $ENV{'PATCHTMP'} ...' That is, perl actually sees this script: ... $ENV{PATCHTMP} ... (no quotes around PATCHTMP). That my be perfectly valid perl, but is not what you intended. I don't understand what you mean when you say is not what you intended... When you wrote this: CCS=`perl -e 'local $/=undef; open FILE, $ENV{'PATCHTMP'};... which one of the following did you mean to feed Perl? (1) open FILE, $ENV{'PATCHTMP'}; (2) open FILE, $ENV{PATCHTMP}; The patch text makes it look as if you wanted to do (1), but what is fed to perl is (2), as Johannes points out. Saying: CCS=`perl -e 'local $/=undef; open FILE, $ENV{PATCHTMP};... would have been more natural if you really meant to do (2), don't you think? So what you wrote is at least misleading. But I think I agree with Johannes's rewrite of the loop, so this may be a moot point. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/9] new git check-ignore sub-command
Adam Spiers g...@adamspiers.org writes: I was browsing stackoverflow the other day and came across this question: http://stackoverflow.com/questions/12144633/which-gitignore-rule-is-ignoring-my-file/ A quick google revealed this thread from 2009: http://thread.gmane.org/gmane.comp.version-control.git/108671/focus=108815 where Junio and Jeff discussed the possibility of adding a new `git check-ignore' subcommand somewhat analogous to the existing `git check-attr', and suggested the beginnings of an implementation. It struck me that it might not be too hard to follow these ideas to their natural conclusion, so I decided it would make a fun project :-) Thanks. I wish there are more people like you ;-) As to styles, I spotted only three kinds of Huh?: * do not initialise statics to 0 or NULL, e.g. -static int exclude_args = 0; +static int exclude_args; * avoid unnnecessary braces {} around single statement blocks, e.g. -if (exclude) { +if (exclude) return exclude; -} * else should follow close brace '}' of if clause, e.g. if (...) { ... -} -else { +} else { ... For reviews on substance, please see other messages from me. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/9] Rename cryptic 'which' variable to more consistent name
Adam Spiers g...@adamspiers.org writes: 'el' is only *slightly* less cryptic, but is already used as the variable name for a struct exclude_list pointer in numerous other places, so this reduces the number of cryptic variable names in use by one :-) The name originally meant to mean to which element of the array dir_struct.exclude_list[] are we adding this entry? but I agree el that stands for ExcludeList would be a better name. Often we use el (or elem) for elements of an iterable we are iterating on in a loop, and the name of the iterable does not have to be EsomethingLsomething, by the way. Because no existing use of el in this file is of that kind, I do not think this change introduces new confusion to the code. Signed-off-by: Adam Spiers g...@adamspiers.org --- dir.c | 10 +- dir.h | 4 ++-- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/dir.c b/dir.c index c9f341a..57a5d11 100644 --- a/dir.c +++ b/dir.c @@ -311,7 +311,7 @@ static int no_wildcard(const char *string) } void add_exclude(const char *string, const char *base, - int baselen, struct exclude_list *which) + int baselen, struct exclude_list *el) { struct exclude *x; size_t len; @@ -346,8 +346,8 @@ void add_exclude(const char *string, const char *base, x-nowildcardlen = simple_length(string); if (*string == '*' no_wildcard(string+1)) x-flags |= EXC_FLAG_ENDSWITH; - ALLOC_GROW(which-excludes, which-nr + 1, which-alloc); - which-excludes[which-nr++] = x; + ALLOC_GROW(el-excludes, el-nr + 1, el-alloc); + el-excludes[el-nr++] = x; } static void *read_skip_worktree_file_from_index(const char *path, size_t *size) @@ -389,7 +389,7 @@ int add_excludes_from_file_to_list(const char *fname, const char *base, int baselen, char **buf_p, -struct exclude_list *which, +struct exclude_list *el, int check_index) { struct stat st; @@ -436,7 +436,7 @@ int add_excludes_from_file_to_list(const char *fname, if (buf[i] == '\n') { if (entry != buf + i entry[0] != '#') { buf[i - (i buf[i-1] == '\r')] = 0; - add_exclude(entry, base, baselen, which); + add_exclude(entry, base, baselen, el); } entry = buf + i + 1; } diff --git a/dir.h b/dir.h index a226fbc..549a187 100644 --- a/dir.h +++ b/dir.h @@ -117,10 +117,10 @@ extern int path_excluded(struct path_exclude_check *, const char *, int namelen, extern int add_excludes_from_file_to_list(const char *fname, const char *base, int baselen, - char **buf_p, struct exclude_list *which, int check_index); + char **buf_p, struct exclude_list *el, int check_index); extern void add_excludes_from_file(struct dir_struct *, const char *fname); extern void add_exclude(const char *string, const char *base, - int baselen, struct exclude_list *which); + int baselen, struct exclude_list *el); extern void free_excludes(struct exclude_list *el); extern int file_exists(const char *); -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 9/9] Add git-check-ignores
Adam Spiers g...@adamspiers.org writes: +SYNOPSIS + +[verse] +'git check-ignore' pathname... +'git check-ignore' --stdin [-z] list-of-paths Also --quiet option, where check-ignore returns 0 if the given path is ignored, 1 otherwise? I agree that multiple paths are problematic. We could error out if multiple paths are given with --quiet until we figure out what the useful result would be in such a case, and still give a useful answer to callers that feed a single path, though. That may encourage suboptimal coding to casual Porcelain writers, i.e. it would allow for path in $paths do if git check-ignore -q $path then do something to $path fi done even though we would rather want to encourage git check-ignore --name-only $paths | while read path do do something to $path done But from lay-scriptors' point of view, being able to easily write a script (even though it may be inefficient) to do the job at hand is far better than having to give up writing one because the tool does not allow easy-and-stupid scripting, so it is not exactly a huge downside. - If many paths are given, then perhaps we could print ignored paths (no extra info). How is this different to git ls-files -i -o ? I personally think the parts of ls-files that deal with paths not in the index outlived its usefulness ;-) and users deserve to be given a better UI. - Going to the next level, we could print path and the the location of the final exclude/include rule (file and line number). That's the current behaviour, and I believe it covers the most common use case. Yes; I have a reservation on your output format, though. - For debugging, given one path, we print all the rules that are applied to it, which may help understand how/why it goes wrong. I do not think that would be terribly useful. Maybe for people who are learning how dir.c internally works, but not for people who are trying to improve the set of .gitignore files in their project. I thought about that, but in the end I decided it probably didn't make sense, because none of the exclude matching routines match against the index - they all match against the working tree and core.excludesfile. This would also require changing the matching logic to honor the index, but I didn't see the benefit in doing that, since all operations which involve excludes (add, status, etc.) relate to a work tree. The mechanism primarily is to see if a path in the working tree is a cruft or a valuable still to be added; I am OK with NEED_WORK_TREE; when we have a useful case to run this in a bare repository, we can lift it. As with the what to do with multiple paths and -q, it is better to start with feature set to cover only the known or easily anticipated use cases, rejecting the cases for which good semantics are not thought out. An alternative would be a code that operates sanely only for known or anticipated cases and do random things with irrational semantics in other cases, and people start relying on the irrational behaviour without realizing their input and the behaviour they are seeing are not something that the feature is designed to for, but whatever the code with loose precondition checking happens to do. We do not want to repeat that kind of mistake, which is hard to fix in future versions. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 9/9] Add git-check-ignores
Adam Spiers g...@adamspiers.org writes: +OPTIONS +--- +--stdin:: + Read file names from stdin instead of from the command-line. + +-z:: + Only meaningful with `--stdin`; paths are separated with a + NUL character instead of a linefeed character. On input, or on output, or both? The answer should be both, otherwise you cannot safely handle paths with funny character in your script, even if you wanted to. Which means that this cannot only be meaningful with --stdin, I think. +OUTPUT +-- + +The output is a series of lines of the form: + +path COLON SP type SP pattern SP source SP position LF + +path is the path of a file being queried, type is either +'excluded' or 'included' (for patterns prefixed with '!'), pattern +is the matching pattern, source is the pattern's source file (either +as an absolute path or relative to the repository root), and +position is the position of the pattern within that source. Let's step back a bit and think what this command is about. What is the reason why the user wants to run check-ignore $path in the first place? I think there are two (or three, depending on how you count). (1) You have one (or more) paths at hand. You want to know if it is (or some of them are) ignored, but you do not particularly care how they are ignored. Think of implementing your own git add as a script. (2) You have one or more paths that are ignored but do not want them to be, and want to find out why they are. For the former, your script may want to see the paths sifted into ignored bin and not-ignored bin, so git check-ignore [-z] --name-only $paths that gives you only the paths without any reason is more useful. You also may want the opposite (show only paths not ignored), but that can be computed easily by the script, so it is of lessor importance. For the latter, you are debugging the set of exclude sources and want to learn where the decision to exclude it comes from. For that kind of use, it would be more useful if the output mimicked error messages from the compilers and output from grep -n to show the source, e.g. .gitignore:13:/git-am git-am Emacs users can use M-x grepRETgit check-ignore -v git-amRET, see the output, and find the hit in its output (I would imagine vim would have a similar feature). The output format would be something like: source COLON linenum COLON pattern HT pathname I do not think you need excluded/included type as a separate item in the non -z output, as it should be clear from the pattern. Substitute cmdline (literally) as source for patterns obtained from the command line. I would also suggest to (1) make --name-only the default (i.e. no need to have the --name-only option); (2) give the version that mimicks grep -n when -v|--verbose is given; and (3) support --quiet; the command would exit with status 0 if _any_ of the paths given is ignored, or status 1 if none of the paths is ignored (or error out with die() when --quiet and multiple paths are given). Regarding -z (for script consumption), I do not object to the broken down format, e.g., check-ignore -z -v may give a sequence of pathname NUL type NUL pattern NUL source NUL position NUL while check-ignore -z would give a sequence of pathname NUL -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: diff/merge tool that ignores whitespace changes
Hi, Would that help ? git help diff [snip] --ignore-space-at-eol Ignore changes in whitespace at EOL. -b, --ignore-space-change Ignore changes in amount of whitespace. This ignores whitespace at line end, and considers all other sequences of one or more whitespace characters to be equivalent. -w, --ignore-all-space Ignore whitespace when comparing lines. This ignores differences even if one line has whitespace where the other line has none. That might be it :) Now I yet need to find out how to configure tig for it. By the way: anything similar for merge/rebase ? thx -- Mit freundlichen Grüßen / Kind regards Enrico Weigelt VNC - Virtual Network Consult GmbH Head Of Development Pariser Platz 4a, D-10117 Berlin Tel.: +49 (30) 3464615-20 Fax: +49 (30) 3464615-59 enrico.weig...@vnc.biz; www.vnc.de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/9] For each exclude pattern, store information about where it came from
From: Junio C Hamano gits...@pobox.com Sent: Sunday, September 02, 2012 8:02 PM Philip Oakley philipoak...@iee.org writes: Is there a way to identify the config core.excludesfile if present? i.e. that it is from that config variable, rather than directory traversal. If the code handles $GIT_DIR/info/exclude then that configuration would also be handled the same way, no? Probably not. The $GIT_DIR/info/exclude is directly a path, while the core.excludesfile could point anywhere. This assumes the path to the relevant ignore file is shown. Given the suggested report format in the Documentation, this path could be reported as 'coreexclude', not just an 'exclude'. If I've understood the regular code correctly, the core.excludesfile is always at one end of the exclude struct so should be easy to check at that position. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] i18n.pathencoding
Torsten Bögershausen skrev 2012-09-01 08.11: Allow path names to be encoded in UTF-8 in the repository and checkout out as e.g. ISO-8859-1 in the working tree. Ack for attempting this. Did it myself if 2007, but times weren't ripe then, I guess. +i18n.pathEncoding:: + This option is only used by some implementations of git. + When git init sets core.supportspathencoding to true, + i18n.pathEncoding can be set to re-encode path names when + a working tree is checked out. + Path names may be e.g. encoded in ISO-8859-1 and are stored as + UTF-8 encoded in the repository. + When not set, the encoding of path names is the same in working tree + and the repository. If set, then core.precomposeunicode is ignored on Mac OS X. diff --git a/compat/reencode_pathname.c b/compat/reencode_pathname.c new file mode 100644 index 000..3bdc776 --- /dev/null +++ b/compat/reencode_pathname.c @@ -0,0 +1,441 @@ +/* + * Converts pathnames from one encoding into another. + * The pathnames are stored as UTF-8 in the repository, + * and might be checkout out as e.g. ISO-8859-1 in the working tree + * + * On MacOS X decomposed unicode is converted into precomposed unicode. , ignoring the setting of core.precomposeunicode. [...] + */ + +#define REENCODE_PATHNAME_C +#include cache.h +#include utf8.h +#include reencode_pathname.h + +#if defined(OLD_ICONV) || (defined(__sun__) !defined(_XPG6)) + typedef const char *iconv_ibp; +#else + typedef char *iconv_ibp; +#endif + +const static char *repo_path_encoding = UTF-8; + +static iconv_t iconv_open_or_die(const char *tocode, const char *fromcode) +{ + iconv_t my_iconv; + my_iconv = iconv_open(tocode, fromcode); join these two lines + if (my_iconv == (iconv_t) -1) + die_errno(_(iconv_open(%s,%s) failed), tocode, fromcode); + return my_iconv; +} + +static size_t has_non_ascii(const char *s, size_t maxlen, size_t *strlen_c) +{ + const uint8_t *ptr = (const uint8_t *)s; + size_t strlen_chars = 0; + size_t ret = 0; + + if (!ptr || !*ptr) + return 0; + + while (*ptr maxlen) { + if (*ptr 0x80) + ret++; + strlen_chars++; + ptr++; + maxlen--; + } + if (strlen_c) + *strlen_c = strlen_chars; + + return ret; +} + +#ifdef PRECOMPOSE_UNICODE +void probe_utf8_pathname_composition(char *path, int len) +{ + static const char *auml_nfc = \xc3\xa4; + static const char *auml_nfd = \x61\xcc\x88; + int output_fd; + if (precomposed_unicode != -1) + return; /* We found it defined in the global config, respect it */ a bland line here would be nice + strcpy(path + len, auml_nfc); + output_fd = open(path, O_CREAT|O_EXCL|O_RDWR, 0600); + if (output_fd = 0) { + close(output_fd); + strcpy(path + len, auml_nfd); + /* Indicate to the user, that we can configure it to true */ + if (!access(path, R_OK)) + git_config_set(core.precomposeunicode, false); + /* To be backward compatible, set precomposed_unicode to 0 */ + precomposed_unicode = 0; + strcpy(path + len, auml_nfc); + if (unlink(path)) + die_errno(_(failed to unlink '%s'), path); + } +} +#endif [...] +struct dirent_psx *renc_pn_readdir(RENC_FN_DIR *renc_pn_dir) +{ + struct dirent *res; + res = readdir(renc_pn_dir-dirp); + if (res) { + size_t namelenz = strlen(res-d_name) + 1; /* \0 */ + size_t new_len_needed = 0; + int ret_errno = errno; + + renc_pn_dir-dirent_utf8-d_ino= res-d_ino; + renc_pn_dir-dirent_utf8-d_type = res-d_type; + do { + if (new_len_needed renc_pn_dir-dirent_utf8-max_name_len) { indent [...] diff --git a/environment.c b/environment.c index 85edd7f..ba81575 100644 --- a/environment.c +++ b/environment.c @@ -59,6 +59,7 @@ int grafts_replace_parents = 1; int core_apply_sparse_checkout; int merge_log_config = -1; int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */ +const char *wt_path_encoding = NULL; indent struct startup_info *startup_info; unsigned long pack_size_limit_cfg; diff --git a/git-compat-util.h b/git-compat-util.h index 35b095e..877b060 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -153,13 +153,21 @@ #endif #endif -/* used on Mac OS X */ -#ifdef PRECOMPOSE_UNICODE -#include compat/precompose_utf8.h +#if defined(PATH_ENCODING) || defined(PRECOMPOSE_UNICODE) +#include compat/reencode_pathname.h #else -#define precompose_str(in,i_nfd2nfc) -#define precompose_argv(c,v) -#define probe_utf8_pathname_composition(a,b) +#define reencode_argv(c,v) +#endif + +/* needed for Mac OS X */ +#ifndef PRECOMPOSE_UNICODE +#define probe_utf8_pathname_composition(a,b); +#endif + +#ifndef PATH_ENCODING +#define str_worktree2repolen(in, insz) (NULL) +#define
Re: [PATCH 9/9] Add git-check-ignores
Junio C Hamano gits...@pobox.com writes: Let's step back a bit and think what this command is about. What is the reason why the user wants to run check-ignore $path in the first place? I think there are two (or three, depending on how you count). (1) You have one (or more) paths at hand. You want to know if it is (or some of them are) ignored, but you do not particularly care how they are ignored. Think of implementing your own git add as a script. (2) You have one or more paths that are ignored but do not want them to be, and want to find out why they are. A reason related to (2) is to find out why the paths you want to be excluded are included, so that you can fix it by disabling an entry in .gitignore that covers too widely, or by adding a new entry to override it. For that to work, the -v mode needs to show all paths that were given from the command line (or --stdin), to explain why each of them is ignored or not ignored. Hence, in addition: For the latter, you are debugging the set of exclude sources and want to learn where the decision to exclude it comes from. For that kind of use, it would be more useful if the output mimicked error messages from the compilers and output from grep -n to show the source, e.g. .gitignore:13:/git-am git-am we would need an entry that shows !include pattern in the output. A path that does not match any pattern anywhere in the exclude files is taken as included---I am not sure what is the best way to explain the reason why they are included. If we are going to show the entry that finally matched (either with negative or positive pattern) and decided the fate as the explanation for both excluded and included paths, perhaps not showing an entry for such a never matched path might be a good enough explanation. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Test failures in t4034
Ramsay Jones ram...@ramsay1.demon.co.uk writes: Yes, there was a net increase in the line count when I introduced die(), but the main program flow was less cluttered by error handling. The net result looked much better, so I thought it was worth it. What may not be too obvious, however, is that test-regex.c was written to be independent of git. That part I was very aware of actually; it it is a bit tricky to tell what the right thing to do, though. Your test itself needs to be pretty much portable without the portability help you would get git-compat-util.h, but the point of this kind of test is to tell if you want to define preprocessor macros that may affect the behaviour of such compatibility layer ;-) Given that I'm now building it as part of git, I should have simply #included git-compat-util.h and used the die() routine from libgit.a (since I'm now *relying* on test-regex being linked with libgit.a). OK. +int main(int argc, char **argv) +{ + char *pat = [^={} \t]+; + char *str = ={}\nfred; + regex_t r; + regmatch_t m[1]; + + if (regcomp(r, pat, REG_EXTENDED | REG_NEWLINE)) + die(failed regcomp() for pattern '%s', pat); + if (regexec(r, str, 1, m, 0)) + die(no match of pattern '%s' to string '%s', pat, str); + + /* http://sourceware.org/bugzilla/show_bug.cgi?id=3957 */ + if (m[0].rm_so == 3) /* matches '\n' when it should not */ + exit(1); This could be the third call site of die() that tells the user to build with NO_REGEX=1. Then cd t sh t0070-fundamental.sh -i -v would give that message directly to the user. Hmm, even without -i -v, it's *very* clear what is going on, but sure it wouldn't hurt either. (Also, I wanted to be able to distinguish an exit via die() from a test failed error return). OK. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 9/9] Add git-check-ignores
On Sun, Sep 2, 2012 at 9:50 PM, Adam Spiers g...@adamspiers.org wrote: I'm no expert on .gitattributes and check-attr, but AFAICS, all the opportunities to share code in the plumbing and front-end seem to be taken already, e.g. the directory traversal and path handling. The CLI argument parsing is necessarily different because check-attr requires a list of attributes as well as a list of files, and of course the output routines have to be different too. The only opportunity for code reuse which I saw but /didn't/ take was around the --stdin line parsing code which is duplicated between: check_attr_stdin_paths check_ignore_stdin_paths cmd_checkout_index cmd_update_index hash_stdin_paths I attempted to refactor these, but quickly realised that due to the lack of proper closures in C, the overheads and complexity incurred by performing such a refactoring probably outweighed the benefits, so I gave up on the idea. Having said that, I'm totally open to suggestions if you can spot other places where code could be reused :) Yeah. That was my impression too. I just hoped a new set of eyes might discover something ;) At lease we could prepare the output format that can be reused (maybe with little changes) for check-attr debugging if it comes later. Or make this command part of check-attr.. Also --quiet option, where check-ignore returns 0 if the given path is ignored, 1 otherwise? I considered that, but couldn't think of appropriate behaviour when multiple paths are given, so in the end I decided to remain consistent with check-attr, which always returns 0. But I'm happy to change it if you can think of a more useful behaviour. For example we could have a --count option which produces no output but has an exit status corresponding to the number of ignored files. We could take this opportunity to kill add --ignore-missing, which is basically .gitignore checking and it accepts multiple paths, I think. - If many paths are given, then perhaps we could print ignored paths (no extra info). How is this different to git ls-files -i -o ? I think ls-files requires real files on working directory, but check-ignore can deal with just non-existing paths. - For debugging, given one path, we print all the rules that are applied to it, which may help understand how/why it goes wrong. That would be nice, but I'm not sure it's a tremendously common use case. Could you think of a scenario in which it would be useful? I guess it could be done by adding a new DIR_DEBUG_IGNORED flag to dir_struct which would make the exclude matcher functions collect all matching patterns, rather than just returning the first one. This in turn would require another field for collecting all matched patterns. Mixing include/exclude ignore rules multiple times could be hard to figure out what goes wrong. But as we haven't seen an actual use case yet, just leave it out. I don't think we really need NEED_WORK_TREE here. .gitignore can be read from index only. I thought about that, but in the end I decided it probably didn't make sense, because none of the exclude matching routines match against the index - they all match against the working tree and core.excludesfile. This would also require changing the matching logic to honor the index, but I didn't see the benefit in doing that, since all operations which involve excludes (add, status, etc.) relate to a work tree. But as with all of the above, please don't hesitate to point out if I've missed something. You guys are the experts, not me ;-) Again I was thinking that check-ignore could work with imaginary paths and could be used by scripts. If a script just wants to check certain paths are excluded, it should not need to move to working directory (though it probably is in working directory most of the time). -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html