Re: [PATCH 0/2] git-p4: fix for handling of multiple depot paths
On 16/12/15 00:38, Sam Hocevar wrote: I'm actually surprised that the patch changes the order at all, since all it does is affect the decision (on a yes/no basis) to include a given file into a changelist. I'm going to have a look at that specific unit test, but of course as a user I'd prefer if the default behaviour could remain the same, unless it was actually a bug. We ask for changes in //depot/sub1/...@1,6 //depot/sub2/...@1,6' which gives us [4, 6, 3, 5]. The old code used to sort this list but this change removes the sort. Maybe putting the sort back would fix it? I'm not sure if my opinion as an outsider is of use, but since the perforce change number is monotonically increasing, my expectation as a user would be for them to be applied in order by the perforce change number. In answer to James' question, the test checks that the most recent change wins (i.e. applied in order). Luke -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7] blame: add support for --[no-]progress option
On Sat, Dec 12, 2015 at 7:51 PM, Edmundo Carmona Antoranz wrote: > --progress can't be used with --incremental or > porcelain formats. > > git-annotate inherits the option as well > > Helped-by: Eric Sunshine > Signed-off-by: Edmundo Carmona Antoranz > --- This version seems to address all the issues raised in my reviews of previous rounds, and the new code is more concise, well-formed, and easier to follow than earlier attempts. Thanks. For the sake of other reviewers, recent previous versions and reviews are here: v6: http://thread.gmane.org/gmane.comp.version-control.git/282305 v5: http://thread.gmane.org/gmane.comp.version-control.git/281677 > diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt > index 760eab7..02cb684 100644 > --- a/Documentation/blame-options.txt > +++ b/Documentation/blame-options.txt > @@ -69,6 +69,13 @@ include::line-range-format.txt[] > iso format is used. For supported values, see the discussion > of the --date option at linkgit:git-log[1]. > > +--[no-]progress:: > + Progress status is reported on the standard error stream > + by default when it is attached to a terminal. This flag > + enables progress reporting even if not attached to a > + terminal. Can't use `--progress` together with `--porcelain` > + or `--incremental`. > + > -M||:: > Detect moved or copied lines within a file. When a commit > moves or copies a block of lines (e.g. the original file > diff --git a/Documentation/git-blame.txt b/Documentation/git-blame.txt > index e6e947c..ba54175 100644 > --- a/Documentation/git-blame.txt > +++ b/Documentation/git-blame.txt > @@ -10,7 +10,8 @@ SYNOPSIS > [verse] > 'git blame' [-c] [-b] [-l] [--root] [-t] [-f] [-n] [-s] [-e] [-p] [-w] > [--incremental] > [-L ] [-S ] [-M] [-C] [-C] [-C] [--since=] > - [--abbrev=] [ | --contents | --reverse ] [--] > > + [--progress] [--abbrev=] [ | --contents | > --reverse ] > + [--] > > DESCRIPTION > --- > diff --git a/builtin/blame.c b/builtin/blame.c > index 2afe828..e78ca09 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -28,6 +28,7 @@ > #include "line-range.h" > #include "line-log.h" > #include "dir.h" > +#include "progress.h" > > static char blame_usage[] = N_("git blame [] [] [] > [--] "); > > @@ -50,6 +51,7 @@ static int incremental; > static int xdl_opts; > static int abbrev = -1; > static int no_whole_file_rename; > +static int show_progress; > > static struct date_mode blame_date_mode = { DATE_ISO8601 }; > static size_t blame_date_width; > @@ -127,6 +129,11 @@ struct origin { > char path[FLEX_ARRAY]; > }; > > +struct progress_info { > + struct progress *progress; > + int blamed_lines; > +}; > + > static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b, long ctxlen, > xdl_emit_hunk_consume_func_t hunk_func, void *cb_data) > { > @@ -1746,7 +1753,8 @@ static int emit_one_suspect_detail(struct origin > *suspect, int repeat) > * The blame_entry is found to be guilty for the range. > * Show it in incremental output. > */ > -static void found_guilty_entry(struct blame_entry *ent) > +static void found_guilty_entry(struct blame_entry *ent, > + struct progress_info *pi) > { > if (incremental) { > struct origin *suspect = ent->suspect; > @@ -1758,6 +1766,8 @@ static void found_guilty_entry(struct blame_entry *ent) > write_filename_info(suspect->path); > maybe_flush_or_die(stdout, "stdout"); > } > + pi->blamed_lines += ent->num_lines; > + display_progress(pi->progress, pi->blamed_lines); > } > > /* > @@ -1768,6 +1778,11 @@ static void assign_blame(struct scoreboard *sb, int > opt) > { > struct rev_info *revs = sb->revs; > struct commit *commit = prio_queue_get(&sb->commits); > + struct progress_info pi = { NULL, 0 }; > + > + if (show_progress) > + pi.progress = start_progress_delay(_("Blaming lines"), > + sb->num_lines, 50, 1); > > while (commit) { > struct blame_entry *ent; > @@ -1809,7 +1824,7 @@ static void assign_blame(struct scoreboard *sb, int opt) > suspect->guilty = 1; > for (;;) { > struct blame_entry *next = ent->next; > - found_guilty_entry(ent); > + found_guilty_entry(ent, &pi); > if (next) { > ent = next; > continue; > @@ -1825,6 +1840,8 @@ static void assign_blame(struct scoreboard *sb, int opt) > if (DEBUG) /* sanity */ > sanity_check_refcnt(sb); > } > + > + stop_progress(&pi.progress
Re: Help debugging git-svn
Edmundo Carmona Antoranz wrote: > 1 [main] perl 5652 cygwin_exception::open_stackdumpfile: Dumping stack > trace to perl.exe.stackdump > > And then, in the file: > > Exception: STATUS_ACCESS_VIOLATION at rip=0048360C10C > rax=000601E4BFF8 rbx=5219E248 rcx=00060003A590 > rdx= rsi=A950 rdi=0004 > r8 = r9 = r10=0023 > r11=00048D78607A r12=0003 r13=06F54A98 > r14=000601E18030 r15=06F54AB0 > rbp=00054A88 rsp=0022B810 > program=C:\Program Files\Git\usr\bin\perl.exe, pid 5652, thread main > cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B Any chance you can reproduce this on a Linux system? I do not use non-Free systems and have no debugging experience there at all. > With my very flawed knowledge of perl I have seen that the process is > getting to Ra.pm around here: It could also be a bug in the SVN bindings or the port of Perl. Which versions are you running? I've also been wondering about the motivation of SVN developers to do a good job on maintaining their Perl bindings (given git-svn is likely the main user of them). We've certainly had problems in the past with SVN breaking and causing git-svn to segfault around 2011-2012; but it's been a while... -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Malicously tampering git metadata?
On Tue, Dec 15, 2015 at 7:26 PM, Santiago Torres wrote: > Hello everyone, > > I'm Santiago, a PhD student at NYU doing research about secure software > development pipelines. We've been studying different aspects of Git > lately, (as it is an integral part of many projects) and we believe > we've found a vulnerabilty in the way Git structures/signs metadata. > > An attacker capable of performing as a Man in the Middle between a > GitHub server and a developer is able to trick such developer into > merging vulnerable commit objects, or omit security patches --- even if > all users sign all commit objects. Given that Git metadata is unsigned, > it can be modified to provide incorrect views of a repository to > downstream developers. > > An example of a malicious commit merge follows: > > 1) The attacker controlling or acting as the upstream server identifies > two branches: one in which the unsuspecting developer is working on, and > another in which a vulnerable piece of code is located. > > 2) Branch pointers are modified: the packed-refs file (or ref/heads/*) > is edited so that the master branch points to the vulnerable commit > object. Having performed the change, no additional configuration must be > made by the attacker, who now waits for an unsuspecting developer to > pull. > > 3) Once a developer pulls, he or she will be prompted to merge his code > with the new change-set (the vulnerable commit). This operation will > only resemble developer negligence. If no conflicts arise, the attack > will pass unsuspected. > > 4) The developer pushes to upstream. All the traffic can be re-routed > back to the original repository. The target branch now contains a > vulnerable piece of code. > > We have identified additional attack scenarios for modifying the > metadata that result in a incorrect state of the target repository, and > we are ready to disclose information about other variants of this attack > as well. > > We also designed a backwards-compatible defense mechanism to prevent > attacks based on Git metadata tampering. Also we implemented a proof of > concept of the scheme, and performed timing, stress and concurrency > tests; our results show that the overhead should be minimal, even in > large software repositories such as the Linux Kernel. > > We already approached people from CERT and GitHub regarding this attack > scenario, and we'd also like to hear your comments regarding this. This is what push certificates ought to solve. The server records all pushes and its signed certificates of pushes and by the difference of the refs (either in packed refs or as a loose ref) to the push certificate this tampering of the server can be detected. The push certs can however not be obtained via Git itself (they are just stored on the server for now for later inspection or similar), because to be really sure the client would need to learn about these push certificates out of band. The model there would be: * A vulnerable piece of software exists. * It get's fixed (and the fix is pushed with a signed push) * the MITM server doesn't show the fix (show the code from before fix) nor the push certificate thereof * client still pulls vulnerable code This model shows the distribution of push certs via the server itself may not be optimal. Thanks for researching on Git, Stefan > > Thanks! > -Santiago. > > P.S. We also elaborate more about this attack vector in this document: > https://drive.google.com/a/nyu.edu/file/d/0B2KBm0fULlS1RDR5UHVESjVua3M/view?usp=sharing > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Dec 2015, #05; Tue, 15)
On 15.12.15 23:48, Junio C Hamano wrote: > * tb/ls-files-eol (2015-11-28) 2 commits > - convert.c: mark a file-local function static > - ls-files: Add eol diagnostics > > Add options to ls-files to help diagnose end-of-line problems. > > This latest round hasn't gotten any review yet. > > Waiting for review. The latest review are here: $gmane/281785 $gmane/282061 And a re-roll is planned the next weeks, sorry for delay. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
Junio C Hamano writes: > This is why the_index.has_untracked_cache is not just a simple "Do I > want to use this feature?" boolean configuration. The index also > stores the real data, and "Am I using this feature?" bit goes hand > in hand with that real data. Thinking that this is merely a boolean > configuration is the real source of the confusion, and introducing a > config that overrules what the user has stored in the index needs to > add complexity. > > The additional complexity may (or may not) be justifiable, but in > any case "all other things being equal, this is a config" feels like > a flawed observation. To put it another way, the "bit" in the index (i.e. the presence of the cached data) is "Am I using the feature now?". The effect of the feature has to (and is designed to) persist, as it is a cache and you do not want a stale cache to give you wrong answers. There is no "Do I want to use the feature?" preference, in other words. And I do not mind creating such a preference bit as a configuration. That is why I suggested such a configuration to cause the equivalent of "update-index --untracked-cache" when a new index is created from scratch (as opposed to the case where the previously created cache data is carried forward across read_cache() -> do things to the index -> write_cache() flow). Doing it that way will not have to involve additional complexity that comes from the desire that setting a single configuration on (or off) has to suddenly change the behaviour of an index file that is already using (or not using) the feature. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: (unknown)
David Greene writes: > - If new option --keep-redundant is specified, invoke cherry-pick with > --keep-redundant-commits. This came up in the past several weeks, I think; you would need to disable patch-equivalence based commit filtering if you really want to do a --keep-redundant that is reproducible and/or reliable. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
Jeff King writes: >> If the feature is something only those with really large repositories >> care about, is it a good trade-off to make everybody pay the runtime >> cost and make code more complex and fragile? I am not yet convinced. > > I'm not sure I understand the runtime and complexity costs of the config > option. Isn't it just: > > if (core_untracked_cache) { > /* do the thing */ > } > > and loading core_untracked_cache in git_default_core_config()? Versus: > > if (the_index.has_untracked_cache) { > /* do the thing */ > } The latter is pretty much so, but the former needs to be a bit more involved, e.g. more like: if (core_untracked_cache) { if (!the_index.has_untracked_cache) create_and_attach_uc(&the_index); /* do the thing */ } else { if (the_index.has_untracked_cache) destroy and remove untracked cache; } Otherwise, after adding the cache to the index, flipping the config off, doing things with the index and working tree and then filpping the config back on, the index would have a stale cache that does not know what happened to the working tree while config was off, and your "git status" will start throwing a wrong result. This is why the_index.has_untracked_cache is not just a simple "Do I want to use this feature?" boolean configuration. The index also stores the real data, and "Am I using this feature?" bit goes hand in hand with that real data. Thinking that this is merely a boolean configuration is the real source of the confusion, and introducing a config that overrules what the user has stored in the index needs to add complexity. The additional complexity may (or may not) be justifiable, but in any case "all other things being equal, this is a config" feels like a flawed observation. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Help debugging git-svn
Hello, Eric, Everybody! I need your help getting git-svn to clone a repository. I had already cloned it once but then a few months ago I discovered the authors map file and it's like the first time I did a checkout using git well, perhaps not that much, but close. Seeing the real names of people when using log, blame, etc is a major difference... so, back to my tale of sorts. The thing is that I have already tried to clone it _several_ times and it always breaks at one point or another (or rather, at one revision of a branch or another). I have come to think that it's more or less at the same revisions that it breaks but I'm not 100% certain. What I _think_ is the cause of most of the problems is that in our repo people have misplaced branches _inside_ other branches, at least for a few revisions before realizing their mistake and deleting it. So say I have a standard layout. trunk branches tags Then at one point I copy trunk to branches/a Later on I copy trunk to branches/b Later on I copy trunk to branches/b/c (instead of branches/c) And a few revisions later I realize my mistake and copy branches/b/c to branches/c and remove branches/b/c I infer this because I'm seeing that on one of the revisions where git svn usually breaks when fetching has the content of the project inside a directory, like, say I have directories A, B and C in the project and I'm seeing that git svn is fetching a revision where the all the paths of the files are prepended by a directory that looks like a branch, like this: branch_name/A/filea.txt branch_name/A/fileb.txt etc etc Instead of A/filea.txt A/fileb.txt So... throw some ideas around that... and then, could you tell a non-perl developer how to debug it? Perhaps increase verbosity? One of the errors I see often when fetching (my memory tells me that it's associated to the branch-in-branch problem but I'm not completely sure right now), looks like this: 1 [main] perl 5652 cygwin_exception::open_stackdumpfile: Dumping stack trace to perl.exe.stackdump And then, in the file: Exception: STATUS_ACCESS_VIOLATION at rip=0048360C10C rax=000601E4BFF8 rbx=5219E248 rcx=00060003A590 rdx= rsi=A950 rdi=0004 r8 = r9 = r10=0023 r11=00048D78607A r12=0003 r13=06F54A98 r14=000601E18030 r15=06F54AB0 rbp=00054A88 rsp=0022B810 program=C:\Program Files\Git\usr\bin\perl.exe, pid 5652, thread main cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B With my very flawed knowledge of perl I have seen that the process is getting to Ra.pm around here: our $AUTOLOAD; sub AUTOLOAD { my $class = ref($_[0]); $AUTOLOAD =~ s/^${class}::(SUPER::)?//; return if $AUTOLOAD =~ m/^[A-Z]/; my $self = shift; no strict 'refs'; my $method = $self->can("invoke_$AUTOLOAD") or die "no such method $AUTOLOAD"; no warnings 'uninitialized'; $method->(@$self, @_); } The value of $AUTOLOAD there is 'finish_report' but I don't know (or at least see) where $method->(@$self, @_) is going. Well... I think that's enough of a mess of a mail (sorry about that). Hope I was able to provide enough information to at least move a little bit forward. Thanks in advance. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] Malicously tampering git metadata?
Hello everyone, I'm Santiago, a PhD student at NYU doing research about secure software development pipelines. We've been studying different aspects of Git lately, (as it is an integral part of many projects) and we believe we've found a vulnerabilty in the way Git structures/signs metadata. An attacker capable of performing as a Man in the Middle between a GitHub server and a developer is able to trick such developer into merging vulnerable commit objects, or omit security patches --- even if all users sign all commit objects. Given that Git metadata is unsigned, it can be modified to provide incorrect views of a repository to downstream developers. An example of a malicious commit merge follows: 1) The attacker controlling or acting as the upstream server identifies two branches: one in which the unsuspecting developer is working on, and another in which a vulnerable piece of code is located. 2) Branch pointers are modified: the packed-refs file (or ref/heads/*) is edited so that the master branch points to the vulnerable commit object. Having performed the change, no additional configuration must be made by the attacker, who now waits for an unsuspecting developer to pull. 3) Once a developer pulls, he or she will be prompted to merge his code with the new change-set (the vulnerable commit). This operation will only resemble developer negligence. If no conflicts arise, the attack will pass unsuspected. 4) The developer pushes to upstream. All the traffic can be re-routed back to the original repository. The target branch now contains a vulnerable piece of code. We have identified additional attack scenarios for modifying the metadata that result in a incorrect state of the target repository, and we are ready to disclose information about other variants of this attack as well. We also designed a backwards-compatible defense mechanism to prevent attacks based on Git metadata tampering. Also we implemented a proof of concept of the scheme, and performed timing, stress and concurrency tests; our results show that the overhead should be minimal, even in large software repositories such as the Linux Kernel. We already approached people from CERT and GitHub regarding this attack scenario, and we'd also like to hear your comments regarding this. Thanks! -Santiago. P.S. We also elaborate more about this attack vector in this document: https://drive.google.com/a/nyu.edu/file/d/0B2KBm0fULlS1RDR5UHVESjVua3M/view?usp=sharing -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Odd rebase behavior
Hi, The attached tests do not do what I expected them to do. I commented out the tests involving the new rebase empty commit behavior I just sent. The uncommented tests show the strange behavior. According to the rebase man page, rebase gathers commits as in "git log ..HEAD." However, that is not what happens in the tests below. Some of the commits disappear. The test basically does this: - Setup a master project and a subproject, merged via a subtree-like merge (this is how git-subtree does it). - Add some commits to the subproject directory after the subtree merge, to create some history not in the original subproject. - filter-branch --subdirectory-filter to extract commits from the subproject directory. - Rebase those commits back on to the original subproject repository. The above loses all commits made after the subproject is merged into the main project. Note that the rebase is a little wonky. filter-branch creates a disconnected graph and the rebase is invoked with =master. I'm not sure how rebase is supposed to operate in this case (if it is supported at all) but it definitely is not doing the master..HEAD thing. Replacing "master" with "--root" causes rebase to do the right thing. This seems like a bug to me, even with the strange on a disconnected graph. At the very least git should not silently lose commits. I can think of two ways this could be resolved: - Forbid this kind of operation and error our with a message (when and HEAD do not share ancestry) - Make it work as if --root were specified Thoughts? -David --->8--- #!/bin/sh test_description='git rebase tests for empty commits This test runs git rebase and tests handling of empty commits. ' . ./test-lib.sh addfile() { name=$1 echo $(basename ${name}) > ${name} ${git} add ${name} ${git} commit -m "Add $(basename ${name})" } check_equal() { test_debug 'echo' test_debug "echo \"check a:\" \"{$1}\"" test_debug "echo \" b:\" \"{$2}\"" if [ "$1" = "$2" ]; then return 0 else return 1 fi } last_commit_message() { git log --pretty=format:%s -1 } test_expect_success 'setup' ' test_commit README && mkdir files && cd files && git init && test_commit master1 && test_commit master2 && test_commit master3 && cd .. && test_debug "echo Add project master to master" && git fetch files master && git branch files-master FETCH_HEAD && test_debug "echo Add subtree master to master via subtree" && git read-tree --prefix=files_subtree files-master && git checkout -- files_subtree && tree=$(git write-tree) && head=$(git rev-parse HEAD) && rev=$(git rev-parse --verify files-master^0) && commit=$(git commit-tree -p ${head} -p ${rev} -m "Add subproject master" ${tree}) && git reset ${commit} && cd files_subtree && test_commit master4 && cd .. && test_commit files_subtree/master5 ' # Does not preserve master4 and master5. test_expect_success 'Rebase default' ' git checkout -b rebase-default master && git filter-branch --prune-empty -f --subdirectory-filter files_subtree && git commit -m "Empty commit" --allow-empty && git rebase -Xsubtree=files_subtree --preserve-merges --onto files-master master && check_equal "$(last_commit_message)" "files_subtree/master5" ' # Does not preserve master4, master5 and empty. test_expect_success 'Rebase --keep-empty' ' git checkout -b rebase-keep-empty master && git filter-branch --prune-empty -f --subdirectory-filter files_subtree && git commit -m "Empty commit" --allow-empty && git rebase -Xsubtree=files_subtree --keep-empty --preserve-merges --onto files-master master && check_equal "$(last_commit_message)" "Empty commit" ' # Does not preserve master4 and master5. #test_expect_success 'Rebase --keep-redundant' ' # git checkout -b rebase-keep-redundant master && # git filter-branch --prune-empty -f --subdirectory-filter files_subtree && # git commit -m "Empty commit" --allow-empty && # git rebase -Xsubtree=files_subtree --keep-redundant --preserve-merges --onto files-master master && # check_equal "$(last_commit_message)" "files_subtree/master5" #' # Does not preserve master4, master5 and empty. #test_expect_success 'Rebase --keep-empty --keep-redundant' ' # git checkout -b rebase-keep-empty-keep-redundant master && # git filter-branch --prune-empty -f --subdirectory-filter files_subtree && # git commit -m "Empty commit" --allow-empty && # git rebase -Xsubtree=files_subtree --keep-empty --keep-redundant --preserve-merges --onto files-master master && # check_equal "$(last_commit_message)" "Empty commit" #' test_done -- To unsub
[PATCH] Support rebase --keep-empty and --keep-redundant
From: "David A. Greene" Teach rebase how to invoke cherry-pick to keep empty commits. Add a new option --keep-redundant equivalent to cherry-pick's --keep-redundant-commits. With this option, rebase will preserve empty commits generated as a result of the merging process. Signed-off-by: David A. Greene --- git-rebase--interactive.sh | 11 +++- git-rebase.sh | 5 ++ t/t3427-rebase-empty.sh| 127 + 3 files changed, 142 insertions(+), 1 deletion(-) create mode 100755 t/t3427-rebase-empty.sh diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index b938a6d..8466cb9 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -393,7 +393,16 @@ pick_one_preserving_merges () { echo "$sha1 $(git rev-parse HEAD^0)" >> "$rewritten_list" ;; *) - output eval git cherry-pick \ + cherry_keep_empty= + if test -n "$keep_empty"; then + cherry_keep_empty="--allow-empty" + fi + cherry_keep_redundant= + if test -n "$keep_redundant"; then + cherry_keep_redundant="--keep-redundant-commits" + fi + output eval git cherry-pick "$cherry_keep_empty" \ + "$cherry_keep_redundant" \ ${gpg_sign_opt:+$(git rev-parse --sq-quote "$gpg_sign_opt")} \ "$strategy_args" "$@" || die_with_patch $sha1 "Could not pick $sha1" diff --git a/git-rebase.sh b/git-rebase.sh index af7ba5f..1eae688 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -24,6 +24,7 @@ m,merge! use merging strategies to rebase i,interactive! let the user edit the list of commits to rebase x,exec=! add exec lines after each commit of the editable list k,keep-empty preserve empty commits during rebase +keep-redundant preserve redundant commits during rebase f,force-rebase!force rebase even if branch is up to date X,strategy-option=! pass the argument through to the merge strategy stat! display a diffstat of what changed upstream @@ -86,6 +87,7 @@ action= preserve_merges= autosquash= keep_empty= +keep_redundant= test "$(git config --bool rebase.autosquash)" = "true" && autosquash=t gpg_sign_opt= @@ -255,6 +257,9 @@ do --keep-empty) keep_empty=yes ;; + --keep-redundant) + keep_redundant=yes + ;; --preserve-merges) preserve_merges=t test -z "$interactive_rebase" && interactive_rebase=implied diff --git a/t/t3427-rebase-empty.sh b/t/t3427-rebase-empty.sh new file mode 100755 index 000..9e67e00 --- /dev/null +++ b/t/t3427-rebase-empty.sh @@ -0,0 +1,127 @@ +#!/bin/sh + +test_description='git rebase tests for empty commits + +This test runs git rebase and tests handling of empty commits. +' +. ./test-lib.sh + +addfile() { +name=$1 +echo $(basename ${name}) > ${name} +${git} add ${name} +${git} commit -m "Add $(basename ${name})" +} + +check_equal() +{ + test_debug 'echo' + test_debug "echo \"check a:\" \"{$1}\"" + test_debug "echo \" b:\" \"{$2}\"" + if [ "$1" = "$2" ]; then + return 0 + else + return 1 + fi +} + +last_commit_message() +{ + git log --pretty=format:%s -1 +} + +test_expect_success 'setup' ' + test_commit README && + mkdir files && + cd files && + git init && + test_commit master1 && + test_commit master2 && + test_commit master3 && + cd .. && + test_debug "echo Add project master to master" && + git fetch files master && + git branch files-master FETCH_HEAD && + test_debug "echo Add subtree master to master via subtree" && + git read-tree --prefix=files_subtree files-master && + git checkout -- files_subtree && + tree=$(git write-tree) && + head=$(git rev-parse HEAD) && + rev=$(git rev-parse --verify files-master^0) && + commit=$(git commit-tree -p ${head} -p ${rev} -m "Add subproject master" ${tree}) && + git reset ${commit} && + cd files_subtree && + test_commit master4 && + cd .. && + test_commit files_subtree/master5 +' + +# Does not preserve master4 and master5. +#test_expect_success 'Rebase default' ' +# git checkout -b rebase-default master && +# git filter-branch --prune-empty -f --subdirectory-filter files_subtree && +# git commit -m "Empty commit" --allow-empty && +# git rebase -Xsubtree=files_subtree --preserve-merges --onto files-master master && +# check_equal "$(last_commit_message)" "files_subtree/master5"
[no subject]
This patch isn't ready for prime-time yet but I wanted to get it out for some discussion. While cleaning up and enhancing git-subtree, I've come across the need to have rebase behave nicely in the case of empty and redundant commits. There's a case in pick_one_preserving_merges where git-cherry pick is used to process the rebase and if cherry-pick fails, the rebase aborts. This change does two things: - If --keep-empty is specified, invoke cherry-pick with --allow-empty. - If new option --keep-redundant is specified, invoke cherry-pick with --keep-redundant-commits. This allows the rebase to go forward without intrruption in the included tests. I will also need a third option that has cherry-pick ignore redundant commits and remove them from the history. Unfortunately, I can't make out exactly how to do that in commit.c, which is where I gather the cherry-pick stuff happens. I'll need some help with that if there's general agreement that this is a useful enhancement. During the course of developing this, I've encountered some strange rebase behavior. I'll send another message about that. I'd appreciate feedback on this direction and any help with the cherry-pick stuff. Thanks! -David -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
On Tue, Dec 15, 2015 at 03:03:14PM -0800, Junio C Hamano wrote: > The thing is, I do not necessarily view this as "configuration". > The way I see the feature is that you say "--untracked" when you > want the states of untracked paths be kept track of in the index, > just like you say "git add Makefile" when you want the state of > 'Makefile' be kept track of in the index. Either the index keeps > track of it, or it doesn't, based solely on user's request, and the > bit to tell us which is the case is already in the index, exactly > because that is part of the data that is kept track of in the index. I know this is a fairly subjective argument, but it feels quite weird for me for such a config to persist in the index and not be mentioned anywhere else. Is there any other user-specified configuration option for which: rm -f .git/index git read-tree HEAD will actually _lose_ information? It seems to me that all other things being equal, we should be in favor of a config option simply because it reduces the cognitive burden on the user: it's one fewer place they need to be aware that git is keeping persistent decisions. > If the feature is something only those with really large repositories > care about, is it a good trade-off to make everybody pay the runtime > cost and make code more complex and fragile? I am not yet convinced. I'm not sure I understand the runtime and complexity costs of the config option. Isn't it just: if (core_untracked_cache) { /* do the thing */ } and loading core_untracked_cache in git_default_core_config()? Versus: if (the_index.has_untracked_cache) { /* do the thing */ } I ask as somebody who hasn't followed the topic closely. I just don't see what is that different about this versus other config options. What am I missing? -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
On Wed, Dec 16, 2015 at 12:03 AM, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > >> Of course hindsight is 20/20, but I think that given what's been >> covered in this thread it's been established that it's categorically >> better that if we introduce features like these that they be >> configured through the normal configuration facility rather than the >> configuration being sticky to the index. > > I doubt that any such thing has been established at all in this > thread. It may be true that you and perhaps Christian loudly > repeated it, but loudly repeating something and establishing > something as a fact are slightly different. > > The thing is, I do not necessarily view this as "configuration". > The way I see the feature is that you say "--untracked" when you > want the states of untracked paths be kept track of in the index. You probably know this, but the --untracked-cache has no bearing on what we actually keep track of, it's just an optimization for how efficiently we execute "git status" commands without the "-uno" option. We still produce the same output. > just like you say "git add Makefile" when you want the state of > 'Makefile' be kept track of in the index. Either the index keeps > track of it, or it doesn't, based solely on user's request, and the > bit to tell us which is the case is already in the index, exactly > because that is part of the data that is kept track of in the index. What I mean by "[we've] established that it's categorically better [to do this via git-config]" is that we can still do all that stuff, we can just also do more stuff now. >> Since the change in performance really isn't noticeable except on >> really large repositories, which are more likely to have someone >> involved watching the changelog on upgrades I think that's OK. > > Especially it is dubious to me that the trade-off you are making > with this design is a good one. In order to avoid paying a one-time > cost to run "update-index --untracked-cache" at sites that _do_ want > to use that feature (and after that, if you teach "git init" and > "git clone" to pay attention to the "give you the default" > configuration to run it for you, so that your users won't have to), It's not unreasonable to avoid the cost of running "update-index --untracked-cache", it's the difference between just adjusting /etc/gitconfig and continually having to traverse the entire / filesystem if you want to enable this feature on a system-wide basis. It should be easy to enable any Git feature via the configuration facility either on a --system, or --global or --local basis. > you are forcing all codepaths that makes any write to the index (not > just "init"-time) to make an extra check with the configuration all > the time for everybody, because you made the presence of the > untracked cache data in the index not usable as a sign that the user > wants to use that feature. Maybe I'm misunderstanding Christian's patches but don't we already parse the git configuration on any commands that update the index anyway? See git_default_core_config(). We already parse the git configuration to run "git status". > If the feature is something only those > with really large repositories care about, is it a good trade-off to > make everybody pay the runtime cost and make code more complex and > fragile? I am not yet convinced. I was arguing that only users with really large repositories would notice if we turned this off because the enabling facility had changed from per-index to config. But it doesn't follow that the expense of checking the git configuration which we're parsing anyway for the index-related commands makes things more complex & fragile. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFPR] updates for 2.7?
Junio C Hamano wrote: > Git 2.7-rc1 has just been tagged, and the remainder of the year will > be for the stabilization, fixes to brown-paper-bag bugs, reverts of > regressions, etc., but I haven't seen updates to the various > subsystems you guys maintain for some time. Please throw me "this > is a good time to pull and here are the updates" message in the > coming few weeks. Not much on the git-svn front. I've been meaning to split out further and make it lazy load for faster startup for a while... (+Cc Niluge kiwi) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Dec 2015, #05; Tue, 15)
On Tue, Dec 15, 2015 at 03:44:42PM -0800, Junio C Hamano wrote: > Junio C Hamano writes: > > > There already was strbuf_getline_crlf(), and I wanted a new name to > > be conservative. > > When I re-read the series, I realize that the existing one had > exactly the same semantics as strbuf_gets(), so I think no risk > would come from reusing that name. Let me try redoing the series > when I find time ;-) Heh. I just made up that name, not realizing it existed. Looks like it is fairly recent as part of builtin-am, which explains why I hadn't seen it yet. I agree it looks like it does what you want. Upon reading your first message I thought "eh, did I totally misunderstand the intent of the series?", but it looks like we are on the same page. :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] git-p4: fix for handling of multiple depot paths
I'm actually surprised that the patch changes the order at all, since all it does is affect the decision (on a yes/no basis) to include a given file into a changelist. I'm going to have a look at that specific unit test, but of course as a user I'd prefer if the default behaviour could remain the same, unless it was actually a bug. -- Sam. On Tue, Dec 15, 2015, James Farwell wrote: > I'm not sure if my opinion as an outsider is of use, but since the perforce > change number is monotonically increasing, my expectation as a user would be > for them to be applied in order by the perforce change number. :) > > - James > > > From: Luke Diamand > Sent: Monday, December 14, 2015 3:09 PM > To: Junio C Hamano > Cc: Git Users; James Farwell; Lars Schneider; Eric Sunshine; Sam Hocevar > Subject: Re: [PATCH 0/2] git-p4: fix for handling of multiple depot paths > > Sorry - I've just run the tests, and this change causes one of the > test cases in t9800-git-p4-basic.sh to fail. > > It looks like the test case makes an assumption about who wins if two > P4 depots have changes to files that end up in the same place, and > this change reverses the order. It may actually be fine, but it needs > to be thought about a bit. > > Sam - do you have any thoughts on this? > > Thanks > Luke > > > > > > > On 14 December 2015 at 22:06, Junio C Hamano wrote: > > Luke Diamand writes: > > > >> On 14 December 2015 at 19:16, Junio C Hamano wrote: > >>> Luke Diamand writes: > >>> > Having just fixed this, I've now just spotted that Sam Hocevar's fix > to reduce the number of P4 transactions also fixes it: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mail-2Darchive.com_git-2540vger.kernel.org_msg81880.html&d=BQIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=wkCayFhpIBdAOEa7tZDTcd1weqwtiFMEIQTL-WQPwC4&m=q8dsOAHvUiDzzPNGRAfMMrcXstxNlI-v7I_03uEL1e8&s=C8wVLMC-iU7We0r36sxOuu920ZjZYdpy7ysNi_5PYv8&e= > > That seems like a cleaner fix. > >>> > >>> Hmm, do you mean I should ignore this series and take the other one, > >>> take only 1/2 from this for tests and then both patches in the other > >>> one, or something else? > >> > >> The second of those (take only 1/2 from this for tests, and then both > >> from the other) seems like the way to go. > > > > OK. Should I consider the two patches from Sam "Reviewed-by" you? echo "creationism" | tr -d "holy godly goal" -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Introduce grep threads param
On Tue, Dec 15, 2015 at 3:06 PM, Junio C Hamano wrote: > Victor Leschuk writes: >> Subject: Re: [PATCH 1/2] Introduce grep threads param > > I'll retitle this to something like > > grep: add --threads= option and grep.threads configuration > > while queuing (which I did for v7 earlier). > > I think [2/2] and also moving the code to disable threading when > show-in-pager mode should be separate "preparatory clean-up" patches > before this main patch. I'll push out what I think this topic > should be on 'pu' later today (with fixups suggested above squashed > in); please check them and see what you think. I read over what was pushed to 'pu' and noticed a couple problems. First, the 'online_cpus() == 1' check, which was removed in patch 1/3, accidentally creeps back in with patch 3/3. >> +grep.threads:: >> + Number of grep worker threads, use it to tune up performance on >> + your machines. Leave it unset (or set to 0) for default behavior, >> + which is using 8 threads for all systems. >> + Default behavior may change in future versions >> + to better suit hardware and circumstances. > > The last sentence is too noisy. Perhaps drop it and phrase it like > this instead? > > grep.threads:: > Number of grep worker threads to use. If unset (or set to 0), > to 0), 8 threads are used by default (for now). Second, the stray "to 0)," on the second line needs to be dropped. Other than that, the series looks reasonable. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2 0/7] Rerolling sb/submodule-parallel-fetch for the time after 2.7
with the interdiff below: diff --git a/git-compat-util.h b/git-compat-util.h index 87456a3..8e39867 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -723,7 +723,6 @@ extern void *xmmap(void *start, size_t length, int prot, int flags, int fd, off_ extern void *xmmap_gently(void *start, size_t length, int prot, int flags, int fd, off_t offset); extern int xopen(const char *path, int flags, ...); extern ssize_t xread(int fd, void *buf, size_t len); -extern ssize_t xread_nonblock(int fd, void *buf, size_t len); extern ssize_t xwrite(int fd, const void *buf, size_t len); extern ssize_t xpread(int fd, void *buf, size_t len, off_t offset); extern int xdup(int fd); diff --git a/strbuf.c b/strbuf.c index b552a13..38686ff 100644 --- a/strbuf.c +++ b/strbuf.c @@ -389,7 +389,7 @@ ssize_t strbuf_read_once(struct strbuf *sb, int fd, size_t hint) ssize_t cnt; strbuf_grow(sb, hint ? hint : 8192); - cnt = xread_nonblock(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); + cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); if (cnt > 0) strbuf_setlen(sb, sb->len + cnt); return cnt; diff --git a/strbuf.h b/strbuf.h index c3e5980..2bf90e7 100644 --- a/strbuf.h +++ b/strbuf.h @@ -367,10 +367,10 @@ extern size_t strbuf_fread(struct strbuf *, size_t, FILE *); extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint); /** - * Returns the number of new bytes appended to the sb. - * Negative return value signals there was an error returned from - * underlying read(2), in which case the caller should check errno. - * e.g. errno == EAGAIN when the read may have blocked. + * Read the contents of a given file descriptor partially by using only one + * attempt of xread. The third argument can be used to give a hint about the + * file size, to avoid reallocs. Returns the number of new bytes appended to + * the sb. */ extern ssize_t strbuf_read_once(struct strbuf *, int fd, size_t hint); diff --git a/wrapper.c b/wrapper.c index f71237c..1770efa 100644 --- a/wrapper.c +++ b/wrapper.c @@ -243,7 +243,14 @@ ssize_t xread(int fd, void *buf, size_t len) struct pollfd pfd; pfd.events = POLLIN; pfd.fd = fd; - /* We deliberately ignore the return value */ + /* +* it is OK if this poll() failed; we +* want to leave this infinite loop +* only when read() returns with +* success, or an expected failure, +* which would be checked by the next +* call to read(2). +*/ poll(&pfd, 1, -1); } } @@ -252,28 +259,6 @@ ssize_t xread(int fd, void *buf, size_t len) } /* - * xread_nonblock() is the same a read(), but it automatically restarts read() - * interrupted operations (EINTR). xread_nonblock() DOES NOT GUARANTEE that - * "len" bytes is read. EWOULDBLOCK is turned into EAGAIN. - */ -ssize_t xread_nonblock(int fd, void *buf, size_t len) -{ - ssize_t nr; - if (len > MAX_IO_SIZE) - len = MAX_IO_SIZE; - while (1) { - nr = read(fd, buf, len); - if (nr < 0) { - if (errno == EINTR) - continue; - if (errno == EWOULDBLOCK) - errno = EAGAIN; - } - return nr; - } -} - -/* * xwrite() is the same a write(), but it automatically restarts write() * operations with a recoverable error (EAGAIN and EINTR). xwrite() DOES NOT * GUARANTEE that "len" bytes is written even if the operation is successful. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2 0/7] Rerolling sb/submodule-parallel-fetch for the time after 2.7
I am sending out a new version for replacing sb/submodule-parallel-fetch for the time after the 2.7 release. * Dropped the patch, which introduces xread_nonblock * strbuf_read_once uses xread now. This is safe as we poll before using strbuf_read_once, so we know we won't stall. * have the commit message reworded for "run-command: add an asynchronous parallel child processor" with Johannes' suggestion. Thanks, Stefan Jonathan Nieder (1): submodule.c: write "Fetching submodule " to stderr Stefan Beller (6): xread: poll on non blocking fds strbuf: add strbuf_read_once to read without blocking sigchain: add command to pop all common signals run-command: add an asynchronous parallel child processor fetch_populated_submodules: use new parallel job processing submodules: allow parallel fetching, add tests and documentation Documentation/fetch-options.txt | 7 + builtin/fetch.c | 6 +- builtin/pull.c | 6 + run-command.c | 335 run-command.h | 80 ++ sigchain.c | 9 ++ sigchain.h | 1 + strbuf.c| 11 ++ strbuf.h| 8 + submodule.c | 141 +++-- submodule.h | 2 +- t/t0061-run-command.sh | 53 +++ t/t5526-fetch-submodules.sh | 71 ++--- test-run-command.c | 55 ++- wrapper.c | 20 ++- 15 files changed, 731 insertions(+), 74 deletions(-) -- 2.6.4.443.ge094245.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2 6/7] fetch_populated_submodules: use new parallel job processing
In a later patch we enable parallel processing of submodules, this only adds the possibility for it. So this change should not change any user facing behavior. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- submodule.c | 142 +--- 1 file changed, 98 insertions(+), 44 deletions(-) diff --git a/submodule.c b/submodule.c index 8386477..6a2d786 100644 --- a/submodule.c +++ b/submodule.c @@ -12,6 +12,7 @@ #include "sha1-array.h" #include "argv-array.h" #include "blob.h" +#include "thread-utils.h" static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND; static struct string_list changed_submodule_paths; @@ -610,37 +611,28 @@ static void calculate_changed_submodule_paths(void) initialized_fetch_ref_tips = 0; } -int fetch_populated_submodules(const struct argv_array *options, - const char *prefix, int command_line_option, - int quiet) +struct submodule_parallel_fetch { + int count; + struct argv_array args; + const char *work_tree; + const char *prefix; + int command_line_option; + int quiet; + int result; +}; +#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0} + +static int get_next_submodule(struct child_process *cp, + struct strbuf *err, void *data, void **task_cb) { - int i, result = 0; - struct child_process cp = CHILD_PROCESS_INIT; - struct argv_array argv = ARGV_ARRAY_INIT; - const char *work_tree = get_git_work_tree(); - if (!work_tree) - goto out; - - if (read_cache() < 0) - die("index file corrupt"); - - argv_array_push(&argv, "fetch"); - for (i = 0; i < options->argc; i++) - argv_array_push(&argv, options->argv[i]); - argv_array_push(&argv, "--recurse-submodules-default"); - /* default value, "--submodule-prefix" and its value are added later */ - - cp.env = local_repo_env; - cp.git_cmd = 1; - cp.no_stdin = 1; - - calculate_changed_submodule_paths(); + int ret = 0; + struct submodule_parallel_fetch *spf = data; - for (i = 0; i < active_nr; i++) { + for ( ; spf->count < active_nr; spf->count++) { struct strbuf submodule_path = STRBUF_INIT; struct strbuf submodule_git_dir = STRBUF_INIT; struct strbuf submodule_prefix = STRBUF_INIT; - const struct cache_entry *ce = active_cache[i]; + const struct cache_entry *ce = active_cache[spf->count]; const char *git_dir, *default_argv; const struct submodule *submodule; @@ -652,7 +644,7 @@ int fetch_populated_submodules(const struct argv_array *options, submodule = submodule_from_name(null_sha1, ce->name); default_argv = "yes"; - if (command_line_option == RECURSE_SUBMODULES_DEFAULT) { + if (spf->command_line_option == RECURSE_SUBMODULES_DEFAULT) { if (submodule && submodule->fetch_recurse != RECURSE_SUBMODULES_NONE) { @@ -675,40 +667,102 @@ int fetch_populated_submodules(const struct argv_array *options, default_argv = "on-demand"; } } - } else if (command_line_option == RECURSE_SUBMODULES_ON_DEMAND) { + } else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) { if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name)) continue; default_argv = "on-demand"; } - strbuf_addf(&submodule_path, "%s/%s", work_tree, ce->name); + strbuf_addf(&submodule_path, "%s/%s", spf->work_tree, ce->name); strbuf_addf(&submodule_git_dir, "%s/.git", submodule_path.buf); - strbuf_addf(&submodule_prefix, "%s%s/", prefix, ce->name); + strbuf_addf(&submodule_prefix, "%s%s/", spf->prefix, ce->name); git_dir = read_gitfile(submodule_git_dir.buf); if (!git_dir) git_dir = submodule_git_dir.buf; if (is_directory(git_dir)) { - if (!quiet) - fprintf(stderr, "Fetching submodule %s%s\n", prefix, ce->name); - cp.dir = submodule_path.buf; - argv_array_push(&argv, default_argv); - argv_array_push(&argv, "--submodule-prefix"); - argv_array_push(&argv, submodule_prefix.buf); - cp.argv = argv.argv; - if (run_command(&cp)) - result = 1; -
[PATCHv2 4/7] sigchain: add command to pop all common signals
The new method removes all common signal handlers that were installed by sigchain_push. CC: Jeff King Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- sigchain.c | 9 + sigchain.h | 1 + 2 files changed, 10 insertions(+) diff --git a/sigchain.c b/sigchain.c index faa375d..2ac43bb 100644 --- a/sigchain.c +++ b/sigchain.c @@ -50,3 +50,12 @@ void sigchain_push_common(sigchain_fun f) sigchain_push(SIGQUIT, f); sigchain_push(SIGPIPE, f); } + +void sigchain_pop_common(void) +{ + sigchain_pop(SIGPIPE); + sigchain_pop(SIGQUIT); + sigchain_pop(SIGTERM); + sigchain_pop(SIGHUP); + sigchain_pop(SIGINT); +} diff --git a/sigchain.h b/sigchain.h index 618083b..138b20f 100644 --- a/sigchain.h +++ b/sigchain.h @@ -7,5 +7,6 @@ int sigchain_push(int sig, sigchain_fun f); int sigchain_pop(int sig); void sigchain_push_common(sigchain_fun f); +void sigchain_pop_common(void); #endif /* SIGCHAIN_H */ -- 2.6.4.443.ge094245.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2 3/7] strbuf: add strbuf_read_once to read without blocking
The new call will read from a file descriptor into a strbuf once. The underlying call xread is just run once. xread only reattempts reading in case of EINTR, which makes it suitable to use for a nonblocking read. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- strbuf.c | 11 +++ strbuf.h | 8 2 files changed, 19 insertions(+) diff --git a/strbuf.c b/strbuf.c index d76f0ae..38686ff 100644 --- a/strbuf.c +++ b/strbuf.c @@ -384,6 +384,17 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint) return sb->len - oldlen; } +ssize_t strbuf_read_once(struct strbuf *sb, int fd, size_t hint) +{ + ssize_t cnt; + + strbuf_grow(sb, hint ? hint : 8192); + cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); + if (cnt > 0) + strbuf_setlen(sb, sb->len + cnt); + return cnt; +} + #define STRBUF_MAXLINK (2*PATH_MAX) int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint) diff --git a/strbuf.h b/strbuf.h index 7123fca..2bf90e7 100644 --- a/strbuf.h +++ b/strbuf.h @@ -367,6 +367,14 @@ extern size_t strbuf_fread(struct strbuf *, size_t, FILE *); extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint); /** + * Read the contents of a given file descriptor partially by using only one + * attempt of xread. The third argument can be used to give a hint about the + * file size, to avoid reallocs. Returns the number of new bytes appended to + * the sb. + */ +extern ssize_t strbuf_read_once(struct strbuf *, int fd, size_t hint); + +/** * Read the contents of a file, specified by its path. The third argument * can be used to give a hint about the file size, to avoid reallocs. */ -- 2.6.4.443.ge094245.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2 7/7] submodules: allow parallel fetching, add tests and documentation
This enables the work of the previous patches. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- Documentation/fetch-options.txt | 7 +++ builtin/fetch.c | 6 +- builtin/pull.c | 6 ++ submodule.c | 3 +-- submodule.h | 2 +- t/t5526-fetch-submodules.sh | 20 6 files changed, 40 insertions(+), 4 deletions(-) diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt index 45583d8..6b109f6 100644 --- a/Documentation/fetch-options.txt +++ b/Documentation/fetch-options.txt @@ -100,6 +100,13 @@ ifndef::git-pull[] reference to a commit that isn't already in the local submodule clone. +-j:: +--jobs=:: + Number of parallel children to be used for fetching submodules. + Each will fetch from different submodules, such that fetching many + submodules will be faster. By default submodules will be fetched + one at a time. + --no-recurse-submodules:: Disable recursive fetching of submodules (this has the same effect as using the '--recurse-submodules=no' option). diff --git a/builtin/fetch.c b/builtin/fetch.c index c85f347..586840d 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -37,6 +37,7 @@ static int prune = -1; /* unspecified */ static int all, append, dry_run, force, keep, multiple, update_head_ok, verbosity; static int progress = -1, recurse_submodules = RECURSE_SUBMODULES_DEFAULT; static int tags = TAGS_DEFAULT, unshallow, update_shallow; +static int max_children = 1; static const char *depth; static const char *upload_pack; static struct strbuf default_rla = STRBUF_INIT; @@ -99,6 +100,8 @@ static struct option builtin_fetch_options[] = { N_("fetch all tags and associated objects"), TAGS_SET), OPT_SET_INT('n', NULL, &tags, N_("do not fetch all tags (--no-tags)"), TAGS_UNSET), + OPT_INTEGER('j', "jobs", &max_children, + N_("number of submodules fetched in parallel")), OPT_BOOL('p', "prune", &prune, N_("prune remote-tracking branches no longer on remote")), { OPTION_CALLBACK, 0, "recurse-submodules", NULL, N_("on-demand"), @@ -1213,7 +1216,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) result = fetch_populated_submodules(&options, submodule_prefix, recurse_submodules, - verbosity < 0); + verbosity < 0, + max_children); argv_array_clear(&options); } diff --git a/builtin/pull.c b/builtin/pull.c index 5145fc6..9e3c738 100644 --- a/builtin/pull.c +++ b/builtin/pull.c @@ -95,6 +95,7 @@ static int opt_force; static char *opt_tags; static char *opt_prune; static char *opt_recurse_submodules; +static char *max_children; static int opt_dry_run; static char *opt_keep; static char *opt_depth; @@ -178,6 +179,9 @@ static struct option pull_options[] = { N_("on-demand"), N_("control recursive fetching of submodules"), PARSE_OPT_OPTARG), + OPT_PASSTHRU('j', "jobs", &max_children, N_("n"), + N_("number of submodules pulled in parallel"), + PARSE_OPT_OPTARG), OPT_BOOL(0, "dry-run", &opt_dry_run, N_("dry run")), OPT_PASSTHRU('k', "keep", &opt_keep, NULL, @@ -525,6 +529,8 @@ static int run_fetch(const char *repo, const char **refspecs) argv_array_push(&args, opt_prune); if (opt_recurse_submodules) argv_array_push(&args, opt_recurse_submodules); + if (max_children) + argv_array_push(&args, max_children); if (opt_dry_run) argv_array_push(&args, "--dry-run"); if (opt_keep) diff --git a/submodule.c b/submodule.c index 6a2d786..0b48734 100644 --- a/submodule.c +++ b/submodule.c @@ -729,10 +729,9 @@ static int fetch_finish(int retvalue, struct child_process *cp, int fetch_populated_submodules(const struct argv_array *options, const char *prefix, int command_line_option, - int quiet) + int quiet, int max_parallel_jobs) { int i; - int max_parallel_jobs = 1; struct submodule_parallel_fetch spf = SPF_INIT; spf.work_tree = get_git_work_tree(); diff --git a/submodule.h b/submodule.h index 5507c3d..cbc0003 100644 --- a/submodule.h +++ b/submodule.h @@ -31,7 +31,7 @@ void set_config_fetch_recurse_submodules(int value); void check_for_new_submodule_commits(unsigned char new_sha1[20]); int fetch_populated_submodules(const struct argv_array *options,
[PATCHv2 5/7] run-command: add an asynchronous parallel child processor
This allows to run external commands in parallel with ordered output on stderr. If we run external commands in parallel we cannot pipe the output directly to the our stdout/err as it would mix up. So each process's output will flow through a pipe, which we buffer. One subprocess can be directly piped to out stdout/err for a low latency feedback to the user. Example: Let's assume we have 5 submodules A,B,C,D,E and each fetch takes a different amount of time as the different submodules vary in size, then the output of fetches in sequential order might look like this: time --> output: |---A---| |-B-| |---C---| |-D-| |-E-| When we schedule these submodules into maximal two parallel processes, a schedule and sample output over time may look like this: process 1: |---A---| |-D-| |-E-| process 2: |-B-| |---C---| output:|---A---|B|---C---|DE So A will be perceived as it would run normally in the single child version. As B has finished by the time A is done, we can dump its whole progress buffer on stderr, such that it looks like it finished in no time. Once that is done, C is determined to be the visible child and its progress will be reported in real time. So this way of output is really good for human consumption, as it only changes the timing, not the actual output. For machine consumption the output needs to be prepared in the tasks, by either having a prefix per line or per block to indicate whose tasks output is displayed, because the output order may not follow the original sequential ordering: |A| |--B--| |-C-| will be scheduled to be all parallel: process 1: |A| process 2: |--B--| process 3: |-C-| output:|A|CB This happens because C finished before B did, so it will be queued for output before B. To detect when a child has finished executing, we check interleaved with other actions (such as checking the liveliness of children or starting new processes) whether the stderr pipe still exists. Once a child closed its stderr stream, we assume it is terminating very soon, and use `finish_command()` from the single external process execution interface to collect the exit status. By maintaining the strong assumption of stderr being open until the very end of a child process, we can avoid other hassle such as an implementation using `waitpid(-1)`, which is not implemented in Windows. Signed-off-by: Stefan Beller --- run-command.c | 335 + run-command.h | 80 t/t0061-run-command.sh | 53 test-run-command.c | 55 +++- 4 files changed, 522 insertions(+), 1 deletion(-) diff --git a/run-command.c b/run-command.c index 13fa452..51fd72c 100644 --- a/run-command.c +++ b/run-command.c @@ -3,6 +3,8 @@ #include "exec_cmd.h" #include "sigchain.h" #include "argv-array.h" +#include "thread-utils.h" +#include "strbuf.h" void child_process_init(struct child_process *child) { @@ -865,3 +867,336 @@ int capture_command(struct child_process *cmd, struct strbuf *buf, size_t hint) close(cmd->out); return finish_command(cmd); } + +enum child_state { + GIT_CP_FREE, + GIT_CP_WORKING, + GIT_CP_WAIT_CLEANUP, +}; + +struct parallel_processes { + void *data; + + int max_processes; + int nr_processes; + + get_next_task_fn get_next_task; + start_failure_fn start_failure; + task_finished_fn task_finished; + + struct { + enum child_state state; + struct child_process process; + struct strbuf err; + void *data; + } *children; + /* +* The struct pollfd is logically part of *children, +* but the system call expects it as its own array. +*/ + struct pollfd *pfd; + + unsigned shutdown : 1; + + int output_owner; + struct strbuf buffered_output; /* of finished children */ +}; + +static int default_start_failure(struct child_process *cp, +struct strbuf *err, +void *pp_cb, +void *pp_task_cb) +{ + int i; + + strbuf_addstr(err, "Starting a child failed:"); + for (i = 0; cp->argv[i]; i++) + strbuf_addf(err, " %s", cp->argv[i]); + + return 0; +} + +static int default_task_finished(int result, +struct child_process *cp, +struct strbuf *err, +void *pp_cb, +void *pp_task_cb) +{ + int i; + + if (!result) + return 0; + + strbuf_addf(err, "A child failed with return code %d:", result); + for (i = 0; cp->argv[i]; i++) + strbuf_addf(err, " %s", cp->argv[i]); + + return 0; +} + +static void kill_children(struct parallel_processes *pp, int signo) +{ + int i, n = pp->max_processes;
[PATCHv2 1/7] submodule.c: write "Fetching submodule " to stderr
From: Jonathan Nieder The "Pushing submodule " progress output correctly goes to stderr, but "Fetching submodule " is going to stdout by mistake. Fix it to write to stderr. Noticed while trying to implement a parallel submodule fetch. When this particular output line went to a different file descriptor, it was buffered separately, resulting in wrongly interleaved output if we copied it to the terminal naively. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- submodule.c | 2 +- t/t5526-fetch-submodules.sh | 51 +++-- 2 files changed, 27 insertions(+), 26 deletions(-) diff --git a/submodule.c b/submodule.c index 14e7624..8386477 100644 --- a/submodule.c +++ b/submodule.c @@ -689,7 +689,7 @@ int fetch_populated_submodules(const struct argv_array *options, git_dir = submodule_git_dir.buf; if (is_directory(git_dir)) { if (!quiet) - printf("Fetching submodule %s%s\n", prefix, ce->name); + fprintf(stderr, "Fetching submodule %s%s\n", prefix, ce->name); cp.dir = submodule_path.buf; argv_array_push(&argv, default_argv); argv_array_push(&argv, "--submodule-prefix"); diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh index a4532b0..17759b14 100755 --- a/t/t5526-fetch-submodules.sh +++ b/t/t5526-fetch-submodules.sh @@ -16,7 +16,8 @@ add_upstream_commit() { git add subfile && git commit -m new subfile && head2=$(git rev-parse --short HEAD) && - echo "From $pwd/submodule" > ../expect.err && + echo "Fetching submodule submodule" > ../expect.err && + echo "From $pwd/submodule" >> ../expect.err && echo " $head1..$head2 master -> origin/master" >> ../expect.err ) && ( @@ -27,6 +28,7 @@ add_upstream_commit() { git add deepsubfile && git commit -m new deepsubfile && head2=$(git rev-parse --short HEAD) && + echo "Fetching submodule submodule/subdir/deepsubmodule" >> ../expect.err echo "From $pwd/deepsubmodule" >> ../expect.err && echo " $head1..$head2 master -> origin/master" >> ../expect.err ) @@ -56,9 +58,7 @@ test_expect_success setup ' ( cd downstream && git submodule update --init --recursive - ) && - echo "Fetching submodule submodule" > expect.out && - echo "Fetching submodule submodule/subdir/deepsubmodule" >> expect.out + ) ' test_expect_success "fetch --recurse-submodules recurses into submodules" ' @@ -67,7 +67,7 @@ test_expect_success "fetch --recurse-submodules recurses into submodules" ' cd downstream && git fetch --recurse-submodules >../actual.out 2>../actual.err ) && - test_i18ncmp expect.out actual.out && + test_must_be_empty actual.out && test_i18ncmp expect.err actual.err ' @@ -96,7 +96,7 @@ test_expect_success "using fetchRecurseSubmodules=true in .gitmodules recurses i git config -f .gitmodules submodule.submodule.fetchRecurseSubmodules true && git fetch >../actual.out 2>../actual.err ) && - test_i18ncmp expect.out actual.out && + test_must_be_empty actual.out && test_i18ncmp expect.err actual.err ' @@ -127,7 +127,7 @@ test_expect_success "--recurse-submodules overrides fetchRecurseSubmodules setti git config --unset -f .gitmodules submodule.submodule.fetchRecurseSubmodules && git config --unset submodule.submodule.fetchRecurseSubmodules ) && - test_i18ncmp expect.out actual.out && + test_must_be_empty actual.out && test_i18ncmp expect.err actual.err ' @@ -146,7 +146,7 @@ test_expect_success "--dry-run propagates to submodules" ' cd downstream && git fetch --recurse-submodules --dry-run >../actual.out 2>../actual.err ) && - test_i18ncmp expect.out actual.out && + test_must_be_empty actual.out && test_i18ncmp expect.err actual.err ' @@ -155,7 +155,7 @@ test_expect_success "Without --dry-run propagates to submodules" ' cd downstream && git fetch --recurse-submodules >../actual.out 2>../actual.err ) && - test_i18ncmp expect.out actual.out && + test_must_be_empty actual.out && test_i18ncmp expect.err actual.err ' @@ -166,7 +166,7 @@ test_expect_success "recurseSubmodules=true propagates into submodules" ' git config fetch.recurseSubmodules true git fetch >../actual.out 2>../actual.err ) && - test_i18ncmp expect.out ac
[PATCHv2 2/7] xread: poll on non blocking fds
The man page of read(2) says: EAGAIN The file descriptor fd refers to a file other than a socket and has been marked nonblocking (O_NONBLOCK), and the read would block. EAGAIN or EWOULDBLOCK The file descriptor fd refers to a socket and has been marked nonblocking (O_NONBLOCK), and the read would block. POSIX.1-2001 allows either error to be returned for this case, and does not require these constants to have the same value, so a portable application should check for both possibilities. If we get an EAGAIN or EWOULDBLOCK the fd must have set O_NONBLOCK. As the intent of xread is to read as much as possible either until the fd is EOF or an actual error occurs, we can ease the feeder of the fd by not spinning the whole time, but rather wait for it politely by not busy waiting. We should not care if the call to poll failed, as we're in an infinite loop and can only get out with the correct read(). Signed-off-by: Stefan Beller Acked-by: Johannes Sixt Signed-off-by: Junio C Hamano --- wrapper.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/wrapper.c b/wrapper.c index 6fcaa4d..1770efa 100644 --- a/wrapper.c +++ b/wrapper.c @@ -236,8 +236,24 @@ ssize_t xread(int fd, void *buf, size_t len) len = MAX_IO_SIZE; while (1) { nr = read(fd, buf, len); - if ((nr < 0) && (errno == EAGAIN || errno == EINTR)) - continue; + if (nr < 0) { + if (errno == EINTR) + continue; + if (errno == EAGAIN || errno == EWOULDBLOCK) { + struct pollfd pfd; + pfd.events = POLLIN; + pfd.fd = fd; + /* +* it is OK if this poll() failed; we +* want to leave this infinite loop +* only when read() returns with +* success, or an expected failure, +* which would be checked by the next +* call to read(2). +*/ + poll(&pfd, 1, -1); + } + } return nr; } } -- 2.6.4.443.ge094245.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] blame: display a more helpful error message if the file was deleted
`git blame 22414770 generate-cmdlist.perl` currently results in: fatal: cannot stat path '22414770': No such file or directory This patch changes the error message to: fatal: ambiguous argument 'generate-cmdlist.perl': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git [...] -- [...]'" That way, the user knows to rewrite the command as `git blame 22414770 -- generate-cmdlist.perl`. Signed-off-by: Alex Henrie --- builtin/blame.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/builtin/blame.c b/builtin/blame.c index 1df13cf..f070272 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -2683,8 +2683,6 @@ parse_done: argv[argc - 1] = "--"; setup_work_tree(); - if (!file_exists(path)) - die_errno("cannot stat path '%s'", path); } revs.disable_stdin = 1; -- 2.6.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Dec 2015, #05; Tue, 15)
Junio C Hamano writes: > There already was strbuf_getline_crlf(), and I wanted a new name to > be conservative. When I re-read the series, I realize that the existing one had exactly the same semantics as strbuf_gets(), so I think no risk would come from reusing that name. Let me try redoing the series when I find time ;-) Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Dec 2015, #05; Tue, 15)
Jeff King writes: >> * jc/strbuf-gets (2015-10-28) 17 commits >> [...] >> >> Teach codepaths that communicate with users by reading text files >> to be more lenient to editors that write CRLF-terminated lines. >> Note that this is only about communication with Git, like feeding >> list of object names from the standard input instead of from the >> command line, and does not involve files in the working tree. >> >> Waiting for review. > > I like the intent here, but I was a little disappointed that we end up > with two almost identical strbuf functions. But even if the ultimate > endgame is to drop back to one, the conservative route is to keep them > both until all new code paths have "opted in" to the new behavior. > However, I found the naming confusing: it was not at all clear to me > which of strbuf_gets and strbuf_getline did the CRLF-munging. Perhaps > it would be more obvious if the new one was strbuf_getline_crlf or > something. I dunno. There already was strbuf_getline_crlf(), and I wanted a new name to be conservative. It was modelled after gets() (not fgets()) that removes the trailing line terminator, but there may have been better names. I dunno, either. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Dec 2015, #05; Tue, 15)
Jeff King writes: > On Tue, Dec 15, 2015 at 02:48:42PM -0800, Junio C Hamano wrote: > >> * dk/gc-more-wo-pack (2015-11-24) 3 commits >> - gc: Clean garbage .bitmap files from pack dir >> - t5304: Add test for .bitmap garbage files >> - prepare_packed_git(): find more garbage >> >> Follow-on to dk/gc-idx-wo-pack topic, to clean up stale >> .bitmap and .keep files. >> >> Waiting for review. > > I just read through and made some comments. > > Note that I think there was a re-roll of the first patch after I picked > up the original: > > http://article.gmane.org/gmane.comp.version-control.git/281759 > > Hopefully Doug will re-post the whole series after taking a look at my > comments. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Dec 2015, #05; Tue, 15)
On Tue, Dec 15, 2015 at 02:48:42PM -0800, Junio C Hamano wrote: > * dk/gc-more-wo-pack (2015-11-24) 3 commits > - gc: Clean garbage .bitmap files from pack dir > - t5304: Add test for .bitmap garbage files > - prepare_packed_git(): find more garbage > > Follow-on to dk/gc-idx-wo-pack topic, to clean up stale > .bitmap and .keep files. > > Waiting for review. I just read through and made some comments. Note that I think there was a re-roll of the first patch after I picked up the original: http://article.gmane.org/gmane.comp.version-control.git/281759 Hopefully Doug will re-post the whole series after taking a look at my comments. > * jc/strbuf-gets (2015-10-28) 17 commits > [...] > > Teach codepaths that communicate with users by reading text files > to be more lenient to editors that write CRLF-terminated lines. > Note that this is only about communication with Git, like feeding > list of object names from the standard input instead of from the > command line, and does not involve files in the working tree. > > Waiting for review. I like the intent here, but I was a little disappointed that we end up with two almost identical strbuf functions. But even if the ultimate endgame is to drop back to one, the conservative route is to keep them both until all new code paths have "opted in" to the new behavior. However, I found the naming confusing: it was not at all clear to me which of strbuf_gets and strbuf_getline did the CRLF-munging. Perhaps it would be more obvious if the new one was strbuf_getline_crlf or something. I dunno. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] prepare_packed_git(): find more garbage
On Tue, Dec 15, 2015 at 06:09:57PM -0500, Jeff King wrote: > > @@ -1225,6 +1225,15 @@ static void report_helper(const struct string_list > > *list, > [...] > If I understand this function correctly, we're just trying to > get rid of "boring" cases that do not need to be reported. BTW, I wondered if this should perhaps just be calling bits_to_msg() and seeing if it returns NULL. It seems like the logic for which cases are "interesting" ends up duplicated. But maybe I am missing something. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir
On Fri, Nov 13, 2015 at 04:46:27PM -0800, Doug Kelly wrote: > Similar to cleaning up excess .idx files, clean any garbage .bitmap > files that are not otherwise associated with any .idx/.pack files. > > Signed-off-by: Doug Kelly > --- > builtin/gc.c | 12 ++-- > t/t5304-prune.sh | 2 +- > 2 files changed, 11 insertions(+), 3 deletions(-) > > diff --git a/builtin/gc.c b/builtin/gc.c > index c583aad..7ddf071 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -58,8 +58,16 @@ static void clean_pack_garbage(void) > > static void report_pack_garbage(unsigned seen_bits, const char *path) > { > - if (seen_bits == PACKDIR_FILE_IDX) > - string_list_append(&pack_garbage, path); > + if (seen_bits & PACKDIR_FILE_IDX || > + seen_bits & PACKDIR_FILE_BITMAP) { So here we're relying on report_helper to have culled the boring cases, right? (Sorry if that is totally obvious; I'm mostly just thinking out loud). That makes sense, then. > + const char *dot = strrchr(path, '.'); > + if (dot) { > + int baselen = dot - path + 1; > + if (!strcmp(path+baselen, "idx") || > + !strcmp(path+baselen, "bitmap")) > + string_list_append(&pack_garbage, path); > + } > + } I was confused at first why we couldn't just pass "path" here. But it's because we will get a garbage report for each related file, and we want to keep some of them (like .keep). Which I guess makes sense. I wonder if this would be simpler to read as just: if (ends_with(path, ".idx") || ends_with(path, ".bitmap")) string_list_append(&pack_garbage, path); Technically it is less efficient because we will compute strlen(path) twice, but that seems like premature optimization (not to mention that ends_with is an inline, so a good compiler can probably optimize out the second call anyway). > -test_expect_failure 'clean pack garbage with gc' ' > +test_expect_success 'clean pack garbage with gc' ' > test_when_finished "rm -f .git/objects/pack/fake*" && > test_when_finished "rm -f .git/objects/pack/foo*" && > : >.git/objects/pack/foo.keep && Should we be checking at the end of this test that "*.keep" didn't get blown away? It might be nice to just test_cmp the results of "ls" on the pack directory to confirm exactly what got deleted and what didn't. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Dec 2015, #05; Tue, 15)
On Tue, Dec 15, 2015 at 2:48 PM, Junio C Hamano wrote: > > * sb/submodule-parallel-fetch (2015-12-14) 8 commits > - submodules: allow parallel fetching, add tests and documentation > - fetch_populated_submodules: use new parallel job processing > - run-command: add an asynchronous parallel child processor > - sigchain: add command to pop all common signals > - strbuf: add strbuf_read_once to read without blocking > - xread_nonblock: add functionality to read from fds without blocking > - xread: poll on non blocking fds > - submodule.c: write "Fetching submodule " to stderr > (this branch is used by sb/submodule-parallel-update.) > > Add a framework to spawn a group of processes in parallel, and use > it to run "git fetch --recurse-submodules" in parallel. > > Rerolled and this seems to be a lot cleaner. The merge of the > earlier one to 'next' has been reverted. > > Will merge to 'next' after a few days. I plan on resending after the recent discussion, dropping xread_nonblock and only have strbuf_read_once in there. > > * sb/submodule-parallel-update (2015-12-14) 8 commits > - clone: allow an explicit argument for parallel submodule clones > - submodule update: expose parallelism to the user > - git submodule update: have a dedicated helper for cloning > - fetching submodules: respect `submodule.fetchJobs` config option > - submodule-config: introduce parse_generic_submodule_config > - submodule-config: remove name_and_item_from_var > - submodule-config: drop check against NULL > - submodule-config: keep update strategy around > (this branch uses sb/submodule-parallel-fetch.) > > Builds on top of the "fetch --recurse-submodules" work to introduce > parallel downloading into multiple submodules for "submodule update". > > Needs review. A while ago it had some review (specially bike shedding on the option name), I did not change it substantially. I hope we don't get into bike shedding the option name again. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] prepare_packed_git(): find more garbage
On Thu, Nov 26, 2015 at 12:15:29AM -0600, Doug Kelly wrote: > diff --git a/builtin/count-objects.c b/builtin/count-objects.c > index ba92919..5197b57 100644 > --- a/builtin/count-objects.c > +++ b/builtin/count-objects.c > @@ -17,19 +17,15 @@ static off_t loose_size; > > static const char *bits_to_msg(unsigned seen_bits) > { > - switch (seen_bits) { > - case 0: > - return "no corresponding .idx or .pack"; > - case PACKDIR_FILE_GARBAGE: > + if (seen_bits == PACKDIR_FILE_GARBAGE) > return "garbage found"; It seems weird to use "==" on a bitfield. I think it is the case now that we would never see GARBAGE alongside anything else, but I wonder if we should future-proof that as: if (seen_bits & PACKDIR_FILE_GARBAGE) Specifically, I am wondering what would happen if we had "foo.pack" and "foo.bogus", where we do not know about the latter at all. > - case PACKDIR_FILE_PACK: > + else if (seen_bits & PACKDIR_FILE_PACK && !(seen_bits & > PACKDIR_FILE_IDX)) > return "no corresponding .idx"; > - case PACKDIR_FILE_IDX: > + else if (seen_bits & PACKDIR_FILE_IDX && !(seen_bits & > PACKDIR_FILE_PACK)) > return "no corresponding .pack"; > - case PACKDIR_FILE_PACK|PACKDIR_FILE_IDX: > - default: > - return NULL; > - } > + else if (seen_bits == 0 || !(seen_bits & > (PACKDIR_FILE_IDX|PACKDIR_FILE_PACK))) > + return "no corresponding .idx or .pack"; > + return NULL; This bottom conditional is interesting. I understand the second half: we saw something pack-like, but there is not matching .idx or .pack at all (if we saw one but not the other, we would have caught it above). But when will we get an empty seen_bits? What did we see that triggered this function, but didn't trigger a bit (even GARBAGE)? I don't mind if the answer is "nothing, this is future-proofing", but am mostly curious. > diff --git a/sha1_file.c b/sha1_file.c > index 3d56746..5f939e4 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -1225,6 +1225,15 @@ static void report_helper(const struct string_list > *list, > if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX)) > return; > > + if (seen_bits == > (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP)) > + return; > + > + if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_KEEP)) > + return; > + > + if (seen_bits == > (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP|PACKDIR_FILE_KEEP)) > + return; > + It seems like we're enumerating a lot of cases here that will explode if we get even one more file type (e.g., we add "pack-XXX.foo" in the future). If I understand this function correctly, we're just trying to get rid of "boring" cases that do not need to be reported. Isn't any case that has both a pack and an idx boring (no matter if it has a .bitmap or .keep)? IOW, can these four conditionals just become: unsigned pack_with_idx = PACKDIR_FILE_PACK | PACKDIR_FILE_IDX; if ((seen_bits & pack_with_idx) == pack_with_idx) return; ? -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFPR] updates for 2.7?
Git 2.7-rc1 has just been tagged, and the remainder of the year will be for the stabilization, fixes to brown-paper-bag bugs, reverts of regressions, etc., but I haven't seen updates to the various subsystems you guys maintain for some time. Please throw me "this is a good time to pull and here are the updates" message in the coming few weeks. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
Ævar Arnfjörð Bjarmason writes: > Of course hindsight is 20/20, but I think that given what's been > covered in this thread it's been established that it's categorically > better that if we introduce features like these that they be > configured through the normal configuration facility rather than the > configuration being sticky to the index. I doubt that any such thing has been established at all in this thread. It may be true that you and perhaps Christian loudly repeated it, but loudly repeating something and establishing something as a fact are slightly different. The thing is, I do not necessarily view this as "configuration". The way I see the feature is that you say "--untracked" when you want the states of untracked paths be kept track of in the index, just like you say "git add Makefile" when you want the state of 'Makefile' be kept track of in the index. Either the index keeps track of it, or it doesn't, based solely on user's request, and the bit to tell us which is the case is already in the index, exactly because that is part of the data that is kept track of in the index. > Since the change in performance really isn't noticeable except on > really large repositories, which are more likely to have someone > involved watching the changelog on upgrades I think that's OK. Especially it is dubious to me that the trade-off you are making with this design is a good one. In order to avoid paying a one-time cost to run "update-index --untracked-cache" at sites that _do_ want to use that feature (and after that, if you teach "git init" and "git clone" to pay attention to the "give you the default" configuration to run it for you, so that your users won't have to), you are forcing all codepaths that makes any write to the index (not just "init"-time) to make an extra check with the configuration all the time for everybody, because you made the presence of the untracked cache data in the index not usable as a sign that the user wants to use that feature. If the feature is something only those with really large repositories care about, is it a good trade-off to make everybody pay the runtime cost and make code more complex and fragile? I am not yet convinced. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] git-p4: fix for handling of multiple depot paths
I'm not sure if my opinion as an outsider is of use, but since the perforce change number is monotonically increasing, my expectation as a user would be for them to be applied in order by the perforce change number. :) - James From: Luke Diamand Sent: Monday, December 14, 2015 3:09 PM To: Junio C Hamano Cc: Git Users; James Farwell; Lars Schneider; Eric Sunshine; Sam Hocevar Subject: Re: [PATCH 0/2] git-p4: fix for handling of multiple depot paths Sorry - I've just run the tests, and this change causes one of the test cases in t9800-git-p4-basic.sh to fail. It looks like the test case makes an assumption about who wins if two P4 depots have changes to files that end up in the same place, and this change reverses the order. It may actually be fine, but it needs to be thought about a bit. Sam - do you have any thoughts on this? Thanks Luke On 14 December 2015 at 22:06, Junio C Hamano wrote: > Luke Diamand writes: > >> On 14 December 2015 at 19:16, Junio C Hamano wrote: >>> Luke Diamand writes: >>> Having just fixed this, I've now just spotted that Sam Hocevar's fix to reduce the number of P4 transactions also fixes it: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mail-2Darchive.com_git-2540vger.kernel.org_msg81880.html&d=BQIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=wkCayFhpIBdAOEa7tZDTcd1weqwtiFMEIQTL-WQPwC4&m=q8dsOAHvUiDzzPNGRAfMMrcXstxNlI-v7I_03uEL1e8&s=C8wVLMC-iU7We0r36sxOuu920ZjZYdpy7ysNi_5PYv8&e= That seems like a cleaner fix. >>> >>> Hmm, do you mean I should ignore this series and take the other one, >>> take only 1/2 from this for tests and then both patches in the other >>> one, or something else? >> >> The second of those (take only 1/2 from this for tests, and then both >> from the other) seems like the way to go. > > OK. Should I consider the two patches from Sam "Reviewed-by" you?-- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
What's cooking in git.git (Dec 2015, #05; Tue, 15)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. 2.7-rc1 has been tagged. I notice that quite a few topics have been in "waiting for review" state without getting anybody helping the review process, leaving them in 'pu'--they will not be part of 2.7 anyway, but the sooner they are reviewed, in the more polished shape they can start in the cycle after 2.7 gets tagged. On the other hand, I've marked a handful of topics below as "Will discard". They were all dormant after waiting for updates for quite a long time; interested people may want to help resurrect them. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [New Topics] * cc/untracked (2015-12-15) 10 commits - dir: do not use untracked cache ident anymore - t7063: add tests for core.untrackedCache - config: add core.untrackedCache - dir: free untracked cache when removing it - dir: add remove_untracked_cache() - dir: add add_untracked_cache() - update-index: move 'uc' var declaration - update-index: add untracked cache notifications - update-index: add --test-untracked-cache - update-index: use enum for untracked cache options Update the untracked cache subsystem and change its primary UI from "git update-index" to "git config". Needs review. * ep/make-phoney (2015-12-15) 1 commit - Makefile: add missing phony target A slight update to the Makefile. Needs review. Comments? -- [Graduated to "master"] * ep/ident-with-getaddrinfo (2015-12-14) 1 commit (merged to 'next' on 2015-12-15 at 2c19123) + ident: fix undefined variable when NO_IPV6 is set A fix-up for a recent topic. * jk/prune-mtime (2015-08-12) 1 commit (merged to 'next' on 2015-12-14 at 4fba3f3) + prune: close directory earlier during loose-object directory traversal The helper used to iterate over loose object directories to prune stale objects did not closedir() immediately when it is done with a directory--a callback such as the one used for "git prune" may want to do rmdir(), but it would fail on open directory on platforms such as WinXP. * jk/send-email-complete-aliases (2015-12-14) 1 commit (merged to 'next' on 2015-12-15 at 6978ff4) + completion: fix completing unstuck email alias arguments A fix-up for a recent topic. * ls/p4-keep-empty-commits (2015-12-10) 1 commit (merged to 'next' on 2015-12-11 at 1827062) + git-p4: add option to keep empty commits "git p4" used to import Perforce CLs that touch only paths outside the client spec as empty commits. It has been corrected to ignore them instead, with a new configuration git-p4.keepEmptyCommits as a backward compatibility knob. -- [Stalled] * kf/http-proxy-auth-methods (2015-11-04) 3 commits . SQUASH??? . http: use credential API to handle proxy authentication . http: allow selection of proxy authentication method New http.proxyAuthMethod configuration variable can be used to specify what authentication method to use, as a way to work around proxies that do not give error response expected by libcurl when CURLAUTH_ANY is used. Also, the codepath for proxy authentication has been taught to use credential API to store the authentication material in user's keyrings. I ejected this from pu for the moment, as it conflicts with the pt/http-socks-proxy topic. That is now in master, so it can be re-rolled on top. Anybody wants to help rerolling this? Otherwise will discard. * nd/ita-cleanup (2015-09-06) 6 commits - grep: make it clear i-t-a entries are ignored - checkout(-index): do not checkout i-t-a entries - apply: make sure check_preimage() does not leave empty file on error - apply: fix adding new files on i-t-a entries - add and use a convenience macro ce_intent_to_add() - blame: remove obsolete comment Paths that have been told the index about with "add -N" are not yet in the index, but various commands behaved as if they already are. Some commits need better explanation. Becoming tired of waiting for a reroll. Will discard. * mg/httpd-tests-update-for-apache-2.4 (2015-04-08) 2 commits - t/lib-git-svn: check same httpd module dirs as lib-httpd - t/lib-httpd: load mod_unixd This is the first two commits in a three-patch series $gmane/266962 Becoming tired of waiting for a reroll. with updated log message ($gmane/268061). Will discard. * wp/sha1-name-negative-match (2015-06-08) 2 commits - sha1_name.c: introduce '^{/!-}' notation - test for '!' handling in rev-parse's named commits Introduce "branch^{/!-}" notation to name a commit reachable from branch that d
[ANNOUNCE] Git v2.7.0-rc1
A release candidate Git v2.7.0-rc1 is now available for testing at the usual places. The tarballs are found at: https://www.kernel.org/pub/software/scm/git/testing/ The following public repositories all have a copy of the 'v2.7.0-rc1' tag and the 'master' branch that the tag points at: url = https://kernel.googlesource.com/pub/scm/git/git url = git://repo.or.cz/alt-git.git url = git://git.sourceforge.jp/gitroot/git-core/git.git url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core url = https://github.com/gitster/git Git 2.7 Release Notes (draft) = Updates since v2.6 -- UI, Workflows & Features * "git remote" learned "get-url" subcommand to show the URL for a given remote name used for fetching and pushing. * There was no way to defeat a configured rebase.autostash variable from the command line, as "git rebase --no-autostash" was missing. * "git log --date=local" used to only show the normal (default) format in the local timezone. The command learned to take 'local' as an instruction to use the local timezone with other formats, * The refs used during a "git bisect" session is now per-worktree so that independent bisect sessions can be done in different worktrees created with "git worktree add". * Users who are too busy to type three extra keystrokes to ask for "git stash show -p" can now set stash.showPatch configuration varible to true to always see the actual patch, not just the list of paths affected with feel for the extent of damage via diffstat. * "quiltimport" allows to specify the series file by honoring the $QUILT_SERIES environment and also --series command line option. * The use of 'good/bad' in "git bisect" made it confusing to use when hunting for a state change that is not a regression (e.g. bugfix). The command learned 'old/new' and then allows the end user to say e.g. "bisect start --term-old=fast --term-new=slow" to find a performance regression. * "git interpret-trailers" can now run outside of a Git repository. * "git p4" learned to reencode the pathname it uses to communicate with the p4 depot with a new option. * Give progress meter to "git filter-branch". * Allow a later "!/abc/def" to override an earlier "/abc" that appears in the same .gitignore file to make it easier to express "everything in /abc directory is ignored, except for ...". * Teach "git p4" to send large blobs outside the repository by talking to Git LFS. * Prepare for Git on-disk repository representation to undergo backward incompatible changes by introducing a new repository format version "1", with an extension mechanism. * "git worktree" learned a "list" subcommand. * "git clone --dissociate" learned that it can be used even when "--reference" was not used at the same time. * "git blame" learnt to take "--first-parent" and "--reverse" at the same time when it makes sense. * "git checkout" did not follow the usual "--[no-]progress" convention and implemented only "--quiet" that is essentially a superset of "--no-progress". Extend the command to support the usual "--[no-]progress". * The semantics of tranfer.hideRefs configuration variable have been extended to work better with the ref "namespace" feature that lets you throw unrelated bunches of repositories in a single physical repository and virtually serve them as separate ones. * send-email config variables whose values are pathnames now go through the ~username/ expansion. * bash completion learnt to TAB-complete recipient addresses given to send-email. * The credential-cache daemon can be told to ignore SIGHUP to work around issue when running Git from inside emacs. Performance, Internal Implementation, Development Support etc. * The infrastructure to rewrite "git submodule" in C is being built incrementally. Let's polish these early parts well enough and make them graduate to 'next' and 'master', so that the more involved follow-up can start cooking on a solid ground. * Some features from "git tag -l" and "git branch -l" have been made available to "git for-each-ref" so that eventually the unified implementation can be shared across all three. The version merged to the 'master' branch earlier had a performance regression in "tag --contains", which has since been corrected. * Because "test_when_finished" in our test framework queues the clean-up tasks to be done in a shell variable, it should not be used inside a subshell. Add a mechanism to allow 'bash' to catch such uses, and fix the ones that were found. * The debugging infrastructure for pkt-line based communication has been improved to mark the side-band communication specifically. * Update "git branch" that list existing branches, using the ref-filter API that is shared with "g
Re: [PATCH 7/8] config: add core.untrackedCache
On Tue, Dec 15, 2015 at 8:40 PM, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > I still have a problem with the approach from "design cleanliness" > point of view[...] > > In any case I think we already have agreed to disagree on this > point, so there is no use discussing it any longer from my side. I > am not closing the door to this series, but I am not convinced, > either. At least not yet. In general the fantastic thing about the git configuration facility is that it provides both systems administrators and normal users with what they want. It's possible to configure things system-wide and override those on a user or repository basis. Of course hindsight is 20/20, but I think that given what's been covered in this thread it's been established that it's categorically better that if we introduce features like these that they be configured through the normal configuration facility rather than the configuration being sticky to the index. It gives you everything that the per-index configuration gives you and more. So assuming that's the case, how do we migrate something that's configured via the index towards being configured through git-config? I think there's no general answer to that, but in this case the worst case scenario with accepting this series as-is is that we downgrade some users who've opted in to it to pre-v2.5.0 "git status" performance. Since the change in performance really isn't noticeable except on really large repositories, which are more likely to have someone involved watching the changelog on upgrades I think that's OK. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Introduce grep threads param
Hello Junio, On 12/15/2015 11:06 PM, Junio C Hamano wrote: Victor Leschuk writes: Subject: Re: [PATCH 1/2] Introduce grep threads param I'll retitle this to something like grep: add --threads= option and grep.threads configuration while queuing (which I did for v7 earlier). Ok, thanks. "git grep" can now be configured (or told from the command line) how many threads to use when searching in the working tree files. Signed-off-by: Victor Leschuk --- ... +grep.threads:: + Number of grep worker threads. "Number of grep worker threads to use"? Accepted, will change. + See `grep.threads` in linkgit:git-grep[1] for more information. ... +grep.threads:: + Number of grep worker threads, use it to tune up performance on + your machines. Leave it unset (or set to 0) for default behavior, + which is using 8 threads for all systems. + Default behavior may change in future versions + to better suit hardware and circumstances. The last sentence is too noisy. Perhaps drop it and phrase it like this instead? grep.threads:: Number of grep worker threads to use. If unset (or set to 0), to 0), 8 threads are used by default (for now). Agree. diff --git a/builtin/grep.c b/builtin/grep.c index 4229cae..e9aebab 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -24,11 +24,11 @@ static char const * const grep_usage[] = { NULL }; -static int use_threads = 1; +#define GREP_NUM_THREADS_DEFAULT 8 +static int num_threads = 0; Please do not initialize static to 0 (or NULL). Ok. @@ -267,6 +270,12 @@ static int grep_cmd_config(const char *var, const char *value, void *cb) int st = grep_config(var, value, cb); if (git_color_default_config(var, value, cb) < 0) st = -1; + + if (!strcmp(var, "grep.threads")) { + /* Sanity check of value will be perfomed later */ Hmm, is that a good design? A user may hear "invalid number of threads specified (-4)" later, but if that came from "grep.threads", especially when the user did not say "--threads=-4" from the command line, would she know to check her configuration file? If she had "grep.threads=Yes" in her configuration, we would helpfully tell her that 'Yes' given to grep.threads is not a valid integer. Shouldn't we do the same for '-4' given to grep.threads, too? if (!strcmp(var, "grep.threads")) { num_threads = git_config_int(var, value); if (num_threads < 0) die(_("invalid number of threads specified (%d) for %s"), num_threads, var); } perhaps. When I was doing this I looked at other code and saw that exact message "Invalid number of threads..." is used in other parts of git and is present in 'po' translations. Thus I decided to use exactly the same message in order not to create numerous almost similar localizations (which we should do if we use two different messages in different places). Do you think we need to create two more different messages (and translations, I can prepare Russian and French) and create two checks for cmd and config? @@ -817,14 +827,23 @@ int cmd_grep(int argc, const char **argv, const char *prefix) } #ifndef NO_PTHREADS - if (list.nr || cached || online_cpus() == 1) - use_threads = 0; + if (list.nr || cached || online_cpus() == 1 || show_in_pager) { + /* Can not multi-thread object lookup */ + num_threads = 0; Removing 'use_threads = 0' from an earlier part and moving the check to show_in_pager is a good idea, but it invalidates this comment. The earlier three (actually two and a half) are "cannot" cases, i.e. the object layer is not easily threaded without locking, and when you have a single core, you do not truly run multiple operations at the same time, but as [PATCH 2/2] does, threading in "grep" is not about CPU alone, so that is why I am demoting it to just a half ;-). But show_in_pager is "we do not want to", I think. In any case, this comment and "User didn't specify" below are not telling the reader something very much useful. You probably should remove them. Ok, will remove comments. + } + else if (num_threads == 0) { + /* User didn't specify value, or just wants default behavior */ + num_threads = GREP_NUM_THREADS_DEFAULT; + } + else if (num_threads < 0) { + die(_("invalid number of threads specified (%d)"), num_threads); + } Many unnecessary braces. I put braces to make code look more unified. I had to put braces here: + else if (num_threads == 0) { + /* User didn't specify value, or just wants default behavior */ In order to be able to place long comment above the line. And code like: if (a) do_smth(); else if (b) { do_smth1(); do_smth2(); } else do_smth3(); Looks rather ugly to me.
Re: [PATCH 1/2] Introduce grep threads param
Victor Leschuk writes: > Subject: Re: [PATCH 1/2] Introduce grep threads param I'll retitle this to something like grep: add --threads= option and grep.threads configuration while queuing (which I did for v7 earlier). > "git grep" can now be configured (or told from the command line) > how many threads to use when searching in the working tree files. > > Signed-off-by: Victor Leschuk > --- > ... > +grep.threads:: > + Number of grep worker threads. "Number of grep worker threads to use"? > + See `grep.threads` in linkgit:git-grep[1] for more information. > ... > +grep.threads:: > + Number of grep worker threads, use it to tune up performance on > + your machines. Leave it unset (or set to 0) for default behavior, > + which is using 8 threads for all systems. > + Default behavior may change in future versions > + to better suit hardware and circumstances. The last sentence is too noisy. Perhaps drop it and phrase it like this instead? grep.threads:: Number of grep worker threads to use. If unset (or set to 0), to 0), 8 threads are used by default (for now). > diff --git a/builtin/grep.c b/builtin/grep.c > index 4229cae..e9aebab 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -24,11 +24,11 @@ static char const * const grep_usage[] = { > NULL > }; > > -static int use_threads = 1; > +#define GREP_NUM_THREADS_DEFAULT 8 > +static int num_threads = 0; Please do not initialize static to 0 (or NULL). > @@ -267,6 +270,12 @@ static int grep_cmd_config(const char *var, const char > *value, void *cb) > int st = grep_config(var, value, cb); > if (git_color_default_config(var, value, cb) < 0) > st = -1; > + > + if (!strcmp(var, "grep.threads")) { > + /* Sanity check of value will be perfomed later */ Hmm, is that a good design? A user may hear "invalid number of threads specified (-4)" later, but if that came from "grep.threads", especially when the user did not say "--threads=-4" from the command line, would she know to check her configuration file? If she had "grep.threads=Yes" in her configuration, we would helpfully tell her that 'Yes' given to grep.threads is not a valid integer. Shouldn't we do the same for '-4' given to grep.threads, too? if (!strcmp(var, "grep.threads")) { num_threads = git_config_int(var, value); if (num_threads < 0) die(_("invalid number of threads specified (%d) for %s"), num_threads, var); } perhaps. > @@ -817,14 +827,23 @@ int cmd_grep(int argc, const char **argv, const char > *prefix) > } > > #ifndef NO_PTHREADS > - if (list.nr || cached || online_cpus() == 1) > - use_threads = 0; > + if (list.nr || cached || online_cpus() == 1 || show_in_pager) { > + /* Can not multi-thread object lookup */ > + num_threads = 0; Removing 'use_threads = 0' from an earlier part and moving the check to show_in_pager is a good idea, but it invalidates this comment. The earlier three (actually two and a half) are "cannot" cases, i.e. the object layer is not easily threaded without locking, and when you have a single core, you do not truly run multiple operations at the same time, but as [PATCH 2/2] does, threading in "grep" is not about CPU alone, so that is why I am demoting it to just a half ;-). But show_in_pager is "we do not want to", I think. In any case, this comment and "User didn't specify" below are not telling the reader something very much useful. You probably should remove them. > + } > + else if (num_threads == 0) { > + /* User didn't specify value, or just wants default behavior */ > + num_threads = GREP_NUM_THREADS_DEFAULT; > + } > + else if (num_threads < 0) { > + die(_("invalid number of threads specified (%d)"), num_threads); > + } Many unnecessary braces. I think [2/2] and also moving the code to disable threading when show-in-pager mode should be separate "preparatory clean-up" patches before this main patch. I'll push out what I think this topic should be on 'pu' later today (with fixups suggested above squashed in); please check them and see what you think. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 10/10] dir: do not use untracked cache ident anymore
Christian Couder writes: > +/* > + * We used to save the location of the work tree and the kernel version, > + * but it was not a good idea, so we now just save an empty string. > + */ I do agree that storing the kernel version (or hostname or whatever specific to the machine) was not a good idea. I however suspect that you must save and check the location of the working tree, though, for correctness. If you use one GIT_DIR and GIT_WORK_TREE to do "git add" or whatever, and then use the same GIT_DIR but a different GIT_WORK_TREE, you should be able to notice that a directory D in the old GIT_WORK_TREE whose modification time you recorded is different from the directory D with the same name but in the new GIT_WORK_TREE, no? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
Ævar Arnfjörð Bjarmason writes: > The way the "config decides" patch series deals with this is that if > you have the UC information in the index and the configuration is set > to core.untrackedCache=false the UC will be removed from the index. > > Otherwise you would indeed easily end up with a stale cache,... Yeah, that's "correctness" thing; it goes without saying that it would be unacceptable if the series did not get this right. I still have a problem with the approach from "design cleanliness" point of view, primarily that the index already has a bit to tell us if the user already said that she wants to use the feature, but because you want to make config win, the code needs to always read the config to allow it to disable the feature in the index, just in case the data that is already in the index says otherwise, and the code has to keep doing that even after the data is removed [*1*]. Unfortunately, I do not think you can solve the "design cleanliness" problem unless you give up "we want /etc/gitconfig override". In any case I think we already have agreed to disagree on this point, so there is no use discussing it any longer from my side. I am not closing the door to this series, but I am not convinced, either. At least not yet. > Once this series is applied and git is shipped with it, existing > users that have set "git update-index --untracked-cache" will have > their UC feature disabled. Well, the fix to that is merely just a one-shot thing away, so it would not be too much of a hassle, no [*2*]? So I do not think it would be such a big issue. > We *could* make even that use-case work by detecting the legacy marker > for the UC in the index (the uname info), then we'd do a one-time "git > config --local core.untrackedCache true" and remove the marker. I do not think we want to go there---that would mean you would need to revamp the repository format version because the old tools would be now unusable on the index/config combo your version mucked with. [Footnote] *1* I also do not want to see that design pattern imitated and used in other parts of the system, and this gives a precedent for people to copy. *2* Yet, those who are broken by this series may say "it is unacceptable that we have to survey all existing repositories and selectively add the configuration to the ones that have enabled the feature before updating.", the same way you complain against the "index already knows" approach. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/10] dir: free untracked cache when removing it
Christian Couder writes: > Signed-off-by: Christian Couder > --- > dir.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/dir.c b/dir.c > index ffc0286..3b83cc0 100644 > --- a/dir.c > +++ b/dir.c > @@ -1954,6 +1954,7 @@ void add_untracked_cache(void) > > void remove_untracked_cache(void) > { > + free_untracked_cache(the_index.untracked); > the_index.untracked = NULL; > the_index.cache_changed |= UNTRACKED_CHANGED; > } Up to this point the series makes sense (again, I am not saying the remainder does not ;-)). But shouldn't this step, as a bugfix, appear a lot earlier in the series? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
On Tue, Dec 15, 2015 at 10:49 AM, Torsten Bögershausen wrote: > On 15.12.15 10:34, Christian Couder wrote: >> On Mon, Dec 14, 2015 at 10:30 PM, Junio C Hamano wrote: >>> Junio C Hamano writes: >>> >>> The primary reason why I do not like your "configuration decides" is >>> it will be a huge source of confusions and bugs. Imagine what >>> should happen in this sequence, and when should a stale cached >>> information be discarded? >>> >>> - the configuration is set to 'yes'. >>> - the index is updated and written by various commands. >>> - more work is done in the working tree without updating the index. >>> - the configuration is set to 'no'. >>> - more work is done in the working tree without updating the index. >>> - the configuration is set to 'yes'. >>> - more work is done in the working tree without updating the index. >>> - somebody asks "what untracked paths are there?" >> > >> As far as I understand the UC just stores the mtime of the directories >> in the working tree to avoid the need of lstat'ing all the files in >> the directories. > > This is what I understand: > UC stores the mtime of the directories in the working tree to avoid the need > opendir() readdir() closedir() to find new, yet untracked, files. > (including sub-directories) I think you are probably right too. In the v2 patch series I just sent, there is: +This feature works by recording the mtime of the working tree +directories and then omitting reading directories and stat calls +against files in those directories whose mtime hasn't changed. I hope it is better. Thanks, Christian. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 06/10] dir: add remove_untracked_cache()
Factor out code into remove_untracked_cache(), which will be used in a later commit. Signed-off-by: Christian Couder --- builtin/update-index.c | 3 +-- dir.c | 6 ++ dir.h | 1 + 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/builtin/update-index.c b/builtin/update-index.c index 5f009cf..4ca6d94 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -1127,8 +1127,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) if (verbose) printf(_("Untracked cache enabled\n")); } else if (untracked_cache == UC_DISABLE && the_index.untracked) { - the_index.untracked = NULL; - the_index.cache_changed |= UNTRACKED_CHANGED; + remove_untracked_cache(); if (verbose) printf(_("Untracked cache disabled\n")); } diff --git a/dir.c b/dir.c index 0f7e293..ffc0286 100644 --- a/dir.c +++ b/dir.c @@ -1952,6 +1952,12 @@ void add_untracked_cache(void) the_index.cache_changed |= UNTRACKED_CHANGED; } +void remove_untracked_cache(void) +{ + the_index.untracked = NULL; + the_index.cache_changed |= UNTRACKED_CHANGED; +} + static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir, int base_len, const struct pathspec *pathspec) diff --git a/dir.h b/dir.h index ee94c76..3e5114d 100644 --- a/dir.h +++ b/dir.h @@ -309,4 +309,5 @@ struct untracked_cache *read_untracked_extension(const void *data, unsigned long void write_untracked_extension(struct strbuf *out, struct untracked_cache *untracked); void add_untracked_ident(struct untracked_cache *); void add_untracked_cache(void); +void remove_untracked_cache(void); #endif -- 2.6.3.479.g8eb29d4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 08/10] config: add core.untrackedCache
When we know that mtime is fully supported by the environment, we might want the untracked cache to be always used by default without any mtime test or kernel version check being performed. Also when we know that mtime is not supported by the environment, for example because the repo is shared over a network file system, then we might want 'git update-index --untracked-cache' to fail immediately instead of performing tests (because it might work on some systems using the repo over the network file system but not others). The normal way to achieve the above in Git is to use a config variable. That's why this patch introduces "core.untrackedCache". To keep things simple, this variable is a bool which default to false. When "git status" is run, it now adds or removes the untracked cache in the index to respect the value of this variable. The job of `git update-index --[no-|force-]untracked-cache` was to add or remove the untracked cache from the index. This was a kind of configuration because this was persistant across git commands. To make this kind of configuration compatible with the new config variable, the simple thing to do, and what this patch does, is to make `git update-index --[no-|force-]untracked-cache` set or unset this config option. This new behavior is a backward incompatible change, but that is deliberate. The untracked cache feature has been experimental and is very unlikely used by beginners. When people will upgrade, this will remove any untracked cache they used unless they set "core.untrackedCache" before upgrading. This should be stated in the release notes. Also `--untracked-cache` used to check that the underlying operating system and file system change `st_mtime` field of a directory if files are added or deleted in that directory. But those tests take a long time and there is now `--test-untracked-cache` to perform them. That's why, to be more consistent with other git commands, this patch prevents `--untracked-cache` to perform tests, so that after this patch there is no difference any more between `--untracked-cache` and `--force-untracked-cache`. All the changes to `--[no-|force-]untracked-cache` make it possible to deprecate those options in the future. Signed-off-by: Christian Couder Signed-off-by: Ævar Arnfjörð Bjarmason --- Documentation/config.txt | 7 Documentation/git-update-index.txt | 58 +++--- builtin/update-index.c | 25 --- cache.h| 1 + config.c | 4 +++ contrib/completion/git-completion.bash | 1 + dir.c | 2 +- environment.c | 1 + t/t7063-status-untracked-cache.sh | 4 +-- wt-status.c| 9 ++ 10 files changed, 85 insertions(+), 27 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 2d06b11..94820eb 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -308,6 +308,13 @@ core.trustctime:: crawlers and some backup systems). See linkgit:git-update-index[1]. True by default. +core.untrackedCache:: + Determines if untracked cache will be enabled. Using + 'git update-index --[no-|force-]untracked-cache' will set + this variable. Before setting it to true, you should check + that mtime is working properly on your system. + See linkgit:git-update-index[1]. False by default. + core.checkStat:: Determines which stat fields to match between the index and work tree. The user can set this to 'default' or diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt index 0ff7396..cd02de4 100644 --- a/Documentation/git-update-index.txt +++ b/Documentation/git-update-index.txt @@ -173,24 +173,29 @@ may not support it yet. --untracked-cache:: --no-untracked-cache:: - Enable or disable untracked cache extension. This could speed - up for commands that involve determining untracked files such - as `git status`. The underlying operating system and file - system must change `st_mtime` field of a directory if files - are added or deleted in that directory. + Enable or disable untracked cache extension. Please use + `--test-untracked-cache` before enabling it. ++ +These options are mostly aliases for setting the `core.untrackedCache` +configuration variable to 'true' or 'false' in the local config file +(see linkgit:git-config[1]). You can equivalently just set those +configuration values directly. These options are just provided for +backwards compatibility with the older versions of Git where this was +the only way to enable or disable the untracked cache extension. --test-untracked-cache:: Only perform tests on the working directory to make sure untracked cache can be used. You have to manually enable - untracked cache usin
[PATCH v2 10/10] dir: do not use untracked cache ident anymore
A previous commit disabled the check to see if the untracked cache ident field was matching the current environment. So this field is now useless and we can remove some related code. We don't remove the ident field from "struct untracked_cache" as it would break compatibility with old indexes that already have an untracked cache with this field. Signed-off-by: Christian Couder --- dir.c | 38 +- dir.h | 2 +- 2 files changed, 6 insertions(+), 34 deletions(-) diff --git a/dir.c b/dir.c index 0b07ba7..94fba2a 100644 --- a/dir.c +++ b/dir.c @@ -1904,36 +1904,13 @@ static int treat_leading_path(struct dir_struct *dir, return rc; } -static const char *get_ident_string(void) -{ - static struct strbuf sb = STRBUF_INIT; - struct utsname uts; - - if (sb.len) - return sb.buf; - if (uname(&uts) < 0) - die_errno(_("failed to get kernel name and information")); - strbuf_addf(&sb, "Location %s, system %s %s %s", get_git_work_tree(), - uts.sysname, uts.release, uts.version); - return sb.buf; -} - -static int ident_in_untracked(const struct untracked_cache *uc) -{ - const char *end = uc->ident.buf + uc->ident.len; - const char *p = uc->ident.buf; - - for (p = uc->ident.buf; p < end; p += strlen(p) + 1) - if (!strcmp(p, get_ident_string())) - return 1; - return 0; -} - +/* + * We used to save the location of the work tree and the kernel version, + * but it was not a good idea, so we now just save an empty string. + */ void add_untracked_ident(struct untracked_cache *uc) { - if (ident_in_untracked(uc)) - return; - strbuf_addstr(&uc->ident, get_ident_string()); + strbuf_addstr(&uc->ident, ""); /* this strbuf contains a list of strings, save NUL too */ strbuf_addch(&uc->ident, 0); } @@ -2015,11 +1992,6 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d if (dir->exclude_list_group[EXC_CMDL].nr) return NULL; - if (use_untracked_cache != 1 && !ident_in_untracked(dir->untracked)) { - warning(_("Untracked cache is disabled on this system.")); - return NULL; - } - if (!dir->untracked->root) { const int len = sizeof(*dir->untracked->root); dir->untracked->root = xmalloc(len); diff --git a/dir.h b/dir.h index 3e5114d..1935b76 100644 --- a/dir.h +++ b/dir.h @@ -127,7 +127,7 @@ struct untracked_cache { struct sha1_stat ss_info_exclude; struct sha1_stat ss_excludes_file; const char *exclude_per_dir; - struct strbuf ident; + struct strbuf ident; /* unused now */ /* * dir_struct#flags must match dir_flags or the untracked * cache is ignored. -- 2.6.3.479.g8eb29d4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 04/10] update-index: move 'uc' var declaration
Signed-off-by: Christian Couder --- builtin/update-index.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/builtin/update-index.c b/builtin/update-index.c index e84674f..fffad79 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -1116,8 +1116,6 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) the_index.cache_changed |= SOMETHING_CHANGED; } if (untracked_cache > UC_DISABLE) { - struct untracked_cache *uc; - if (untracked_cache < UC_FORCE) { setup_work_tree(); if (!test_if_untracked_cache_is_supported()) @@ -1126,7 +1124,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) return 0; } if (!the_index.untracked) { - uc = xcalloc(1, sizeof(*uc)); + struct untracked_cache *uc = xcalloc(1, sizeof(*uc)); strbuf_init(&uc->ident, 100); uc->exclude_per_dir = ".gitignore"; /* should be the same flags used by git-status */ -- 2.6.3.479.g8eb29d4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 09/10] t7063: add tests for core.untrackedCache
Signed-off-by: Christian Couder --- t/t7063-status-untracked-cache.sh | 48 +-- 1 file changed, 46 insertions(+), 2 deletions(-) diff --git a/t/t7063-status-untracked-cache.sh b/t/t7063-status-untracked-cache.sh index 253160a..f0af41c 100755 --- a/t/t7063-status-untracked-cache.sh +++ b/t/t7063-status-untracked-cache.sh @@ -18,6 +18,10 @@ if ! test_have_prereq UNTRACKED_CACHE; then test_done fi +test_expect_success 'core.untrackedCache is unset' ' + test_must_fail git config --get core.untrackedCache +' + test_expect_success 'setup' ' git init worktree && cd worktree && @@ -28,6 +32,11 @@ test_expect_success 'setup' ' git update-index --untracked-cache ' +test_expect_success 'core.untrackedCache is true' ' + UC=$(git config core.untrackedCache) && + test "$UC" = "true" +' + test_expect_success 'untracked cache is empty' ' test-dump-untracked-cache >../actual && cat >../expect <../actual && - cat >../expect <../expect-from-test-dump <../actual && + echo "no untracked cache" >../expect && + test_cmp ../expect ../actual +' + +test_expect_success 'git status does not change anything' ' + git status && + test-dump-untracked-cache >../actual && + test_cmp ../expect ../actual && + UC=$(git config core.untrackedCache) && + test "$UC" = "false" +' + +test_expect_success 'setting core.untrackedCache and using git status creates the cache' ' + git config core.untrackedCache true && + test-dump-untracked-cache >../actual && + test_cmp ../expect ../actual && + git status && + test-dump-untracked-cache >../actual && + test_cmp ../expect-from-test-dump ../actual +' + +test_expect_success 'unsetting core.untrackedCache and using git status removes the cache' ' + git config --unset core.untrackedCache && + test-dump-untracked-cache >../actual && + test_cmp ../expect-from-test-dump ../actual && + git status && + test-dump-untracked-cache >../actual && + test_cmp ../expect ../actual +' + test_done -- 2.6.3.479.g8eb29d4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
On Tue, Dec 15, 2015 at 11:02 AM, Duy Nguyen wrote: > On Tue, Dec 15, 2015 at 4:34 PM, Christian Couder > wrote: >> On Mon, Dec 14, 2015 at 10:30 PM, Junio C Hamano wrote: >>> Junio C Hamano writes: >>> >>> The primary reason why I do not like your "configuration decides" is >>> it will be a huge source of confusions and bugs. Imagine what >>> should happen in this sequence, and when should a stale cached >>> information be discarded? >>> >>> - the configuration is set to 'yes'. >>> - the index is updated and written by various commands. >>> - more work is done in the working tree without updating the index. >>> - the configuration is set to 'no'. >>> - more work is done in the working tree without updating the index. >>> - the configuration is set to 'yes'. >>> - more work is done in the working tree without updating the index. >>> - somebody asks "what untracked paths are there?" >> >> As far as I understand the UC just stores the mtime of the directories >> in the working tree to avoid the need of lstat'ing all the files in >> the directories. >> >> When somebody asks "what untracked paths are there", if the UC has not >> been discarded when the configuration was set to no, then git will >> just ask for the mtimes of the directories in the working tree and >> compare them with what is in the UC. >> >> I don't see how it can create confusion and bugs, as the work done in >> the working tree should anyway have changed the mtime of the >> directories where work has been done. > > Any operation that can add or remove an entry from the index may lead > to UC update. For example, if file "foo" is tracked, then the user > does "git rm --cached foo", "foo" may become either untracked or > ignored. So if you disable UC while doing this removal, then re-enable > UC again, "git-status" may show incorrect output. So, as long as UC > extension exists in the index, it must be updated, or it may become > outdated and useless. When UC is disabled, it is removed from the index, and when it is (re-)enabled, it is recreated, so I don't think that your example applies to the code in this patch. Thanks anyway for this insight, Christian. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 03/10] update-index: add untracked cache notifications
Attempting to flip the untracked-cache feature on for a random index file with cd /random/unrelated/place git --git-dir=/somewhere/else/.git update-index --untracked-cache would not work as you might expect. Because flipping the feature on in the index also records the location of the corresponding working tree (/random/unrelated/place in the above example), when the index is subsequently used to keep track of files in the working tree in /somewhere/else, the feature is disabled. With this patch "git update-index --[test-]untracked-cache" tells the user in which directory tests are performed. This makes it easy to spot any problem. Also in verbose mode, let's tell the user when the cache is enabled or disabled. Signed-off-by: Christian Couder --- builtin/update-index.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/builtin/update-index.c b/builtin/update-index.c index e747a6c..e84674f 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -130,7 +130,7 @@ static int test_if_untracked_cache_is_supported(void) if (!mkdtemp(mtime_dir.buf)) die_errno("Could not make temporary directory"); - fprintf(stderr, _("Testing ")); + fprintf(stderr, _("Testing mtime in '%s' "), xgetcwd()); atexit(remove_test_directory); xstat_mtime_dir(&st); fill_stat_data(&base, &st); @@ -1135,9 +1135,13 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) } add_untracked_ident(the_index.untracked); the_index.cache_changed |= UNTRACKED_CHANGED; + if (verbose) + printf(_("Untracked cache enabled\n")); } else if (untracked_cache == UC_DISABLE && the_index.untracked) { the_index.untracked = NULL; the_index.cache_changed |= UNTRACKED_CHANGED; + if (verbose) + printf(_("Untracked cache disabled\n")); } if (active_cache_changed) { -- 2.6.3.479.g8eb29d4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 05/10] dir: add add_untracked_cache()
Factor out code into add_untracked_cache(), which will be used in a later commit. Signed-off-by: Christian Couder --- builtin/update-index.c | 11 +-- dir.c | 14 ++ dir.h | 1 + 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/builtin/update-index.c b/builtin/update-index.c index fffad79..5f009cf 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -1123,16 +1123,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) if (untracked_cache == UC_TEST) return 0; } - if (!the_index.untracked) { - struct untracked_cache *uc = xcalloc(1, sizeof(*uc)); - strbuf_init(&uc->ident, 100); - uc->exclude_per_dir = ".gitignore"; - /* should be the same flags used by git-status */ - uc->dir_flags = DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES; - the_index.untracked = uc; - } - add_untracked_ident(the_index.untracked); - the_index.cache_changed |= UNTRACKED_CHANGED; + add_untracked_cache(); if (verbose) printf(_("Untracked cache enabled\n")); } else if (untracked_cache == UC_DISABLE && the_index.untracked) { diff --git a/dir.c b/dir.c index d2a8f06..0f7e293 100644 --- a/dir.c +++ b/dir.c @@ -1938,6 +1938,20 @@ void add_untracked_ident(struct untracked_cache *uc) strbuf_addch(&uc->ident, 0); } +void add_untracked_cache(void) +{ + if (!the_index.untracked) { + struct untracked_cache *uc = xcalloc(1, sizeof(*uc)); + strbuf_init(&uc->ident, 100); + uc->exclude_per_dir = ".gitignore"; + /* should be the same flags used by git-status */ + uc->dir_flags = DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES; + the_index.untracked = uc; + } + add_untracked_ident(the_index.untracked); + the_index.cache_changed |= UNTRACKED_CHANGED; +} + static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir, int base_len, const struct pathspec *pathspec) diff --git a/dir.h b/dir.h index 7b5855d..ee94c76 100644 --- a/dir.h +++ b/dir.h @@ -308,4 +308,5 @@ void free_untracked_cache(struct untracked_cache *); struct untracked_cache *read_untracked_extension(const void *data, unsigned long sz); void write_untracked_extension(struct strbuf *out, struct untracked_cache *untracked); void add_untracked_ident(struct untracked_cache *); +void add_untracked_cache(void); #endif -- 2.6.3.479.g8eb29d4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 07/10] dir: free untracked cache when removing it
Signed-off-by: Christian Couder --- dir.c | 1 + 1 file changed, 1 insertion(+) diff --git a/dir.c b/dir.c index ffc0286..3b83cc0 100644 --- a/dir.c +++ b/dir.c @@ -1954,6 +1954,7 @@ void add_untracked_cache(void) void remove_untracked_cache(void) { + free_untracked_cache(the_index.untracked); the_index.untracked = NULL; the_index.cache_changed |= UNTRACKED_CHANGED; } -- 2.6.3.479.g8eb29d4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 02/10] update-index: add --test-untracked-cache
It is nice to just be able to test if untracked cache is supported without enabling it. Signed-off-by: Christian Couder --- Documentation/git-update-index.txt | 9 - builtin/update-index.c | 5 + 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt index 3df9c26..0ff7396 100644 --- a/Documentation/git-update-index.txt +++ b/Documentation/git-update-index.txt @@ -17,7 +17,7 @@ SYNOPSIS [--[no-]assume-unchanged] [--[no-]skip-worktree] [--ignore-submodules] -[--[no-|force-]untracked-cache] +[--[no-|test-|force-]untracked-cache] [--really-refresh] [--unresolve] [--again | -g] [--info-only] [--index-info] [-z] [--stdin] [--index-version ] @@ -179,6 +179,13 @@ may not support it yet. system must change `st_mtime` field of a directory if files are added or deleted in that directory. +--test-untracked-cache:: + Only perform tests on the working directory to make sure + untracked cache can be used. You have to manually enable + untracked cache using `--force-untracked-cache` (or + `--untracked-cache` but this will run the tests again) + afterwards if you really want to use it. + --force-untracked-cache:: For safety, `--untracked-cache` performs tests on the working directory to make sure untracked cache can be used. These diff --git a/builtin/update-index.c b/builtin/update-index.c index 2430a68..e747a6c 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -40,6 +40,7 @@ enum uc_mode { UC_UNSPECIFIED = -1, UC_DISABLE = 0, UC_ENABLE, + UC_TEST, UC_FORCE }; @@ -1004,6 +1005,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) N_("enable or disable split index")), OPT_BOOL(0, "untracked-cache", &untracked_cache, N_("enable/disable untracked cache")), + OPT_SET_INT(0, "test-untracked-cache", &untracked_cache, + N_("test if the filesystem supports untracked cache"), UC_TEST), OPT_SET_INT(0, "force-untracked-cache", &untracked_cache, N_("enable untracked cache without testing the filesystem"), UC_FORCE), OPT_END() @@ -1119,6 +1122,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) setup_work_tree(); if (!test_if_untracked_cache_is_supported()) return 1; + if (untracked_cache == UC_TEST) + return 0; } if (!the_index.untracked) { uc = xcalloc(1, sizeof(*uc)); -- 2.6.3.479.g8eb29d4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 01/10] update-index: use enum for untracked cache options
Signed-off-by: Christian Couder --- builtin/update-index.c | 18 +- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/builtin/update-index.c b/builtin/update-index.c index 7431938..2430a68 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -35,6 +35,14 @@ static int mark_skip_worktree_only; #define UNMARK_FLAG 2 static struct strbuf mtime_dir = STRBUF_INIT; +/* Untracked cache mode */ +enum uc_mode { + UC_UNSPECIFIED = -1, + UC_DISABLE = 0, + UC_ENABLE, + UC_FORCE +}; + __attribute__((format (printf, 1, 2))) static void report(const char *fmt, ...) { @@ -902,7 +910,7 @@ static int reupdate_callback(struct parse_opt_ctx_t *ctx, int cmd_update_index(int argc, const char **argv, const char *prefix) { int newfd, entries, has_errors = 0, line_termination = '\n'; - int untracked_cache = -1; + enum uc_mode untracked_cache = UC_UNSPECIFIED; int read_from_stdin = 0; int prefix_length = prefix ? strlen(prefix) : 0; int preferred_index_format = 0; @@ -997,7 +1005,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "untracked-cache", &untracked_cache, N_("enable/disable untracked cache")), OPT_SET_INT(0, "force-untracked-cache", &untracked_cache, - N_("enable untracked cache without testing the filesystem"), 2), + N_("enable untracked cache without testing the filesystem"), UC_FORCE), OPT_END() }; @@ -1104,10 +1112,10 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) the_index.split_index = NULL; the_index.cache_changed |= SOMETHING_CHANGED; } - if (untracked_cache > 0) { + if (untracked_cache > UC_DISABLE) { struct untracked_cache *uc; - if (untracked_cache < 2) { + if (untracked_cache < UC_FORCE) { setup_work_tree(); if (!test_if_untracked_cache_is_supported()) return 1; @@ -1122,7 +1130,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) } add_untracked_ident(the_index.untracked); the_index.cache_changed |= UNTRACKED_CHANGED; - } else if (!untracked_cache && the_index.untracked) { + } else if (untracked_cache == UC_DISABLE && the_index.untracked) { the_index.untracked = NULL; the_index.cache_changed |= UNTRACKED_CHANGED; } -- 2.6.3.479.g8eb29d4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 00/10] Untracked cache improvements
Here is a new version of a patch series to improve the untracked cache feature. This v2 still implements core.untrackedCache as a simple bool config variable. When it's true, Git should always try to use the untracked cache, and when false, Git should never use it. Patchs 1/10 to 3/10 add some features that are missing. Patch 3/10 has been moved after the two other patches and has been changed a bit according to Duy's and Junio's suggestions. In patch 2/10 the enum names have been changed as discussed with Junio. Patchs 4/10, 5/10 and 6/10, which have not been changed, are some refactoring to prepare for patch 8/10 which implements core.untrackedCache. Patch 7/10 is a small bug fix suggested by Junio. Patch 8/10, which adds core.untrackedCache, contains many documentation and commit message improvements, some by AEvar. Patch 9/10 has not been changed. Patch 10/10 is new and removes code that is now useless. So the changes compared to v1 are mostly small updates, and patchs 7/10 and 10/10. The patch series is also available there: https://github.com/chriscool/git/tree/uc-notifs25 Thanks to the reviewers and helpers. Christian Couder (10): update-index: use enum for untracked cache options update-index: add --test-untracked-cache update-index: add untracked cache notifications update-index: move 'uc' var declaration dir: add add_untracked_cache() dir: add remove_untracked_cache() dir: free untracked cache when removing it config: add core.untrackedCache t7063: add tests for core.untrackedCache dir: do not use untracked cache ident anymore Documentation/config.txt | 7 Documentation/git-update-index.txt | 61 -- builtin/update-index.c | 54 +- cache.h| 1 + config.c | 4 +++ contrib/completion/git-completion.bash | 1 + dir.c | 53 + dir.h | 4 ++- environment.c | 1 + t/t7063-status-untracked-cache.sh | 52 ++--- wt-status.c| 9 + 11 files changed, 178 insertions(+), 69 deletions(-) -- 2.6.3.479.g8eb29d4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Introduce grep threads param
"git grep" can now be configured (or told from the command line) how many threads to use when searching in the working tree files. Signed-off-by: Victor Leschuk --- Documentation/config.txt | 4 +++ Documentation/git-grep.txt | 12 + builtin/grep.c | 49 +++--- contrib/completion/git-completion.bash | 1 + 4 files changed, 51 insertions(+), 15 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 2d06b11..cbf4071 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1450,6 +1450,10 @@ grep.extendedRegexp:: option is ignored when the 'grep.patternType' option is set to a value other than 'default'. +grep.threads:: + Number of grep worker threads. + See `grep.threads` in linkgit:git-grep[1] for more information. + gpg.program:: Use this custom program instead of "gpg" found on $PATH when making or verifying a PGP signature. The program must support the diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 4a44d6d..25e6dc5 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -23,6 +23,7 @@ SYNOPSIS [--break] [--heading] [-p | --show-function] [-A ] [-B ] [-C ] [-W | --function-context] + [--threads ] [-f ] [-e] [--and|--or|--not|(|)|-e ...] [ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | ...] @@ -53,6 +54,13 @@ grep.extendedRegexp:: option is ignored when the 'grep.patternType' option is set to a value other than 'default'. +grep.threads:: + Number of grep worker threads, use it to tune up performance on + your machines. Leave it unset (or set to 0) for default behavior, + which is using 8 threads for all systems. + Default behavior may change in future versions + to better suit hardware and circumstances. + grep.fullName:: If set to true, enable '--full-name' option by default. @@ -227,6 +235,10 @@ OPTIONS effectively showing the whole function in which the match was found. +--threads :: + Number of grep worker threads. + See `grep.threads` in 'CONFIGURATION' for more information. + -f :: Read patterns from , one per line. diff --git a/builtin/grep.c b/builtin/grep.c index 4229cae..e9aebab 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -24,11 +24,11 @@ static char const * const grep_usage[] = { NULL }; -static int use_threads = 1; +#define GREP_NUM_THREADS_DEFAULT 8 +static int num_threads = 0; #ifndef NO_PTHREADS -#define THREADS 8 -static pthread_t threads[THREADS]; +static pthread_t *threads; /* We use one producer thread and THREADS consumer * threads. The producer adds struct work_items to 'todo' and the @@ -63,13 +63,13 @@ static pthread_mutex_t grep_mutex; static inline void grep_lock(void) { - if (use_threads) + if (num_threads) pthread_mutex_lock(&grep_mutex); } static inline void grep_unlock(void) { - if (use_threads) + if (num_threads) pthread_mutex_unlock(&grep_mutex); } @@ -206,7 +206,8 @@ static void start_threads(struct grep_opt *opt) strbuf_init(&todo[i].out, 0); } - for (i = 0; i < ARRAY_SIZE(threads); i++) { + threads = xcalloc(num_threads, sizeof(*threads)); + for (i = 0; i < num_threads; i++) { int err; struct grep_opt *o = grep_opt_dup(opt); o->output = strbuf_out; @@ -238,12 +239,14 @@ static int wait_all(void) pthread_cond_broadcast(&cond_add); grep_unlock(); - for (i = 0; i < ARRAY_SIZE(threads); i++) { + for (i = 0; i < num_threads; i++) { void *h; pthread_join(threads[i], &h); hit |= (int) (intptr_t) h; } + free(threads); + pthread_mutex_destroy(&grep_mutex); pthread_mutex_destroy(&grep_read_mutex); pthread_mutex_destroy(&grep_attr_mutex); @@ -267,6 +270,12 @@ static int grep_cmd_config(const char *var, const char *value, void *cb) int st = grep_config(var, value, cb); if (git_color_default_config(var, value, cb) < 0) st = -1; + + if (!strcmp(var, "grep.threads")) { + /* Sanity check of value will be perfomed later */ + num_threads = git_config_int(var, value); + } + return st; } @@ -294,7 +303,7 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, } #ifndef NO_PTHREADS - if (use_threads) { + if (num_threads) { add_work(opt, GREP_SOURCE_SHA1, pathbuf.buf, path, sha1); strbuf_release(&pathbuf); return 0; @@ -323,7 +332,7 @@ static int grep_file(struct grep_opt *opt, const char *filename)
[PATCH v8 0/2] Add git-grep threads param
Introducing v8 of git-grep threads param patch. Patch is now split in 2 parts - 1/2 is actually renewed v6 version (see list of changes below), 2/2 removes dependency on online_cpus() - as we discussed with Eric this is rather significant change in default behavior and should be placed into separate patch. Here is list of changes since v6 ($gmane/281160): * Fixed broken t7811: moved all threads_num setup to 1 place (for -O option it was in wrong place) * Fixed 'invalid number of threads' message so that it could be translated * Got rid of grep_threads_config() - its too trivial to be separate function * Fixed xcalloc() args (sizeof(pthread_t) -> sizeof(*threads)) to correspond to general git style * Improved commit message (in 2/2) to explain why online_cpus() is now not used in threads_num setup * The full param documentation was moved into single place (grep.threads description in git-grep.txt) and is referenced from other places. Also made few language improvements in documentation. * Style improvements: splitted too long lines Victor Leschuk (2): "git grep" can now be configured (or told from the command line) how many threads to use when searching in the working tree files. Number of threads now doesn't depend on online_cpus(), e.g. if specific number is not configured GREP_NUM_THREADS_DEFAULT (8) threads will be used even on 1-core CPU. Documentation/config.txt | 4 +++ Documentation/git-grep.txt | 12 + builtin/grep.c | 49 +++--- contrib/completion/git-completion.bash | 1 + 4 files changed, 51 insertions(+), 15 deletions(-) -- 2.6.3.369.g3e7f205.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Get rid of online_cpus() when determining grep threads num
Number of threads now doesn't depend on online_cpus(), e.g. if specific number is not configured GREP_NUM_THREADS_DEFAULT (8) threads will be used even on 1-core CPU. Reason: multithreading can improve performance even on single core machines as IO is also a major factor here. Using multiple threads can significantly boost grep performance when working on slow filesystems (or repo isn't cached) or through network (for example repo is located on NFS). Signed-off-by: Victor Leschuk --- builtin/grep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/grep.c b/builtin/grep.c index e9aebab..1315905 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -827,7 +827,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix) } #ifndef NO_PTHREADS - if (list.nr || cached || online_cpus() == 1 || show_in_pager) { + if (list.nr || cached || show_in_pager) { /* Can not multi-thread object lookup */ num_threads = 0; } -- 2.6.3.369.g3e7f205.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv3] Makefile: add missing phony target
Add some missing phony target to Makefile. Signed-off-by: Elia Pinto Helped-by: Matthieu Moy --- This is the third version of this patch. Compared to the previous I have added only the missing phony target as suggested by Matthieu Moy Makefile | 9 + 1 file changed, 9 insertions(+) diff --git a/Makefile b/Makefile index fd19b54..fc2f1ab 100644 --- a/Makefile +++ b/Makefile @@ -2025,6 +2025,7 @@ $(VCSSVN_LIB): $(VCSSVN_OBJS) export DEFAULT_EDITOR DEFAULT_PAGER +.PHONY: doc man html info pdf doc: $(MAKE) -C Documentation all @@ -2068,6 +2069,7 @@ po/git.pot: $(GENERATED_H) FORCE $(LOCALIZED_PERL) mv $@+ $@ +.PHONY: pot pot: po/git.pot POFILES := $(wildcard po/*.po) @@ -2277,6 +2279,7 @@ mergetools_instdir_SQ = $(subst ','\'',$(mergetools_instdir)) install_bindir_programs := $(patsubst %,%$X,$(BINDIR_PROGRAMS_NEED_X)) $(BINDIR_PROGRAMS_NO_X) +.PHONY: profile-install profile-fast-install profile-install: profile $(MAKE) install @@ -2343,6 +2346,8 @@ endif done && \ ./check_bindir "z$$bindir" "z$$execdir" "$$bindir/git-add$X" +.PHONY: install-gitweb install-doc install-man install-html install-info install-pdf +.PHONY: quick-install-doc quick-install-man quick-install-html install-gitweb: $(MAKE) -C gitweb install @@ -2402,6 +2407,7 @@ rpm: dist htmldocs = git-htmldocs-$(GIT_VERSION) manpages = git-manpages-$(GIT_VERSION) +.PHONY: dist-doc distclean dist-doc: $(RM) -r .doc-tmp-dir mkdir .doc-tmp-dir @@ -2470,6 +2476,8 @@ ALL_COMMANDS += git ALL_COMMANDS += gitk ALL_COMMANDS += gitweb ALL_COMMANDS += git-gui git-citool + +.PHONY: check-docs check-docs:: @(for v in $(ALL_COMMANDS); \ do \ @@ -2514,6 +2522,7 @@ check-builtins:: ### Test suite coverage testing # .PHONY: coverage coverage-clean coverage-compile coverage-test coverage-report +.PHONY: coverage-untested-functions cover_db cover_db_html .PHONY: coverage-clean-results coverage: -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Announcing Git for Windows 2.6.4
Hi Lars, [since I really do not like to repeat myself, I will re-Cc: the lists] On Tue, 15 Dec 2015, Lars Schneider wrote: > Junio always prefixes his announcements with "[ANNOUNCE] " in the > subject. Would that be an option for you announcements, too? If not then > I just need to update my filters :-) I actually meant to mention that I worked really hard on a script to automate all the steps of the Git for Windows releases: https://github.com/git-for-windows/build-extra/blob/master/please.sh This includes the generation of the announcement mail, of course. So your wish was addressed by: https://github.com/git-for-windows/build-extra/commit/bf79be45 Please feel free to mention your future wishes in the form of Pull Requests ;-) Ciao, Dscho -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
On Tue, Dec 15, 2015 at 2:04 PM, Ævar Arnfjörð Bjarmason wrote: > On Mon, Dec 14, 2015 at 8:44 PM, Junio C Hamano wrote: > > I'm replying to & quoting from two E-Mails of yours at once here for > clarity & less noise. I'm working wich Christian on getting this > integrated, and we both thought it would be good to have some fresh > input on the matter from me. > >> Christian Couder writes: > >>> If you want only some repos to use the UC, you will set >>> core.untrackedCache in the repo config. Then after cloning such a >>> repo, you will copy the config file, and this will not be enough to >>> enable the UC. >> >> Surely. "Does this index file keeps track of the untracked files' >> states?" is a property of the index. Cloning does not propagate the >> configuration and copying or not copying is irrelevant. If you want >> to enable, running "update-index --untracked-cache" is a way to do >> so. I cannot see what's so hard about it. >> >>> And if you have set core.untrackedCache in the global config when you >>> clone, UC is enabled, but if you have just set it in the repo config >>> after the clone, it is not enabled. >> >> That's fine. In your patch series, if you set it in the global, you >> will get the cache in the new one. With the cleaned-up semantics I >> suggested, the same thing will happen. >> >> And with the cleaned-up semantics, the configuration is *ONLY* used >> to give the *DEFAULT* before other things happen, i.e. creation of >> the index file for the first time. Because the configuration is >> only the default, an explicit "update-index --[no-]untracked-cache" >> will defeat it, just like any other config/option interaction. > > As you know Christian is working on this for Booking.com to integrate > features we find useful into git.git in such a way that we don't have > to maintain some internal fork of Git. > > What we're trying to do, and what a lot of other big deployments of > Git elsewhere would also find useful, is to ship a default sensible > configuration for all users on the system in /etc/gitconfig. > > I'd like to be able to easily enable some feature that aids Git > performance globally on our thousands of machines and for our hundreds > of users by just tweaking something in puppet to change > /etc/gitconfig, and more importantly if that change ends up being bad > reverting that config in /etc/gitconfig should undo the change. > > It's an unacceptable level of complexity for system-level automation > to have to scour the filesystem for existing Git repositories and run > "git update-index" on each of them, that's why we're submitting > patches to make this a config option, so we can simply flip a flag in > /etc/gitconfig. > > It's also unacceptable to have the config simply provide the default > which'll be frozen either at clone time or after an initial "git > status". > > Let's say I ship a /etc/gitconfig that says "new clones should use the > untracked cache". Now I roll that out across our fleet of machines and > it turns out the morning after that the feature doesn't work properly > for whatever reason. If it's just a "default until clone or status" > type of thing even if I revert the configuration a lot of users & > their repositories in the wild will still be broken, and will have to > be manually fixed. Which again leads to the scouring the filesystem > problem. > > So that gives some more context for why we're pushing for this change. > I believe this feature breaks no existing use-case and just supports > new ones, and I think that your objections to it are based on a simple > misunderstanding as will become apparent if you read on below. > >> The biggest issue I had with your patch series, IIRC, is that >> configuration will defeat the command line option. > > I think it's a moot point to focus on configuration v.s. command-line > option. The important question is whether or not this feature can > still be configured on a repo-local basis with this series as before. > That's still the case since --local git configuration overrides > --global and --system, so users who want to enable/disable this > per-repo still can. > >>> Shouldn't it be nice if they could just enable core.untrackedCache in >>> the global config files without having to also cd into every repo and >>> use "git update-index --untracked-cache" there? >> >> NO. It is bad to change the behaviour behind users' back. > > I'm not quite sure what the objection here is exactly. If you're a > normal user you can enable/disable this per-repo just like you can > now, and enable/disable it for all your repos in ~/.gitconfig. > > If you mean that the user's configuration shouldn't be changed by the > global config in /etc/gitconfig I do think that's a moot point. If > you're a user on a system where I have root and I want to change your > Git configuration I'm going to be able to do that whatever the > mechanism is. > > That's indeed that's what we're doing to enable this at Booking.com > currently, we
Re: [PATCH 7/8] config: add core.untrackedCache
On Mon, Dec 14, 2015 at 8:44 PM, Junio C Hamano wrote: I'm replying to & quoting from two E-Mails of yours at once here for clarity & less noise. I'm working wich Christian on getting this integrated, and we both thought it would be good to have some fresh input on the matter from me. > Christian Couder writes: >> If you want only some repos to use the UC, you will set >> core.untrackedCache in the repo config. Then after cloning such a >> repo, you will copy the config file, and this will not be enough to >> enable the UC. > > Surely. "Does this index file keeps track of the untracked files' > states?" is a property of the index. Cloning does not propagate the > configuration and copying or not copying is irrelevant. If you want > to enable, running "update-index --untracked-cache" is a way to do > so. I cannot see what's so hard about it. > >> And if you have set core.untrackedCache in the global config when you >> clone, UC is enabled, but if you have just set it in the repo config >> after the clone, it is not enabled. > > That's fine. In your patch series, if you set it in the global, you > will get the cache in the new one. With the cleaned-up semantics I > suggested, the same thing will happen. > > And with the cleaned-up semantics, the configuration is *ONLY* used > to give the *DEFAULT* before other things happen, i.e. creation of > the index file for the first time. Because the configuration is > only the default, an explicit "update-index --[no-]untracked-cache" > will defeat it, just like any other config/option interaction. As you know Christian is working on this for Booking.com to integrate features we find useful into git.git in such a way that we don't have to maintain some internal fork of Git. What we're trying to do, and what a lot of other big deployments of Git elsewhere would also find useful, is to ship a default sensible configuration for all users on the system in /etc/gitconfig. I'd like to be able to easily enable some feature that aids Git performance globally on our thousands of machines and for our hundreds of users by just tweaking something in puppet to change /etc/gitconfig, and more importantly if that change ends up being bad reverting that config in /etc/gitconfig should undo the change. It's an unacceptable level of complexity for system-level automation to have to scour the filesystem for existing Git repositories and run "git update-index" on each of them, that's why we're submitting patches to make this a config option, so we can simply flip a flag in /etc/gitconfig. It's also unacceptable to have the config simply provide the default which'll be frozen either at clone time or after an initial "git status". Let's say I ship a /etc/gitconfig that says "new clones should use the untracked cache". Now I roll that out across our fleet of machines and it turns out the morning after that the feature doesn't work properly for whatever reason. If it's just a "default until clone or status" type of thing even if I revert the configuration a lot of users & their repositories in the wild will still be broken, and will have to be manually fixed. Which again leads to the scouring the filesystem problem. So that gives some more context for why we're pushing for this change. I believe this feature breaks no existing use-case and just supports new ones, and I think that your objections to it are based on a simple misunderstanding as will become apparent if you read on below. > The biggest issue I had with your patch series, IIRC, is that > configuration will defeat the command line option. I think it's a moot point to focus on configuration v.s. command-line option. The important question is whether or not this feature can still be configured on a repo-local basis with this series as before. That's still the case since --local git configuration overrides --global and --system, so users who want to enable/disable this per-repo still can. >> Shouldn't it be nice if they could just enable core.untrackedCache in >> the global config files without having to also cd into every repo and >> use "git update-index --untracked-cache" there? > > NO. It is bad to change the behaviour behind users' back. I'm not quite sure what the objection here is exactly. If you're a normal user you can enable/disable this per-repo just like you can now, and enable/disable it for all your repos in ~/.gitconfig. If you mean that the user's configuration shouldn't be changed by the global config in /etc/gitconfig I do think that's a moot point. If you're a user on a system where I have root and I want to change your Git configuration I'm going to be able to do that whatever the mechanism is. That's indeed that's what we're doing to enable this at Booking.com currently, we run a job to find some limited set of common checkouts and run "git update-index" for users as root. The problem with that is that it's needlessly complex, hence this series. But in case you mean disabling the config
Re: Where does http.sslcainfo get set in Windows (2.6.3)?
Hi, On Mon, 14 Dec 2015, Lars Schneider wrote: > try to look here: > C:\Users\All Users\Git\config The location should be C:\ProgramData\Git\config for the Git configuration shared between all users and all Git implementations on Windows. Maybe you are running Windows XP, where C:\ProgramData does not exist, and where %ALLUSERSPROFILE%\Application Data\Git\config plays that role. Ciao, Johannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: update index mtime etc metadata
On Tue, Dec 15, 2015 at 3:44 AM, Joey Hess wrote: > Is there any available plumbing that can change the mtime etc metadata > that is recorded in the index for a file, to user-provided values? Or, > to force the current file stat metadata to be updated in the index? I don't think there is a way. We probably should improve ls-files and update-index to examine and update basically everything in the index.. But so far, nothing yet. > I know, git update-index --refresh, but I have a case where that's too > expensive. I'm using smudge filters; I know that the cleaned version of > the file will be unchanged from what's in the index now and only the > stat metadata will change, and so I want to avoid > git update-index --refresh running the clean filter, which can > be quite expensive for a large file. > > At the moment I don't see a way to do it other than using eg libgit2 to > update the appropriate fields in the index structure. Yeah. I see libgit2 has a haskell binding, probably best for you. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Announcing Git for Windows 2.6.4
Dear Git users, It is my pleasure to announce that Git for Windows 2.6.4 is available from: https://git-for-windows.github.io/ Changes since Git for Windows v2.6.3 (November 10th 2015) New Features ??? Comes with Git v2.6.4. ??? Also available as .tar.bz2 packages (you need an MSys2/ Cygwin-compatible unpacker to recreate the symbolic links correctly). Bug Fixes ??? Git for Windows v2.6.3's installer failed to elevate privileges automatically (reported three times, making it a charm), and as a consequence Git for Windows 2.6.3 was frequently installed per-user by mistake ??? The bug where SHELL_PATH had spaces and that was reported multiple times has been fixed. ??? An additional work-around from upstream Git for SHELL_PATH containing spaces (fixing problems with interactive rebase's exec command has been applied. Filename | SHA-256 | --- Git-2.6.4-64-bit.exe | 2deab47a0e4a212576f6822db532269d8a69ec47345f946da655e04b2437ae44 Git-2.6.4-32-bit.exe | c2c27746010f3b09949fb8435ac8a8d1496aeb6fb26fc6ac787ed09686cffcaa PortableGit-2.6.4-64-bit.7z.exe | 42190f01e78c68e0b4485cd31f8296af392bcaf82a7291f21918f4412e90db8c PortableGit-2.6.4-32-bit.7z.exe | 5b0fcea5262db1a7676b5cd21df717cd02f1835a270d7f94d1537028462e61df Git-2.6.4-64-bit.tar.bz2 | d2d143fe5b9a6517ebec5afcb3183a809f3b7ceda9abf2c1c76203accf65e877 Git-2.6.4-32-bit.tar.bz2 | 71296bef22154077f8f5acda61b791fbbecce872f4d2c550b61ca82f58eb8c90 Ciao, Johannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
On Tue, Dec 15, 2015 at 4:34 PM, Christian Couder wrote: > On Mon, Dec 14, 2015 at 10:30 PM, Junio C Hamano wrote: >> Junio C Hamano writes: >> >> The primary reason why I do not like your "configuration decides" is >> it will be a huge source of confusions and bugs. Imagine what >> should happen in this sequence, and when should a stale cached >> information be discarded? >> >> - the configuration is set to 'yes'. >> - the index is updated and written by various commands. >> - more work is done in the working tree without updating the index. >> - the configuration is set to 'no'. >> - more work is done in the working tree without updating the index. >> - the configuration is set to 'yes'. >> - more work is done in the working tree without updating the index. >> - somebody asks "what untracked paths are there?" > > As far as I understand the UC just stores the mtime of the directories > in the working tree to avoid the need of lstat'ing all the files in > the directories. > > When somebody asks "what untracked paths are there", if the UC has not > been discarded when the configuration was set to no, then git will > just ask for the mtimes of the directories in the working tree and > compare them with what is in the UC. > > I don't see how it can create confusion and bugs, as the work done in > the working tree should anyway have changed the mtime of the > directories where work has been done. Any operation that can add or remove an entry from the index may lead to UC update. For example, if file "foo" is tracked, then the user does "git rm --cached foo", "foo" may become either untracked or ignored. So if you disable UC while doing this removal, then re-enable UC again, "git-status" may show incorrect output. So, as long as UC extension exists in the index, it must be updated, or it may become outdated and useless. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
On 15.12.15 10:34, Christian Couder wrote: > On Mon, Dec 14, 2015 at 10:30 PM, Junio C Hamano wrote: >> Junio C Hamano writes: >> >> The primary reason why I do not like your "configuration decides" is >> it will be a huge source of confusions and bugs. Imagine what >> should happen in this sequence, and when should a stale cached >> information be discarded? >> >> - the configuration is set to 'yes'. >> - the index is updated and written by various commands. >> - more work is done in the working tree without updating the index. >> - the configuration is set to 'no'. >> - more work is done in the working tree without updating the index. >> - the configuration is set to 'yes'. >> - more work is done in the working tree without updating the index. >> - somebody asks "what untracked paths are there?" > > As far as I understand the UC just stores the mtime of the directories > in the working tree to avoid the need of lstat'ing all the files in > the directories. This is what I understand: UC stores the mtime of the directories in the working tree to avoid the need opendir() readdir() closedir() to find new, yet untracked, files. (including sub-directories) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] config: add core.untrackedCache
On Mon, Dec 14, 2015 at 10:30 PM, Junio C Hamano wrote: > Junio C Hamano writes: > > The primary reason why I do not like your "configuration decides" is > it will be a huge source of confusions and bugs. Imagine what > should happen in this sequence, and when should a stale cached > information be discarded? > > - the configuration is set to 'yes'. > - the index is updated and written by various commands. > - more work is done in the working tree without updating the index. > - the configuration is set to 'no'. > - more work is done in the working tree without updating the index. > - the configuration is set to 'yes'. > - more work is done in the working tree without updating the index. > - somebody asks "what untracked paths are there?" As far as I understand the UC just stores the mtime of the directories in the working tree to avoid the need of lstat'ing all the files in the directories. When somebody asks "what untracked paths are there", if the UC has not been discarded when the configuration was set to no, then git will just ask for the mtimes of the directories in the working tree and compare them with what is in the UC. I don't see how it can create confusion and bugs, as the work done in the working tree should anyway have changed the mtime of the directories where work has been done. Maybe as the UC has not been updated for a long time, there will be a lot of mtimes that are different, so there will not be a big speed up or it could be even slower than if the UC was not used, but that's all. > In the "configuration decides" world, I am not sure how a sane > implementation efficiently invalidates the cache as needed, without > the config subsystem having intimate knowledge with the untracked > cache (so far, the config subsystem is merely a key-value store and > does not care _what_ it stores; you would want to invalidate the > cache in the index when somebody sets the variable to 'no', which > means the config subsystem needs to know that this variable is > special). In the current patch series and in the one I am preparing and will probably send soon, this hunk takes care of removing or addind the UC if needed: diff --git a/wt-status.c b/wt-status.c index 435fc28..3e0fe02 100644 --- a/wt-status.c +++ b/wt-status.c @@ -586,6 +586,15 @@ static void wt_status_collect_untracked(struct wt_status *s) dir.flags |= DIR_SHOW_IGNORED_TOO; else dir.untracked = the_index.untracked; + + if (!dir.untracked && use_untracked_cache == 1) { + add_untracked_cache(); + dir.untracked = the_index.untracked; + } else if (dir.untracked && use_untracked_cache == 0) { + remove_untracked_cache(); + dir.untracked = NULL; + } + setup_standard_excludes(&dir); fill_directory(&dir, &s->pathspec); So when the config option is changed, the UC is removed or recreated only the next time "git status" (and maybe also "git commit" and a few other commands) is called. And anyway I don't think people will change the UC config very often. Maybe they will play with it a bit when they discover it, but afterwards they will just set it or not and be done with it. So I think it is not worth trying to optimize what happens when the config is set or unset. We should just make sure that it works and it is well documented. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] push: add '-d' as shorthand for '--delete'
On Mon, Dec 14, 2015 at 11:18:18AM -0800, Junio C Hamano wrote: > Patrick Steinhardt writes: > > > It is only possible to delete branches on remotes by specifying > > the long '--delete' flag. > > Not really. "git push origin :unnecessary-branch" should just work > with out "--delete" or "-d". Well, sure, didn't think about this when phrasing the commit message. Still I think my point stands that it is more convenient for users to also have the '-d' shorthand, as is also in use for branch deletion in `git-branch`. I'll resend this patch with a corrected message. Patrick signature.asc Description: Digital signature