What's cooking in git.git (Sep 2017, #03; Fri, 15)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. We are at week #6 of this cycle. It seems that people had a productive week while I was away, which I am reasonably happy about ;-) Quite a many topics that have been in 'master' are now also merged to 'maint', so perhaps I should tag 2.14.2 soonish. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [New Topics] * cc/subprocess-handshake-missing-capabilities (2017-09-11) 1 commit - subprocess: loudly die when subprocess asks for an unsupported capability Finishing touches to a topic already in 'master'. Will merge to 'next'. * bw/protocol-v1 (2017-09-14) 8 commits - i5700: add interop test for protocol transition - http: tell server that the client understands v1 - connect: tell server that the client understands v1 - connect: teach client to recognize v1 server response - upload-pack, receive-pack: introduce protocol version 1 - daemon: recognize hidden request arguments - protocol: introduce protocol extention mechanisms - pkt-line: add packet_write function A new mechanism to upgrade the wire protocol in place is proposed and demonstrated that it works with the older versions of Git without harming them. * ez/doc-duplicated-words-fix (2017-09-14) 1 commit - doc: fix minor typos (extra/duplicated words) Typofix. Will merge to 'next'. * jk/write-in-full-fix (2017-09-14) 8 commits - read_pack_header: handle signed/unsigned comparison in read result - config: flip return value of store_write_*() - notes-merge: use ssize_t for write_in_full() return value - pkt-line: check write_in_full() errors against "< 0" - convert less-trivial versions of "write_in_full() != len" - avoid "write_in_full(fd, buf, len) != len" pattern - get-tar-commit-id: check write_in_full() return against 0 - config: avoid "write_in_full(fd, buf, len) < len" pattern Many codepaths did not diagnose write failures correctly when disks go full, due to their misuse of write_in_full() helper function, which have been corrected. Will merge to 'next'. * jn/per-repo-object-store-fixes (2017-09-14) 3 commits - replace-objects: evaluate replacement refs without using the object store - push, fetch: error out for submodule entries not pointing to commits - pack: make packed_git_mru global a value instead of a pointer Step #0 of a planned & larger series. Will merge to 'next'. * ks/commit-do-not-touch-cut-line (2017-09-15) 1 commit - commit-template: change a message to be more intuitive The explanation of the cut-line in the commit log editor has been slightly tweaked. Will merge to 'next'. * ks/help-alias-label (2017-09-14) 1 commit - help: change a message to be more precise "git help co" now says "co is aliased to ...", not "git co is". Will merge to 'next'. * mh/mmap-packed-refs (2017-09-14) 20 commits - packed-backend.c: rename a bunch of things and update comments - mmapped_ref_iterator: inline into `packed_ref_iterator` - ref_cache: remove support for storing peeled values - packed_ref_store: get rid of the `ref_cache` entirely - ref_store: implement `refs_peel_ref()` generically - packed_read_raw_ref(): read the reference from the mmapped buffer - packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator` - read_packed_refs(): ensure that references are ordered when read - packed_ref_cache: keep the `packed-refs` file open and mmapped - mmapped_ref_iterator_advance(): no peeled value for broken refs - mmapped_ref_iterator: add iterator over a packed-refs file - packed_ref_cache: remember the file-wide peeling state - read_packed_refs(): read references with minimal copying - read_packed_refs(): make parsing of the header line more robust - read_packed_refs(): only check for a header at the top of the file - read_packed_refs(): use mmap to read the `packed-refs` file - die_unterminated_line(), die_invalid_line(): new functions - packed_ref_cache: add a backlink to the associated `packed_ref_store` - prefix_ref_iterator: break when we leave the prefix - ref_iterator: keep track of whether the iterator output is ordered (this branch uses mh/packed-ref-transactions.) Operations that do not touch (majority of) packed refs have been optimized by making accesses to packed-refs file lazy; we no longer pre-parse everything, and an access to a single ref in the packed-refs does not touch majority of irrelevant refs, either. Expecting a rework to also work on Windows. * nm/imap-send-with-curl (2017-09-15) 4 commits - imap-send: use curl by default when possible - imap_send: setup_curl: retreive credentials if not set
Re: git diff --name-status for deleted files
Gene Thomaswrites: > Junio, >Thanks for your reply. So git is essentially doing a >"git commit" when I "git rm". No. You'd probably need to read a bit more on Git; unlike other systems like CVS and SVN, where you only have two states (i.e. committed contents vs files on the filesystem), we have three states (i.e. the index in addition to the above two). "git add file" and "git rm file" make the index match what's on the filesystem wrt "file". They never touch committed contents, which "git commit" command is about.
RE: git diff --name-status for deleted files
Junio, Thanks for your reply. So git is essentially doing a "git commit" when I "git rm". Gene. -Original Message- From: Junio C Hamano [mailto:gits...@pobox.com] Sent: Friday, 15 September 2017 2:58 PM To: Gene ThomasCc: git@vger.kernel.org Subject: Re: git diff --name-status for deleted files Gene Thomas writes: > Hello, > "git diff -name-status" is useful to list the files one > has changed but it does not list file that one has > deleted with "git rm". It would be really handy if it > did. I am using git 2.9.3 on Ubuntu Linux 16.10. With or without --name-status option, "git diff" compares between the contents you have in the index and in your working tree. After you modify contents of a file, i.e. edit file git add file you would not see that file edited exactly because the file on the filesystem is identical to what you added to the index with "git add". Your example works exactly the same way. Instead of modifying the contents of a file, removing the presense of the file and recording that fact to the index (i.e. removing the path from the index, too) is done with "git rm", so after running git rm file your working tree would lack "file" and so would your index. Hence you wouldn't see "git diff" report any difference on "file". Perhaps you wanted "git diff HEAD", which is a way to compare between the contents you have in the tip commit and the paths in your working tree thru the index?
Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.
Kevin Willfordwrites: > 1. Does this statement, "I only care about the files in this > sparse checkout, and do not concern me with anything else", mean > that git should not change files outside the sparse-checkout whether > that be in the working directory or in the index? Or does that only > apply to the working directory and the index version of files can > change to whatever git the git command would do without using > sparse? For example if I am doing a 'git reset HEAD~1' should the > version in the index of files outside the sparse not be changed or > change to the HEAD~1 version with the skip-worktree bit on? My understanding of the purpose of "sparse checkout" thing is that the user still wants to create correct whole-tree commit even the user does not have the whole-tree checkout. The object names for blobs recorded in the index that are meant to be included in the next commit MUST be the same as those that would be in the index when the "sparse" feature is not in use. "reset HEAD~1" should match the index entries to the tree entries in HEAD~1. So, the latter, I would think, among your two alternatives. IOW, after "git reset HEAD~", if you drop the skip-worktree bit from all index entries, "git diff --cached HEAD" must say "there is no changes". The only difference between the "sparse" and normal would be that, because the "sparse" user does not intend to change anything outside the "sparse" area, these paths outside her "sparse" area would not materialize on the filesystem. For the next "write-tree" out of the index to still write the correct tree out, the entries outside her "sparse" area in the index MUST match the tree of the commit she started working from. > 2. How will this work with other git commands like merge, rebase, > cherry-pick, etc.? > 3. What about when there is a merge conflict with a file that is outside > the sparse checkout? I would say, rather than forbidding such a merge, it should let her see and deal with the conflict by dropping the "this is outside the sparse area, so do not bother materializing it to the filesystem" bit, but tell her loudly what it did ("I checked out a half-merged file outside your sparse-checkout area because you'd need it while resolving the conflict"). By doing things that way, the user can decide if she wants to go ahead and complete the merge, even if the conflict is outside the area she is currently interested in, or postpone the merge and continue working on what she has been working on inside the narrowed-down area first. I do not have a strong opinion whether the sparse-checkout configuration file should be adjusted to match when the command must tentatively bust the sparse checkout area; I'd imagine it can be argued either way. Note that "sparse" is not my itch, and I would not be surprised if those who designed it may want it to work differently from my knee-jerk reaction in the previous two paragraphs, and I may even find such an alternative solution preferable. But it is highly unlikely for any sensible solution would violate the basic premise, i.e. "the indexed contents will stay the same as the case without any 'sparse', so the next write-tree will do the right thing".
[PATCH v3] commit-template: change a message to be more intuitive
It's not good to use the phrase 'do not touch' to convey the information that the cut-line should not be modified or removed as it could possibly be mis-interpreted by a person who doesn't know that the word 'touch' has the meaning of 'tamper with'. Further, it could make translations a little difficult as it might not have the intended meaning in a few languages (for which translations don't exist yet) when translated as such. So, use intuitive terms in the sentence. Replacing the word 'touch' with other terms has introduced the possibility of the following sentence to be mis-interpreted, so change the terms in that too. Signed-off-by: Kaartic Sivaraam--- Changes in v3: - changed the wordings of the second sentence as there seemed to be a magical 'or else' connecting the two lines. I didn't expect the least that this would go upto v3. In case anyboy finds something wrong with this change too, it's a lot better to drop this altogether than going for a v4. wt-status.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/wt-status.c b/wt-status.c index 77c27c511..23e87e74d 100644 --- a/wt-status.c +++ b/wt-status.c @@ -934,7 +934,7 @@ size_t wt_status_locate_end(const char *s, size_t len) void wt_status_add_cut_line(FILE *fp) { - const char *explanation = _("Do not touch the line above.\nEverything below will be removed."); + const char *explanation = _("Do not modify or remove the line above.\nEverything below it will be ignored."); struct strbuf buf = STRBUF_INIT; fprintf(fp, "%c %s", comment_line_char, cut_line); -- 2.14.1.1006.g90ad9a07c
Re: [PATCH v4 3/4] imap_send: setup_curl: retreive credentials if not set in config file
Nicolas Morey-Chaisemartinwrites: > > + if (cred.username) > + if (res == CURLE_OK) > + credential_approve(); > +#if LIBCURL_VERSION_NUM >= 0x070d01 > + else if (res == CURLE_LOGIN_DENIED) > +#else > + else > +#endif > + credential_reject(); > + > + credential_clear(); > + As my copy of GCC seemed to be worried about readers getting confused by the if/else cascade, I'd place an extra pair of braces around this, i.e. if (cred.username) { if (res == CURLE_OK) credential_approve(); else /* or "else if DENIED" */ credential_reject(); } credential_clear(); while queuing this patch.
Re: [PATCH v4 3/4] imap_send: setup_curl: retreive credentials if not set in config file
Nicolas Morey-Chaisemartinwrites: > + if (cred.username) > + if (res == CURLE_OK) > + credential_approve(); > +#if LIBCURL_VERSION_NUM >= 0x070d01 > + else if (res == CURLE_LOGIN_DENIED) A slight tangent. This is in line with the way in which we do conditional compilation to work with different versions of libCurl, but we recently had discussion on modernizing these version based conditional compilation to use feature based one in another topic. We may want to switch to #if defined(CURLE_LOGIN_DENIED) ... (cf. https://public-inbox.org/git/cover.1502462884.git@jupiterrise.com/ the entire thread). No need to change _this_ patch in this series, but something to keep in mind planning for a future follow-up work to clean things up. Thanks.
Re: [PATCH 00/20] Read `packed-refs` using mmap()
On 09/14/2017 10:23 PM, Johannes Schindelin wrote: > On Wed, 13 Sep 2017, Michael Haggerty wrote: > >> * `mmap()` the whole file rather than `read()`ing it. > > On Windows, a memory-mapped file cannot be renamed. As a consequence, the > following tests fail on `pu`: > > [...] Thanks for your quick investigation. > This diff: > > -- snip -- > diff --git a/refs/packed-backend.c b/refs/packed-backend.c > index f9c5e771bdd..ee8b3603624 100644 > --- a/refs/packed-backend.c > +++ b/refs/packed-backend.c > @@ -1306,13 +1308,13 @@ static int packed_transaction_finish(struct > ref_store *ref_store, > char *packed_refs_path; > > packed_refs_path = get_locked_file_path(>lock); > + clear_snapshot(refs); > if (rename_tempfile(>tempfile, packed_refs_path)) { > strbuf_addf(err, "error replacing %s: %s", > refs->path, strerror(errno)); > goto cleanup; > } > > - clear_snapshot(refs); > ret = 0; > > cleanup: > -- snap -- > > reduces the failed tests to > > t1410-reflog.counts.sh > t3210-pack-refs.counts.sh > t3211-peel-ref.counts.sh > t5505-remote.counts.sh > t5510-fetch.counts.sh > t5600-clone-fail-cleanup.counts.sh That change is a strict improvement on all systems; I'll integrate it into the series. > However, much as I tried to wrap my head around it, I could not even start > to come up with a solution for e.g. the t1410 regression. The failure > happens in `git branch -d one/two`. > > The culprit: there is yet another unreleased snapshot while we try to > finish the transaction. > > It is the snapshot of the main_worktree_ref_store as acquired by > add_head_info() (which is called from get_main_worktree(), which in turn > was called from get_worktrees(), in turn having been called from > find_shared_symref() that was called from delete_branches()), and it seems > to me that there was never any plan to have that released, including its > snapshot. Yeah the idea was that the default situation would be that whenever a `packed-refs` file needs to be read, it would be kept mmapped for the life of the program (or until the file was detected to have been changed). This was meant to be a replacement for the explicit `ref_cache`. So any long-lived git process could be expected to have the `packed-refs` file (or even multiple `packed-refs` files in the case of submodules) mmapped. That's obviously not going to work on Windows. > [...] > Do you have any idea how to ensure that all mmap()ed packed refs are > released before attempting to finish a transaction? [And from your other email:] > This is only one example, as I figure that multiple worktrees could cause > even more ref_stores to have unreleased snapshots of the packed-refs file. > > I imagine that something like close_all_packs() is needed > ("clear_all_snapshots()"?). Yes, I wasn't really familiar with `close_all_packs()`, but it sounds like it solves a similar problem. Within a single process, we could make this work via an arduous process of figuring out what functions might want to overwrite the `packed-refs` file, and making sure that nobody up the call stack is iterating over references during those function calls. If that condition is met, then calling `clear_snapshot()` earlier, as you propose above, would decrease the reference count to zero, causing the old snapshot to be freed, and allowing the rename to succeed. We could even do if (!clear_snapshot(refs)) BUG("packed-refs snapshot is still in use"); , analogously to `close_all_packs()`, to help find code that violates the condition. Similarly, it wouldn't be allowed to invoke a subprocess or hook function while iterating over references, and one would have to clear any current snapshots before doing so. It might even make sense to do that at the same time as `close_all_packs()`, for example in a new function `close_all_unnecessary_files()`. But what about unrelated processes? Anybody reading `packed-refs` would block anybody who wants to modify it, for example to delete a reference or pack the references. I think this is worse than the packfile situation. Packfiles all have unique names, so they never need to be overwritten; at worst they are deleted. And the deletion is never critical to correctness. If you can't delete a packfile, the only consequence is that it hangs around until the next GC. However, the `packed-refs` file always has the same name, and overwriting it is sometimes critical for correctness. So it sounds to me like even *readers* mustn't keep the file open or mmapped longer than necessary! (This makes me wonder, doesn't `rename_tempfile()` for the `packed-refs` file *already* fail sometimes on Windows due to a concurrent reader? Shouldn't it retry if it gets whatever error indicates that the failure was due to the file being open?) Anyway, if all of my reasoning is correct, it seems that the inescapable conclusion is that having the `packed-refs` file
Re: git diff --name-status for deleted files
Gene Thomaswrites: > Hello, > "git diff -name-status" is useful to list the files one > has changed but it does not list file that one has > deleted with "git rm". It would be really handy if it > did. I am using git 2.9.3 on Ubuntu Linux 16.10. With or without --name-status option, "git diff" compares between the contents you have in the index and in your working tree. After you modify contents of a file, i.e. edit file git add file you would not see that file edited exactly because the file on the filesystem is identical to what you added to the index with "git add". Your example works exactly the same way. Instead of modifying the contents of a file, removing the presense of the file and recording that fact to the index (i.e. removing the path from the index, too) is done with "git rm", so after running git rm file your working tree would lack "file" and so would your index. Hence you wouldn't see "git diff" report any difference on "file". Perhaps you wanted "git diff HEAD", which is a way to compare between the contents you have in the tip commit and the paths in your working tree thru the index?
Git 2.14.1: t6500: error during test on musl libc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hi there, While bumping Git's version for our Linux distribution to 2.14.1, I've run in to a new test failure in t6500-gc.sh. This is the output of the failing test with debug=t verbose=t: expecting success: # make sure we run a background auto-gc test_commit make-pack && git repack && test_config gc.autopacklimit 1 && test_config gc.autodetach true && # create a ref whose loose presence we can use to detect a pack-refs run git update-ref refs/heads/should-be-loose HEAD && test_path_is_file .git/refs/heads/should-be-loose && # now fake a concurrent gc that holds the lock; we can use our # shell pid so that it looks valid. hostname=$(hostname || echo unknown) && printf "$$ %s" "$hostname" >.git/gc.pid && # our gc should exit zero without doing anything run_and_wait_for_auto_gc && test_path_is_file .git/refs/heads/should-be-loose [master 28ecdda] make-pack Author: A U Thor1 file changed, 1 insertion(+) create mode 100644 make-pack.t Counting objects: 3, done. Delta compression using up to 8 threads. Compressing objects: 100% (2/2), done. Writing objects: 100% (3/3), done. Total 3 (delta 0), reused 0 (delta 0) Auto packing the repository in background for optimum performance. See "git help gc" for manual housekeeping. File .git/refs/heads/should-be-loose doesn't exist. not ok 8 - background auto gc respects lock for all operations # # # make sure we run a background auto-gc # test_commit make-pack && # git repack && # test_config gc.autopacklimit 1 && # test_config gc.autodetach true && # # # create a ref whose loose presence we can use to detect a pack-refs run # git update-ref refs/heads/should-be-loose HEAD && # test_path_is_file .git/refs/heads/should-be-loose && # # # now fake a concurrent gc that holds the lock; we can use our # # shell pid so that it looks valid. # hostname=$(hostname || echo unknown) && # printf "$$ %s" "$hostname" >.git/gc.pid && # # # our gc should exit zero without doing anything # run_and_wait_for_auto_gc && # test_path_is_file .git/refs/heads/should-be-loose # # failed 1 among 8 test(s) 1..8 I admit I am mostly blind with the Git gc system. Should I use strace on the git-gc process at the end? How would I accomplish that? Is there a better way of debugging this error further? Core system stats: Intel x86_64 E3-1280 v3 @ 3.60 GHz musl libc 1.1.16+20 git 2.14.1, vanilla except for a patch to an earlier test due to musl's inability to cope with EUC-JP bash 4.3.48(1)-release Thank you very much. All the best, - --arw - -- A. Wilcox (awilfox) Project Lead, Adélie Linux http://adelielinux.org -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBCAAGBQJZuz4yAAoJEMspy1GSK50UORwP/0Jxfp3xzexh27tSJlXYWS/g g9QK8Xmid+3A0R696Vb2GguKg2roCcTmM2anR7iD1B2f2W31sgf+8M5mnJRHyJ1p geEeqwrTdpCk6jQ/1Pj03L0NOftb1ftR6hcoVujBFAOph4jRlRdZDPA87fe6snrh q99C3LoDXQcyK6WWJwzX+t2wOplKgpGJP8wTAaZ0AHoUwVS5CLPl8tP2XaY4kLfD ZPPcvtp9wisVzzZ2ssE/CLGd38EbenNNZ6OJCBFJIHmlwey4G2isZ9kk6fVIHXi2 unBJ8yVqI7hQKmQFSVQMMSFSd9azhHnDjTBO5mzWeRK9HNVMda3LZsXTtVeswnRs lN/ASMdt5KdfpNy/plFB7yDWLlQSQY7j1mxBMR8lL3AdVVQUbJppDM795tt+rn6a NCE2ESZMWd/QEULmT92AbkNJTj5ibBEoubnVTka05KMjaBLwIauhpqU5XxLFq2UH y3JYQU9hm0E7dQE0CLXxIm5/574T6bBUgp1cXH3CjxkeUYKR1USVKtDfBV6t/Qmt xlDZKPEfjKbTvL3KUF33G+eAp55wTwrJTaWlOp8A/JqooXavYghcsuFhYtCPJ8qo fFUa8kBZP70E/O7JkycUu8wi7p42+j1a8gR6/AnPG2u2wyoiosLCxHX+nll4gKmN b6BuiRn0Z9ie5xw4xcMR =Vf8Z -END PGP SIGNATURE-
Re: [PATCH 2/3] merge-base: return fork-point outside reflog
Michael J Gruberwrites: > In fact, per documentation "--fork-point" looks at the reflog in > addition to doing the usual walk from the tip. The original design > description in d96855ff51 ("merge-base: teach "--fork-point" mode", > 2013-10-23) describes this as computing from a virtual merge-base of all > the historical tips of refname. They may or may not all be present in > the reflog (think pruning, non-ff fetching, fast forwarding etc.), > so filtering by the current contents of the reflog is potentially > harmful, and it does not seem to fulfill any purpose in the original > design. Let me think aloud, using the picture from the log message from that commit. o---B1 / ---o---o---B2--o---o---o---Base \ B3 \ Derived where the current tip of the "base" branch is at Base, but earlier fetch observed that its tip used to be B3 and then B2 and then B1 before getting to the current commit, and the branch being rebased on top of the latest "base" is based on commit B3. So the logic tries to find a merge base between "Derived" and a virtual merge commit across Base, B1, B2 and B3. And it finds B3. If for some reason we didn't have B3 in the reflog, then wouldn't the merge base computation between Derived and a virtual merge commit across Base, B2 and B2 (but not B3 because we no longer know about it due to its lack in the reflog) find 'o' that is the parent of B2 and B3? Wouldn't that lead to both B3 and Derived replayed when the user of the fork-point potion rebases the history of Derived? Perhaps that is the best we could do with a pruned reflog that lacks B3, but if that is the case, I wonder if it may be better to fail the request saying that we cannot find the fork-point (because, well, your reflog no longer has sufficient information), than silently give a wrong result, and in this case, we can tell between a correct result (i.e. the merge base is one of the commits we still know was at the tip) and a wrong one (i.e. the merge base is not any of the commits in the reflog). If we declare --fork-point is the best effort option and may give an answer that is not better than without the option, then I think this patch is OK, but that diminishes the value of the option as a building block, I am afraid. Callers that are more careful could ask merge-base with --fork-point (and happily use it knowing that the result is indeed a commit that used to be at the tip), or fall back to the result merge-base without --fork-point gives (because you could do no better) and deal with duplicates that may happen due to the imprecise determination. With this change, these callers will get result from a call to "merge-base --fork-point" that may or may not be definite, and they cannot tell. For lazy users, making the option itself to fall back may be simpler to use, and certainly is a valid stance to take when implementing a convenience option to a Porcelain command, but I do not know if it is a good idea to throw "merge-base --fork-point" into that category. > > Remove the filtering and add a test for an out-of-reflog merge base. > > Reported-by: Ekelhart Jakob > Signed-off-by: Michael J Gruber > --- > builtin/merge-base.c | 18 +++--- > t/t6010-merge-base.sh | 8 > 2 files changed, 11 insertions(+), 15 deletions(-) > > diff --git a/builtin/merge-base.c b/builtin/merge-base.c > index 6dbd167d3b..926a7615ea 100644 > --- a/builtin/merge-base.c > +++ b/builtin/merge-base.c > @@ -186,23 +186,11 @@ static int handle_fork_point(int argc, const char > **argv) >* There should be one and only one merge base, when we found >* a common ancestor among reflog entries. >*/ > - if (!bases || bases->next) { > + if (!bases || bases->next) > ret = 1; > - goto cleanup_return; > - } > - > - /* And the found one must be one of the reflog entries */ > - for (i = 0; i < revs.nr; i++) > - if (>item->object == [i]->object) > - break; /* found */ > - if (revs.nr <= i) { > - ret = 1; /* not found */ > - goto cleanup_return; > - } > - > - printf("%s\n", oid_to_hex(>item->object.oid)); > + else > + printf("%s\n", oid_to_hex(>item->object.oid)); > > -cleanup_return: > free_commit_list(bases); > return ret; > } > diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh > index 17fffd7998..850463d4f2 100755 > --- a/t/t6010-merge-base.sh > +++ b/t/t6010-merge-base.sh > @@ -267,6 +267,14 @@ test_expect_success '--fork-point works with empty > reflog' ' > test_cmp expect actual > ' > > +test_expect_success '--fork-point works with merge-base outside reflog' ' > + git -c core.logallrefupdates=false checkout no-reflog && > + git -c
Re: [PATCH 0/4] Fixes from the per-repository object store series
Jonathan Niederwrites: > This is a continuation of the series at [1]. That was part 1 --- > you can think of this as part 0, since it contains the simplest and > least controversial part of the series --- each patch in this series > is a bugfix in its own right. > > Patch 1 should be familiar if you reviewed the series [1]. It is > unchanged from the patch there, except to note Peff's ack. > > Patches 2-4 have not yet visited the list but are fixes (or, in the > case of patch 4, cleanups) noticed as part of the same process of > teaching object store code to handle multiple repositories. > > We hope that splitting these out should make them easier to review > and for people to benefit from these fixes without having to wait > for the rest of the series to settle. One thing that is unclear is if you are tentatively withdrawing the longer series with this update, but I'd assume that it is the case because these two obviously will conflict with each other, and the most of what these four patches do are subset of what the larger one you sent earlier do. Thanks.
Re: [PATCH v2] commit-template: change a message to be more intuitive
Jeff Kingwrites: > # >8 > # Do not modify or remove the line above. > # Everything below will be ignored. > > (I might say "everything below it" to be more precise). > > I dunno. We are deep in bikeshed territory here, and I admit I don't > overly care that much in the first place. So should I drop this topic altogether? My inclination is to declare that "everything below it will be ignored" is the best version and stop reading further updates to this thread ;-)
Re: [PATCH v2] commit-template: change a message to be more intuitive
On Thu, 2017-09-14 at 09:36 +0200, Michael J Gruber wrote: > > Also, given all the translations that we have, it seems somewhat strange > to try and foresee and workaround possible misunderstandings of commonly > used English phrases. In case you're trying to say that "Translators have successfully translated the phrase with misunderstanding for many languages" then I have to say that I foresaw the difficulty to translate it in other languages for which a translation does not exist yet (specifically my mother tongue). If that's not useful as it might only be an rare case, then this patch isn't either. -- Kaartic
Re: commit-msg hook does not run on merge with --no-ff option
On Tue, 2017-09-12 at 13:24 -0500, Joseph Dunne wrote: > Sorry I don't understand your question. The commit-msg hook runs > properly in all cases except when I perform a merge with the --no-ff > option enabled. > It's working just as the documentation says it does (emphasis mine), This hook is invoked by **git commit**, and can be bypassed with the --no-verify option. It takes a single parameter, the name of the file that holds the proposed commit log message. Exiting with a non-zero status causes the git commit to abort. It says that 'commit-msg' hook is invoked only for a "commit" (it's not a MERGE-msg hook you see, it doesn't exist anyway). In case you see the hook getting invoked for a merge then that's an issue, I guess. For what kind of merges do you see the 'commit-msg' hook getting invoked? -- Kaartic
Dearest/Käraste
Dearest, I am Mrs. Asana Hajraf and I am married to Mr. Hassan Hajraf from kuwait for 19 years without a child and my husband died in 2014. I am contacting you to let you know my desire to donate the sum of $4.5 million to charities in your country which I inherited from my late husband. Due to my illness of cancer were confirmed out of just eight months, so it is my desire to see that this money is invested to any organization of your choice and distributed each year among the charity organizations, motherless babies home, schools and support for the men and women homeless or what you may consider to be the benefit of those less fortunate. I do not want this fund to be invested in a manner without God. As soon as I receive your reply confirming your acceptance of the work I tell you, I will give all relevant information to authorize the release and transfer the money to you as my duly appointed representative. Thanks, Mrs Asana Hajraf. Käraste, Jag är fru Asana Hajraf och jag är gift med Mr Hassan Hajraf från Kuwait i 19 år utan ett barn och min man dog i 2014. Jag kontaktar dig för att meddela dig min önskan om att donera summan av 4,5 miljoner dollar till välgörenhetsorganisationer i ditt land som jag ärvde från min sena make. På grund av min cancercancer bekräftades på bara åtta månader, så det är min önskan att se att pengarna investeras i valfri organisation och fördelas varje år bland välgörenhetsorganisationerna, moderlösa barnhem, skolor och stöd till män och kvinnor hemlösa eller vad du kanske anser vara till nytta för de mindre lyckliga. Jag vill inte att denna fond ska investeras på ett sätt utan Gud. Så snart jag mottar ditt svar bekräftar din acceptans av det arbete jag säger till dig, kommer jag att ge all relevant information för att tillåta frigivningen och överföra pengarna till dig som min vederbörligen utsedda representant. Tack, Fru Asana Hajraf.
Re: [PATCH 7/7] config: flip return value of store_write_*()
On Wed, Sep 13, 2017 at 02:25:28PM -0700, Jonathan Nieder wrote: > > Let's flip them to follow the usual write() conventions and > > update all callers. As these are local to config.c, it's > > unlikely that we'd have new callers in any topics in flight > > (which would be silently broken by our change). But just to > > be on the safe side, let's rename them to just > > write_section() and write_pairs(). That also accentuates > > their relationship with write(). > > > > Signed-off-by: Jeff King> > The caller only cares if it succeeded, right? Could this return > the customary 0 vs -1 instead of the number of bytes written? Yes, it could. I went with "follow the conventions of write()" because these are used in a big chain of write() calls (well, really write_in_full). But given the current callers, it does not matter either way. Thanks for reviewing the series, and sorry if my comments have been a bit terse. I'm trying to clear my pile before going out of town for a few days (which I admit may have contributed to my desire for you to prepare patches on top). But either way, don't expect a re-roll until next week at the earliest. -Peff
Re: [PATCH 6/7] notes-merge: use ssize_t for write_in_full() return value
On Wed, Sep 13, 2017 at 02:20:35PM -0700, Jonathan Nieder wrote: > > --- a/notes-merge.c > > +++ b/notes-merge.c > > @@ -302,7 +302,7 @@ static void write_buf_to_worktree(const struct > > object_id *obj, > > fd = xopen(path, O_WRONLY | O_EXCL | O_CREAT, 0666); > > > > while (size > 0) { > > - long ret = write_in_full(fd, buf, size); > > + ssize_t ret = write_in_full(fd, buf, size); > > if (ret < 0) { > > /* Ignore epipe */ > > if (errno == EPIPE) > > break; > > die_errno("notes-merge"); > > } else if (!ret) { > > die("notes-merge: disk full?"); > > } > > These three lines are dead code. How about the following, e.g. for > squashing in? Thanks, I didn't notice that. I'd actually prefer it as a separate patch, since it needs explained separately. -Peff
Re: [PATCH 3/7] avoid "write_in_full(fd, buf, len) != len" pattern
On Wed, Sep 13, 2017 at 02:14:30PM -0700, Jonathan Nieder wrote: > >I really wish every "write_in_full()" user would just > >check against "<0" now, but this fixes the nasty and > >stupid ones. > > Ok, you convinced me. > > Should we add a comment to cache.h as well encouraging this? I'd be OK with a comment, though I don't know that it's strictly necessary. It looks like most of it was just cargo-culted, so removing the offending examples is sufficient. > > [1] A careful reader may notice there is one way that > > write_in_full() can return a different value. If we ask > > write() to write N bytes and get a return value that is > > _larger_ than N, we could return a larger total. But > > besides the fact that this would imply a totally broken > > version of write(), it would already invoke undefined > > behavior. Our internal remaining counter is an unsigned > > size_t, which means that subtracting too many byte will > > wrap it around to a very large number. So we'll instantly > > begin reading off the end of the buffer, trying to write > > gigabytes (or petabytes) of data. > > This footnote just leaves me more confused, since as you mention, > write() never would return a value greater than N. Are you saying we > need to defend against a broken platform where that isn't true? No, I'm saying that my claim that write_in_full() can only return two values (-1 and the original length) is not strictly true. But that it doesn't matter in practice. I don't think we need to defend against such a broken platform, but I didn't want anybody reading the claim to say "aha, you forgot this case". It is a case that does not matter. -Peff
Re: [PATCH 2/7] get-tar-commit-id: check write_in_full() return against 0
On Wed, Sep 13, 2017 at 02:09:27PM -0700, Jonathan Nieder wrote: > > We ask to write 41 bytes and make sure that the return value > > is at least 41. This is the same "dangerous" pattern that > > was fixed in the prior commit (wherein a negative return > > value is promoted to unsigned), though it is not dangerous > > here because our "41" is a constant, not an unsigned > > variable. > > > > But we should convert it anyway to avoid modeling a > > dangerous construct. > > > > Signed-off-by: Jeff King> > --- > > builtin/get-tar-commit-id.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > I kind of disagree with calling this dangerous (and I think that is > what you alluded to above by putting it in quotes), but I like the > postimage more than the preimage. Right, this instance is fine, but the pattern of using "<" is not. If you swapped out "41" for: size_t len = 41; then it would be a bug. Which I think would surprise most people. > The variable 'n' could be eliminated to simplify this further. I > realize that would go against the spirit of this patch, but (1) it's > on-topic for the patch, since it is another ssize_t vs constant > comparison and (2) as mentioned elsewhere in this thread, it's a very > common idiom with read_in_full. If we want to eliminate it then we > could introduce a separate helper to distinguish between > read_this_much_i_mean_it and read_this_much_or_to_eof. Yes, I noticed that, too, after you brought up read_in_full() as a potential source of problems. But I would rather deal with read_in_full() separately on top. Can you do it as a separate patch on top (possibly with other read_in_full() cleanups, though I think this is the only "<" case that exists). -Peff
Re: [PATCH v2] config: avoid "write_in_full(fd, buf, len) < len" pattern
On Thu, Sep 14, 2017 at 12:31:38AM +0100, Ramsay Jones wrote: > > Hmm, about three or four years ago, I spent two or three evenings > > getting git to compile -Wextra clean. I remember the signed/unsigned > > issue was the cause of a large portion of the warnings issued by > > the compiler. I was surprised that it took such a relatively short > > time to do. However, it affected quite a large portion of the code, so > > I didn't think Junio would even consider applying it. Also, I only used > > gcc and was anticipating having additional warnings on clang (but I > > didn't get around to trying). > > > > Maybe I should give it another go. :-D > > For example, I remember the patch given below reduced the number > of warnings quite a bit (because it's an inline function in a > header file). > > I just tried it again tonight; the current master branch has 3532 > warnings when compiled with -Wextra, 1409 of which are -Wsign-compare > warnings. After applying the patch below, those numbers are reduced > by 344 to 3188/1065. I'd love it if we were clean on -Wextra. My big question is many contortions we'd have to go through in the code. I don't mind at all if we're actually cleaning as we go (e.g., using types of the correct signedness, or preventing possible funny interactions). I'm just worried it will turn into a bunch of noisy casts. The patch you showed seems like an improvement to the code, but I don't know if that would be the case for all of them. :) -Peff
Re: [idea] File history tracking hints
From: "Johannes Schindelin"Hi Philip, On Mon, 11 Sep 2017, Philip Oakley wrote: From: "Pavel Kretov" > Hi all, > > Excuse me if the topic I'm going to raise here has been already > discussed > on the mailing list, forums, or IRC, but I couldn't find anything > related. > > > The problem: > > Git, being "a stupid content tracker", doesn't try to keep an eye on > operations which happens to individual files; things like file renames > aren't recorded during commit, but heuristically detected later. > > Unfortunately, the heuristic can only deal with simple file renames > with > no substantial content changes; it's helpless when you: > > - rename file and change it's content significantly; > - split single file into several files; > - merge several files into another; > - copy entire file from another commit, and do other things like these. > > However, if we're able to preserve this information, it's possible > not only to do more accurate 'git blame', but also merge revisions with > fewer conflicts. > > > The proposal: > > The idea is to let user give hints about what was changed during > the commit. For example, if user did a rename which wasn't > automatically > detected, he would append something like the following to his commit > message: > >Tracking-hints: rename dev-vcs/git/git-1.0.ebuild -> > dev-vcs/git/git-2.0.ebuild > > or (if full paths of affected files can be unambiguously omitted): > >Tracking-hints: rename git-1.0.ebuild -> git-2.0.ebuild > > There may be other hint types: > >Tracking-hint: recreate LICENSE.txt >Tracking-hint: split main.c -> main.c cmdline.c >Tracking-hint: merge linalg.py <- vector.py matrix.py > > or even something like this: > >Tracking-hint: copy json.py <- > libs/json.py@4db88291251151d8c5c8e4f20430fa4def2cb2ed > > If file transformation cannot be described by a single tracking hint, > it > shall > be possible to specify a sequence of hints at once: > >Tracking-hint: >split Utils.java -> AppHelpers.java StringHelpers.java >recreate Utils.java > > Note that in the above example the order of operations really matters, > so > both lines have to reside in one 'Tracking-hint' block. > > * * * > > How do you think, is this idea worth implementing? > Any other thoughts on this? > > -- Pavel Kretov. Maybe use the "interpret-trailers" methods for standardising your hints locally (in your team / workplace) to see how it goes and flesh out what works and what doesn't. Trying to decide, a-priori, what are the right hints is likely to be the hard part. I think this adds a very valuable insight to this discussion: the current state of Git's rename handling is based on the idea that you either record the renames, or you detect them. Like, there is either "on" or "off". No middle ground. However, if you understand that there is also the possibility of hints that can help any erroneous rename detection (and *everybody* who seriously worked on a massive code base has seen that rename detection fail in the most inopportune ways [*1*]), then you are on to something. So I totally like the idea of introducing hints, possibly as trailers in the commit message (or as refs/notes/rename/* or whatever) that can be picked up by Git versions that know about them, and can be ignored by Git versions that insist on the rename detection du jour. With a config option to control the behavior, maybe, too. Ciao, Dscho Footnote *1*: Just to name a couple of examples from my personal experience, off the top of my head: - license boiler plates often let Git detect renames/copies where there are none, - even something as trivial as moving Java classes (and their dependent classes) between packages changes every line referring to said packages, causing Git's rename detection to go for a drink instead of doing its job, - indentation changes overwhelm Git's rename detection, - when rename detection would matter most, like, really a lot, to lift the burden of the human beings in front of the computer pouring over hundreds of thousands of files moved from one directory tree to another, that's exactly when Git's rename detection says that there are too many files, here are my union rights, I am going home, good luck to you. In light of such experiences, I have to admit that the notion that the rename detection can always be improved in hindsight puts quite a bit of insult to injury for those developers who are bitten by it. Your list made me think that the hints should be directed toward what may be considered existing solutions for those specific awkward cases. So the hints could be (by type): - template;licence;boiler-plate;standard;reference :: copy - word-rename - regex for word substitution changes (e.g. which chars are within 'Word-_0`) - regex for white-space changes (i.e. which chars are considered whitespace.) - move-dir path/glob spec - move-file path/glob spec
Re: [PATCH 1/2] test-lib: group system specific FIFO tests by system
Hi Michael, On Thu, 14 Sep 2017, Michael J Gruber wrote: > test-lib determines whether a file-system supports FIFOs and needs to do > special casing for CYGWIN and MINGW. This separates those system > specific settings from those at more central place. > > Set mkfifo() to false in the central system specific place so that the > same test works everywhere. The mkfifo() emulation of Cygwin seems to work, no? I think it works even in MSYS2, but not in MINGW. So maybe this patch should affect only the MINGW arm? Ciao, Dscho
Re: [PATCH 00/20] Read `packed-refs` using mmap()
Hi Michael, On Thu, 14 Sep 2017, Michael Haggerty wrote: > On 09/13/2017 07:15 PM, Michael Haggerty wrote: > > [...] > > * `mmap()` the whole file rather than `read()`ing it. > > [...] > > Apparently this doesn't work on Windows, because the `snapshot` is > keeping the `packed-refs` file open too long, so the new file can't be > renamed on top of it. Indeed, I sent you a mail about it before I checked for new mails ;-) > I didn't realize that this is even allowed, but TIL that you can close a > file while keeping it mmapped. Does that technique work on Windows? If > so, I'll change v2 to do it as sketched below. I do not know whether that technique works on Windows, but it would not solve the problem *entirely*, as I outlined in my reply: in delete_branches(), for example, the main_worktree_ref_store is implicitly getting its snapshot initialized, and nothing seems to release it. This is only one example, as I figure that multiple worktrees could cause even more ref_stores to have unreleased snapshots of the packed-refs file. I imagine that something like close_all_packs() is needed ("clear_all_snapshots()"?). Ciao, Dscho
git diff --name-status for deleted files
Hello, "git diff -name-status" is useful to list the files one has changed but it does not list file that one has deleted with "git rm". It would be really handy if it did. I am using git 2.9.3 on Ubuntu Linux 16.10. Yours Sincerely, Gene Thomas.
Re: RFC v3: Another proposed hash function transition plan
Hi Jonathan, On Thu, 14 Sep 2017, Jonathan Nieder wrote: > Johannes Schindelin wrote: > > On Wed, 13 Sep 2017, Jonathan Nieder wrote: > > >> [3] https://www.imperialviolet.org/2017/05/31/skipsha3.html, > > > > I had read this short after it was published, and had missed the updates. > > One link in particular caught my eye: > > > > https://eprint.iacr.org/2012/476 > > > > Essentially, the authors demonstrate that using SIMD technology can speed > > up computation by factor 2 for longer messages (2kB being considered > > "long" already). It is a little bit unclear to me from a cursory look > > whether their fast algorithm computes SHA-256, or something similar. > > The latter: that paper is about a variant on SHA-256 called SHA-256x4 > (or SHA-256x16 to take advantage of newer instructions). It's a > different hash function. This is what I was alluding to at [1]. Thanks for the explanation! > > As the author of that paper is also known to have contributed to OpenSSL, > > I had a quick look and it would appear that a comment in > > crypto/sha/asm/sha256-mb-x86_64.pl speaking about "lanes" suggests that > > OpenSSL uses the ideas from the paper, even if b783858654 (x86_64 assembly > > pack: add multi-block AES-NI, SHA1 and SHA256., 2013-10-03) does not talk > > about the paper specifically. > > > > The numbers shown in > > https://github.com/openssl/openssl/blob/master/crypto/sha/asm/keccak1600-x86_64.pl#L28 > > and in > > https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha256-mb-x86_64.pl#L17 > > > > are sufficiently satisfying. > > This one is about actual SHA-256, but computing the hash of multiple > streams in a single funtion call. The paper to read is [2]. We could > probably take advantage of it for e.g. bulk-checkin and index-pack. > Most other code paths that compute hashes wouldn't be able to benefit > from it. Again, thanks for the explanation. Ciao, Dscho > [1] > https://public-inbox.org/git/20170616212414.gc133...@aiede.mtv.corp.google.com/ > [2] https://eprint.iacr.org/2012/371 >
git fast-export/import bug with -M -C
The commands should be self explanatory. 0.2.0~20 is the first commit where the reconstructed repository diverges, that commit had a simultaneous copy and edit of one file. It seems that copy/rename detection, enabled with -M -C is confused by this. I reproduced it with git 2.14 next @ 8fa685d. git clone https://github.com/googlecartographer/cartographer_ros mkdir copy && cd copy && git init (cd ../cartographer_ros; git fast-export --all --date-order -M -C) | git fast-import git rev-parse 0.2.0~20 #2237e1d0a974977fbcb0737dd1fb5876a2b8e29d git rev-parse 0.2.0~21 #cd1276a99ccffcc491d0b2e50296ec04347ba5f2 cd ../cartographer_ros git rev-parse 0.2.0~20 #9d5b221ed41783b15c84bc90b71527194b8d9a49 git rev-parse 0.2.0~21 #cd1276a99ccffcc491d0b2e50296ec04347ba5f2
Re: RFC v3: Another proposed hash function transition plan
Hi, On Thu, 14 Sep 2017, demerphq wrote: > On 14 September 2017 at 17:23, Johannes Schindelin >wrote: > > > > SHA-256 has been hammered on a lot more than SHA3-256. > > Last year that was even more true of SHA1 than it is true of SHA-256 > today. I hope you are not deliberately trying to annoy me. I say that because you seemed to be interested enough in cryptography to know that the known attacks on SHA-256 *today* are unlikely to extend to Git's use case, whereas the known attacks on SHA-1 *in 2005* were already raising doubts. So while SHA-1 has been hammered on for longer than SHA-256, the latter came out a lot less scathed than the former. Besides, you are totally missing the point here that the choice is *not* between SHA-1 and SHA-256, but between SHA-256 and SHA3-256. After all, we would not consider any hash algorithm with known problems (as far as Git's usage is concerned). The amount of scrutiny with which the algorithm was investigated would only be a deciding factor among the remaining choices, yes? In any case, don't trust me on cryptography (just like I do not trust you on that matter). Trust the cryptographers. I contacted some of my colleagues who are responsible for crypto, and the two who seem to disagree on pretty much everything agreed on this one thing: that SHA-256 would be a good choice for Git (and one of them suggested that it would be much better than SHA3-256, because SHA-256 saw more cryptanalysis). Ciao, Johannes
Re: [PATCH 00/20] Read `packed-refs` using mmap()
Hi Michael, On Wed, 13 Sep 2017, Michael Haggerty wrote: > * `mmap()` the whole file rather than `read()`ing it. On Windows, a memory-mapped file cannot be renamed. As a consequence, the following tests fail on `pu`: t1400-update-ref.sh t1404-update-ref-errors.sh t1405-main-ref-store.sh t1408-packed-refs.sh t1410-reflog.sh t1430-bad-ref-name.sh t1450-fsck.sh t1507-rev-parse-upstream.sh t2018-checkout-branch.sh t2020-checkout-detach.sh t2024-checkout-dwim.sh t3200-branch.sh t3204-branch-name-interpretation.sh t3210-pack-refs.sh t3211-peel-ref.sh t3306-notes-prune.sh t3400-rebase.sh t3404-rebase-interactive.sh t3420-rebase-autostash.sh t3903-stash.sh t3905-stash-include-untracked.sh t4151-am-abort.sh t5304-prune.sh t5312-prune-corruption.sh t5400-send-pack.sh t5404-tracking-branches.sh t5500-fetch-pack.sh t5505-remote.sh t5510-fetch.sh t5514-fetch-multiple.sh t5515-fetch-merge-logic.sh t5516-fetch-push.sh t5520-pull.sh t5533-push-cas.sh t5572-pull-submodule.sh t5600-clone-fail-cleanup.sh t5607-clone-bundle.sh t6016-rev-list-graph-simplify-history.sh t6030-bisect-porcelain.sh t6032-merge-large-rename.sh t6040-tracking-info.sh t6050-replace.sh t6500-gc.sh t6501-freshen-objects.sh t7003-filter-branch.sh t7004-tag.sh t7102-reset.sh t7201-co.sh t7406-submodule-update.sh t7504-commit-msg-hook.sh t7701-repack-unpack-unreachable.sh t9300-fast-import.sh t9902-completion.sh t9903-bash-prompt.sh This diff: -- snip -- diff --git a/refs/packed-backend.c b/refs/packed-backend.c index f9c5e771bdd..ee8b3603624 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -1306,13 +1308,13 @@ static int packed_transaction_finish(struct ref_store *ref_store, char *packed_refs_path; packed_refs_path = get_locked_file_path(>lock); + clear_snapshot(refs); if (rename_tempfile(>tempfile, packed_refs_path)) { strbuf_addf(err, "error replacing %s: %s", refs->path, strerror(errno)); goto cleanup; } - clear_snapshot(refs); ret = 0; cleanup: -- snap -- reduces the failed tests to t1410-reflog.counts.sh t3210-pack-refs.counts.sh t3211-peel-ref.counts.sh t5505-remote.counts.sh t5510-fetch.counts.sh t5600-clone-fail-cleanup.counts.sh However, much as I tried to wrap my head around it, I could not even start to come up with a solution for e.g. the t1410 regression. The failure happens in `git branch -d one/two`. The culprit: there is yet another unreleased snapshot while we try to finish the transaction. It is the snapshot of the main_worktree_ref_store as acquired by add_head_info() (which is called from get_main_worktree(), which in turn was called from get_worktrees(), in turn having been called from find_shared_symref() that was called from delete_branches()), and it seems to me that there was never any plan to have that released, including its snapshot. And the snapshot gets initialized upon add_head_info() calling refs_resolve_ref_unsafe(). Further down in the delete_branches() function, however, we call delete_ref() which in turn wants to overwrite the packed-refs file via an atomic rename. That rename fails now because there is still that main worktree's ref_store that has the snapshot mmap()ed . I imagine that the rest of the test failures stems from the same root cause. Do you have any idea how to ensure that all mmap()ed packed refs are released before attempting to finish a transaction? Ciao, Dscho
Re: RFC v3: Another proposed hash function transition plan
Johannes Schindelin wrote: > On Wed, 13 Sep 2017, Jonathan Nieder wrote: >> As a side note, I am probably misreading, but I found this set of >> paragraphs a bit condescending. It sounds to me like you are saying >> "You are making the wrong choice of hash function and everything else >> you are describing is irrelevant when compared to that monumental >> mistake. Please stop working on things I don't consider important". >> With that reading it is quite demotivating to read. > > I am sorry you read it that way. I did not feel condescending when I wrote > that mail, I felt annoyed by the side track, and anxious. In my mind, the > transition is too important for side tracking, and I worry that we are not > fast enough (imagine what would happen if a better attack was discovered > that is not as easily detected as the one we know about?). Thanks for clarifying. That makes sense. [...] > As to *my* opinion: after reading https://goo.gl/gh2Mzc (is it really > correct that its last update has been on March 6th?), my only concern is > really that it still talks about SHA3-256 when I think that the > performance benefits of SHA-256 (think: "Git at scale", and also hardware > support) really make the latter a better choice. > > In order to be "ironed out", I think we need to talk about the > implementation detail "Translation table". This is important. It needs to > be *fast*. > > Speaking of *fast*, I could imagine that it would make sense to store the > SHA-1 objects on disk, still, instead of converting them on the fly. I am > not sure whether this is something we need to define in the document, > though, as it may very well be premature optimization; Maybe mention that > we could do this if necessary? > > Apart from that, I would *love* to see this document as The Official Plan > that I can Show To The Manager so that I can ask to Allocate Time. Sounds promising! Thanks much for this feedback. This is very helpful for knowing what v4 of the doc needs. The discussion of the translation table in [1] didn't make it to the doc. You're right that it needs to. Caching SHA-1 objects (and the pros and cons involved) makes sense to mention in an "ideas for future work" section. An implementation plan with well-defined pieces for people to take on and estimates of how much work each involves may be useful for Showing To The Manager. So I'll include a sketch of that for reviewers to poke holes in, too. Another thing the doc doesn't currently describe is how Git protocol would work. That's worth sketching in a "future work" section as well. Sorry it has been taking so long to get this out. I think we should have something ready to send on Monday. Thanks, Jonathan [1] https://public-inbox.org/git/CAJo=hJtoX9=aylhhpujs7fuev9ciz_mnpnephuz8whui6g9...@mail.gmail.com/
Re: RFC v3: Another proposed hash function transition plan
Hi Linus, On Wed, 13 Sep 2017, Linus Torvalds wrote: > On Wed, Sep 13, 2017 at 6:43 AM, demerphqwrote: > > > > SHA3 however uses a completely different design where it mixes a 1088 > > bit block into a 1600 bit state, for a leverage of 2:3, and the excess > > is *preserved between each block*. > > Yes. And considering that the SHA1 attack was actually predicated on > the fact that each block was independent (no extra state between), I > do think SHA3 is a better model. > > So I'd rather see SHA3-256 than SHA256. SHA-256 got much more cryptanalysis than SHA3-256, and apart from the length-extension problem that does not affect Git's usage, there are no known weaknesses so far. It would seem that the experts I talked to were much more concerned about that amount of attention than the particulars of the algorithm. My impression was that the new features of SHA3 were less studied than the well-known features of SHA2, and that the new-ness of SHA3 is not necessarily a good thing. You will have to deal with the fact that I trust the crypto experts' opinion on this a lot more than your opinion. Sure, you learned from the fact that you had been warned about SHA-1 already seeing theoretical attacks in 2005 and still choosing to hard-wire it into Git. And yet, you are still no more of a cryptography expert than I am. Ciao, Dscho
Re: RFC v3: Another proposed hash function transition plan
Hi, Johannes Schindelin wrote: > On Wed, 13 Sep 2017, Jonathan Nieder wrote: >> [3] https://www.imperialviolet.org/2017/05/31/skipsha3.html, > > I had read this short after it was published, and had missed the updates. > One link in particular caught my eye: > > https://eprint.iacr.org/2012/476 > > Essentially, the authors demonstrate that using SIMD technology can speed > up computation by factor 2 for longer messages (2kB being considered > "long" already). It is a little bit unclear to me from a cursory look > whether their fast algorithm computes SHA-256, or something similar. The latter: that paper is about a variant on SHA-256 called SHA-256x4 (or SHA-256x16 to take advantage of newer instructions). It's a different hash function. This is what I was alluding to at [1]. > As the author of that paper is also known to have contributed to OpenSSL, > I had a quick look and it would appear that a comment in > crypto/sha/asm/sha256-mb-x86_64.pl speaking about "lanes" suggests that > OpenSSL uses the ideas from the paper, even if b783858654 (x86_64 assembly > pack: add multi-block AES-NI, SHA1 and SHA256., 2013-10-03) does not talk > about the paper specifically. > > The numbers shown in > https://github.com/openssl/openssl/blob/master/crypto/sha/asm/keccak1600-x86_64.pl#L28 > and in > https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha256-mb-x86_64.pl#L17 > > are sufficiently satisfying. This one is about actual SHA-256, but computing the hash of multiple streams in a single funtion call. The paper to read is [2]. We could probably take advantage of it for e.g. bulk-checkin and index-pack. Most other code paths that compute hashes wouldn't be able to benefit from it. Thanks, Jonathan [1] https://public-inbox.org/git/20170616212414.gc133...@aiede.mtv.corp.google.com/ [2] https://eprint.iacr.org/2012/371
Re: RFC v3: Another proposed hash function transition plan
Hi Jonathan, On Wed, 13 Sep 2017, Jonathan Nieder wrote: > [3] https://www.imperialviolet.org/2017/05/31/skipsha3.html, I had read this short after it was published, and had missed the updates. One link in particular caught my eye: https://eprint.iacr.org/2012/476 Essentially, the authors demonstrate that using SIMD technology can speed up computation by factor 2 for longer messages (2kB being considered "long" already). It is a little bit unclear to me from a cursory look whether their fast algorithm computes SHA-256, or something similar. As the author of that paper is also known to have contributed to OpenSSL, I had a quick look and it would appear that a comment in crypto/sha/asm/sha256-mb-x86_64.pl speaking about "lanes" suggests that OpenSSL uses the ideas from the paper, even if b783858654 (x86_64 assembly pack: add multi-block AES-NI, SHA1 and SHA256., 2013-10-03) does not talk about the paper specifically. The numbers shown in https://github.com/openssl/openssl/blob/master/crypto/sha/asm/keccak1600-x86_64.pl#L28 and in https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha256-mb-x86_64.pl#L17 are sufficiently satisfying. Ciao, Dscho
Re: [PATCH v2] test-lib: don't use ulimit in test prerequisites on cygwin
Ramsay Jones wrote: > On cygwin (and MinGW), the 'ulimit' built-in bash command does not have > the desired effect of limiting the resources of new processes, at least > for the stack and file descriptors. However, it always returns success > and leads to several test prerequisites being erroneously set to true. > > Add a check for cygwin and MinGW to the prerequisite expressions, using > a 'test_have_prereq !MINGW,!CYGWIN' clause, to guard against using ulimit. > This affects the prerequisite expressions for the ULIMIT_STACK_SIZE, > CMDLINE_LIMIT and ULIMIT_FILE_DESCRIPTORS prerequisites. > > Signed-off-by: Ramsay Jones> --- > t/t1400-update-ref.sh | 5 - > t/t6120-describe.sh | 1 - > t/t7004-tag.sh| 1 - > t/test-lib.sh | 10 -- > 4 files changed, 12 insertions(+), 5 deletions(-) Reviewed-by: Jonathan Nieder Thank you.
Re: [PATCH v5 25/40] external-odb: add 'get_direct' support
On Thu, 14 Sep 2017 10:39:35 +0200 Christian Couderwrote: > From the following email: > > https://public-inbox.org/git/20170804145113.5ceaf...@twelve2.svl.corp.google.com/ > > it looks like his work is fundamentally about changing the rules of > connectivity checks. Objects are split between "homegrown" objects and > "imported" objects which are in separate pack files. Then references > to imported objects are not checked during connectivity check. > > I think changing connectivity rules is not necessary to make something > like external odb work. For example when fetching a pack that refers > to objects that are in an external odb, if access this external odb > has been configured, then the connectivity check will pass as the > missing objects in the pack will be seen as already part of the repo. There are still some nuances. For example, if an external ODB provides both a tree and a blob that the tree references, do we fetch the tree in order to call "have" on all its blobs, or do we trust the ODB that if it has the tree, it has all the other objects? In my design, I do the latter, but in the general case where we have multiple ODBs, we might have to do the former. (And if we do the former, it seems to me that the connectivity check must be performed "online" - that is, with the ODBs being able to provide "get".) (Also, my work extends all the way to fetch/clone [1], but admittedly I have been taking it a step at a time and recently have only been discussing how the local repo should handle the missing object situation.) [1] https://public-inbox.org/git/cover.1499800530.git.jonathanta...@google.com/ > Yeah, if some commands like fsck are used, then possibly all the > objects will have to be requested from the external odb, as it may not > be possible to fully check all the objects, especially the blobs, > without accessing all their data. But I think this is a problem that > could be dealt with in different ways. For example we could develop > specific options in fsck so that it doesn't check the sha1 of objects > that are marked with some specific attributes, or that are stored in > external odbs, or that are bigger than some size. The hard part is in dealing with missing commits and trees, I think, not blobs.
Re: Rebase & submodules
On Thu, Sep 14, 2017 at 10:57 AM, Nicolas Morey-Chaisemartinwrote: > Without changing your workflow too much, If you mean to imply that you have other recommendations if I'm willing to change my workflow, then please by all means share them. I'm very interested. I'm not too hooked on my workflow. > simply add an annotated tag to your branch before your rebase. > This way the SHA1 will always exists. Unless you want to cleanup at some > point (branch merged ?) and then you can simply delete all those old tags. This definitely the best idea so far; although the maintenance overhead of this could be high for long-lived branches with frequent rebases. Maybe with bigger workflow changes there are other solutions?
Re: [PATCH 2/2] test-lib: ulimit does not limit on CYGWIN and MINGW
On 14/09/17 15:52, Michael J Gruber wrote: > ulimit succeeds (by return value) but does not limit on some systems. > > Set ulimit() to false on these systems so that we do not rely on its > output nor effect. As an intended side-effect, ulimit based > prerequisites are set correctly (to "not-have") on these systems. > > Reported-by: Ramsay Jones> Reported-by: Adam Dinwoodie > Reported-by: Johannes Schindelin > Signed-off-by: Michael J Gruber > --- > This is independent of my series, but should best go before so that no > ulimit based test is run on CYGWIN and MINGW. > > It follows the basic assumption that a tool like ulimit is either > present and functional or not present; and that we work around by > defines or such when that assumption is broken. > (Alternatively, we could set ULIMT_LIMITS or so and depend on that.) Heh, this was my first suggestion, if you recall, but I decided to go a different way ... ;-) Also, Johannes made a good suggestion, which lead to a new version of my patch (which could easily be extended to cover the FIFO). I don't have a strong preference for either approach (but I would have to test your patches, which I haven't done yet), so I would be happy to see either applied. ATB, Ramsay Jones
[PATCH v2] test-lib: don't use ulimit in test prerequisites on cygwin
On cygwin (and MinGW), the 'ulimit' built-in bash command does not have the desired effect of limiting the resources of new processes, at least for the stack and file descriptors. However, it always returns success and leads to several test prerequisites being erroneously set to true. Add a check for cygwin and MinGW to the prerequisite expressions, using a 'test_have_prereq !MINGW,!CYGWIN' clause, to guard against using ulimit. This affects the prerequisite expressions for the ULIMIT_STACK_SIZE, CMDLINE_LIMIT and ULIMIT_FILE_DESCRIPTORS prerequisites. Signed-off-by: Ramsay Jones--- Hi Junio, Johannes suggested the improvement to this patch (namely using the 'test_have_prereq !MINGW,!CYGWIN' sub-expression), which makes it somewhat more compact and obvious. Thanks Johannes! ATB, Ramsay Jones t/t1400-update-ref.sh | 5 - t/t6120-describe.sh | 1 - t/t7004-tag.sh| 1 - t/test-lib.sh | 10 -- 4 files changed, 12 insertions(+), 5 deletions(-) diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index dc98b4bc6..664a3a4e4 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -1253,7 +1253,10 @@ run_with_limited_open_files () { (ulimit -n 32 && "$@") } -test_lazy_prereq ULIMIT_FILE_DESCRIPTORS 'run_with_limited_open_files true' +test_lazy_prereq ULIMIT_FILE_DESCRIPTORS ' + test_have_prereq !MINGW,!CYGWIN && + run_with_limited_open_files true +' test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches does not burst open file limit' ' ( diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index dd6dd9df9..3d45dc295 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -279,7 +279,6 @@ test_expect_success 'describe ignoring a borken submodule' ' grep broken out ' -# we require ulimit, this excludes Windows test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' ' i=1 && while test $i -lt 8000 diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh index 5bf5ace56..b545c33f8 100755 --- a/t/t7004-tag.sh +++ b/t/t7004-tag.sh @@ -1863,7 +1863,6 @@ test_expect_success 'version sort with very long prerelease suffix' ' git tag -l --sort=version:refname ' -# we require ulimit, this excludes Windows test_expect_success ULIMIT_STACK_SIZE '--contains and --no-contains work in a deep repo' ' >expect && i=1 && diff --git a/t/test-lib.sh b/t/test-lib.sh index 83f5d3dd2..32f9b2001 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1170,13 +1170,19 @@ run_with_limited_cmdline () { (ulimit -s 128 && "$@") } -test_lazy_prereq CMDLINE_LIMIT 'run_with_limited_cmdline true' +test_lazy_prereq CMDLINE_LIMIT ' + test_have_prereq !MINGW,!CYGWIN && + run_with_limited_cmdline true +' run_with_limited_stack () { (ulimit -s 128 && "$@") } -test_lazy_prereq ULIMIT_STACK_SIZE 'run_with_limited_stack true' +test_lazy_prereq ULIMIT_STACK_SIZE ' + test_have_prereq !MINGW,!CYGWIN && + run_with_limited_stack true +' build_option () { git version --build-options | -- 2.14.0
[PATCH v2] commit-tree: do not complete line in -F input
"git commit-tree -F ", unlike "cat | git commit-tree" (i.e. feeding the same contents from the standard input), added a missing final newline when the input ended in an incomplete line. Correct this inconsistency by leaving the incomplete line as-is, as erring on the side of not touching the input is preferable and expected for a plumbing command like "commit-tree". Signed-off-by: Ross Kabus--- Fixed up commit message with feedback from Junio. builtin/commit-tree.c | 1 - 1 file changed, 1 deletion(-) diff --git a/builtin/commit-tree.c b/builtin/commit-tree.c index 19e898fa4..2177251e2 100644 --- a/builtin/commit-tree.c +++ b/builtin/commit-tree.c @@ -102,7 +102,6 @@ int cmd_commit_tree(int argc, const char **argv, const char *prefix) if (fd && close(fd)) die_errno("git commit-tree: failed to close '%s'", argv[i]); - strbuf_complete_line(); continue; } -- 2.13.1.windows.2
Re: RFC v3: Another proposed hash function transition plan
On 09/14, Johannes Schindelin wrote: > Hi Jonathan, > > On Wed, 13 Sep 2017, Jonathan Nieder wrote: > > > As a side note, I am probably misreading, but I found this set of > > paragraphs a bit condescending. It sounds to me like you are saying > > "You are making the wrong choice of hash function and everything else > > you are describing is irrelevant when compared to that monumental > > mistake. Please stop working on things I don't consider important". > > With that reading it is quite demotivating to read. > > I am sorry you read it that way. I did not feel condescending when I wrote > that mail, I felt annoyed by the side track, and anxious. In my mind, the > transition is too important for side tracking, and I worry that we are not > fast enough (imagine what would happen if a better attack was discovered > that is not as easily detected as the one we know about?). > > > An alternative reading is that you are saying that the transition plan > > described in this thread is not ironed out. Can you spell that out > > more? What particular aspect of the transition plan (which is of > > course orthogonal to the choice of hash function) are you discontent > > with? > > My impression from reading Junio's mail was that he does not consider the > transition plan ironed out yet, and that he wants to spend time on > discussing generation numbers right now. > > I was in particularly frightened by the suggestion to "reboot" [*1*]. > Hopefully I misunderstand and he meant "finishing touches" instead. > > As to *my* opinion: after reading https://goo.gl/gh2Mzc (is it really > correct that its last update has been on March 6th?), my only concern is > really that it still talks about SHA3-256 when I think that the > performance benefits of SHA-256 (think: "Git at scale", and also hardware > support) really make the latter a better choice. > > In order to be "ironed out", I think we need to talk about the > implementation detail "Translation table". This is important. It needs to > be *fast*. Agreed, when that document was written it was hand waved as an implementation detail but once we should probably stare ironing out those details soon so that we have a concrete plan in place. > > Speaking of *fast*, I could imagine that it would make sense to store the > SHA-1 objects on disk, still, instead of converting them on the fly. I am > not sure whether this is something we need to define in the document, > though, as it may very well be premature optimization; Maybe mention that > we could do this if necessary? > > Apart from that, I would *love* to see this document as The Official Plan > that I can Show To The Manager so that I can ask to Allocate Time. Speaking of having a concrete plan, we discussed in office the other day about finally converting the doc into a Documentation patch. That was always are intention but after writing up the doc we got busy working on other projects. Getting it in as a patch (with a more concrete road map) is probably the next step we'd need to take. I do want to echo what jonathan has said in other parts of this thread, that the transition plan itself doesn't depend on which hash function we end up going with in the end. I fully expect that for the transition plan to succeed that we'll have infrastructure for dropping in different hash functions so that we can do some sort of benchmarking before selecting one to use. This would also give us the ability to more easily transition to another hash function when the time comes. -- Brandon Williams
Re: [PATCH 00/20] Read `packed-refs` using mmap()
On 09/13/2017 07:15 PM, Michael Haggerty wrote: > [...] > * `mmap()` the whole file rather than `read()`ing it. > [...] Apparently this doesn't work on Windows, because the `snapshot` is keeping the `packed-refs` file open too long, so the new file can't be renamed on top of it. I didn't realize that this is even allowed, but TIL that you can close a file while keeping it mmapped. Does that technique work on Windows? If so, I'll change v2 to do it as sketched below. Michael diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 8235ac8506..95c1cd2a27 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -35,11 +35,8 @@ struct snapshot { */ struct packed_ref_store *refs; - /* -* The file descriptor of the `packed-refs` file (open in -* read-only mode), or -1 if it is not open. -*/ - int fd; + /* Is the `packed-refs` file currently mmapped? */ + int mmapped; /* * The contents of the `packed-refs` file. If the file was @@ -135,12 +132,11 @@ static void acquire_snapshot(struct snapshot *snapshot) */ static void clear_snapshot_buffer(struct snapshot *snapshot) { - if (snapshot->fd >= 0) { + if (snapshot->mmapped) { if (munmap(snapshot->buf, snapshot->eof - snapshot->buf)) die_errno("error ummapping packed-refs file %s", snapshot->refs->path); - close(snapshot->fd); - snapshot->fd = -1; + snapshot->mmapped = 0; } else { free(snapshot->buf); } @@ -525,6 +521,7 @@ static const char *find_reference_location(struct snapshot *snapshot, static struct snapshot *create_snapshot(struct packed_ref_store *refs) { struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot)); + int fd; struct stat st; size_t size; int sorted = 0; @@ -533,8 +530,8 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs) acquire_snapshot(snapshot); snapshot->peeled = PEELED_NONE; - snapshot->fd = open(refs->path, O_RDONLY); - if (snapshot->fd < 0) { + fd = open(refs->path, O_RDONLY); + if (fd < 0) { if (errno == ENOENT) { /* * This is OK; it just means that no @@ -549,15 +546,16 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs) } } - stat_validity_update(>validity, snapshot->fd); + stat_validity_update(>validity, fd); - if (fstat(snapshot->fd, ) < 0) + if (fstat(fd, ) < 0) die_errno("couldn't stat %s", refs->path); size = xsize_t(st.st_size); - snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, - snapshot->fd, 0); + snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); snapshot->eof = snapshot->buf + size; + snapshot->mmapped = 1; + close(fd); /* If the file has a header line, process it: */ if (snapshot->buf < snapshot->eof && *snapshot->buf == '#') {
Re: Rebase & submodules
Le 14/09/2017 à 17:39, Robert Dailey a écrit : > So I often will have a submodule that points to one of my own forks, > because I will have work done on a feature branch that hasn't been > merged upstream yet. Assuming this merge takes a long time to get > approved, I will occasionally rebase my topic branch to keep things up > to date and clean. > > However, any previous commits in my parent repository that refer to a > SHA1 prior to the rebase will now be pointing to dangling/orphaned > commits, which means I wouldn't be able to go back to that commit in > history and do `git submodule update` since that checkout will fail. > > One obvious solution to this is: Don't rebase. But, this could result > in a lot of merging between the upstream and my fork which would make > things not only ugly, but prevent me from making a pull request that > makes sense to the upstream repository (They'd likely request a rebase > in order to accept the PR, which I wouldn't be able to do for the > reasons outlined above). > > Are there any other workflows that would support this kind of model better? Without changing your workflow too much, simply add an annotated tag to your branch before your rebase. This way the SHA1 will always exists. Unless you want to cleanup at some point (branch merged ?) and then you can simply delete all those old tags. Nicolas
Re: RFC v3: Another proposed hash function transition plan
On 14 September 2017 at 17:23, Johannes Schindelinwrote: > Hi Junio, > > On Thu, 14 Sep 2017, Junio C Hamano wrote: > >> Jonathan Nieder writes: >> >> > In other words, a long lifetime for the hash absolutely is a design >> > goal. Coping well with an unexpectedly short lifetime for the hash is >> > also a design goal. >> > >> > If the hash function lasts 10 years then I am happy. >> >> Absolutely. When two functions have similar expected remaining life >> and are equally widely supported, then faster is better than slower. >> Otherwise our primary goal when picking the function from candidates >> should be to optimize for its remaining life and wider availability. > > SHA-256 has been hammered on a lot more than SHA3-256. Last year that was even more true of SHA1 than it is true of SHA-256 today. Anyway, Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
Rebase & submodules
So I often will have a submodule that points to one of my own forks, because I will have work done on a feature branch that hasn't been merged upstream yet. Assuming this merge takes a long time to get approved, I will occasionally rebase my topic branch to keep things up to date and clean. However, any previous commits in my parent repository that refer to a SHA1 prior to the rebase will now be pointing to dangling/orphaned commits, which means I wouldn't be able to go back to that commit in history and do `git submodule update` since that checkout will fail. One obvious solution to this is: Don't rebase. But, this could result in a lot of merging between the upstream and my fork which would make things not only ugly, but prevent me from making a pull request that makes sense to the upstream repository (They'd likely request a rebase in order to accept the PR, which I wouldn't be able to do for the reasons outlined above). Are there any other workflows that would support this kind of model better?
Re: RFC v3: Another proposed hash function transition plan
Hi Junio, On Thu, 14 Sep 2017, Junio C Hamano wrote: > Jonathan Niederwrites: > > > In other words, a long lifetime for the hash absolutely is a design > > goal. Coping well with an unexpectedly short lifetime for the hash is > > also a design goal. > > > > If the hash function lasts 10 years then I am happy. > > Absolutely. When two functions have similar expected remaining life > and are equally widely supported, then faster is better than slower. > Otherwise our primary goal when picking the function from candidates > should be to optimize for its remaining life and wider availability. SHA-256 has been hammered on a lot more than SHA3-256. That would be a strong point in favor of SHA2. Ciao, Dscho
[PATCH 2/2] test-lib: ulimit does not limit on CYGWIN and MINGW
ulimit succeeds (by return value) but does not limit on some systems. Set ulimit() to false on these systems so that we do not rely on its output nor effect. As an intended side-effect, ulimit based prerequisites are set correctly (to "not-have") on these systems. Reported-by: Ramsay JonesReported-by: Adam Dinwoodie Reported-by: Johannes Schindelin Signed-off-by: Michael J Gruber --- This is independent of my series, but should best go before so that no ulimit based test is run on CYGWIN and MINGW. It follows the basic assumption that a tool like ulimit is either present and functional or not present; and that we work around by defines or such when that assumption is broken. (Alternatively, we could set ULIMT_LIMITS or so and depend on that.) t/test-lib.sh | 8 1 file changed, 8 insertions(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index b8a0b05102..f6ac60d487 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -998,6 +998,10 @@ case $uname_s in mkfifo () { false } + # ulimit succeeds but does not limit + ulimit () { + false + } # no POSIX permissions # backslashes in pathspec are converted to '/' # exec does not inherit the PID @@ -1012,6 +1016,10 @@ case $uname_s in mkfifo () { false } + # ulimit succeeds but does not limit + ulimit () { + false + } test_set_prereq POSIXPERM test_set_prereq EXECKEEPSPID test_set_prereq CYGWIN -- 2.14.1.712.gda4591c8a2
[PATCH 1/2] test-lib: group system specific FIFO tests by system
test-lib determines whether a file-system supports FIFOs and needs to do special casing for CYGWIN and MINGW. This separates those system specific settings from those at more central place. Set mkfifo() to false in the central system specific place so that the same test works everywhere. Signed-off-by: Michael J Gruber--- t/test-lib.sh | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/t/test-lib.sh b/t/test-lib.sh index 5fbd8d4a90..b8a0b05102 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -994,6 +994,10 @@ case $uname_s in pwd () { builtin pwd -W } + # no FIFOs + mkfifo () { + false + } # no POSIX permissions # backslashes in pathspec are converted to '/' # exec does not inherit the PID @@ -1004,6 +1008,10 @@ case $uname_s in GIT_TEST_CMP=mingw_test_cmp ;; *CYGWIN*) + # no FIFOs + mkfifo () { + false + } test_set_prereq POSIXPERM test_set_prereq EXECKEEPSPID test_set_prereq CYGWIN @@ -1062,14 +1070,7 @@ test_i18ngrep () { test_lazy_prereq PIPE ' # test whether the filesystem supports FIFOs - case $(uname -s) in - CYGWIN*|MINGW*) - false - ;; - *) - rm -f testfifo && mkfifo testfifo - ;; - esac + rm -f testfifo && mkfifo testfifo ' test_lazy_prereq SYMLINKS ' -- 2.14.1.712.gda4591c8a2
Re: [PATCH v2] commit-template: change a message to be more intuitive
On Thu, Sep 14, 2017 at 07:47:51PM +0530, Kaartic Sivaraam wrote: > > > - const char *explanation = _("Do not touch the line above.\nEverything > > > below will be removed."); > > > + const char *explanation = _("Do not modify or remove the line > > > above.\nEverything below will be removed."); > > I don't want to complicate things. But now - due to the repeated usage > > of "remove" - these two sentences seem to be connected by an invisible > > "or else" ("will" vs. "would" not withstanding). i.e. "or else > > everything below will be removed, too", which is wrong. Before, they > > were separated more clearly. > > At least I don't think they are connected by an invisible "or else" because > they appear on different lines, You could switch out the second "removed" for "ignored", which I think has the same meaning but avoids repeating the same word. So: # >8 # Do not modify or remove the line above. # Everything below will be ignored. (I might say "everything below it" to be more precise). I dunno. We are deep in bikeshed territory here, and I admit I don't overly care that much in the first place. -Peff
Re: [PATCH 0/3] merge-base --fork-point fixes
On Thu, Sep 14, 2017 at 03:15:17PM +0200, Michael J Gruber wrote: > merge-base --fork-point does not quite work as advertised when the > reflog is empty or partial. This series brings it in line with the > documentation and, hopefully, with the original intent. > > Michael J Gruber (3): > t6010: test actual test output > merge-base: return fork-point outside reflog > merge-base: find fork-point outside partial reflog Patches 1 and 3 look sane to me. Your argument in patch 2 looks plausible to me, but I'm not familiar enough with the mechanics of fork-point to have a solid opinion. -Peff
Re: [PATCH 3/3] merge-base: find fork-point outside partial reflog
On Thu, Sep 14, 2017 at 03:15:20PM +0200, Michael J Gruber wrote: > In fork-point mode, merge-base adds the commit at refname to a list of > candidates to walk from only when the reflog is empty. Therefore, it > fails to find merge bases that are descendants of the most recent reflog > entry. > > Add the commit at the tip unconditionally so that a merge base on the > history of refname is found in any case, independent of the state of the > reflog. OK, so the reflog we are interested in here is one where the tip commit is not mentioned. I'm not sure how one gets into that state, but I agree it makes sense to act as if it's there. -Peff
Re: [PATCH 1/3] t6010: test actual test output
On Thu, Sep 14, 2017 at 03:15:18PM +0200, Michael J Gruber wrote: > 4f21454b55 ("merge-base: handle --fork-point without reflog", > 2016-10-12) introduced a fix for merge-base --fork-point without reflog > and a test. While that test is fine, it did not update expected nor actual > output and tested that of the previous test instead. Oops. The use of the existing "expect3" was intentional (the idea being that we are repeating the identical step from the previous test with the slight tweak of not having a reflog). But obviously failing to write to "actual" at all was a mistake. That said, I'm OK with reiterating the expected output. -Peff
RE: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.
> From: Junio C Hamano [mailto:gits...@pobox.com] > Sent: Wednesday, September 13, 2017 4:18 PM > > Jacob Kellerwrites: > > > By definition if you do a sparse checkout, you're telling git "I only > > care about the files in this sparse checkout, and do not concern me > > with anything else"... So the proposed fix is "since git cleared the > > skip-worktree flag, we should actually also copy the file out again." > > but I think the real problem is that we're clearing skip worktree to > > begin with? > > As this 100% agree with what I've been saying, I do not have > anything to add to the discussion at the moment, so I'll stop > speaking now but will continue to monitor the thread so that I may > hear a new argument and datapoint that would make me change my mind. > > Thanks for a healthy discussion. Hi Junio, Thanks for the feedback. I will give this implementation a try. I still have the following unanswered questions, if you wouldn't mind answering. 1. Does this statement, "I only care about the files in this sparse checkout, and do not concern me with anything else", mean that git should not change files outside the sparse-checkout whether that be in the working directory or in the index? Or does that only apply to the working directory and the index version of files can change to whatever git the git command would do without using sparse? For example if I am doing a 'git reset HEAD~1' should the version in the index of files outside the sparse not be changed or change to the HEAD~1 version with the skip-worktree bit on? 2. How will this work with other git commands like merge, rebase, cherry-pick, etc.? 3. What about when there is a merge conflict with a file that is outside the sparse checkout? Should there not be a conflict because git shouldn't be trying to merge the file since it is outside the sparse checkout area? Should the conflicted file get written outside the sparse checkout so the user can resolve it? Thanks, Kevin
Re: [PATCH v2] commit-template: change a message to be more intuitive
On Thursday 14 September 2017 01:06 PM, Michael J Gruber wrote: Kaartic Sivaraam venit, vidit, dixit 13.09.2017 15:05: void wt_status_add_cut_line(FILE *fp) { - const char *explanation = _("Do not touch the line above.\nEverything below will be removed."); + const char *explanation = _("Do not modify or remove the line above.\nEverything below will be removed."); I don't want to complicate things. But now - due to the repeated usage of "remove" - these two sentences seem to be connected by an invisible "or else" ("will" vs. "would" not withstanding). i.e. "or else everything below will be removed, too", which is wrong. Before, they were separated more clearly. At least I don't think they are connected by an invisible "or else" because they appear on different lines, # # >8 # Do not modify or remove the line above. # Everything below will be removed. diff --git a/hook-checks/hooks/applypatch-msg b/hook-checks/hooks/applypatch-msg new file mode 100755 index 000..a5d7b84 --- /dev/null +++ b/hook-checks/hooks/applypatch-msg Also, given all the translations that we have, it seems somewhat strange to try and foresee and workaround possible misunderstandings of commonly used English phrases. I thought the translators wouldn't mind this! If they do, then it's better to drop this patch. --- Kaartic
Re: [PATCH 0/3] merge-base --fork-point fixes
Hi Michael, On Thu, 14 Sep 2017, Michael J Gruber wrote: > merge-base --fork-point does not quite work as advertised when the > reflog is empty or partial. This series brings it in line with the > documentation and, hopefully, with the original intent. Thank you so much for working on this. I recently tried to use --fork-point in a script of mine and it failed in this surprising way, too. So here's hoping for this patch series being accepted quickly (and sorry in advance for not having time to review...). Ciao, Dscho
[PATCH 0/3] merge-base --fork-point fixes
merge-base --fork-point does not quite work as advertised when the reflog is empty or partial. This series brings it in line with the documentation and, hopefully, with the original intent. Michael J Gruber (3): t6010: test actual test output merge-base: return fork-point outside reflog merge-base: find fork-point outside partial reflog builtin/merge-base.c | 20 t/t6010-merge-base.sh | 21 +++-- 2 files changed, 23 insertions(+), 18 deletions(-) -- 2.14.1.712.gda4591c8a2
[PATCH 1/3] t6010: test actual test output
4f21454b55 ("merge-base: handle --fork-point without reflog", 2016-10-12) introduced a fix for merge-base --fork-point without reflog and a test. While that test is fine, it did not update expected nor actual output and tested that of the previous test instead. Fix the test to test the merge-base output, not just its return code. Signed-off-by: Michael J Gruber--- t/t6010-merge-base.sh | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh index 31db7b5f91..17fffd7998 100755 --- a/t/t6010-merge-base.sh +++ b/t/t6010-merge-base.sh @@ -262,8 +262,9 @@ test_expect_success 'using reflog to find the fork point' ' test_expect_success '--fork-point works with empty reflog' ' git -c core.logallrefupdates=false branch no-reflog base && - git merge-base --fork-point no-reflog derived && - test_cmp expect3 actual + git rev-parse base >expect && + git merge-base --fork-point no-reflog derived >actual && + test_cmp expect actual ' test_expect_success 'merge-base --octopus --all for complex tree' ' -- 2.14.1.712.gda4591c8a2
[PATCH 3/3] merge-base: find fork-point outside partial reflog
In fork-point mode, merge-base adds the commit at refname to a list of candidates to walk from only when the reflog is empty. Therefore, it fails to find merge bases that are descendants of the most recent reflog entry. Add the commit at the tip unconditionally so that a merge base on the history of refname is found in any case, independent of the state of the reflog. Signed-off-by: Michael J Gruber--- builtin/merge-base.c | 2 +- t/t6010-merge-base.sh | 8 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/builtin/merge-base.c b/builtin/merge-base.c index 926a7615ea..b5b4cf7eac 100644 --- a/builtin/merge-base.c +++ b/builtin/merge-base.c @@ -174,7 +174,7 @@ static int handle_fork_point(int argc, const char **argv) revs.initial = 1; for_each_reflog_ent(refname, collect_one_reflog_ent, ); - if (!revs.nr && !get_oid(refname, )) + if (!get_oid(refname, )) add_one_commit(, ); for (i = 0; i < revs.nr; i++) diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh index 850463d4f2..78342896c7 100755 --- a/t/t6010-merge-base.sh +++ b/t/t6010-merge-base.sh @@ -275,6 +275,14 @@ test_expect_success '--fork-point works with merge-base outside reflog' ' test_cmp expect actual ' +test_expect_success '--fork-point works with merge-base outside partial reflog' ' + git -c core.logallrefupdates=true branch partial-reflog base && + git rev-parse no-reflog >.git/refs/heads/partial-reflog && + git rev-parse no-reflog >expect && + git merge-base --fork-point partial-reflog no-reflog >actual && + test_cmp expect actual +' + test_expect_success 'merge-base --octopus --all for complex tree' ' # Best common ancestor for JE, JAA and JDD is JC # JE -- 2.14.1.712.gda4591c8a2
[PATCH 2/3] merge-base: return fork-point outside reflog
4f21454b55 ("merge-base: handle --fork-point without reflog", 2016-10-12) fixed the case without reflog, but only partially: The original code checks whether the merge base candidates are from the list that we started with, which was the list of reflog entries before 4f21454b55 and the list of reflog entries - or the ref itself if empty - after. The test from 4f21454b55 tested in a situation where the merge base candidate equalled the commit at refname, so it was on this (1 item) list by accident. In fact, per documentation "--fork-point" looks at the reflog in addition to doing the usual walk from the tip. The original design description in d96855ff51 ("merge-base: teach "--fork-point" mode", 2013-10-23) describes this as computing from a virtual merge-base of all the historical tips of refname. They may or may not all be present in the reflog (think pruning, non-ff fetching, fast forwarding etc.), so filtering by the current contents of the reflog is potentially harmful, and it does not seem to fulfill any purpose in the original design. Remove the filtering and add a test for an out-of-reflog merge base. Reported-by: Ekelhart JakobSigned-off-by: Michael J Gruber --- builtin/merge-base.c | 18 +++--- t/t6010-merge-base.sh | 8 2 files changed, 11 insertions(+), 15 deletions(-) diff --git a/builtin/merge-base.c b/builtin/merge-base.c index 6dbd167d3b..926a7615ea 100644 --- a/builtin/merge-base.c +++ b/builtin/merge-base.c @@ -186,23 +186,11 @@ static int handle_fork_point(int argc, const char **argv) * There should be one and only one merge base, when we found * a common ancestor among reflog entries. */ - if (!bases || bases->next) { + if (!bases || bases->next) ret = 1; - goto cleanup_return; - } - - /* And the found one must be one of the reflog entries */ - for (i = 0; i < revs.nr; i++) - if (>item->object == [i]->object) - break; /* found */ - if (revs.nr <= i) { - ret = 1; /* not found */ - goto cleanup_return; - } - - printf("%s\n", oid_to_hex(>item->object.oid)); + else + printf("%s\n", oid_to_hex(>item->object.oid)); -cleanup_return: free_commit_list(bases); return ret; } diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh index 17fffd7998..850463d4f2 100755 --- a/t/t6010-merge-base.sh +++ b/t/t6010-merge-base.sh @@ -267,6 +267,14 @@ test_expect_success '--fork-point works with empty reflog' ' test_cmp expect actual ' +test_expect_success '--fork-point works with merge-base outside reflog' ' + git -c core.logallrefupdates=false checkout no-reflog && + git -c core.logallrefupdates=false commit --allow-empty -m "Commit outside reflogs" && + git rev-parse base >expect && + git merge-base --fork-point no-reflog derived >actual && + test_cmp expect actual +' + test_expect_success 'merge-base --octopus --all for complex tree' ' # Best common ancestor for JE, JAA and JDD is JC # JE -- 2.14.1.712.gda4591c8a2
WINNING NOTIFICATION
Congratulation you have won 85,000,000.00GBP For Claim email:i...@winningsclaimsoffice.com
Re: [PATCH 2/4] push, fetch: error out for submodule entries not pointing to commits
On Tue, Sep 12, 2017 at 10:30:27AM -0700, Jonathan Nieder wrote: > From: Stefan Beller> > The check_has_commit helper uses resolves a submodule entry to a > commit, when validating its existence. As a side effect this means > tolerates a submodule entry pointing to a tag, which is not a valid > submodule entry that git commands would know how to cope with. > > Tighten the check to require an actual commit, not a tag pointing to a > commit. > > Also improve the error handling when a submodule entry points to > non-commit (e.g., a blob) to error out instead of warning and > pretending the pointed to object doesn't exist. > > Signed-off-by: Stefan Beller > Signed-off-by: Jonathan Nieder Looks good to me. Cheers Heiko
Re: RFC v3: Another proposed hash function transition plan
Hi Jonathan, On Wed, 13 Sep 2017, Jonathan Nieder wrote: > As a side note, I am probably misreading, but I found this set of > paragraphs a bit condescending. It sounds to me like you are saying > "You are making the wrong choice of hash function and everything else > you are describing is irrelevant when compared to that monumental > mistake. Please stop working on things I don't consider important". > With that reading it is quite demotivating to read. I am sorry you read it that way. I did not feel condescending when I wrote that mail, I felt annoyed by the side track, and anxious. In my mind, the transition is too important for side tracking, and I worry that we are not fast enough (imagine what would happen if a better attack was discovered that is not as easily detected as the one we know about?). > An alternative reading is that you are saying that the transition plan > described in this thread is not ironed out. Can you spell that out > more? What particular aspect of the transition plan (which is of > course orthogonal to the choice of hash function) are you discontent > with? My impression from reading Junio's mail was that he does not consider the transition plan ironed out yet, and that he wants to spend time on discussing generation numbers right now. I was in particularly frightened by the suggestion to "reboot" [*1*]. Hopefully I misunderstand and he meant "finishing touches" instead. As to *my* opinion: after reading https://goo.gl/gh2Mzc (is it really correct that its last update has been on March 6th?), my only concern is really that it still talks about SHA3-256 when I think that the performance benefits of SHA-256 (think: "Git at scale", and also hardware support) really make the latter a better choice. In order to be "ironed out", I think we need to talk about the implementation detail "Translation table". This is important. It needs to be *fast*. Speaking of *fast*, I could imagine that it would make sense to store the SHA-1 objects on disk, still, instead of converting them on the fly. I am not sure whether this is something we need to define in the document, though, as it may very well be premature optimization; Maybe mention that we could do this if necessary? Apart from that, I would *love* to see this document as The Official Plan that I can Show To The Manager so that I can ask to Allocate Time. Ciao, Dscho Footnote *1*: https://public-inbox.org/git/xmqqa828733s@gitster.mtv.corp.google.com/
Re: how to remove from history just *one* version of a file/dir?
On Thu, Sep 14, 2017 at 07:32:11AM -0400, Robert P. J. Day wrote: > [is this the right place to ask questions about git usage? or is > there a different forum where one can submit possibly embarrassingly > silly questions?] No, this is the right place for embarrassing questions. :) > say, early on, one commits a sizable directory of content, call it > /mydir. that directory sits there for a while until it becomes obvious > it's out of date and worthless and should never have been committed. > the obvious solution would seem to be: > > $ git filter-branch --tree-filter 'rm -rf /mydir' HEAD > > correct? That would work, though note that using an --index-filter would be more efficient (since it avoids checking out each tree as it walks the history). > however, say one version of that directory was committed early on, > then later tossed for being useless with "git rm", and subsequently > replaced by newer content under exactly the same name. now i'd like to > go back and delete the history related to that early version of > /mydir, but not the second. Makes sense as a goal. > obviously, i can't use the above command as it would delete both > versions. so it appears the solution would be a trivial application of > the "--commit-filter" option: > >git filter-branch --commit-filter ' > if [ "$GIT_COMMIT" = "" ] ; then >skip_commit "$@"; > else >git commit-tree "$@"; > fi' HEAD > > where is the commit that introduced the first verrsion of > /mydir. do i have that right? is there a simpler way to do this? No, this won't work. Filter-branch is not walking the history and applying the changes to each commit, like rebase does. It's literally operating on each commit object, and recall that each commit object points to a tree that is a snapshot of the repository contents. So if you skip a commit, that commit itself goes away. But the commit after it (which didn't touch the unwanted contents) will still mention those contents in its tree. I think you want to stick with a --tree-filter (or an --index-filter), but just selectively decide when to do the deletion. For example, if you can tell the difference between the two states based on the presence of some file, then perhaps: git filter-branch --prune-empty --index-filter ' if git rev-parse --verify :dir/sentinel >/dev/null 2>&1 then git rm --cached -rf dir fi ' HEAD The "--prune-empty" is optional, but will drop commits that become empty because they _only_ touched that directory. We use ":dir/sentinel" to see if the entry is in the index, because the index filter won't have the tree checked out. Likewise, we need to use "rm --cached" to just touch the index. -Peff
Re: should "git tag" option of "-a" conflict with "-s", "-u"?
On Thu, Sep 14, 2017 at 06:08:10AM -0400, Robert P. J. Day wrote: > just noticed in "man git-tag": > > 1) "-a" explicitly claims to create an unsigned tag > 2) both "-s" and "-u" both claim to create signed tags > > should that mean that it should be an error to try to use "-a" in > combination with either of "-u" or "-s"? the synopsis does suggest > that you can use only one of those at a time: > > git tag [-a | -s | -u ] ... > > but i tested with (from memory) using both "-a" and "-s", and tag > actually tried to do it, but failed simply because i had no key. > clearly, though, it was making the attempt. > > can someone clarify this? I think the documentation could be a little more clear. "-a" just asks to create a tag object, rather than a "lightweight" tag. Using "-s" asks for a signature, which requires having a tag object, and thus implies "-a". So there's no point in using the two together (since "-s" already implies "-a"), but it also doesn't hurt to do so. -Peff
how to remove from history just *one* version of a file/dir?
[is this the right place to ask questions about git usage? or is there a different forum where one can submit possibly embarrassingly silly questions?] i've been perusing "git filter-branch", and i'm curious if i have the right idea about how to very selectively get rid of some useless history. say, early on, one commits a sizable directory of content, call it /mydir. that directory sits there for a while until it becomes obvious it's out of date and worthless and should never have been committed. the obvious solution would seem to be: $ git filter-branch --tree-filter 'rm -rf /mydir' HEAD correct? however, say one version of that directory was committed early on, then later tossed for being useless with "git rm", and subsequently replaced by newer content under exactly the same name. now i'd like to go back and delete the history related to that early version of /mydir, but not the second. obviously, i can't use the above command as it would delete both versions. so it appears the solution would be a trivial application of the "--commit-filter" option: git filter-branch --commit-filter ' if [ "$GIT_COMMIT" = "" ] ; then skip_commit "$@"; else git commit-tree "$@"; fi' HEAD where is the commit that introduced the first verrsion of /mydir. do i have that right? is there a simpler way to do this? rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: [PATCH v5 10/40] Add initial external odb support
On Thu, Aug 3, 2017 at 9:34 PM, Junio C Hamanowrote: > Christian Couder writes: > >> diff --git a/external-odb.h b/external-odb.h >> new file mode 100644 >> index 00..9989490c9e >> --- /dev/null >> +++ b/external-odb.h >> @@ -0,0 +1,8 @@ >> +#ifndef EXTERNAL_ODB_H >> +#define EXTERNAL_ODB_H >> + >> +const char *external_odb_root(void); >> +int external_odb_has_object(const unsigned char *sha1); >> +int external_odb_get_object(const unsigned char *sha1); > > Even though ancient codebase of ours deliberately omitted them, I > think our recent trend is to explicitly spell "extern " in headers. > >> diff --git a/odb-helper.h b/odb-helper.h >> new file mode 100644 >> index 00..5800661704 >> --- /dev/null >> +++ b/odb-helper.h > > Likewise. Ok, I am adding "extern " to the headers.
should "git tag" option of "-a" conflict with "-s", "-u"?
just noticed in "man git-tag": 1) "-a" explicitly claims to create an unsigned tag 2) both "-s" and "-u" both claim to create signed tags should that mean that it should be an error to try to use "-a" in combination with either of "-u" or "-s"? the synopsis does suggest that you can use only one of those at a time: git tag [-a | -s | -u ] ... but i tested with (from memory) using both "-a" and "-s", and tag actually tried to do it, but failed simply because i had no key. clearly, though, it was making the attempt. can someone clarify this? rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: [PATCH v5 14/40] external-odb: accept only blobs for now
On Thu, Aug 3, 2017 at 9:52 PM, Junio C Hamanowrote: > Christian Couder writes: > >> The mechanism to decide which blobs should be sent to which >> external object database will be very simple for now. >> If the external odb helper support any "put_*" instruction >> all the new blobs will be sent to it. >> >> Signed-off-by: Christian Couder >> --- >> external-odb.c | 4 >> 1 file changed, 4 insertions(+) >> >> diff --git a/external-odb.c b/external-odb.c >> index 82fac702e8..a4f8c72e1c 100644 >> --- a/external-odb.c >> +++ b/external-odb.c >> @@ -124,6 +124,10 @@ int external_odb_put_object(const void *buf, size_t len, >> { >> struct odb_helper *o; >> >> + /* For now accept only blobs */ >> + if (strcmp(type, "blob")) >> + return 1; >> + > > I somehow doubt that a policy decision like this should be made at > this layer. Shouldn't it be encoded in the capability the other > side supports, or determined at runtime per each individual object > when a "put" is attempted (i.e. allow the other side to say "You > tell me that you want me to store an object of type X and size Y; > I cannot do that, sorry"). I agree that it would be conceptually better to be able to support other kind of objects in external odb, but realistically most use cases for 'get_*' and 'put_*' instructions are for storing/retrieving blobs as other kind of objects are in specific formats that are well supported by the current object store. I also agree that it would be a nice feature if external odb could decide by themselves which objects they accept and I really want to leave the door open to a future improvement implementing that using the capability mechanism or perhaps another mechanism. For now though in "[PATCH v5 34/40] external-odb: use 'odb=magic' attribute to mark odb blobs" the attribute system is used to decide which blobs are put into which external odb as it is probably good enough for many use cases. It is also simple to implement in Git and makes helpers simpler to implement.
Re: [PATCH v5 13/40] external odb: add 'put_raw_obj' support
On Thu, Aug 3, 2017 at 9:50 PM, Junio C Hamanowrote: > Christian Couder writes: > >> Add support for a 'put_raw_obj' capability/instruction to send new >> objects to an external odb. Objects will be sent as they are (in >> their 'raw' format). They will not be converted to Git objects. >> >> For now any new Git object (blob, tree, commit, ...) would be sent >> if 'put_raw_obj' is supported by an odb helper. This is not a great >> default, but let's leave it to following commits to tweak that. >> >> Signed-off-by: Christian Couder >> --- > > I thought in an earlier step that I saw this thing initialized in > the codepath that adds alternate object stores, which are read-only > places we "borrow" from. Being able to write into it is good, but > conceptually it no longer feels correct to initialize it from the > alternate object database initialization codepath. > > Another way to say it is that an object store, whether it is local > or external, is not "alt" if it will result in storing new objects > we locally create. It's just an extension of our local object > store. I guess you are talking about the following code in "[PATCH v5 10/40] Add initial external odb support": +void prepare_external_alt_odb(void) +{ + static int linked_external; + const char *path; + + if (linked_external) + return; + + path = external_odb_root(); + if (!access(path, F_OK)) { + link_alt_odb_entry(path, NULL, 0, ""); + linked_external = 1; + } +} + void prepare_alt_odb(void) { const char *alt; @@ -650,6 +666,7 @@ void prepare_alt_odb(void) link_alt_odb_entries(alt, strlen(alt), PATH_SEP, NULL, 0); read_info_alternates(get_object_directory(), 0); + prepare_external_alt_odb(); } Would it be ok if I do the following: - rename prepare_external_alt_odb() to just prepare_external_odb(), as this would avoid confusion between alt_odbs and external odbs - remove the call to prepare_external_odb() in prepare_alt_odb() - add a prepare_alt_and_external_odb() that just calls prepare_alt_odb() and then prepare_external_odb() - replace all the calls to prepare_alt_odb() with calls to prepare_alt_and_external_odb() ?
Re: [PATCH v5 25/40] external-odb: add 'get_direct' support
On Thu, Aug 3, 2017 at 11:40 PM, Junio C Hamanowrote: > Christian Couder writes: > >> This implements the 'get_direct' capability/instruction that makes >> it possible for external odb helper scripts to pass blobs to Git >> by directly writing them as loose objects files. > > I am not sure if the assumption is made clear in this series, but I > am (perhaps incorrectly) guessing that it is assumed that the > intended use of this feature is to offload access to large blobs > by not including them in the initial clone. Yeah, it could be used for that, but that's not the only interesting use case. It could also be used for example if the working tree contains a huge number of blobs and it is better to download only the blobs that are needed when they are needed. In fact the code for 'get_direct' was taken from Ben Peart's "read-object" patch series (actually from an earlier version of this patch series): https://public-inbox.org/git/20170714132651.170708-1-benpe...@microsoft.com/ > So from that point of > view, I think it makes tons of sense to let the external helper to > directly populate the database bypassing Git (i.e. instead of > feeding data stream and have Git store it) like this "direct" method > does. > > How does this compare with (and how well does this work with) what > Jonathan Tan is doing recently? >From the following email: https://public-inbox.org/git/20170804145113.5ceaf...@twelve2.svl.corp.google.com/ it looks like his work is fundamentally about changing the rules of connectivity checks. Objects are split between "homegrown" objects and "imported" objects which are in separate pack files. Then references to imported objects are not checked during connectivity check. I think changing connectivity rules is not necessary to make something like external odb work. For example when fetching a pack that refers to objects that are in an external odb, if access this external odb has been configured, then the connectivity check will pass as the missing objects in the pack will be seen as already part of the repo. Yeah, if some commands like fsck are used, then possibly all the objects will have to be requested from the external odb, as it may not be possible to fully check all the objects, especially the blobs, without accessing all their data. But I think this is a problem that could be dealt with in different ways. For example we could develop specific options in fsck so that it doesn't check the sha1 of objects that are marked with some specific attributes, or that are stored in external odbs, or that are bigger than some size.
Re: [PATCH] test-lib: don't use ulimit in test prerequisites on cygwin
Jonathan Nieder venit, vidit, dixit 13.09.2017 21:20: > Ramsay Jones wrote: > >> On cygwin (and MinGW), the 'ulimit' built-in bash command does not have >> the desired effect of limiting the resources of new processes, at least >> for the stack and file descriptors. However, it always returns success >> and leads to several test prerequisites being erroneously set to true. >> >> Add a check for cygwin and MinGW to the prerequisite expressions, using >> 'uname -s', and return false instead of (indirectly) using 'ulimit'. >> This affects the prerequisite expressions for the ULIMIT_STACK_SIZE, >> CMDLINE_LIMIT and ULIMIT_FILE_DESCRIPTORS prerequisites. >> >> Signed-off-by: Ramsay Jones>> --- >> t/t1400-update-ref.sh | 11 ++- >> t/t6120-describe.sh | 1 - >> t/t7004-tag.sh| 1 - >> t/test-lib.sh | 22 -- >> 4 files changed, 30 insertions(+), 5 deletions(-) > > Reviewed-by: Jonathan Nieder > > Thanks. > > An alternative would be to do some more explicit feature-based test > like using "ulimit" to set a limit and then reading back the limit in > a separate process, but that feels like overkill. It's still not clear whether these limits would be in effect or just reported. Rather than checking uname -s in several places, I think we should just define ulimit to be false in one place, or rather set the prerequisite there. Michael
Re: merge-base not working as expected when base is ahead
Ekelhart Jakob venit, vidit, dixit 13.09.2017 17:07: > Dear Git, > > git merge-base --fork-point "master" not working if master is already newer > then my current branch. > Very oddly it seems to work whenever you had the expected commit checked out > previously - what made it very tricky to detect this problem. > > Example: > - Clone "https://github.com/jekelhart/GitInfoTry; > - Switch to branch "v1.0.0" > - git merge-base --fork-point "master" >- or: git merge-base --fork-point "origin/master" > - expected result: fork point "fbb1db34c6317a6e8b319c1ec261e97ca1672c22" > - but result is empty > > In the repo where we created this example tree in the first place the command > returned the expected fork point. If you clone it new and fresh it does not > return any result anymore. > > Works, however, on branch "v2.0.0". Assumption: because "master" is older.(?) > I think it works locally because the command uses the reflog in addition(!), > however, it should work without the local reflog as well. (since the history > was not modified in any way) The documentation lies, unfortunately. It claims that in fork-mode, "git merge-base" "also" looks at the reflog. In fact, the code explicitely *discards* any merge bases that it finds but which are not in the reflog. (The merge base that you find for v2.0.0 is in the reflog because it's the checked out HEAD.) Removing the corresponding check gives the merge base for v1.0.0 that you expect, and git still passes all tests. But I'll look at the history of these lines before I submit a patch. Michael
[PATCH v4 0/4] imap-send: Fix and enable curl by default
Changes since v3: - Fix return code in patch #1 - Reword patch#4 Nicolas Morey-Chaisemartin (4): imap-send: return with error if curl failed imap-send: add wrapper to get server credentials if needed imap_send: setup_curl: retreive credentials if not set in config file imap-send: use curl by default when possible imap-send.c | 60 1 file changed, 40 insertions(+), 20 deletions(-) -- 2.14.1.461.g503560879
[PATCH v4 1/4] imap-send: return with error if curl failed
curl_append_msgs_to_imap always returned 0, whether curl failed or not. Return a proper status so git imap-send will exit with an error code if something wrong happened. Signed-off-by: Nicolas Morey-Chaisemartin--- imap-send.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/imap-send.c b/imap-send.c index b2d0b849bb..b5e332420a 100644 --- a/imap-send.c +++ b/imap-send.c @@ -1490,7 +1490,7 @@ static int curl_append_msgs_to_imap(struct imap_server_conf *server, curl_easy_cleanup(curl); curl_global_cleanup(); - return 0; + return res != CURLE_OK; } #endif -- 2.14.1.461.g503560879
[PATCH v4 3/4] imap_send: setup_curl: retreive credentials if not set in config file
Up to this point, the curl mode only supported getting the username and password from the gitconfig file while the legacy mode could also fetch them using the credential API. Signed-off-by: Nicolas Morey-Chaisemartin--- imap-send.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/imap-send.c b/imap-send.c index 1b8fbbd545..7e39993d95 100644 --- a/imap-send.c +++ b/imap-send.c @@ -1398,7 +1398,7 @@ static int append_msgs_to_imap(struct imap_server_conf *server, } #ifdef USE_CURL_FOR_IMAP_SEND -static CURL *setup_curl(struct imap_server_conf *srvc) +static CURL *setup_curl(struct imap_server_conf *srvc, struct credential *cred) { CURL *curl; struct strbuf path = STRBUF_INIT; @@ -1411,6 +1411,7 @@ static CURL *setup_curl(struct imap_server_conf *srvc) if (!curl) die("curl_easy_init failed"); + server_fill_credential(, cred); curl_easy_setopt(curl, CURLOPT_USERNAME, server.user); curl_easy_setopt(curl, CURLOPT_PASSWORD, server.pass); @@ -1460,8 +1461,9 @@ static int curl_append_msgs_to_imap(struct imap_server_conf *server, struct buffer msgbuf = { STRBUF_INIT, 0 }; CURL *curl; CURLcode res = CURLE_OK; + struct credential cred = CREDENTIAL_INIT; - curl = setup_curl(server); + curl = setup_curl(server, ); curl_easy_setopt(curl, CURLOPT_READDATA, ); fprintf(stderr, "sending %d message%s\n", total, (total != 1) ? "s" : ""); @@ -1496,6 +1498,18 @@ static int curl_append_msgs_to_imap(struct imap_server_conf *server, curl_easy_cleanup(curl); curl_global_cleanup(); + if (cred.username) + if (res == CURLE_OK) + credential_approve(); +#if LIBCURL_VERSION_NUM >= 0x070d01 + else if (res == CURLE_LOGIN_DENIED) +#else + else +#endif + credential_reject(); + + credential_clear(); + return res != CURLE_OK; } #endif -- 2.14.1.461.g503560879
[PATCH v4 4/4] imap-send: use curl by default when possible
Set curl as the runtime default when it is available. When linked against older curl versions (< 7_34_0) or without curl, use the legacy imap implementation. The goal is to validate feature parity between the legacy and the curl implementation, deprecate the legacy implementation later on and in the long term, hopefully drop it altogether. Signed-off-by: Nicolas Morey-Chaisemartin--- imap-send.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/imap-send.c b/imap-send.c index 7e39993d95..af1e1576bd 100644 --- a/imap-send.c +++ b/imap-send.c @@ -35,11 +35,11 @@ typedef void *SSL; #include "http.h" #endif -#if defined(USE_CURL_FOR_IMAP_SEND) && defined(NO_OPENSSL) -/* only available option */ +#if defined(USE_CURL_FOR_IMAP_SEND) +/* Always default to curl if it's available. */ #define USE_CURL_DEFAULT 1 #else -/* strictly opt in */ +/* We don't have curl, so continue to use the historical implementation */ #define USE_CURL_DEFAULT 0 #endif -- 2.14.1.461.g503560879
[PATCH v4 2/4] imap-send: add wrapper to get server credentials if needed
Signed-off-by: Nicolas Morey-Chaisemartin--- imap-send.c | 34 -- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/imap-send.c b/imap-send.c index b5e332420a..1b8fbbd545 100644 --- a/imap-send.c +++ b/imap-send.c @@ -926,6 +926,25 @@ static int auth_cram_md5(struct imap_store *ctx, struct imap_cmd *cmd, const cha return 0; } +static void server_fill_credential(struct imap_server_conf *srvc, struct credential *cred) +{ + if (srvc->user && srvc->pass) + return; + + cred->protocol = xstrdup(srvc->use_ssl ? "imaps" : "imap"); + cred->host = xstrdup(srvc->host); + + cred->username = xstrdup_or_null(srvc->user); + cred->password = xstrdup_or_null(srvc->pass); + + credential_fill(cred); + + if (!srvc->user) + srvc->user = xstrdup(cred->username); + if (!srvc->pass) + srvc->pass = xstrdup(cred->password); +} + static struct imap_store *imap_open_store(struct imap_server_conf *srvc, char *folder) { struct credential cred = CREDENTIAL_INIT; @@ -1078,20 +1097,7 @@ static struct imap_store *imap_open_store(struct imap_server_conf *srvc, char *f } #endif imap_info("Logging in...\n"); - if (!srvc->user || !srvc->pass) { - cred.protocol = xstrdup(srvc->use_ssl ? "imaps" : "imap"); - cred.host = xstrdup(srvc->host); - - cred.username = xstrdup_or_null(srvc->user); - cred.password = xstrdup_or_null(srvc->pass); - - credential_fill(); - - if (!srvc->user) - srvc->user = xstrdup(cred.username); - if (!srvc->pass) - srvc->pass = xstrdup(cred.password); - } + server_fill_credential(srvc, ); if (srvc->auth_method) { struct imap_cmd_cb cb; -- 2.14.1.461.g503560879
Re: [PATCH v5 19/40] lib-httpd: add upload.sh
On Thu, Aug 3, 2017 at 10:07 PM, Junio C Hamanowrote: > Christian Couder writes: > >> +OLDIFS="$IFS" >> +IFS='&' >> +set -- $QUERY_STRING >> +IFS="$OLDIFS" >> + >> +while test $# -gt 0 >> +do >> +key=${1%=*} >> +val=${1#*=} > > When you see that ${V%X*} and ${V#*X} appear in a pair for the same > variable V and same delimiter X, it almost always indicates a bug > waiting to happen. > > What's the definition of "key" here? A member of known set of short > tokens, all of which consists only of alphanumeric, or something? Yeah, the key can be only "sha1", "type", "size" or "delete" as can be seen later in the code. > Even if you do not currently plan to deal with a value with '=' in > it, it may be prudent to double '%' above (and do not double '#'). Yeah I agree. Thanks for spotting this! > Style: indent your shell script with tabs. Sure. >> +case "$key" in >> + "sha1") sha1="$val" ;; >> + "type") type="$val" ;; >> + "size") size="$val" ;; >> + "delete") delete=1 ;; >> + *) echo >&2 "unknown key '$key'" ;; >> +esac > > Indent your shell script with tabs; case/esac and the labels used > for each case arms all align at the same column. Yeah, it will be properly indented in the version I will send soon.
BUSINESS PROPOSAL
Please I like you to keep this proposal as a top secret and delete it if you are not interested and get back to me if you are interested for details as regards to the transfer of $24,500,000 to you. This money initially belongs to a Libyan client who died in the libya crisis and had no next of kin in his account-opening package in my bank here in Hong kong where I am a bank director. In other to achieve this, I shall require your full name, and telephone number to reach you.Most importantly, a confirmation of acceptance from you will be needed after which I shall furnish you with the full details of this transaction. Kindly reply to linglung0...@gmail.com Respectfully, Ling Lung
Re: [PATCH v2] commit-template: change a message to be more intuitive
Kaartic Sivaraam venit, vidit, dixit 13.09.2017 15:05: > It's not good to use the phrase 'do not touch' to convey the information > that the cut-line should not be modified or removed as it could possibly > be mis-interpreted by a person who doesn't know that the word 'touch' has > the meaning of 'tamper with'. Further, it could make translations a little > difficult as it might not have the intended meaning in a few languages when > translated as such. > > So, use more intuitive terms in the sentence. > > Signed-off-by: Kaartic Sivaraam> --- > wt-status.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/wt-status.c b/wt-status.c > index 77c27c51134d2..be53579760ee7 100644 > --- a/wt-status.c > +++ b/wt-status.c > @@ -934,7 +934,7 @@ size_t wt_status_locate_end(const char *s, size_t len) > > void wt_status_add_cut_line(FILE *fp) > { > - const char *explanation = _("Do not touch the line above.\nEverything > below will be removed."); > + const char *explanation = _("Do not modify or remove the line > above.\nEverything below will be removed."); I don't want to complicate things. But now - due to the repeated usage of "remove" - these two sentences seem to be connected by an invisible "or else" ("will" vs. "would" not withstanding). i.e. "or else everything below will be removed, too", which is wrong. Before, they were separated more clearly. Also, given all the translations that we have, it seems somewhat strange to try and foresee and workaround possible misunderstandings of commonly used English phrases. > struct strbuf buf = STRBUF_INIT; > > fprintf(fp, "%c %s", comment_line_char, cut_line); > > -- > https://github.com/git/git/pull/401 >
Re: [PATCH v5 11/40] odb-helper: add odb_helper_init() to send 'init' instruction
On Sun, Sep 10, 2017 at 2:12 PM, Lars Schneiderwrote: > >> On 03 Aug 2017, at 10:18, Christian Couder >> wrote: >> >> +static void parse_capabilities(char *cap_buf, >> +unsigned int *supported_capabilities, >> +const char *process_name) >> +{ >> + struct string_list cap_list = STRING_LIST_INIT_NODUP; >> + >> + string_list_split_in_place(_list, cap_buf, '=', 1); >> + >> + if (cap_list.nr == 2 && !strcmp(cap_list.items[0].string, >> "capability")) { >> + const char *cap_name = cap_list.items[1].string; >> + >> + if (!strcmp(cap_name, "get_git_obj")) { >> + *supported_capabilities |= ODB_HELPER_CAP_GET_GIT_OBJ; >> + } else if (!strcmp(cap_name, "get_raw_obj")) { >> + *supported_capabilities |= ODB_HELPER_CAP_GET_RAW_OBJ; >> + } else if (!strcmp(cap_name, "get_direct")) { >> + *supported_capabilities |= ODB_HELPER_CAP_GET_DIRECT; >> + } else if (!strcmp(cap_name, "put_git_obj")) { >> + *supported_capabilities |= ODB_HELPER_CAP_PUT_GIT_OBJ; >> + } else if (!strcmp(cap_name, "put_raw_obj")) { >> + *supported_capabilities |= ODB_HELPER_CAP_PUT_RAW_OBJ; >> + } else if (!strcmp(cap_name, "put_direct")) { >> + *supported_capabilities |= ODB_HELPER_CAP_PUT_DIRECT; >> + } else if (!strcmp(cap_name, "have")) { >> + *supported_capabilities |= ODB_HELPER_CAP_HAVE; >> + } else { >> + warning("external process '%s' requested unsupported >> read-object capability '%s'", >> + process_name, cap_name); >> + } > > In 1514c8ed ("convert: refactor capabilities negotiation", 2017-06-30) I > introduced > a simpler version of the capabilities negotiation. Maybe useful for you here, > too? Yeah, actually there is also fa64a2fdbe (sub-process: refactor handshake to common function, 2017-07-26) that Jonathan Tan wrote on top of your changes and that adds subprocess_handshake(). So the current code is using it like that: static int start_object_process_fn(struct subprocess_entry *subprocess) { static int versions[] = {1, 0}; static struct subprocess_capability capabilities[] = { { "get_git_obj", ODB_HELPER_CAP_GET_GIT_OBJ }, { "get_raw_obj", ODB_HELPER_CAP_GET_RAW_OBJ }, { "get_direct", ODB_HELPER_CAP_GET_DIRECT }, { "put_git_obj", ODB_HELPER_CAP_PUT_GIT_OBJ }, { "put_raw_obj", ODB_HELPER_CAP_PUT_RAW_OBJ }, { "put_direct", ODB_HELPER_CAP_PUT_DIRECT }, { "have",ODB_HELPER_CAP_HAVE }, { NULL, 0 } }; struct object_process *entry = (struct object_process *)subprocess; return subprocess_handshake(subprocess, "git-read-object", versions, NULL, capabilities, >supported_capabilities); }
Re: [PATCH v5 12/40] t0400: add 'put_raw_obj' instruction to odb-helper script
On Sun, Sep 10, 2017 at 2:12 PM, Lars Schneiderwrote: > >> On 03 Aug 2017, at 10:18, Christian Couder >> wrote: >> >> To properly test passing objects from Git to an external odb >> we need an odb-helper script that supports a 'put' >> capability/instruction. >> >> For now we will support only sending raw blobs, so the >> supported capability/instruction will be 'put_raw_obj'. > > What other kind of blobs do you imagine could we send? As for the get instructions there could be 'put_git_obj' to send blobs in the Git format and 'put_direct' to have the helper read directly from the Git object store. >> +put_raw_obj) >> + sha1="$2" >> + size="$3" >> + kind="$4" >> + writen=$(git hash-object -w -t "$kind" --stdin) >> + test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen >> '$writen'" > > Typo? Should it be "written"? Yeah, thanks. It's fixed in the current version.
Re: [PATCH v5 00/40] Add initial experimental external ODB support
On Sun, Sep 10, 2017 at 2:30 PM, Lars Schneiderwrote: > >> On 03 Aug 2017, at 10:18, Christian Couder >> wrote: >> >> ... >> >> * The "helpers" (registered commands) >> >> Each helper manages access to one external ODB. >> >> There are 2 different modes for helper: >> >> - Helpers configured using "odb..scriptCommand" are >>launched each time Git wants to communicate with the >>external ODB. This is called "script mode". >> >> - Helpers configured using "odb..subprocessCommand" are >>launched launched once as a sub-process (using sub-process.h), and >>Git communicates with them using packet lines. This is called >>"process mode". > > I am curious, why would we support two modes? Wouldn't that increase > the maintenance cost? Wouldn't the subprocess command be superior? > I imagine the script mode eases testing, right?! The script mode makes it much easier to write some helpers. For example, as shown in t0430 at the end of the patch series, a helper for a restartable bundle based clone could be something like basically: case "$1" in init) ref_hash=$(git rev-parse refs/odbs/magic/bundle) || die "couldn't find refs/odbs/magic/bundle" GIT_NO_EXTERNAL_ODB=1 git cat-file blob "$ref_hash" >bundle_info || die "couldn't get blob $ref_hash" bundle_url=$(sed -e 's/bundle url: //' bundle_info) curl "$bundle_url" -o bundle_file || die "curl '$bundle_url' failed" GIT_NO_EXTERNAL_ODB=1 git bundle unbundle bundle_file >unbundling_info || die "unbundling 'bundle_file' failed" ;; >> These odb refs point to a blob that is stored in the Git >> repository and contain information about the blob stored in the >> external odb. This information can be specific to the external odb. >> The repos can then share this information using commands like: >> >> `git fetch origin "refs/odbs//*:refs/odbs//*"` >> >> At the end of the current patch series, "git clone" is teached a >> "--initial-refspec" option, that asks it to first fetch some specified >> refs. This is used in the tests to fetch the odb refs first. >> >> This way only one "git clone" command can setup a repo using the >> external ODB mechanism as long as the right helper is installed on the >> machine and as long as the following options are used: >> >> - "--initial-refspec " to fetch the odb refspec >> - "-c odb..command=" to configure the helper > > The "odb" config could, of course, go into the global git config. Sure. > The odbrefspec is optional, right? Using "--initial-refspec " is optional. There will be more information in the documentation about this option in the next version of the series. > I have the impression there are a number of topics on the list > that tackle the "many/big objects in a Git repo" problem. Is > there a write up about the status of them, how they relate > to each other, and what the current problems are? > I found the following but it looks abandoned: > https://github.com/jrn/git-large-repositories Yeah, it could be interesting to discuss all these topics together. On the other hand people working on existing patch series, like me, have to work on them and post new versions, as just discussing the topics is not enough to move things forward. Anyway Junio and Jonathan Tan also asked me questions about how my work relates to Jonathan's, so I will reply to them hopefully soon...