What's cooking in git.git (Sep 2017, #03; Fri, 15)

2017-09-14 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

We are at week #6 of this cycle.  It seems that people had a
productive week while I was away, which I am reasonably happy about
;-)  Quite a many topics that have been in 'master' are now also
merged to 'maint', so perhaps I should tag 2.14.2 soonish.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[New Topics]

* cc/subprocess-handshake-missing-capabilities (2017-09-11) 1 commit
 - subprocess: loudly die when subprocess asks for an unsupported capability

 Finishing touches to a topic already in 'master'.

 Will merge to 'next'.


* bw/protocol-v1 (2017-09-14) 8 commits
 - i5700: add interop test for protocol transition
 - http: tell server that the client understands v1
 - connect: tell server that the client understands v1
 - connect: teach client to recognize v1 server response
 - upload-pack, receive-pack: introduce protocol version 1
 - daemon: recognize hidden request arguments
 - protocol: introduce protocol extention mechanisms
 - pkt-line: add packet_write function

 A new mechanism to upgrade the wire protocol in place is proposed
 and demonstrated that it works with the older versions of Git
 without harming them.


* ez/doc-duplicated-words-fix (2017-09-14) 1 commit
 - doc: fix minor typos (extra/duplicated words)

 Typofix.

 Will merge to 'next'.


* jk/write-in-full-fix (2017-09-14) 8 commits
 - read_pack_header: handle signed/unsigned comparison in read result
 - config: flip return value of store_write_*()
 - notes-merge: use ssize_t for write_in_full() return value
 - pkt-line: check write_in_full() errors against "< 0"
 - convert less-trivial versions of "write_in_full() != len"
 - avoid "write_in_full(fd, buf, len) != len" pattern
 - get-tar-commit-id: check write_in_full() return against 0
 - config: avoid "write_in_full(fd, buf, len) < len" pattern

 Many codepaths did not diagnose write failures correctly when disks
 go full, due to their misuse of write_in_full() helper function,
 which have been corrected.

 Will merge to 'next'.


* jn/per-repo-object-store-fixes (2017-09-14) 3 commits
 - replace-objects: evaluate replacement refs without using the object store
 - push, fetch: error out for submodule entries not pointing to commits
 - pack: make packed_git_mru global a value instead of a pointer

 Step #0 of a planned & larger series.

 Will merge to 'next'.


* ks/commit-do-not-touch-cut-line (2017-09-15) 1 commit
 - commit-template: change a message to be more intuitive

 The explanation of the cut-line in the commit log editor has been
 slightly tweaked.

 Will merge to 'next'.


* ks/help-alias-label (2017-09-14) 1 commit
 - help: change a message to be more precise

 "git help co" now says "co is aliased to ...", not "git co is".

 Will merge to 'next'.


* mh/mmap-packed-refs (2017-09-14) 20 commits
 - packed-backend.c: rename a bunch of things and update comments
 - mmapped_ref_iterator: inline into `packed_ref_iterator`
 - ref_cache: remove support for storing peeled values
 - packed_ref_store: get rid of the `ref_cache` entirely
 - ref_store: implement `refs_peel_ref()` generically
 - packed_read_raw_ref(): read the reference from the mmapped buffer
 - packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator`
 - read_packed_refs(): ensure that references are ordered when read
 - packed_ref_cache: keep the `packed-refs` file open and mmapped
 - mmapped_ref_iterator_advance(): no peeled value for broken refs
 - mmapped_ref_iterator: add iterator over a packed-refs file
 - packed_ref_cache: remember the file-wide peeling state
 - read_packed_refs(): read references with minimal copying
 - read_packed_refs(): make parsing of the header line more robust
 - read_packed_refs(): only check for a header at the top of the file
 - read_packed_refs(): use mmap to read the `packed-refs` file
 - die_unterminated_line(), die_invalid_line(): new functions
 - packed_ref_cache: add a backlink to the associated `packed_ref_store`
 - prefix_ref_iterator: break when we leave the prefix
 - ref_iterator: keep track of whether the iterator output is ordered
 (this branch uses mh/packed-ref-transactions.)

 Operations that do not touch (majority of) packed refs have been
 optimized by making accesses to packed-refs file lazy; we no longer
 pre-parse everything, and an access to a single ref in the
 packed-refs does not touch majority of irrelevant refs, either.

 Expecting a rework to also work on Windows.


* nm/imap-send-with-curl (2017-09-15) 4 commits
 - imap-send: use curl by default when possible
 - imap_send: setup_curl: retreive credentials if not set 

Re: git diff --name-status for deleted files

2017-09-14 Thread Junio C Hamano
Gene Thomas  writes:

> Junio,
>Thanks for your reply. So git is essentially doing a
>"git commit" when I "git rm".

No. You'd probably need to read a bit more on Git; unlike other
systems like CVS and SVN, where you only have two states
(i.e. committed contents vs files on the filesystem), we have three
states (i.e. the index in addition to the above two).

"git add file" and "git rm file" make the index match what's on the
filesystem wrt "file".  They never touch committed contents, which
"git commit" command is about.



RE: git diff --name-status for deleted files

2017-09-14 Thread Gene Thomas
Junio,
   Thanks for your reply. So git is essentially doing a "git commit" 
when I "git rm". 

Gene.

-Original Message-
From: Junio C Hamano [mailto:gits...@pobox.com] 
Sent: Friday, 15 September 2017 2:58 PM
To: Gene Thomas 
Cc: git@vger.kernel.org
Subject: Re: git diff --name-status for deleted files

Gene Thomas  writes:

> Hello,
>   "git diff -name-status" is useful to list the files one
>   has changed but it does not list file that one has
>   deleted with "git rm". It would be really handy if it
>   did. I am using git 2.9.3 on Ubuntu Linux 16.10.

With or without --name-status option, "git diff" compares between the contents 
you have in the index and in your working tree.  After you modify contents of a 
file, i.e.

edit file
git add file

you would not see that file edited exactly because the file on the filesystem 
is identical to what you added to the index with "git add".

Your example works exactly the same way.  Instead of modifying the contents of 
a file, removing the presense of the file and recording that fact to the index 
(i.e. removing the path from the index, too) is done with "git rm", so after 
running

git rm file

your working tree would lack "file" and so would your index.  Hence you 
wouldn't see "git diff" report any difference on "file".

Perhaps you wanted "git diff HEAD", which is a way to compare between the 
contents you have in the tip commit and the paths in your working tree thru the 
index?


Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-14 Thread Junio C Hamano
Kevin Willford  writes:

> 1. Does this statement, "I only care about the files in this
> sparse checkout, and do not concern me with anything else", mean
> that git should not change files outside the sparse-checkout whether
> that be in the working directory or in the index?  Or does that only
> apply to the working directory and the index version of files can
> change to whatever git the git command would do without using
> sparse?  For example if I am doing a 'git reset HEAD~1'  should the
> version in the index of files outside the sparse not be changed or
> change to the HEAD~1 version with the skip-worktree bit on?

My understanding of the purpose of "sparse checkout" thing is that
the user still wants to create correct whole-tree commit even the
user does not have the whole-tree checkout.  The object names for
blobs recorded in the index that are meant to be included in the
next commit MUST be the same as those that would be in the index
when the "sparse" feature is not in use.  "reset HEAD~1" should
match the index entries to the tree entries in HEAD~1.  So, the
latter, I would think, among your two alternatives.

IOW, after "git reset HEAD~", if you drop the skip-worktree bit from
all index entries, "git diff --cached HEAD" must say "there is no
changes".

The only difference between the "sparse" and normal would be that,
because the "sparse" user does not intend to change anything outside
the "sparse" area, these paths outside her "sparse" area would not
materialize on the filesystem.  For the next "write-tree" out of the
index to still write the correct tree out, the entries outside her
"sparse" area in the index MUST match the tree of the commit she
started working from.

> 2. How will this work with other git commands like merge, rebase,
> cherry-pick, etc.?  
> 3. What about when there is a merge conflict with a file that is outside
> the sparse checkout?

I would say, rather than forbidding such a merge, it should let her
see and deal with the conflict by dropping the "this is outside the
sparse area, so do not bother materializing it to the filesystem"
bit, but tell her loudly what it did ("I checked out a half-merged
file outside your sparse-checkout area because you'd need it while
resolving the conflict").  By doing things that way, the user can
decide if she wants to go ahead and complete the merge, even if the
conflict is outside the area she is currently interested in, or
postpone the merge and continue working on what she has been working
on inside the narrowed-down area first.

I do not have a strong opinion whether the sparse-checkout
configuration file should be adjusted to match when the command must
tentatively bust the sparse checkout area; I'd imagine it can be
argued either way.

Note that "sparse" is not my itch, and I would not be surprised if
those who designed it may want it to work differently from my
knee-jerk reaction in the previous two paragraphs, and I may even
find such an alternative solution preferable.  

But it is highly unlikely for any sensible solution would violate
the basic premise, i.e. "the indexed contents will stay the same as
the case without any 'sparse', so the next write-tree will do the
right thing".


[PATCH v3] commit-template: change a message to be more intuitive

2017-09-14 Thread Kaartic Sivaraam
It's not good to use the phrase 'do not touch' to convey the information
that the cut-line should not be modified or removed as it could possibly
be mis-interpreted by a person who doesn't know that the word 'touch' has
the meaning of 'tamper with'. Further, it could make translations a little
difficult as it might not have the intended meaning in a few languages (for
which translations don't exist yet) when translated as such.

So, use intuitive terms in the sentence. Replacing the word 'touch' with
other terms has introduced the possibility of the following sentence to be
mis-interpreted, so change the terms in that too.

Signed-off-by: Kaartic Sivaraam 
---
 Changes in v3:
 
- changed the wordings of the second sentence as there seemed to be a
  magical 'or else' connecting the two lines.
 
 I didn't expect the least that this would go upto v3. In case anyboy finds
 something wrong with this change too, it's a lot better to drop this altogether
 than going for a v4.

 wt-status.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/wt-status.c b/wt-status.c
index 77c27c511..23e87e74d 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -934,7 +934,7 @@ size_t wt_status_locate_end(const char *s, size_t len)
 
 void wt_status_add_cut_line(FILE *fp)
 {
-   const char *explanation = _("Do not touch the line above.\nEverything 
below will be removed.");
+   const char *explanation = _("Do not modify or remove the line 
above.\nEverything below it will be ignored.");
struct strbuf buf = STRBUF_INIT;
 
fprintf(fp, "%c %s", comment_line_char, cut_line);
-- 
2.14.1.1006.g90ad9a07c



Re: [PATCH v4 3/4] imap_send: setup_curl: retreive credentials if not set in config file

2017-09-14 Thread Junio C Hamano
Nicolas Morey-Chaisemartin  writes:

>  
> + if (cred.username)
> + if (res == CURLE_OK)
> + credential_approve();
> +#if LIBCURL_VERSION_NUM >= 0x070d01
> + else if (res == CURLE_LOGIN_DENIED)
> +#else
> + else
> +#endif
> + credential_reject();
> +
> + credential_clear();
> +

As my copy of GCC seemed to be worried about readers getting
confused by the if/else cascade, I'd place an extra pair of braces
around this, i.e.

if (cred.username) {
if (res == CURLE_OK)
credential_approve();
else /* or "else if DENIED" */
credential_reject();
}
credential_clear();

while queuing this patch.



Re: [PATCH v4 3/4] imap_send: setup_curl: retreive credentials if not set in config file

2017-09-14 Thread Junio C Hamano
Nicolas Morey-Chaisemartin  writes:

> + if (cred.username)
> + if (res == CURLE_OK)
> + credential_approve();
> +#if LIBCURL_VERSION_NUM >= 0x070d01
> + else if (res == CURLE_LOGIN_DENIED)

A slight tangent.  This is in line with the way in which we do
conditional compilation to work with different versions of libCurl,
but we recently had discussion on modernizing these version based
conditional compilation to use feature based one in another topic.
We may want to switch to

#if defined(CURLE_LOGIN_DENIED)
...

(cf.
https://public-inbox.org/git/cover.1502462884.git@jupiterrise.com/
the entire thread).

No need to change _this_ patch in this series, but something to keep
in mind planning for a future follow-up work to clean things up.

Thanks.


Re: [PATCH 00/20] Read `packed-refs` using mmap()

2017-09-14 Thread Michael Haggerty
On 09/14/2017 10:23 PM, Johannes Schindelin wrote:
> On Wed, 13 Sep 2017, Michael Haggerty wrote:
> 
>> * `mmap()` the whole file rather than `read()`ing it.
> 
> On Windows, a memory-mapped file cannot be renamed. As a consequence, the
> following tests fail on `pu`:
> 
> [...]

Thanks for your quick investigation.

> This diff:
> 
> -- snip --
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index f9c5e771bdd..ee8b3603624 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1306,13 +1308,13 @@ static int packed_transaction_finish(struct
> ref_store *ref_store,
>   char *packed_refs_path;
>  
>   packed_refs_path = get_locked_file_path(>lock);
> + clear_snapshot(refs);
>   if (rename_tempfile(>tempfile, packed_refs_path)) {
>   strbuf_addf(err, "error replacing %s: %s",
>   refs->path, strerror(errno));
>   goto cleanup;
>   }
>  
> - clear_snapshot(refs);
>   ret = 0;
>  
>  cleanup:
> -- snap --
> 
> reduces the failed tests to
> 
> t1410-reflog.counts.sh
> t3210-pack-refs.counts.sh
> t3211-peel-ref.counts.sh
> t5505-remote.counts.sh
> t5510-fetch.counts.sh
> t5600-clone-fail-cleanup.counts.sh

That change is a strict improvement on all systems; I'll integrate it
into the series.

> However, much as I tried to wrap my head around it, I could not even start
> to come up with a solution for e.g. the t1410 regression. The failure
> happens in `git branch -d one/two`.
> 
> The culprit: there is yet another unreleased snapshot while we try to
> finish the transaction.
> 
> It is the snapshot of the main_worktree_ref_store as acquired by
> add_head_info() (which is called from get_main_worktree(), which in turn
> was called from get_worktrees(), in turn having been called from
> find_shared_symref() that was called from delete_branches()), and it seems
> to me that there was never any plan to have that released, including its
> snapshot.

Yeah the idea was that the default situation would be that whenever a
`packed-refs` file needs to be read, it would be kept mmapped for the
life of the program (or until the file was detected to have been
changed). This was meant to be a replacement for the explicit
`ref_cache`. So any long-lived git process could be expected to have the
`packed-refs` file (or even multiple `packed-refs` files in the case of
submodules) mmapped.

That's obviously not going to work on Windows.

> [...]
> Do you have any idea how to ensure that all mmap()ed packed refs are
> released before attempting to finish a transaction?

[And from your other email:]
> This is only one example, as I figure that multiple worktrees could cause
> even more ref_stores to have unreleased snapshots of the packed-refs file.
> 
> I imagine that something like close_all_packs() is needed
> ("clear_all_snapshots()"?).

Yes, I wasn't really familiar with `close_all_packs()`, but it sounds
like it solves a similar problem.

Within a single process, we could make this work via an arduous process
of figuring out what functions might want to overwrite the `packed-refs`
file, and making sure that nobody up the call stack is iterating over
references during those function calls. If that condition is met, then
calling `clear_snapshot()` earlier, as you propose above, would decrease
the reference count to zero, causing the old snapshot to be freed, and
allowing the rename to succeed. We could even do

if (!clear_snapshot(refs))
BUG("packed-refs snapshot is still in use");

, analogously to `close_all_packs()`, to help find code that violates
the condition.

Similarly, it wouldn't be allowed to invoke a subprocess or hook
function while iterating over references, and one would have to clear
any current snapshots before doing so. It might even make sense to do
that at the same time as `close_all_packs()`, for example in a new
function `close_all_unnecessary_files()`.

But what about unrelated processes? Anybody reading `packed-refs` would
block anybody who wants to modify it, for example to delete a reference
or pack the references. I think this is worse than the packfile
situation. Packfiles all have unique names, so they never need to be
overwritten; at worst they are deleted. And the deletion is never
critical to correctness. If you can't delete a packfile, the only
consequence is that it hangs around until the next GC.

However, the `packed-refs` file always has the same name, and
overwriting it is sometimes critical for correctness. So it sounds to me
like even *readers* mustn't keep the file open or mmapped longer than
necessary!

(This makes me wonder, doesn't `rename_tempfile()` for the `packed-refs`
file *already* fail sometimes on Windows due to a concurrent reader?
Shouldn't it retry if it gets whatever error indicates that the failure
was due to the file being open?)

Anyway, if all of my reasoning is correct, it seems that the inescapable
conclusion is that having the `packed-refs` file 

Re: git diff --name-status for deleted files

2017-09-14 Thread Junio C Hamano
Gene Thomas  writes:

> Hello,
>   "git diff -name-status" is useful to list the files one
>   has changed but it does not list file that one has
>   deleted with "git rm". It would be really handy if it
>   did. I am using git 2.9.3 on Ubuntu Linux 16.10.

With or without --name-status option, "git diff" compares between
the contents you have in the index and in your working tree.  After
you modify contents of a file, i.e.

edit file
git add file

you would not see that file edited exactly because the file on the
filesystem is identical to what you added to the index with "git
add".

Your example works exactly the same way.  Instead of modifying the
contents of a file, removing the presense of the file and recording
that fact to the index (i.e. removing the path from the index, too)
is done with "git rm", so after running

git rm file

your working tree would lack "file" and so would your index.  Hence
you wouldn't see "git diff" report any difference on "file".

Perhaps you wanted "git diff HEAD", which is a way to compare
between the contents you have in the tip commit and the paths in
your working tree thru the index?


Git 2.14.1: t6500: error during test on musl libc

2017-09-14 Thread A. Wilcox
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi there,

While bumping Git's version for our Linux distribution to 2.14.1, I've
run in to a new test failure in t6500-gc.sh.  This is the output of
the failing test with debug=t verbose=t:


expecting success:
# make sure we run a background auto-gc
test_commit make-pack &&
git repack &&
test_config gc.autopacklimit 1 &&
test_config gc.autodetach true &&

# create a ref whose loose presence we can use to detect a
pack-refs run
git update-ref refs/heads/should-be-loose HEAD &&
test_path_is_file .git/refs/heads/should-be-loose &&

# now fake a concurrent gc that holds the lock; we can use our
# shell pid so that it looks valid.
hostname=$(hostname || echo unknown) &&
printf "$$ %s" "$hostname" >.git/gc.pid &&

# our gc should exit zero without doing anything
run_and_wait_for_auto_gc &&
test_path_is_file .git/refs/heads/should-be-loose

[master 28ecdda] make-pack
 Author: A U Thor 
 1 file changed, 1 insertion(+)
 create mode 100644 make-pack.t
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), done.
Total 3 (delta 0), reused 0 (delta 0)
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
File .git/refs/heads/should-be-loose doesn't exist.
not ok 8 - background auto gc respects lock for all operations
#
#   # make sure we run a background auto-gc
#   test_commit make-pack &&
#   git repack &&
#   test_config gc.autopacklimit 1 &&
#   test_config gc.autodetach true &&
#
#   # create a ref whose loose presence we can use to
detect a pack-refs run
#   git update-ref refs/heads/should-be-loose HEAD &&
#   test_path_is_file .git/refs/heads/should-be-loose &&
#
#   # now fake a concurrent gc that holds the lock; we can
use our
#   # shell pid so that it looks valid.
#   hostname=$(hostname || echo unknown) &&
#   printf "$$ %s" "$hostname" >.git/gc.pid &&
#
#   # our gc should exit zero without doing anything
#   run_and_wait_for_auto_gc &&
#   test_path_is_file .git/refs/heads/should-be-loose
#

# failed 1 among 8 test(s)
1..8


I admit I am mostly blind with the Git gc system.  Should I use strace
on the git-gc process at the end?  How would I accomplish that?  Is
there a better way of debugging this error further?

Core system stats:

Intel x86_64 E3-1280 v3 @ 3.60 GHz
musl libc 1.1.16+20
git 2.14.1, vanilla except for a patch to an earlier test due to
musl's inability to cope with EUC-JP
bash 4.3.48(1)-release

Thank you very much.

All the best,
- --arw

- -- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
http://adelielinux.org
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBCAAGBQJZuz4yAAoJEMspy1GSK50UORwP/0Jxfp3xzexh27tSJlXYWS/g
g9QK8Xmid+3A0R696Vb2GguKg2roCcTmM2anR7iD1B2f2W31sgf+8M5mnJRHyJ1p
geEeqwrTdpCk6jQ/1Pj03L0NOftb1ftR6hcoVujBFAOph4jRlRdZDPA87fe6snrh
q99C3LoDXQcyK6WWJwzX+t2wOplKgpGJP8wTAaZ0AHoUwVS5CLPl8tP2XaY4kLfD
ZPPcvtp9wisVzzZ2ssE/CLGd38EbenNNZ6OJCBFJIHmlwey4G2isZ9kk6fVIHXi2
unBJ8yVqI7hQKmQFSVQMMSFSd9azhHnDjTBO5mzWeRK9HNVMda3LZsXTtVeswnRs
lN/ASMdt5KdfpNy/plFB7yDWLlQSQY7j1mxBMR8lL3AdVVQUbJppDM795tt+rn6a
NCE2ESZMWd/QEULmT92AbkNJTj5ibBEoubnVTka05KMjaBLwIauhpqU5XxLFq2UH
y3JYQU9hm0E7dQE0CLXxIm5/574T6bBUgp1cXH3CjxkeUYKR1USVKtDfBV6t/Qmt
xlDZKPEfjKbTvL3KUF33G+eAp55wTwrJTaWlOp8A/JqooXavYghcsuFhYtCPJ8qo
fFUa8kBZP70E/O7JkycUu8wi7p42+j1a8gR6/AnPG2u2wyoiosLCxHX+nll4gKmN
b6BuiRn0Z9ie5xw4xcMR
=Vf8Z
-END PGP SIGNATURE-


Re: [PATCH 2/3] merge-base: return fork-point outside reflog

2017-09-14 Thread Junio C Hamano
Michael J Gruber  writes:

> In fact, per documentation "--fork-point" looks at the reflog in
> addition to doing the usual walk from the tip. The original design
> description in d96855ff51 ("merge-base: teach "--fork-point" mode",
> 2013-10-23) describes this as computing from a virtual merge-base of all
> the historical tips of refname. They may or may not all be present in
> the reflog (think pruning, non-ff fetching, fast forwarding etc.),
> so filtering by the current contents of the reflog is potentially
> harmful, and it does not seem to fulfill any purpose in the original
> design.

Let me think aloud, using the picture from the log message from that
commit.

 o---B1
/
---o---o---B2--o---o---o---Base
\
 B3
  \
   Derived

where the current tip of the "base" branch is at Base, but earlier
fetch observed that its tip used to be B3 and then B2 and then B1
before getting to the current commit, and the branch being rebased
on top of the latest "base" is based on commit B3.

So the logic tries to find a merge base between "Derived" and a
virtual merge commit across Base, B1, B2 and B3.  And it finds B3.

If for some reason we didn't have B3 in the reflog, then wouldn't
the merge base computation between Derived and a virtual merge
commit across Base, B2 and B2 (but not B3 because we no longer know
about it due to its lack in the reflog) find 'o' that is the parent
of B2 and B3?  Wouldn't that lead to both B3 and Derived replayed
when the user of the fork-point potion rebases the history of
Derived?

Perhaps that is the best we could do with a pruned reflog that lacks
B3, but if that is the case, I wonder if it may be better to fail
the request saying that we cannot find the fork-point (because,
well, your reflog no longer has sufficient information), than
silently give a wrong result, and in this case, we can tell between
a correct result (i.e. the merge base is one of the commits we still
know was at the tip) and a wrong one (i.e. the merge base is not any
of the commits in the reflog).

If we declare --fork-point is the best effort option and may give an
answer that is not better than without the option, then I think this
patch is OK, but that diminishes the value of the option as a
building block, I am afraid.

Callers that are more careful could ask merge-base with --fork-point
(and happily use it knowing that the result is indeed a commit that
used to be at the tip), or fall back to the result merge-base
without --fork-point gives (because you could do no better) and deal
with duplicates that may happen due to the imprecise determination.
With this change, these callers will get result from a call to
"merge-base --fork-point" that may or may not be definite, and they
cannot tell.  For lazy users, making the option itself to fall back
may be simpler to use, and certainly is a valid stance to take when
implementing a convenience option to a Porcelain command, but I do
not know if it is a good idea to throw "merge-base --fork-point"
into that category.



>
> Remove the filtering and add a test for an out-of-reflog merge base.
>
> Reported-by: Ekelhart Jakob 
> Signed-off-by: Michael J Gruber 
> ---
>  builtin/merge-base.c  | 18 +++---
>  t/t6010-merge-base.sh |  8 
>  2 files changed, 11 insertions(+), 15 deletions(-)
>
> diff --git a/builtin/merge-base.c b/builtin/merge-base.c
> index 6dbd167d3b..926a7615ea 100644
> --- a/builtin/merge-base.c
> +++ b/builtin/merge-base.c
> @@ -186,23 +186,11 @@ static int handle_fork_point(int argc, const char 
> **argv)
>* There should be one and only one merge base, when we found
>* a common ancestor among reflog entries.
>*/
> - if (!bases || bases->next) {
> + if (!bases || bases->next)
>   ret = 1;
> - goto cleanup_return;
> - }
> -
> - /* And the found one must be one of the reflog entries */
> - for (i = 0; i < revs.nr; i++)
> - if (>item->object == [i]->object)
> - break; /* found */
> - if (revs.nr <= i) {
> - ret = 1; /* not found */
> - goto cleanup_return;
> - }
> -
> - printf("%s\n", oid_to_hex(>item->object.oid));
> + else
> + printf("%s\n", oid_to_hex(>item->object.oid));
>  
> -cleanup_return:
>   free_commit_list(bases);
>   return ret;
>  }
> diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh
> index 17fffd7998..850463d4f2 100755
> --- a/t/t6010-merge-base.sh
> +++ b/t/t6010-merge-base.sh
> @@ -267,6 +267,14 @@ test_expect_success '--fork-point works with empty 
> reflog' '
>   test_cmp expect actual
>  '
>  
> +test_expect_success '--fork-point works with merge-base outside reflog' '
> + git -c core.logallrefupdates=false checkout no-reflog &&
> + git -c 

Re: [PATCH 0/4] Fixes from the per-repository object store series

2017-09-14 Thread Junio C Hamano
Jonathan Nieder  writes:

> This is a continuation of the series at [1].  That was part 1 ---
> you can think of this as part 0, since it contains the simplest and
> least controversial part of the series --- each patch in this series
> is a bugfix in its own right.
>
> Patch 1 should be familiar if you reviewed the series [1].  It is
> unchanged from the patch there, except to note Peff's ack.
>
> Patches 2-4 have not yet visited the list but are fixes (or, in the
> case of patch 4, cleanups) noticed as part of the same process of
> teaching object store code to handle multiple repositories.
>
> We hope that splitting these out should make them easier to review
> and for people to benefit from these fixes without having to wait
> for the rest of the series to settle.

One thing that is unclear is if you are tentatively withdrawing the
longer series with this update, but I'd assume that it is the case
because these two obviously will conflict with each other, and the
most of what these four patches do are subset of what the larger one
you sent earlier do.

Thanks.


Re: [PATCH v2] commit-template: change a message to be more intuitive

2017-09-14 Thread Junio C Hamano
Jeff King  writes:

>   #  >8 
>   # Do not modify or remove the line above.
>   # Everything below will be ignored.
>
> (I might say "everything below it" to be more precise).
>
> I dunno. We are deep in bikeshed territory here, and I admit I don't
> overly care that much in the first place.

So should I drop this topic altogether?  My inclination is to
declare that "everything below it will be ignored" is the best
version and stop reading further updates to this thread ;-)



Re: [PATCH v2] commit-template: change a message to be more intuitive

2017-09-14 Thread Kaartic Sivaraam
On Thu, 2017-09-14 at 09:36 +0200, Michael J Gruber wrote:
> 
> Also, given all the translations that we have, it seems somewhat strange
> to try and foresee and workaround possible misunderstandings of commonly
> used English phrases.

In case you're trying to say that "Translators have successfully
translated the phrase with misunderstanding for many languages" then I
have to say that I foresaw the difficulty to translate it in other
languages for which a translation does not exist yet (specifically my
mother tongue). If that's not useful as it might only be an rare case,
then this patch isn't either.

-- 
Kaartic


Re: commit-msg hook does not run on merge with --no-ff option

2017-09-14 Thread Kaartic Sivaraam
On Tue, 2017-09-12 at 13:24 -0500, Joseph Dunne wrote:
> Sorry I don't understand your question.  The commit-msg hook runs
> properly in all cases except when I perform a merge with the --no-ff
> option enabled.
> 

It's working just as the documentation says it does (emphasis mine),


This hook is invoked by **git commit**, and can be bypassed with the 
--no-verify option.
It takes a single parameter, the name of the file that holds the proposed 
commit log
message. Exiting with a non-zero status causes the git commit to abort.


It says that 'commit-msg' hook is invoked only for a "commit" (it's not
a MERGE-msg hook you see, it doesn't exist anyway). In case you see the
hook getting invoked for a merge then that's an issue, I guess. For
what kind of merges do you see the 'commit-msg' hook getting invoked? 

-- 
Kaartic


Dearest/Käraste

2017-09-14 Thread Asana Hajraf
Dearest,
I am Mrs. Asana Hajraf and I am married to Mr. Hassan Hajraf from kuwait for 19 
years without a child and my husband died in 2014. I am contacting you to let 
you know my desire to donate the sum of $4.5 million to charities in your 
country which I inherited from my late husband. Due to my illness of cancer 
were confirmed out of just eight months, so it is my desire to see that this 
money is invested to any organization of your choice and distributed each year 
among the charity organizations, motherless babies home, schools and support 
for the men and women homeless or what you may consider to be the benefit of 
those less fortunate. I do not want this fund to be invested in a manner 
without God. As soon as I receive your reply confirming your acceptance of the 
work I tell you, I will give all relevant information to authorize the release 
and transfer the money to you as my duly appointed representative.
Thanks,
Mrs Asana Hajraf.


Käraste,
Jag är fru Asana Hajraf och jag är gift med Mr Hassan Hajraf från Kuwait i 19 
år utan ett barn och min man dog i 2014. Jag kontaktar dig för att meddela dig 
min önskan om att donera summan av 4,5 miljoner dollar till 
välgörenhetsorganisationer i ditt land som jag ärvde från min sena make. På 
grund av min cancercancer bekräftades på bara åtta månader, så det är min 
önskan att se att pengarna investeras i valfri organisation och fördelas varje 
år bland välgörenhetsorganisationerna, moderlösa barnhem, skolor och stöd till 
män och kvinnor hemlösa eller vad du kanske anser vara till nytta för de mindre 
lyckliga. Jag vill inte att denna fond ska investeras på ett sätt utan Gud. Så 
snart jag mottar ditt svar bekräftar din acceptans av det arbete jag säger till 
dig, kommer jag att ge all relevant information för att tillåta frigivningen 
och överföra pengarna till dig som min vederbörligen utsedda representant.
Tack,
Fru Asana Hajraf.


Re: [PATCH 7/7] config: flip return value of store_write_*()

2017-09-14 Thread Jeff King
On Wed, Sep 13, 2017 at 02:25:28PM -0700, Jonathan Nieder wrote:

> > Let's flip them to follow the usual write() conventions and
> > update all callers. As these are local to config.c, it's
> > unlikely that we'd have new callers in any topics in flight
> > (which would be silently broken by our change). But just to
> > be on the safe side, let's rename them to just
> > write_section() and write_pairs().  That also accentuates
> > their relationship with write().
> >
> > Signed-off-by: Jeff King 
> 
> The caller only cares if it succeeded, right?  Could this return
> the customary 0 vs -1 instead of the number of bytes written?

Yes, it could. I went with "follow the conventions of write()" because
these are used in a big chain of write() calls (well, really
write_in_full). But given the current callers, it does not matter either
way.

Thanks for reviewing the series, and sorry if my comments have been a
bit terse. I'm trying to clear my pile before going out of town for a
few days (which I admit may have contributed to my desire for you to
prepare patches on top).

But either way, don't expect a re-roll until next week at the earliest.

-Peff


Re: [PATCH 6/7] notes-merge: use ssize_t for write_in_full() return value

2017-09-14 Thread Jeff King
On Wed, Sep 13, 2017 at 02:20:35PM -0700, Jonathan Nieder wrote:

> > --- a/notes-merge.c
> > +++ b/notes-merge.c
> > @@ -302,7 +302,7 @@ static void write_buf_to_worktree(const struct 
> > object_id *obj,
> > fd = xopen(path, O_WRONLY | O_EXCL | O_CREAT, 0666);
> >  
> > while (size > 0) {
> > -   long ret = write_in_full(fd, buf, size);
> > +   ssize_t ret = write_in_full(fd, buf, size);
> > if (ret < 0) {
> > /* Ignore epipe */
> > if (errno == EPIPE)
> > break;
> > die_errno("notes-merge");
> > } else if (!ret) {
> > die("notes-merge: disk full?");
> > }
> 
> These three lines are dead code.  How about the following, e.g. for
> squashing in?

Thanks, I didn't notice that.

I'd actually prefer it as a separate patch, since it needs explained
separately.

-Peff


Re: [PATCH 3/7] avoid "write_in_full(fd, buf, len) != len" pattern

2017-09-14 Thread Jeff King
On Wed, Sep 13, 2017 at 02:14:30PM -0700, Jonathan Nieder wrote:

> >I really wish every "write_in_full()" user would just
> >check against "<0" now, but this fixes the nasty and
> >stupid ones.
> 
> Ok, you convinced me.
> 
> Should we add a comment to cache.h as well encouraging this?

I'd be OK with a comment, though I don't know that it's strictly
necessary. It looks like most of it was just cargo-culted, so removing
the offending examples is sufficient.

> > [1] A careful reader may notice there is one way that
> > write_in_full() can return a different value. If we ask
> > write() to write N bytes and get a return value that is
> > _larger_ than N, we could return a larger total. But
> > besides the fact that this would imply a totally broken
> > version of write(), it would already invoke undefined
> > behavior. Our internal remaining counter is an unsigned
> > size_t, which means that subtracting too many byte will
> > wrap it around to a very large number. So we'll instantly
> > begin reading off the end of the buffer, trying to write
> > gigabytes (or petabytes) of data.
> 
> This footnote just leaves me more confused, since as you mention,
> write() never would return a value greater than N.  Are you saying we
> need to defend against a broken platform where that isn't true?

No, I'm saying that my claim that write_in_full() can only return two
values (-1 and the original length) is not strictly true. But that it
doesn't matter in practice.

I don't think we need to defend against such a broken platform, but I
didn't want anybody reading the claim to say "aha, you forgot this
case". It is a case that does not matter.

-Peff


Re: [PATCH 2/7] get-tar-commit-id: check write_in_full() return against 0

2017-09-14 Thread Jeff King
On Wed, Sep 13, 2017 at 02:09:27PM -0700, Jonathan Nieder wrote:

> > We ask to write 41 bytes and make sure that the return value
> > is at least 41. This is the same "dangerous" pattern that
> > was fixed in the prior commit (wherein a negative return
> > value is promoted to unsigned), though it is not dangerous
> > here because our "41" is a constant, not an unsigned
> > variable.
> >
> > But we should convert it anyway to avoid modeling a
> > dangerous construct.
> >
> > Signed-off-by: Jeff King 
> > ---
> >  builtin/get-tar-commit-id.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> I kind of disagree with calling this dangerous (and I think that is
> what you alluded to above by putting it in quotes), but I like the
> postimage more than the preimage.

Right, this instance is fine, but the pattern of using "<" is not. If
you swapped out "41" for:

  size_t len = 41;

then it would be a bug. Which I think would surprise most people.

> The variable 'n' could be eliminated to simplify this further.  I
> realize that would go against the spirit of this patch, but (1) it's
> on-topic for the patch, since it is another ssize_t vs constant
> comparison and (2) as mentioned elsewhere in this thread, it's a very
> common idiom with read_in_full.  If we want to eliminate it then we
> could introduce a separate helper to distinguish between
> read_this_much_i_mean_it and read_this_much_or_to_eof.

Yes, I noticed that, too, after you brought up read_in_full() as a
potential source of problems. But I would rather deal with
read_in_full() separately on top. Can you do it as a separate patch on
top (possibly with other read_in_full() cleanups, though I think this is
the only "<" case that exists).

-Peff


Re: [PATCH v2] config: avoid "write_in_full(fd, buf, len) < len" pattern

2017-09-14 Thread Jeff King
On Thu, Sep 14, 2017 at 12:31:38AM +0100, Ramsay Jones wrote:

> > Hmm, about three or four years ago, I spent two or three evenings
> > getting git to compile -Wextra clean. I remember the signed/unsigned
> > issue was the cause of a large portion of the warnings issued by
> > the compiler. I was surprised that it took such a relatively short
> > time to do. However, it affected quite a large portion of the code, so
> > I didn't think Junio would even consider applying it. Also, I only used
> > gcc and was anticipating having additional warnings on clang (but I
> > didn't get around to trying).
> > 
> > Maybe I should give it another go. :-D
> 
> For example, I remember the patch given below reduced the number
> of warnings quite a bit (because it's an inline function in a
> header file).
> 
> I just tried it again tonight; the current master branch has 3532
> warnings when compiled with -Wextra, 1409 of which are -Wsign-compare
> warnings. After applying the patch below, those numbers are reduced
> by 344 to 3188/1065.

I'd love it if we were clean on -Wextra. My big question is many
contortions we'd have to go through in the code. I don't mind at all if
we're actually cleaning as we go (e.g., using types of the correct
signedness, or preventing possible funny interactions). I'm just worried
it will turn into a bunch of noisy casts.

The patch you showed seems like an improvement to the code, but I don't
know if that would be the case for all of them. :)

-Peff


Re: [idea] File history tracking hints

2017-09-14 Thread Philip Oakley

From: "Johannes Schindelin" 

Hi Philip,

On Mon, 11 Sep 2017, Philip Oakley wrote:


From: "Pavel Kretov" 
> Hi all,
>
> Excuse me if the topic I'm going to raise here has been already 
> discussed
> on the mailing list, forums, or IRC, but I couldn't find anything 
> related.

>
>
> The problem:
>
> Git, being "a stupid content tracker", doesn't try to keep an eye on
> operations which happens to individual files; things like file renames
> aren't recorded during commit, but heuristically detected later.
>
> Unfortunately, the heuristic can only deal with simple file renames 
> with

> no substantial content changes; it's helpless when you:
>
> - rename file and change it's content significantly;
> - split single file into several files;
> - merge several files into another;
> - copy entire file from another commit, and do other things like these.
>
> However, if we're able to preserve this information, it's possible
> not only to do more accurate 'git blame', but also merge revisions with
> fewer conflicts.
>
>
> The proposal:
>
> The idea is to let user give hints about what was changed during
> the commit. For example, if user did a rename which wasn't 
> automatically

> detected, he would append something like the following to his commit
> message:
>
>Tracking-hints: rename dev-vcs/git/git-1.0.ebuild ->
> dev-vcs/git/git-2.0.ebuild
>
> or (if full paths of affected files can be unambiguously omitted):
>
>Tracking-hints: rename git-1.0.ebuild -> git-2.0.ebuild
>
> There may be other hint types:
>
>Tracking-hint: recreate LICENSE.txt
>Tracking-hint: split main.c -> main.c cmdline.c
>Tracking-hint: merge linalg.py <- vector.py matrix.py
>
> or even something like this:
>
>Tracking-hint: copy json.py <-
> libs/json.py@4db88291251151d8c5c8e4f20430fa4def2cb2ed
>
> If file transformation cannot be described by a single tracking hint, 
> it

> shall
> be possible to specify a sequence of hints at once:
>
>Tracking-hint:
>split Utils.java -> AppHelpers.java StringHelpers.java
>recreate Utils.java
>
> Note that in the above example the order of operations really matters, 
> so

> both lines have to reside in one 'Tracking-hint' block.
>
> * * *
>
> How do you think, is this idea worth implementing?
> Any other thoughts on this?
>
> -- Pavel Kretov.

Maybe use the "interpret-trailers" methods for standardising your hints
locally (in your team / workplace) to see how it goes and flesh out what 
works

and what doesn't. Trying to decide, a-priori, what are the right hints is
likely to be the hard part.


I think this adds a very valuable insight to this discussion: the current
state of Git's rename handling is based on the idea that you either record
the renames, or you detect them. Like, there is either "on" or "off". No
middle ground.

However, if you understand that there is also the possibility of hints
that can help any erroneous rename detection (and *everybody* who
seriously worked on a massive code base has seen that rename detection
fail in the most inopportune ways [*1*]), then you are on to something.

So I totally like the idea of introducing hints, possibly as trailers in
the commit message (or as refs/notes/rename/* or whatever) that can be
picked up by Git versions that know about them, and can be ignored by Git
versions that insist on the rename detection du jour. With a config option
to control the behavior, maybe, too.

Ciao,
Dscho

Footnote *1*: Just to name a couple of examples from my personal
experience, off the top of my head:

- license boiler plates often let Git detect renames/copies where there
 are none,

- even something as trivial as moving Java classes (and their dependent
 classes) between packages changes every line referring to said packages,
 causing Git's rename detection to go for a drink instead of doing its
 job,

- indentation changes overwhelm Git's rename detection,

- when rename detection would matter most, like, really a lot, to lift the
 burden of the human beings in front of the computer pouring over
 hundreds of thousands of files moved from one directory tree to another,
 that's exactly when Git's rename detection says that there are too many
 files, here are my union rights, I am going home, good luck to you.

In light of such experiences, I have to admit that the notion that the
rename detection can always be improved in hindsight puts quite a bit of
insult to injury for those developers who are bitten by it.


Your list made me think that the hints should be directed toward what may be 
considered existing solutions for those specific awkward cases.


So the hints could be (by type):
- template;licence;boiler-plate;standard;reference :: copy
- word-rename
- regex for word substitution changes (e.g. which chars are within 
'Word-_0`)
- regex for white-space changes (i.e. which chars are considered 
whitespace.)

- move-dir path/glob spec
- move-file path/glob spec

Re: [PATCH 1/2] test-lib: group system specific FIFO tests by system

2017-09-14 Thread Johannes Schindelin
Hi Michael,

On Thu, 14 Sep 2017, Michael J Gruber wrote:

> test-lib determines whether a file-system supports FIFOs and needs to do
> special casing for CYGWIN and MINGW. This separates those system
> specific settings from those at more central place.
> 
> Set mkfifo()  to false in the central system specific place so that the
> same test works everywhere.

The mkfifo() emulation of Cygwin seems to work, no? I think it works even
in MSYS2, but not in MINGW.

So maybe this patch should affect only the MINGW arm?

Ciao,
Dscho


Re: [PATCH 00/20] Read `packed-refs` using mmap()

2017-09-14 Thread Johannes Schindelin
Hi Michael,

On Thu, 14 Sep 2017, Michael Haggerty wrote:

> On 09/13/2017 07:15 PM, Michael Haggerty wrote:
> > [...]
> > * `mmap()` the whole file rather than `read()`ing it.
> > [...]
>
> Apparently this doesn't work on Windows, because the `snapshot` is
> keeping the `packed-refs` file open too long, so the new file can't be
> renamed on top of it.

Indeed, I sent you a mail about it before I checked for new mails ;-)

> I didn't realize that this is even allowed, but TIL that you can close a
> file while keeping it mmapped. Does that technique work on Windows? If
> so, I'll change v2 to do it as sketched below.

I do not know whether that technique works on Windows, but it would not
solve the problem *entirely*, as I outlined in my reply: in
delete_branches(), for example, the main_worktree_ref_store is implicitly
getting its snapshot initialized, and nothing seems to release it.

This is only one example, as I figure that multiple worktrees could cause
even more ref_stores to have unreleased snapshots of the packed-refs file.

I imagine that something like close_all_packs() is needed
("clear_all_snapshots()"?).

Ciao,
Dscho


git diff --name-status for deleted files

2017-09-14 Thread Gene Thomas
Hello,
  "git diff -name-status" is useful to list the files one has changed 
but it does not list file that one has deleted with "git rm". It would be 
really handy if it did. I am using git 2.9.3 on Ubuntu Linux 16.10.

Yours Sincerely,

Gene Thomas.


Re: RFC v3: Another proposed hash function transition plan

2017-09-14 Thread Johannes Schindelin
Hi Jonathan,

On Thu, 14 Sep 2017, Jonathan Nieder wrote:

> Johannes Schindelin wrote:
> > On Wed, 13 Sep 2017, Jonathan Nieder wrote:
> 
> >> [3] https://www.imperialviolet.org/2017/05/31/skipsha3.html,
> >
> > I had read this short after it was published, and had missed the updates.
> > One link in particular caught my eye:
> >
> > https://eprint.iacr.org/2012/476
> >
> > Essentially, the authors demonstrate that using SIMD technology can speed
> > up computation by factor 2 for longer messages (2kB being considered
> > "long" already). It is a little bit unclear to me from a cursory look
> > whether their fast algorithm computes SHA-256, or something similar.
> 
> The latter: that paper is about a variant on SHA-256 called SHA-256x4
> (or SHA-256x16 to take advantage of newer instructions).  It's a
> different hash function.  This is what I was alluding to at [1].

Thanks for the explanation!

> > As the author of that paper is also known to have contributed to OpenSSL,
> > I had a quick look and it would appear that a comment in
> > crypto/sha/asm/sha256-mb-x86_64.pl speaking about "lanes" suggests that
> > OpenSSL uses the ideas from the paper, even if b783858654 (x86_64 assembly
> > pack: add multi-block AES-NI, SHA1 and SHA256., 2013-10-03) does not talk
> > about the paper specifically.
> >
> > The numbers shown in
> > https://github.com/openssl/openssl/blob/master/crypto/sha/asm/keccak1600-x86_64.pl#L28
> > and in
> > https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha256-mb-x86_64.pl#L17
> >
> > are sufficiently satisfying.
> 
> This one is about actual SHA-256, but computing the hash of multiple
> streams in a single funtion call.  The paper to read is [2].  We could
> probably take advantage of it for e.g. bulk-checkin and index-pack.
> Most other code paths that compute hashes wouldn't be able to benefit
> from it.

Again, thanks for the explanation.

Ciao,
Dscho

> [1] 
> https://public-inbox.org/git/20170616212414.gc133...@aiede.mtv.corp.google.com/
> [2] https://eprint.iacr.org/2012/371
> 


git fast-export/import bug with -M -C

2017-09-14 Thread Juraj Oršulić
The commands should be self explanatory. 0.2.0~20 is the first commit
where the reconstructed repository diverges, that commit had a
simultaneous copy and edit of one file. It seems that copy/rename
detection, enabled with -M -C is confused by this. I reproduced it
with git 2.14 next @ 8fa685d.

git clone https://github.com/googlecartographer/cartographer_ros
mkdir copy && cd copy && git init
(cd ../cartographer_ros; git fast-export --all --date-order -M -C) |
git fast-import
git rev-parse 0.2.0~20 #2237e1d0a974977fbcb0737dd1fb5876a2b8e29d
git rev-parse 0.2.0~21 #cd1276a99ccffcc491d0b2e50296ec04347ba5f2
cd ../cartographer_ros
git rev-parse 0.2.0~20 #9d5b221ed41783b15c84bc90b71527194b8d9a49
git rev-parse 0.2.0~21 #cd1276a99ccffcc491d0b2e50296ec04347ba5f2


Re: RFC v3: Another proposed hash function transition plan

2017-09-14 Thread Johannes Schindelin
Hi,

On Thu, 14 Sep 2017, demerphq wrote:

> On 14 September 2017 at 17:23, Johannes Schindelin
>  wrote:
> >
> > SHA-256 has been hammered on a lot more than SHA3-256.
> 
> Last year that was even more true of SHA1 than it is true of SHA-256
> today.

I hope you are not deliberately trying to annoy me. I say that because you
seemed to be interested enough in cryptography to know that the known
attacks on SHA-256 *today* are unlikely to extend to Git's use case,
whereas the known attacks on SHA-1 *in 2005* were already raising doubts.

So while SHA-1 has been hammered on for longer than SHA-256, the latter
came out a lot less scathed than the former.

Besides, you are totally missing the point here that the choice is *not*
between SHA-1 and SHA-256, but between SHA-256 and SHA3-256.

After all, we would not consider any hash algorithm with known problems
(as far as Git's usage is concerned). The amount of scrutiny with which
the algorithm was investigated would only be a deciding factor among the
remaining choices, yes?

In any case, don't trust me on cryptography (just like I do not trust you
on that matter). Trust the cryptographers. I contacted some of my
colleagues who are responsible for crypto, and the two who seem to
disagree on pretty much everything agreed on this one thing: that SHA-256
would be a good choice for Git (and one of them suggested that it would be
much better than SHA3-256, because SHA-256 saw more cryptanalysis).

Ciao,
Johannes


Re: [PATCH 00/20] Read `packed-refs` using mmap()

2017-09-14 Thread Johannes Schindelin
Hi Michael,

On Wed, 13 Sep 2017, Michael Haggerty wrote:

> * `mmap()` the whole file rather than `read()`ing it.

On Windows, a memory-mapped file cannot be renamed. As a consequence, the
following tests fail on `pu`:

t1400-update-ref.sh
t1404-update-ref-errors.sh
t1405-main-ref-store.sh
t1408-packed-refs.sh
t1410-reflog.sh
t1430-bad-ref-name.sh
t1450-fsck.sh
t1507-rev-parse-upstream.sh
t2018-checkout-branch.sh
t2020-checkout-detach.sh
t2024-checkout-dwim.sh
t3200-branch.sh
t3204-branch-name-interpretation.sh
t3210-pack-refs.sh
t3211-peel-ref.sh
t3306-notes-prune.sh
t3400-rebase.sh
t3404-rebase-interactive.sh
t3420-rebase-autostash.sh
t3903-stash.sh
t3905-stash-include-untracked.sh
t4151-am-abort.sh
t5304-prune.sh
t5312-prune-corruption.sh
t5400-send-pack.sh
t5404-tracking-branches.sh
t5500-fetch-pack.sh
t5505-remote.sh
t5510-fetch.sh
t5514-fetch-multiple.sh
t5515-fetch-merge-logic.sh
t5516-fetch-push.sh
t5520-pull.sh
t5533-push-cas.sh
t5572-pull-submodule.sh
t5600-clone-fail-cleanup.sh
t5607-clone-bundle.sh
t6016-rev-list-graph-simplify-history.sh
t6030-bisect-porcelain.sh
t6032-merge-large-rename.sh
t6040-tracking-info.sh
t6050-replace.sh
t6500-gc.sh
t6501-freshen-objects.sh
t7003-filter-branch.sh
t7004-tag.sh
t7102-reset.sh
t7201-co.sh
t7406-submodule-update.sh
t7504-commit-msg-hook.sh
t7701-repack-unpack-unreachable.sh
t9300-fast-import.sh
t9902-completion.sh
t9903-bash-prompt.sh

This diff:

-- snip --
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index f9c5e771bdd..ee8b3603624 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1306,13 +1308,13 @@ static int packed_transaction_finish(struct
ref_store *ref_store,
char *packed_refs_path;
 
packed_refs_path = get_locked_file_path(>lock);
+   clear_snapshot(refs);
if (rename_tempfile(>tempfile, packed_refs_path)) {
strbuf_addf(err, "error replacing %s: %s",
refs->path, strerror(errno));
goto cleanup;
}
 
-   clear_snapshot(refs);
ret = 0;
 
 cleanup:
-- snap --

reduces the failed tests to

t1410-reflog.counts.sh
t3210-pack-refs.counts.sh
t3211-peel-ref.counts.sh
t5505-remote.counts.sh
t5510-fetch.counts.sh
t5600-clone-fail-cleanup.counts.sh

However, much as I tried to wrap my head around it, I could not even start
to come up with a solution for e.g. the t1410 regression. The failure
happens in `git branch -d one/two`.

The culprit: there is yet another unreleased snapshot while we try to
finish the transaction.

It is the snapshot of the main_worktree_ref_store as acquired by
add_head_info() (which is called from get_main_worktree(), which in turn
was called from get_worktrees(), in turn having been called from
find_shared_symref() that was called from delete_branches()), and it seems
to me that there was never any plan to have that released, including its
snapshot.

And the snapshot gets initialized upon add_head_info() calling
refs_resolve_ref_unsafe().

Further down in the delete_branches() function, however, we call
delete_ref() which in turn wants to overwrite the packed-refs file via an
atomic rename. That rename fails now because there is still that main
worktree's ref_store that has the snapshot mmap()ed .

I imagine that the rest of the test failures stems from the same root
cause.

Do you have any idea how to ensure that all mmap()ed packed refs are
released before attempting to finish a transaction?

Ciao,
Dscho


Re: RFC v3: Another proposed hash function transition plan

2017-09-14 Thread Jonathan Nieder
Johannes Schindelin wrote:
> On Wed, 13 Sep 2017, Jonathan Nieder wrote:

>> As a side note, I am probably misreading, but I found this set of
>> paragraphs a bit condescending.  It sounds to me like you are saying
>> "You are making the wrong choice of hash function and everything else
>> you are describing is irrelevant when compared to that monumental
>> mistake.  Please stop working on things I don't consider important".
>> With that reading it is quite demotivating to read.
>
> I am sorry you read it that way. I did not feel condescending when I wrote
> that mail, I felt annoyed by the side track, and anxious. In my mind, the
> transition is too important for side tracking, and I worry that we are not
> fast enough (imagine what would happen if a better attack was discovered
> that is not as easily detected as the one we know about?).

Thanks for clarifying.  That makes sense.

[...]
> As to *my* opinion: after reading https://goo.gl/gh2Mzc (is it really
> correct that its last update has been on March 6th?), my only concern is
> really that it still talks about SHA3-256 when I think that the
> performance benefits of SHA-256 (think: "Git at scale", and also hardware
> support) really make the latter a better choice.
>
> In order to be "ironed out", I think we need to talk about the
> implementation detail "Translation table". This is important. It needs to
> be *fast*.
>
> Speaking of *fast*, I could imagine that it would make sense to store the
> SHA-1 objects on disk, still, instead of converting them on the fly. I am
> not sure whether this is something we need to define in the document,
> though, as it may very well be premature optimization; Maybe mention that
> we could do this if necessary?
>
> Apart from that, I would *love* to see this document as The Official Plan
> that I can Show To The Manager so that I can ask to Allocate Time.

Sounds promising!

Thanks much for this feedback.  This is very helpful for knowing what
v4 of the doc needs.

The discussion of the translation table in [1] didn't make it to the
doc.  You're right that it needs to.

Caching SHA-1 objects (and the pros and cons involved) makes sense to
mention in an "ideas for future work" section.

An implementation plan with well-defined pieces for people to take on
and estimates of how much work each involves may be useful for Showing
To The Manager.  So I'll include a sketch of that for reviewers to
poke holes in, too.

Another thing the doc doesn't currently describe is how Git protocol
would work.  That's worth sketching in a "future work" section as
well.

Sorry it has been taking so long to get this out.  I think we should
have something ready to send on Monday.

Thanks,
Jonathan

[1] 
https://public-inbox.org/git/CAJo=hJtoX9=aylhhpujs7fuev9ciz_mnpnephuz8whui6g9...@mail.gmail.com/


Re: RFC v3: Another proposed hash function transition plan

2017-09-14 Thread Johannes Schindelin
Hi Linus,

On Wed, 13 Sep 2017, Linus Torvalds wrote:

> On Wed, Sep 13, 2017 at 6:43 AM, demerphq  wrote:
> >
> > SHA3 however uses a completely different design where it mixes a 1088
> > bit block into a 1600 bit state, for a leverage of 2:3, and the excess
> > is *preserved between each block*.
> 
> Yes. And considering that the SHA1 attack was actually predicated on
> the fact that each block was independent (no extra state between), I
> do think SHA3 is a better model.
> 
> So I'd rather see SHA3-256 than SHA256.

SHA-256 got much more cryptanalysis than SHA3-256, and apart from the
length-extension problem that does not affect Git's usage, there are no
known weaknesses so far.

It would seem that the experts I talked to were much more concerned about
that amount of attention than the particulars of the algorithm. My
impression was that the new features of SHA3 were less studied than the
well-known features of SHA2, and that the new-ness of SHA3 is not
necessarily a good thing.

You will have to deal with the fact that I trust the crypto experts'
opinion on this a lot more than your opinion. Sure, you learned from the
fact that you had been warned about SHA-1 already seeing theoretical
attacks in 2005 and still choosing to hard-wire it into Git. And yet, you
are still no more of a cryptography expert than I am.

Ciao,
Dscho


Re: RFC v3: Another proposed hash function transition plan

2017-09-14 Thread Jonathan Nieder
Hi,

Johannes Schindelin wrote:
> On Wed, 13 Sep 2017, Jonathan Nieder wrote:

>> [3] https://www.imperialviolet.org/2017/05/31/skipsha3.html,
>
> I had read this short after it was published, and had missed the updates.
> One link in particular caught my eye:
>
>   https://eprint.iacr.org/2012/476
>
> Essentially, the authors demonstrate that using SIMD technology can speed
> up computation by factor 2 for longer messages (2kB being considered
> "long" already). It is a little bit unclear to me from a cursory look
> whether their fast algorithm computes SHA-256, or something similar.

The latter: that paper is about a variant on SHA-256 called SHA-256x4
(or SHA-256x16 to take advantage of newer instructions).  It's a
different hash function.  This is what I was alluding to at [1].

> As the author of that paper is also known to have contributed to OpenSSL,
> I had a quick look and it would appear that a comment in
> crypto/sha/asm/sha256-mb-x86_64.pl speaking about "lanes" suggests that
> OpenSSL uses the ideas from the paper, even if b783858654 (x86_64 assembly
> pack: add multi-block AES-NI, SHA1 and SHA256., 2013-10-03) does not talk
> about the paper specifically.
>
> The numbers shown in
> https://github.com/openssl/openssl/blob/master/crypto/sha/asm/keccak1600-x86_64.pl#L28
> and in
> https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha256-mb-x86_64.pl#L17
>
> are sufficiently satisfying.

This one is about actual SHA-256, but computing the hash of multiple
streams in a single funtion call.  The paper to read is [2].  We could
probably take advantage of it for e.g. bulk-checkin and index-pack.
Most other code paths that compute hashes wouldn't be able to benefit
from it.

Thanks,
Jonathan

[1] 
https://public-inbox.org/git/20170616212414.gc133...@aiede.mtv.corp.google.com/
[2] https://eprint.iacr.org/2012/371


Re: RFC v3: Another proposed hash function transition plan

2017-09-14 Thread Johannes Schindelin
Hi Jonathan,

On Wed, 13 Sep 2017, Jonathan Nieder wrote:

> [3] https://www.imperialviolet.org/2017/05/31/skipsha3.html,

I had read this short after it was published, and had missed the updates.
One link in particular caught my eye:

https://eprint.iacr.org/2012/476

Essentially, the authors demonstrate that using SIMD technology can speed
up computation by factor 2 for longer messages (2kB being considered
"long" already). It is a little bit unclear to me from a cursory look
whether their fast algorithm computes SHA-256, or something similar.

As the author of that paper is also known to have contributed to OpenSSL,
I had a quick look and it would appear that a comment in
crypto/sha/asm/sha256-mb-x86_64.pl speaking about "lanes" suggests that
OpenSSL uses the ideas from the paper, even if b783858654 (x86_64 assembly
pack: add multi-block AES-NI, SHA1 and SHA256., 2013-10-03) does not talk
about the paper specifically.

The numbers shown in
https://github.com/openssl/openssl/blob/master/crypto/sha/asm/keccak1600-x86_64.pl#L28
and in
https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha256-mb-x86_64.pl#L17
are sufficiently satisfying.

Ciao,
Dscho


Re: [PATCH v2] test-lib: don't use ulimit in test prerequisites on cygwin

2017-09-14 Thread Jonathan Nieder
Ramsay Jones wrote:

> On cygwin (and MinGW), the 'ulimit' built-in bash command does not have
> the desired effect of limiting the resources of new processes, at least
> for the stack and file descriptors. However, it always returns success
> and leads to several test prerequisites being erroneously set to true.
>
> Add a check for cygwin and MinGW to the prerequisite expressions, using
> a 'test_have_prereq !MINGW,!CYGWIN' clause, to guard against using ulimit.
> This affects the prerequisite expressions for the ULIMIT_STACK_SIZE,
> CMDLINE_LIMIT and ULIMIT_FILE_DESCRIPTORS prerequisites.
>
> Signed-off-by: Ramsay Jones 
> ---
>  t/t1400-update-ref.sh |  5 -
>  t/t6120-describe.sh   |  1 -
>  t/t7004-tag.sh|  1 -
>  t/test-lib.sh | 10 --
>  4 files changed, 12 insertions(+), 5 deletions(-)

Reviewed-by: Jonathan Nieder 

Thank you.


Re: [PATCH v5 25/40] external-odb: add 'get_direct' support

2017-09-14 Thread Jonathan Tan
On Thu, 14 Sep 2017 10:39:35 +0200
Christian Couder  wrote:

> From the following email:
> 
> https://public-inbox.org/git/20170804145113.5ceaf...@twelve2.svl.corp.google.com/
> 
> it looks like his work is fundamentally about changing the rules of
> connectivity checks. Objects are split between "homegrown" objects and
> "imported" objects which are in separate pack files. Then references
> to imported objects are not checked during connectivity check.
> 
> I think changing connectivity rules is not necessary to make something
> like external odb work. For example when fetching a pack that refers
> to objects that are in an external odb, if access this external odb
> has been configured, then the connectivity check will pass as the
> missing objects in the pack will be seen as already part of the repo.

There are still some nuances. For example, if an external ODB provides
both a tree and a blob that the tree references, do we fetch the tree in
order to call "have" on all its blobs, or do we trust the ODB that if it
has the tree, it has all the other objects? In my design, I do the
latter, but in the general case where we have multiple ODBs, we might
have to do the former. (And if we do the former, it seems to me that the
connectivity check must be performed "online" - that is, with the ODBs
being able to provide "get".)

(Also, my work extends all the way to fetch/clone [1], but admittedly I
have been taking it a step at a time and recently have only been
discussing how the local repo should handle the missing object
situation.)

[1] https://public-inbox.org/git/cover.1499800530.git.jonathanta...@google.com/

> Yeah, if some commands like fsck are used, then possibly all the
> objects will have to be requested from the external odb, as it may not
> be possible to fully check all the objects, especially the blobs,
> without accessing all their data. But I think this is a problem that
> could be dealt with in different ways. For example we could develop
> specific options in fsck so that it doesn't check the sha1 of objects
> that are marked with some specific attributes, or that are stored in
> external odbs, or that are bigger than some size.

The hard part is in dealing with missing commits and trees, I think, not
blobs.


Re: Rebase & submodules

2017-09-14 Thread Robert Dailey
On Thu, Sep 14, 2017 at 10:57 AM, Nicolas Morey-Chaisemartin
 wrote:
> Without changing your workflow too much,

If you mean to imply that you have other recommendations if I'm
willing to change my workflow, then please by all means share them.
I'm very interested. I'm not too hooked on my workflow.

> simply add an annotated tag to your branch before your rebase.
> This way the SHA1 will always exists. Unless you want to cleanup at some 
> point (branch merged ?) and then you can simply delete all those old tags.

This definitely the best idea so far; although the maintenance
overhead of this could be high for long-lived branches with frequent
rebases. Maybe with bigger workflow changes there are other solutions?


Re: [PATCH 2/2] test-lib: ulimit does not limit on CYGWIN and MINGW

2017-09-14 Thread Ramsay Jones


On 14/09/17 15:52, Michael J Gruber wrote:
> ulimit succeeds (by return value) but does not limit on some systems.
> 
> Set ulimit() to false on these systems so that we do not rely on its
> output nor effect. As an intended side-effect, ulimit based
> prerequisites are set correctly (to "not-have") on these systems.
> 
> Reported-by: Ramsay Jones 
> Reported-by: Adam Dinwoodie 
> Reported-by: Johannes Schindelin 
> Signed-off-by: Michael J Gruber 
> ---
> This is independent of my series, but should best go before so that no
> ulimit based test is run on CYGWIN and MINGW.
> 
> It follows the basic assumption that a tool like ulimit is either
> present and functional or not present; and that we work around by
> defines or such when that assumption is broken.
> (Alternatively, we could set ULIMT_LIMITS or so and depend on that.)

Heh, this was my first suggestion, if you recall, but I decided to
go a different way ... ;-)

Also, Johannes made a good suggestion, which lead to a new version
of my patch (which could easily be extended to cover the FIFO).

I don't have a strong preference for either approach (but I would
have to test your patches, which I haven't done yet), so I would
be happy to see either applied.

ATB,
Ramsay Jones


[PATCH v2] test-lib: don't use ulimit in test prerequisites on cygwin

2017-09-14 Thread Ramsay Jones

On cygwin (and MinGW), the 'ulimit' built-in bash command does not have
the desired effect of limiting the resources of new processes, at least
for the stack and file descriptors. However, it always returns success
and leads to several test prerequisites being erroneously set to true.

Add a check for cygwin and MinGW to the prerequisite expressions, using
a 'test_have_prereq !MINGW,!CYGWIN' clause, to guard against using ulimit.
This affects the prerequisite expressions for the ULIMIT_STACK_SIZE,
CMDLINE_LIMIT and ULIMIT_FILE_DESCRIPTORS prerequisites.

Signed-off-by: Ramsay Jones 
---

Hi Junio,

Johannes suggested the improvement to this patch (namely using
the 'test_have_prereq !MINGW,!CYGWIN' sub-expression), which
makes it somewhat more compact and obvious.

Thanks Johannes!

ATB,
Ramsay Jones

 t/t1400-update-ref.sh |  5 -
 t/t6120-describe.sh   |  1 -
 t/t7004-tag.sh|  1 -
 t/test-lib.sh | 10 --
 4 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
index dc98b4bc6..664a3a4e4 100755
--- a/t/t1400-update-ref.sh
+++ b/t/t1400-update-ref.sh
@@ -1253,7 +1253,10 @@ run_with_limited_open_files () {
(ulimit -n 32 && "$@")
 }
 
-test_lazy_prereq ULIMIT_FILE_DESCRIPTORS 'run_with_limited_open_files true'
+test_lazy_prereq ULIMIT_FILE_DESCRIPTORS '
+   test_have_prereq !MINGW,!CYGWIN &&
+   run_with_limited_open_files true
+'
 
 test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction creating 
branches does not burst open file limit' '
 (
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index dd6dd9df9..3d45dc295 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -279,7 +279,6 @@ test_expect_success 'describe ignoring a borken submodule' '
grep broken out
 '
 
-# we require ulimit, this excludes Windows
 test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
i=1 &&
while test $i -lt 8000
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 5bf5ace56..b545c33f8 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1863,7 +1863,6 @@ test_expect_success 'version sort with very long 
prerelease suffix' '
git tag -l --sort=version:refname
 '
 
-# we require ulimit, this excludes Windows
 test_expect_success ULIMIT_STACK_SIZE '--contains and --no-contains work in a 
deep repo' '
>expect &&
i=1 &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 83f5d3dd2..32f9b2001 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1170,13 +1170,19 @@ run_with_limited_cmdline () {
(ulimit -s 128 && "$@")
 }
 
-test_lazy_prereq CMDLINE_LIMIT 'run_with_limited_cmdline true'
+test_lazy_prereq CMDLINE_LIMIT '
+   test_have_prereq !MINGW,!CYGWIN &&
+   run_with_limited_cmdline true
+'
 
 run_with_limited_stack () {
(ulimit -s 128 && "$@")
 }
 
-test_lazy_prereq ULIMIT_STACK_SIZE 'run_with_limited_stack true'
+test_lazy_prereq ULIMIT_STACK_SIZE '
+   test_have_prereq !MINGW,!CYGWIN &&
+   run_with_limited_stack true
+'
 
 build_option () {
git version --build-options |
-- 
2.14.0


[PATCH v2] commit-tree: do not complete line in -F input

2017-09-14 Thread Ross Kabus
"git commit-tree -F ", unlike "cat  | git
commit-tree" (i.e. feeding the same contents from the standard
input), added a missing final newline when the input ended in an
incomplete line.

Correct this inconsistency by leaving the incomplete line as-is,
as erring on the side of not touching the input is preferable
and expected for a plumbing command like "commit-tree".

Signed-off-by: Ross Kabus 
---
Fixed up commit message with feedback from Junio.

 builtin/commit-tree.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/builtin/commit-tree.c b/builtin/commit-tree.c
index 19e898fa4..2177251e2 100644
--- a/builtin/commit-tree.c
+++ b/builtin/commit-tree.c
@@ -102,7 +102,6 @@ int cmd_commit_tree(int argc, const char **argv, const char 
*prefix)
if (fd && close(fd))
die_errno("git commit-tree: failed to close 
'%s'",
  argv[i]);
-   strbuf_complete_line();
continue;
}
 
-- 
2.13.1.windows.2



Re: RFC v3: Another proposed hash function transition plan

2017-09-14 Thread Brandon Williams
On 09/14, Johannes Schindelin wrote:
> Hi Jonathan,
> 
> On Wed, 13 Sep 2017, Jonathan Nieder wrote:
> 
> > As a side note, I am probably misreading, but I found this set of
> > paragraphs a bit condescending.  It sounds to me like you are saying
> > "You are making the wrong choice of hash function and everything else
> > you are describing is irrelevant when compared to that monumental
> > mistake.  Please stop working on things I don't consider important".
> > With that reading it is quite demotivating to read.
> 
> I am sorry you read it that way. I did not feel condescending when I wrote
> that mail, I felt annoyed by the side track, and anxious. In my mind, the
> transition is too important for side tracking, and I worry that we are not
> fast enough (imagine what would happen if a better attack was discovered
> that is not as easily detected as the one we know about?).
> 
> > An alternative reading is that you are saying that the transition plan
> > described in this thread is not ironed out.  Can you spell that out
> > more?  What particular aspect of the transition plan (which is of
> > course orthogonal to the choice of hash function) are you discontent
> > with?
> 
> My impression from reading Junio's mail was that he does not consider the
> transition plan ironed out yet, and that he wants to spend time on
> discussing generation numbers right now.
> 
> I was in particularly frightened by the suggestion to "reboot" [*1*].
> Hopefully I misunderstand and he meant "finishing touches" instead.
> 
> As to *my* opinion: after reading https://goo.gl/gh2Mzc (is it really
> correct that its last update has been on March 6th?), my only concern is
> really that it still talks about SHA3-256 when I think that the
> performance benefits of SHA-256 (think: "Git at scale", and also hardware
> support) really make the latter a better choice.
> 
> In order to be "ironed out", I think we need to talk about the
> implementation detail "Translation table". This is important. It needs to
> be *fast*.

Agreed, when that document was written it was hand waved as an
implementation detail but once we should probably stare ironing out
those details soon so that we have a concrete plan in place.

> 
> Speaking of *fast*, I could imagine that it would make sense to store the
> SHA-1 objects on disk, still, instead of converting them on the fly. I am
> not sure whether this is something we need to define in the document,
> though, as it may very well be premature optimization; Maybe mention that
> we could do this if necessary?
> 
> Apart from that, I would *love* to see this document as The Official Plan
> that I can Show To The Manager so that I can ask to Allocate Time.

Speaking of having a concrete plan, we discussed in office the other day
about finally converting the doc into a Documentation patch.  That was
always are intention but after writing up the doc we got busy working on
other projects.  Getting it in as a patch (with a more concrete road map)
is probably the next step we'd need to take.

I do want to echo what jonathan has said in other parts of this thread,
that the transition plan itself doesn't depend on which hash function we
end up going with in the end.  I fully expect that for the transition
plan to succeed that we'll have infrastructure for dropping in different
hash functions so that we can do some sort of benchmarking before
selecting one to use.  This would also give us the ability to more
easily transition to another hash function when the time comes.

-- 
Brandon Williams


Re: [PATCH 00/20] Read `packed-refs` using mmap()

2017-09-14 Thread Michael Haggerty
On 09/13/2017 07:15 PM, Michael Haggerty wrote:
> [...]
> * `mmap()` the whole file rather than `read()`ing it.
> [...]
Apparently this doesn't work on Windows, because the `snapshot` is
keeping the `packed-refs` file open too long, so the new file can't be
renamed on top of it.

I didn't realize that this is even allowed, but TIL that you can close a
file while keeping it mmapped. Does that technique work on Windows? If
so, I'll change v2 to do it as sketched below.

Michael

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 8235ac8506..95c1cd2a27 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -35,11 +35,8 @@ struct snapshot {
 */
struct packed_ref_store *refs;

-   /*
-* The file descriptor of the `packed-refs` file (open in
-* read-only mode), or -1 if it is not open.
-*/
-   int fd;
+   /* Is the `packed-refs` file currently mmapped? */
+   int mmapped;

/*
 * The contents of the `packed-refs` file. If the file was
@@ -135,12 +132,11 @@ static void acquire_snapshot(struct snapshot
*snapshot)
  */
 static void clear_snapshot_buffer(struct snapshot *snapshot)
 {
-   if (snapshot->fd >= 0) {
+   if (snapshot->mmapped) {
if (munmap(snapshot->buf, snapshot->eof - snapshot->buf))
die_errno("error ummapping packed-refs file %s",
  snapshot->refs->path);
-   close(snapshot->fd);
-   snapshot->fd = -1;
+   snapshot->mmapped = 0;
} else {
free(snapshot->buf);
}
@@ -525,6 +521,7 @@ static const char *find_reference_location(struct
snapshot *snapshot,
 static struct snapshot *create_snapshot(struct packed_ref_store *refs)
 {
struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));
+   int fd;
struct stat st;
size_t size;
int sorted = 0;
@@ -533,8 +530,8 @@ static struct snapshot *create_snapshot(struct
packed_ref_store *refs)
acquire_snapshot(snapshot);
snapshot->peeled = PEELED_NONE;

-   snapshot->fd = open(refs->path, O_RDONLY);
-   if (snapshot->fd < 0) {
+   fd = open(refs->path, O_RDONLY);
+   if (fd < 0) {
if (errno == ENOENT) {
/*
 * This is OK; it just means that no
@@ -549,15 +546,16 @@ static struct snapshot *create_snapshot(struct
packed_ref_store *refs)
}
}

-   stat_validity_update(>validity, snapshot->fd);
+   stat_validity_update(>validity, fd);

-   if (fstat(snapshot->fd, ) < 0)
+   if (fstat(fd, ) < 0)
die_errno("couldn't stat %s", refs->path);

size = xsize_t(st.st_size);
-   snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE,
- snapshot->fd, 0);
+   snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
snapshot->eof = snapshot->buf + size;
+   snapshot->mmapped = 1;
+   close(fd);

/* If the file has a header line, process it: */
if (snapshot->buf < snapshot->eof && *snapshot->buf == '#') {


Re: Rebase & submodules

2017-09-14 Thread Nicolas Morey-Chaisemartin


Le 14/09/2017 à 17:39, Robert Dailey a écrit :
> So I often will have a submodule that points to one of my own forks,
> because I will have work done on a feature branch that hasn't been
> merged upstream yet. Assuming this merge takes a long time to get
> approved, I will occasionally rebase my topic branch to keep things up
> to date and clean.
>
> However, any previous commits in my parent repository that refer to a
> SHA1 prior to the rebase will now be pointing to dangling/orphaned
> commits, which means I wouldn't be able to go back to that commit in
> history and do `git submodule update` since that checkout will fail.
>
> One obvious solution to this is: Don't rebase. But, this could result
> in a lot of merging between the upstream and my fork which would make
> things not only ugly, but prevent me from making a pull request that
> makes sense to the upstream repository (They'd likely request a rebase
> in order to accept the PR, which I wouldn't be able to do for the
> reasons outlined above).
>
> Are there any other workflows that would support this kind of model better?

Without changing your workflow too much, simply add an annotated tag to your 
branch before your rebase.
This way the SHA1 will always exists. Unless you want to cleanup at some point 
(branch merged ?) and then you can simply delete all those old tags.

Nicolas




Re: RFC v3: Another proposed hash function transition plan

2017-09-14 Thread demerphq
On 14 September 2017 at 17:23, Johannes Schindelin
 wrote:
> Hi Junio,
>
> On Thu, 14 Sep 2017, Junio C Hamano wrote:
>
>> Jonathan Nieder  writes:
>>
>> > In other words, a long lifetime for the hash absolutely is a design
>> > goal.  Coping well with an unexpectedly short lifetime for the hash is
>> > also a design goal.
>> >
>> > If the hash function lasts 10 years then I am happy.
>>
>> Absolutely.  When two functions have similar expected remaining life
>> and are equally widely supported, then faster is better than slower.
>> Otherwise our primary goal when picking the function from candidates
>> should be to optimize for its remaining life and wider availability.
>
> SHA-256 has been hammered on a lot more than SHA3-256.

Last year that was even more true of SHA1 than it is true of SHA-256 today.

Anyway,
Yves
-- 
perl -Mre=debug -e "/just|another|perl|hacker/"


Rebase & submodules

2017-09-14 Thread Robert Dailey
So I often will have a submodule that points to one of my own forks,
because I will have work done on a feature branch that hasn't been
merged upstream yet. Assuming this merge takes a long time to get
approved, I will occasionally rebase my topic branch to keep things up
to date and clean.

However, any previous commits in my parent repository that refer to a
SHA1 prior to the rebase will now be pointing to dangling/orphaned
commits, which means I wouldn't be able to go back to that commit in
history and do `git submodule update` since that checkout will fail.

One obvious solution to this is: Don't rebase. But, this could result
in a lot of merging between the upstream and my fork which would make
things not only ugly, but prevent me from making a pull request that
makes sense to the upstream repository (They'd likely request a rebase
in order to accept the PR, which I wouldn't be able to do for the
reasons outlined above).

Are there any other workflows that would support this kind of model better?


Re: RFC v3: Another proposed hash function transition plan

2017-09-14 Thread Johannes Schindelin
Hi Junio,

On Thu, 14 Sep 2017, Junio C Hamano wrote:

> Jonathan Nieder  writes:
> 
> > In other words, a long lifetime for the hash absolutely is a design
> > goal.  Coping well with an unexpectedly short lifetime for the hash is
> > also a design goal.
> >
> > If the hash function lasts 10 years then I am happy.
> 
> Absolutely.  When two functions have similar expected remaining life
> and are equally widely supported, then faster is better than slower.
> Otherwise our primary goal when picking the function from candidates
> should be to optimize for its remaining life and wider availability.

SHA-256 has been hammered on a lot more than SHA3-256.

That would be a strong point in favor of SHA2.

Ciao,
Dscho


[PATCH 2/2] test-lib: ulimit does not limit on CYGWIN and MINGW

2017-09-14 Thread Michael J Gruber
ulimit succeeds (by return value) but does not limit on some systems.

Set ulimit() to false on these systems so that we do not rely on its
output nor effect. As an intended side-effect, ulimit based
prerequisites are set correctly (to "not-have") on these systems.

Reported-by: Ramsay Jones 
Reported-by: Adam Dinwoodie 
Reported-by: Johannes Schindelin 
Signed-off-by: Michael J Gruber 
---
This is independent of my series, but should best go before so that no
ulimit based test is run on CYGWIN and MINGW.

It follows the basic assumption that a tool like ulimit is either
present and functional or not present; and that we work around by
defines or such when that assumption is broken.
(Alternatively, we could set ULIMT_LIMITS or so and depend on that.)

 t/test-lib.sh | 8 
 1 file changed, 8 insertions(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index b8a0b05102..f6ac60d487 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -998,6 +998,10 @@ case $uname_s in
mkfifo () {
false
}
+   # ulimit succeeds but does not limit
+   ulimit () {
+   false
+   }
# no POSIX permissions
# backslashes in pathspec are converted to '/'
# exec does not inherit the PID
@@ -1012,6 +1016,10 @@ case $uname_s in
mkfifo () {
false
}
+   # ulimit succeeds but does not limit
+   ulimit () {
+   false
+   }
test_set_prereq POSIXPERM
test_set_prereq EXECKEEPSPID
test_set_prereq CYGWIN
-- 
2.14.1.712.gda4591c8a2



[PATCH 1/2] test-lib: group system specific FIFO tests by system

2017-09-14 Thread Michael J Gruber
test-lib determines whether a file-system supports FIFOs and needs to do
special casing for CYGWIN and MINGW. This separates those system
specific settings from those at more central place.

Set mkfifo()  to false in the central system specific place so that the
same test works everywhere.

Signed-off-by: Michael J Gruber 
---
 t/test-lib.sh | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 5fbd8d4a90..b8a0b05102 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -994,6 +994,10 @@ case $uname_s in
pwd () {
builtin pwd -W
}
+   # no FIFOs
+   mkfifo () {
+   false
+   }
# no POSIX permissions
# backslashes in pathspec are converted to '/'
# exec does not inherit the PID
@@ -1004,6 +1008,10 @@ case $uname_s in
GIT_TEST_CMP=mingw_test_cmp
;;
 *CYGWIN*)
+   # no FIFOs
+   mkfifo () {
+   false
+   }
test_set_prereq POSIXPERM
test_set_prereq EXECKEEPSPID
test_set_prereq CYGWIN
@@ -1062,14 +1070,7 @@ test_i18ngrep () {
 
 test_lazy_prereq PIPE '
# test whether the filesystem supports FIFOs
-   case $(uname -s) in
-   CYGWIN*|MINGW*)
-   false
-   ;;
-   *)
-   rm -f testfifo && mkfifo testfifo
-   ;;
-   esac
+   rm -f testfifo && mkfifo testfifo
 '
 
 test_lazy_prereq SYMLINKS '
-- 
2.14.1.712.gda4591c8a2



Re: [PATCH v2] commit-template: change a message to be more intuitive

2017-09-14 Thread Jeff King
On Thu, Sep 14, 2017 at 07:47:51PM +0530, Kaartic Sivaraam wrote:

> > > - const char *explanation = _("Do not touch the line above.\nEverything 
> > > below will be removed.");
> > > + const char *explanation = _("Do not modify or remove the line 
> > > above.\nEverything below will be removed.");
> > I don't want to complicate things. But now - due to the repeated usage
> > of "remove" - these two sentences seem to be connected by an invisible
> > "or else" ("will" vs. "would" not withstanding). i.e. "or else
> > everything below will be removed, too", which is wrong. Before, they
> > were separated more clearly.
> 
> At least I don't think they are connected by an invisible "or else" because
> they appear on different lines,

You could switch out the second "removed" for "ignored", which I think
has the same meaning but avoids repeating the same word. So:

  #  >8 
  # Do not modify or remove the line above.
  # Everything below will be ignored.

(I might say "everything below it" to be more precise).

I dunno. We are deep in bikeshed territory here, and I admit I don't
overly care that much in the first place.

-Peff


Re: [PATCH 0/3] merge-base --fork-point fixes

2017-09-14 Thread Jeff King
On Thu, Sep 14, 2017 at 03:15:17PM +0200, Michael J Gruber wrote:

> merge-base --fork-point does not quite work as advertised when the
> reflog is empty or partial. This series brings it in line with the
> documentation and, hopefully, with the original intent.
> 
> Michael J Gruber (3):
>   t6010: test actual test output
>   merge-base: return fork-point outside reflog
>   merge-base: find fork-point outside partial reflog

Patches 1 and 3 look sane to me. Your argument in patch 2 looks
plausible to me, but I'm not familiar enough with the mechanics of
fork-point to have a solid opinion.

-Peff


Re: [PATCH 3/3] merge-base: find fork-point outside partial reflog

2017-09-14 Thread Jeff King
On Thu, Sep 14, 2017 at 03:15:20PM +0200, Michael J Gruber wrote:

> In fork-point mode, merge-base adds the commit at refname to a list of
> candidates to walk from only when the reflog is empty. Therefore, it
> fails to find merge bases that are descendants of the most recent reflog
> entry.
> 
> Add the commit at the tip unconditionally so that a merge base on the
> history of refname is found in any case, independent of the state of the
> reflog.

OK, so the reflog we are interested in here is one where the tip commit
is not mentioned. I'm not sure how one gets into that state, but I agree
it makes sense to act as if it's there.

-Peff


Re: [PATCH 1/3] t6010: test actual test output

2017-09-14 Thread Jeff King
On Thu, Sep 14, 2017 at 03:15:18PM +0200, Michael J Gruber wrote:

> 4f21454b55 ("merge-base: handle --fork-point without reflog",
> 2016-10-12) introduced a fix for merge-base --fork-point without reflog
> and a test. While that test is fine, it did not update expected nor actual
> output and tested that of the previous test instead.

Oops. The use of the existing "expect3" was intentional (the idea being
that we are repeating the identical step from the previous test with the
slight tweak of not having a reflog). But obviously failing to write to
"actual" at all was a mistake.

That said, I'm OK with reiterating the expected output.

-Peff


RE: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-14 Thread Kevin Willford
> From: Junio C Hamano [mailto:gits...@pobox.com]
> Sent: Wednesday, September 13, 2017 4:18 PM
> 
> Jacob Keller  writes:
> 
> > By definition if you do a sparse checkout, you're telling git "I only
> > care about the files in this sparse checkout, and do not concern me
> > with anything else"... So the proposed fix is "since git cleared the
> > skip-worktree flag, we should actually also copy the file out again."
> > but I think the real problem is that we're clearing skip worktree to
> > begin with?
> 
> As this 100% agree with what I've been saying, I do not have
> anything to add to the discussion at the moment, so I'll stop
> speaking now but will continue to monitor the thread so that I may
> hear a new argument and datapoint that would make me change my mind.
> 
> Thanks for a healthy discussion.

Hi Junio,

Thanks for the feedback.  I will give this implementation a try.  I still
have the following unanswered questions, if you wouldn't mind
answering.

1. Does this statement, "I only care about the files in this
sparse checkout, and do not concern me with anything else", mean
that git should not change files outside the sparse-checkout whether
that be in the working directory or in the index?  Or does that only
apply to the working directory and the index version of files can
change to whatever git the git command would do without using
sparse?  For example if I am doing a 'git reset HEAD~1'  should the
version in the index of files outside the sparse not be changed or
change to the HEAD~1 version with the skip-worktree bit on?
2. How will this work with other git commands like merge, rebase,
cherry-pick, etc.?  
3. What about when there is a merge conflict with a file that is outside
the sparse checkout?  Should there not be a conflict because
git shouldn't be trying to merge the file since it is outside the sparse
checkout area?  Should the conflicted file get written outside the
sparse checkout so the user can resolve it?

Thanks,
Kevin



Re: [PATCH v2] commit-template: change a message to be more intuitive

2017-09-14 Thread Kaartic Sivaraam

On Thursday 14 September 2017 01:06 PM, Michael J Gruber wrote:

Kaartic Sivaraam venit, vidit, dixit 13.09.2017 15:05:

  void wt_status_add_cut_line(FILE *fp)
  {
-   const char *explanation = _("Do not touch the line above.\nEverything below 
will be removed.");
+   const char *explanation = _("Do not modify or remove the line 
above.\nEverything below will be removed.");

I don't want to complicate things. But now - due to the repeated usage
of "remove" - these two sentences seem to be connected by an invisible
"or else" ("will" vs. "would" not withstanding). i.e. "or else
everything below will be removed, too", which is wrong. Before, they
were separated more clearly.


At least I don't think they are connected by an invisible "or else" because
they appear on different lines,

   #

   #  >8 

   # Do not modify or remove the line above.

   # Everything below will be removed.

   diff --git a/hook-checks/hooks/applypatch-msg 
b/hook-checks/hooks/applypatch-msg

   new file mode 100755

   index 000..a5d7b84

   --- /dev/null

   +++ b/hook-checks/hooks/applypatch-msg


Also, given all the translations that we have, it seems somewhat strange
to try and foresee and workaround possible misunderstandings of commonly
used English phrases.
I thought the translators wouldn't mind this! If they do, then it's 
better to drop

this patch.

---
Kaartic


Re: [PATCH 0/3] merge-base --fork-point fixes

2017-09-14 Thread Johannes Schindelin
Hi Michael,

On Thu, 14 Sep 2017, Michael J Gruber wrote:

> merge-base --fork-point does not quite work as advertised when the
> reflog is empty or partial. This series brings it in line with the
> documentation and, hopefully, with the original intent.

Thank you so much for working on this. I recently tried to use
--fork-point in a script of mine and it failed in this surprising way,
too. So here's hoping for this patch series being accepted quickly (and
sorry in advance for not having time to review...).

Ciao,
Dscho


[PATCH 0/3] merge-base --fork-point fixes

2017-09-14 Thread Michael J Gruber
merge-base --fork-point does not quite work as advertised when the
reflog is empty or partial. This series brings it in line with the
documentation and, hopefully, with the original intent.

Michael J Gruber (3):
  t6010: test actual test output
  merge-base: return fork-point outside reflog
  merge-base: find fork-point outside partial reflog

 builtin/merge-base.c  | 20 
 t/t6010-merge-base.sh | 21 +++--
 2 files changed, 23 insertions(+), 18 deletions(-)

-- 
2.14.1.712.gda4591c8a2



[PATCH 1/3] t6010: test actual test output

2017-09-14 Thread Michael J Gruber
4f21454b55 ("merge-base: handle --fork-point without reflog",
2016-10-12) introduced a fix for merge-base --fork-point without reflog
and a test. While that test is fine, it did not update expected nor actual
output and tested that of the previous test instead.

Fix the test to test the merge-base output, not just its return code.

Signed-off-by: Michael J Gruber 
---
 t/t6010-merge-base.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh
index 31db7b5f91..17fffd7998 100755
--- a/t/t6010-merge-base.sh
+++ b/t/t6010-merge-base.sh
@@ -262,8 +262,9 @@ test_expect_success 'using reflog to find the fork point' '
 
 test_expect_success '--fork-point works with empty reflog' '
git -c core.logallrefupdates=false branch no-reflog base &&
-   git merge-base --fork-point no-reflog derived &&
-   test_cmp expect3 actual
+   git rev-parse base >expect &&
+   git merge-base --fork-point no-reflog derived >actual &&
+   test_cmp expect actual
 '
 
 test_expect_success 'merge-base --octopus --all for complex tree' '
-- 
2.14.1.712.gda4591c8a2



[PATCH 3/3] merge-base: find fork-point outside partial reflog

2017-09-14 Thread Michael J Gruber
In fork-point mode, merge-base adds the commit at refname to a list of
candidates to walk from only when the reflog is empty. Therefore, it
fails to find merge bases that are descendants of the most recent reflog
entry.

Add the commit at the tip unconditionally so that a merge base on the
history of refname is found in any case, independent of the state of the
reflog.

Signed-off-by: Michael J Gruber 
---
 builtin/merge-base.c  | 2 +-
 t/t6010-merge-base.sh | 8 
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 926a7615ea..b5b4cf7eac 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -174,7 +174,7 @@ static int handle_fork_point(int argc, const char **argv)
revs.initial = 1;
for_each_reflog_ent(refname, collect_one_reflog_ent, );
 
-   if (!revs.nr && !get_oid(refname, ))
+   if (!get_oid(refname, ))
add_one_commit(, );
 
for (i = 0; i < revs.nr; i++)
diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh
index 850463d4f2..78342896c7 100755
--- a/t/t6010-merge-base.sh
+++ b/t/t6010-merge-base.sh
@@ -275,6 +275,14 @@ test_expect_success '--fork-point works with merge-base 
outside reflog' '
test_cmp expect actual
 '
 
+test_expect_success '--fork-point works with merge-base outside partial 
reflog' '
+   git -c core.logallrefupdates=true branch partial-reflog base &&
+   git rev-parse no-reflog >.git/refs/heads/partial-reflog &&
+   git rev-parse no-reflog >expect &&
+   git merge-base --fork-point partial-reflog no-reflog >actual &&
+   test_cmp expect actual
+'
+
 test_expect_success 'merge-base --octopus --all for complex tree' '
# Best common ancestor for JE, JAA and JDD is JC
# JE
-- 
2.14.1.712.gda4591c8a2



[PATCH 2/3] merge-base: return fork-point outside reflog

2017-09-14 Thread Michael J Gruber
4f21454b55 ("merge-base: handle --fork-point without reflog",
2016-10-12) fixed the case without reflog, but only partially:
The original code checks whether the merge base candidates are from the
list that we started with, which was the list of reflog entries before
4f21454b55 and the list of reflog entries - or the ref itself if empty -
after. The test from 4f21454b55 tested in a situation where the merge
base candidate equalled the commit at refname, so it was on this (1
item) list by accident.

In fact, per documentation "--fork-point" looks at the reflog in
addition to doing the usual walk from the tip. The original design
description in d96855ff51 ("merge-base: teach "--fork-point" mode",
2013-10-23) describes this as computing from a virtual merge-base of all
the historical tips of refname. They may or may not all be present in
the reflog (think pruning, non-ff fetching, fast forwarding etc.),
so filtering by the current contents of the reflog is potentially
harmful, and it does not seem to fulfill any purpose in the original
design.

Remove the filtering and add a test for an out-of-reflog merge base.

Reported-by: Ekelhart Jakob 
Signed-off-by: Michael J Gruber 
---
 builtin/merge-base.c  | 18 +++---
 t/t6010-merge-base.sh |  8 
 2 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 6dbd167d3b..926a7615ea 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -186,23 +186,11 @@ static int handle_fork_point(int argc, const char **argv)
 * There should be one and only one merge base, when we found
 * a common ancestor among reflog entries.
 */
-   if (!bases || bases->next) {
+   if (!bases || bases->next)
ret = 1;
-   goto cleanup_return;
-   }
-
-   /* And the found one must be one of the reflog entries */
-   for (i = 0; i < revs.nr; i++)
-   if (>item->object == [i]->object)
-   break; /* found */
-   if (revs.nr <= i) {
-   ret = 1; /* not found */
-   goto cleanup_return;
-   }
-
-   printf("%s\n", oid_to_hex(>item->object.oid));
+   else
+   printf("%s\n", oid_to_hex(>item->object.oid));
 
-cleanup_return:
free_commit_list(bases);
return ret;
 }
diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh
index 17fffd7998..850463d4f2 100755
--- a/t/t6010-merge-base.sh
+++ b/t/t6010-merge-base.sh
@@ -267,6 +267,14 @@ test_expect_success '--fork-point works with empty reflog' 
'
test_cmp expect actual
 '
 
+test_expect_success '--fork-point works with merge-base outside reflog' '
+   git -c core.logallrefupdates=false checkout no-reflog &&
+   git -c core.logallrefupdates=false commit --allow-empty -m "Commit 
outside reflogs" &&
+   git rev-parse base >expect &&
+   git merge-base --fork-point no-reflog derived >actual &&
+   test_cmp expect actual
+'
+
 test_expect_success 'merge-base --octopus --all for complex tree' '
# Best common ancestor for JE, JAA and JDD is JC
# JE
-- 
2.14.1.712.gda4591c8a2



WINNING NOTIFICATION

2017-09-14 Thread P
Congratulation you have won 85,000,000.00GBP For Claim 
email:i...@winningsclaimsoffice.com


Re: [PATCH 2/4] push, fetch: error out for submodule entries not pointing to commits

2017-09-14 Thread Heiko Voigt
On Tue, Sep 12, 2017 at 10:30:27AM -0700, Jonathan Nieder wrote:
> From: Stefan Beller 
> 
> The check_has_commit helper uses resolves a submodule entry to a
> commit, when validating its existence. As a side effect this means
> tolerates a submodule entry pointing to a tag, which is not a valid
> submodule entry that git commands would know how to cope with.
> 
> Tighten the check to require an actual commit, not a tag pointing to a
> commit.
> 
> Also improve the error handling when a submodule entry points to
> non-commit (e.g., a blob) to error out instead of warning and
> pretending the pointed to object doesn't exist.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Looks good to me.

Cheers Heiko


Re: RFC v3: Another proposed hash function transition plan

2017-09-14 Thread Johannes Schindelin
Hi Jonathan,

On Wed, 13 Sep 2017, Jonathan Nieder wrote:

> As a side note, I am probably misreading, but I found this set of
> paragraphs a bit condescending.  It sounds to me like you are saying
> "You are making the wrong choice of hash function and everything else
> you are describing is irrelevant when compared to that monumental
> mistake.  Please stop working on things I don't consider important".
> With that reading it is quite demotivating to read.

I am sorry you read it that way. I did not feel condescending when I wrote
that mail, I felt annoyed by the side track, and anxious. In my mind, the
transition is too important for side tracking, and I worry that we are not
fast enough (imagine what would happen if a better attack was discovered
that is not as easily detected as the one we know about?).

> An alternative reading is that you are saying that the transition plan
> described in this thread is not ironed out.  Can you spell that out
> more?  What particular aspect of the transition plan (which is of
> course orthogonal to the choice of hash function) are you discontent
> with?

My impression from reading Junio's mail was that he does not consider the
transition plan ironed out yet, and that he wants to spend time on
discussing generation numbers right now.

I was in particularly frightened by the suggestion to "reboot" [*1*].
Hopefully I misunderstand and he meant "finishing touches" instead.

As to *my* opinion: after reading https://goo.gl/gh2Mzc (is it really
correct that its last update has been on March 6th?), my only concern is
really that it still talks about SHA3-256 when I think that the
performance benefits of SHA-256 (think: "Git at scale", and also hardware
support) really make the latter a better choice.

In order to be "ironed out", I think we need to talk about the
implementation detail "Translation table". This is important. It needs to
be *fast*.

Speaking of *fast*, I could imagine that it would make sense to store the
SHA-1 objects on disk, still, instead of converting them on the fly. I am
not sure whether this is something we need to define in the document,
though, as it may very well be premature optimization; Maybe mention that
we could do this if necessary?

Apart from that, I would *love* to see this document as The Official Plan
that I can Show To The Manager so that I can ask to Allocate Time.

Ciao,
Dscho

Footnote *1*:
https://public-inbox.org/git/xmqqa828733s@gitster.mtv.corp.google.com/


Re: how to remove from history just *one* version of a file/dir?

2017-09-14 Thread Jeff King
On Thu, Sep 14, 2017 at 07:32:11AM -0400, Robert P. J. Day wrote:

>   [is this the right place to ask questions about git usage? or is
> there a different forum where one can submit possibly embarrassingly
> silly questions?]

No, this is the right place for embarrassing questions. :)

>   say, early on, one commits a sizable directory of content, call it
> /mydir. that directory sits there for a while until it becomes obvious
> it's out of date and worthless and should never have been committed.
> the obvious solution would seem to be:
> 
>   $ git filter-branch --tree-filter 'rm -rf /mydir' HEAD
> 
> correct?

That would work, though note that using an --index-filter would be more
efficient (since it avoids checking out each tree as it walks the
history).

>   however, say one version of that directory was committed early on,
> then later tossed for being useless with "git rm", and subsequently
> replaced by newer content under exactly the same name. now i'd like to
> go back and delete the history related to that early version of
> /mydir, but not the second.

Makes sense as a goal.

>   obviously, i can't use the above command as it would delete both
> versions. so it appears the solution would be a trivial application of
> the "--commit-filter" option:
> 
>git filter-branch --commit-filter '
>  if [ "$GIT_COMMIT" = "" ] ; then
>skip_commit "$@";
>  else
>git commit-tree "$@";
>  fi' HEAD
> 
> where  is the commit that introduced the first verrsion of
> /mydir. do i have that right? is there a simpler way to do this?

No, this won't work. Filter-branch is not walking the history and
applying the changes to each commit, like rebase does.  It's literally
operating on each commit object, and recall that each commit object
points to a tree that is a snapshot of the repository contents.

So if you skip a commit, that commit itself goes away. But the commit
after it (which didn't touch the unwanted contents) will still mention
those contents in its tree.

I think you want to stick with a --tree-filter (or an --index-filter),
but just selectively decide when to do the deletion. For example, if you
can tell the difference between the two states based on the presence of
some file, then perhaps:

  git filter-branch --prune-empty --index-filter '
if git rev-parse --verify :dir/sentinel >/dev/null 2>&1
then
  git rm --cached -rf dir
fi
  ' HEAD

The "--prune-empty" is optional, but will drop commits that become empty
because they _only_ touched that directory.

We use ":dir/sentinel" to see if the entry is in the index, because the
index filter won't have the tree checked out. Likewise, we need to use
"rm --cached" to just touch the index.

-Peff


Re: should "git tag" option of "-a" conflict with "-s", "-u"?

2017-09-14 Thread Jeff King
On Thu, Sep 14, 2017 at 06:08:10AM -0400, Robert P. J. Day wrote:

>   just noticed in "man git-tag":
> 
>   1)  "-a" explicitly claims to create an unsigned tag
>   2)  both "-s" and "-u" both claim to create signed tags
> 
> should that mean that it should be an error to try to use "-a" in
> combination with either of "-u" or "-s"? the synopsis does suggest
> that you can use only one of those at a time:
> 
>   git tag [-a | -s | -u ] ...
> 
> but i tested with (from memory) using both "-a" and "-s", and tag
> actually tried to do it, but failed simply because i had no key.
> clearly, though, it was making the attempt.
> 
>   can someone clarify this?

I think the documentation could be a little more clear. "-a" just asks
to create a tag object, rather than a "lightweight" tag. Using "-s" asks
for a signature, which requires having a tag object, and thus implies
"-a".

So there's no point in using the two together (since "-s" already
implies "-a"), but it also doesn't hurt to do so.

-Peff


how to remove from history just *one* version of a file/dir?

2017-09-14 Thread Robert P. J. Day

  [is this the right place to ask questions about git usage? or is
there a different forum where one can submit possibly embarrassingly
silly questions?]

  i've been perusing "git filter-branch", and i'm curious if i have
the right idea about how to very selectively get rid of some useless
history.

  say, early on, one commits a sizable directory of content, call it
/mydir. that directory sits there for a while until it becomes obvious
it's out of date and worthless and should never have been committed.
the obvious solution would seem to be:

  $ git filter-branch --tree-filter 'rm -rf /mydir' HEAD

correct?

  however, say one version of that directory was committed early on,
then later tossed for being useless with "git rm", and subsequently
replaced by newer content under exactly the same name. now i'd like to
go back and delete the history related to that early version of
/mydir, but not the second.

  obviously, i can't use the above command as it would delete both
versions. so it appears the solution would be a trivial application of
the "--commit-filter" option:

   git filter-branch --commit-filter '
 if [ "$GIT_COMMIT" = "" ] ; then
   skip_commit "$@";
 else
   git commit-tree "$@";
 fi' HEAD

where  is the commit that introduced the first verrsion of
/mydir. do i have that right? is there a simpler way to do this?

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: [PATCH v5 10/40] Add initial external odb support

2017-09-14 Thread Christian Couder
On Thu, Aug 3, 2017 at 9:34 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> diff --git a/external-odb.h b/external-odb.h
>> new file mode 100644
>> index 00..9989490c9e
>> --- /dev/null
>> +++ b/external-odb.h
>> @@ -0,0 +1,8 @@
>> +#ifndef EXTERNAL_ODB_H
>> +#define EXTERNAL_ODB_H
>> +
>> +const char *external_odb_root(void);
>> +int external_odb_has_object(const unsigned char *sha1);
>> +int external_odb_get_object(const unsigned char *sha1);
>
> Even though ancient codebase of ours deliberately omitted them, I
> think our recent trend is to explicitly spell "extern " in headers.
>
>> diff --git a/odb-helper.h b/odb-helper.h
>> new file mode 100644
>> index 00..5800661704
>> --- /dev/null
>> +++ b/odb-helper.h
>
> Likewise.

Ok, I am adding "extern " to the headers.


should "git tag" option of "-a" conflict with "-s", "-u"?

2017-09-14 Thread Robert P. J. Day

  just noticed in "man git-tag":

  1)  "-a" explicitly claims to create an unsigned tag
  2)  both "-s" and "-u" both claim to create signed tags

should that mean that it should be an error to try to use "-a" in
combination with either of "-u" or "-s"? the synopsis does suggest
that you can use only one of those at a time:

  git tag [-a | -s | -u ] ...

but i tested with (from memory) using both "-a" and "-s", and tag
actually tried to do it, but failed simply because i had no key.
clearly, though, it was making the attempt.

  can someone clarify this?

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: [PATCH v5 14/40] external-odb: accept only blobs for now

2017-09-14 Thread Christian Couder
On Thu, Aug 3, 2017 at 9:52 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> The mechanism to decide which blobs should be sent to which
>> external object database will be very simple for now.
>> If the external odb helper support any "put_*" instruction
>> all the new blobs will be sent to it.
>>
>> Signed-off-by: Christian Couder 
>> ---
>>  external-odb.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/external-odb.c b/external-odb.c
>> index 82fac702e8..a4f8c72e1c 100644
>> --- a/external-odb.c
>> +++ b/external-odb.c
>> @@ -124,6 +124,10 @@ int external_odb_put_object(const void *buf, size_t len,
>>  {
>>   struct odb_helper *o;
>>
>> + /* For now accept only blobs */
>> + if (strcmp(type, "blob"))
>> + return 1;
>> +
>
> I somehow doubt that a policy decision like this should be made at
> this layer.  Shouldn't it be encoded in the capability the other
> side supports, or determined at runtime per each individual object
> when a "put" is attempted (i.e. allow the other side to say "You
> tell me that you want me to store an object of type X and size Y;
> I cannot do that, sorry").

I agree that it would be conceptually better to be able to support
other kind of objects in external odb, but realistically most use
cases for 'get_*' and 'put_*' instructions are for storing/retrieving
blobs as other kind of objects are in specific formats that are well
supported by the current object store.

I also agree that it would be a nice feature if external odb could
decide by themselves which objects they accept and I really want to
leave the door open to a future improvement implementing that using
the capability mechanism or perhaps another mechanism. For now though
in "[PATCH v5 34/40] external-odb: use 'odb=magic' attribute to mark
odb blobs" the attribute system is used to decide which blobs are put
into which external odb as it is probably good enough for many use
cases. It is also simple to implement in Git and makes helpers simpler
to implement.


Re: [PATCH v5 13/40] external odb: add 'put_raw_obj' support

2017-09-14 Thread Christian Couder
On Thu, Aug 3, 2017 at 9:50 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> Add support for a 'put_raw_obj' capability/instruction to send new
>> objects to an external odb. Objects will be sent as they are (in
>> their 'raw' format). They will not be converted to Git objects.
>>
>> For now any new Git object (blob, tree, commit, ...) would be sent
>> if 'put_raw_obj' is supported by an odb helper. This is not a great
>> default, but let's leave it to following commits to tweak that.
>>
>> Signed-off-by: Christian Couder 
>> ---
>
> I thought in an earlier step that I saw this thing initialized in
> the codepath that adds alternate object stores, which are read-only
> places we "borrow" from.  Being able to write into it is good, but
> conceptually it no longer feels correct to initialize it from the
> alternate object database initialization codepath.
>
> Another way to say it is that an object store, whether it is local
> or external, is not "alt" if it will result in storing new objects
> we locally create.  It's just an extension of our local object
> store.

I guess you are talking about the following code in "[PATCH v5 10/40]
Add initial external odb support":

+void prepare_external_alt_odb(void)
+{
+   static int linked_external;
+   const char *path;
+
+   if (linked_external)
+   return;
+
+   path = external_odb_root();
+   if (!access(path, F_OK)) {
+   link_alt_odb_entry(path, NULL, 0, "");
+   linked_external = 1;
+   }
+}
+
 void prepare_alt_odb(void)
 {
const char *alt;
@@ -650,6 +666,7 @@ void prepare_alt_odb(void)
link_alt_odb_entries(alt, strlen(alt), PATH_SEP, NULL, 0);

read_info_alternates(get_object_directory(), 0);
+   prepare_external_alt_odb();
 }

Would it be ok if I do the following:

- rename prepare_external_alt_odb() to just prepare_external_odb(), as
this would avoid confusion between alt_odbs and external odbs
- remove the call to prepare_external_odb() in prepare_alt_odb()
- add a prepare_alt_and_external_odb() that just calls
prepare_alt_odb() and then prepare_external_odb()
- replace all the calls to prepare_alt_odb() with calls to
prepare_alt_and_external_odb()

?


Re: [PATCH v5 25/40] external-odb: add 'get_direct' support

2017-09-14 Thread Christian Couder
On Thu, Aug 3, 2017 at 11:40 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> This implements the 'get_direct' capability/instruction that makes
>> it possible for external odb helper scripts to pass blobs to Git
>> by directly writing them as loose objects files.
>
> I am not sure if the assumption is made clear in this series, but I
> am (perhaps incorrectly) guessing that it is assumed that the
> intended use of this feature is to offload access to large blobs
> by not including them in the initial clone.

Yeah, it could be used for that, but that's not the only interesting use case.

It could also be used for example if the working tree contains a huge
number of blobs and it is better to download only the blobs that are
needed when they are needed. In fact the code for 'get_direct' was
taken from Ben Peart's "read-object" patch series (actually from an
earlier version of this patch series):

https://public-inbox.org/git/20170714132651.170708-1-benpe...@microsoft.com/

> So from that point of
> view, I think it makes tons of sense to let the external helper to
> directly populate the database bypassing Git (i.e. instead of
> feeding data stream and have Git store it) like this "direct" method
> does.
>
> How does this compare with (and how well does this work with) what
> Jonathan Tan is doing recently?

>From the following email:

https://public-inbox.org/git/20170804145113.5ceaf...@twelve2.svl.corp.google.com/

it looks like his work is fundamentally about changing the rules of
connectivity checks. Objects are split between "homegrown" objects and
"imported" objects which are in separate pack files. Then references
to imported objects are not checked during connectivity check.

I think changing connectivity rules is not necessary to make something
like external odb work. For example when fetching a pack that refers
to objects that are in an external odb, if access this external odb
has been configured, then the connectivity check will pass as the
missing objects in the pack will be seen as already part of the repo.

Yeah, if some commands like fsck are used, then possibly all the
objects will have to be requested from the external odb, as it may not
be possible to fully check all the objects, especially the blobs,
without accessing all their data. But I think this is a problem that
could be dealt with in different ways. For example we could develop
specific options in fsck so that it doesn't check the sha1 of objects
that are marked with some specific attributes, or that are stored in
external odbs, or that are bigger than some size.


Re: [PATCH] test-lib: don't use ulimit in test prerequisites on cygwin

2017-09-14 Thread Michael J Gruber
Jonathan Nieder venit, vidit, dixit 13.09.2017 21:20:
> Ramsay Jones wrote:
> 
>> On cygwin (and MinGW), the 'ulimit' built-in bash command does not have
>> the desired effect of limiting the resources of new processes, at least
>> for the stack and file descriptors. However, it always returns success
>> and leads to several test prerequisites being erroneously set to true.
>>
>> Add a check for cygwin and MinGW to the prerequisite expressions, using
>> 'uname -s', and return false instead of (indirectly) using 'ulimit'.
>> This affects the prerequisite expressions for the ULIMIT_STACK_SIZE,
>> CMDLINE_LIMIT and ULIMIT_FILE_DESCRIPTORS prerequisites.
>>
>> Signed-off-by: Ramsay Jones 
>> ---
>>  t/t1400-update-ref.sh | 11 ++-
>>  t/t6120-describe.sh   |  1 -
>>  t/t7004-tag.sh|  1 -
>>  t/test-lib.sh | 22 --
>>  4 files changed, 30 insertions(+), 5 deletions(-)
> 
> Reviewed-by: Jonathan Nieder 
> 
> Thanks.
> 
> An alternative would be to do some more explicit feature-based test
> like using "ulimit" to set a limit and then reading back the limit in
> a separate process, but that feels like overkill.

It's still not clear whether these limits would be in effect or just
reported.

Rather than checking uname -s in several places, I think we should just
define ulimit to be false in one place, or rather set the prerequisite
there.

Michael


Re: merge-base not working as expected when base is ahead

2017-09-14 Thread Michael J Gruber
Ekelhart Jakob venit, vidit, dixit 13.09.2017 17:07:
> Dear Git,
> 
> git merge-base --fork-point "master" not working if master is already newer 
> then my current branch.
> Very oddly it seems to work whenever you had the expected commit checked out 
> previously - what made it very tricky to detect this problem.
> 
> Example:
>  - Clone "https://github.com/jekelhart/GitInfoTry;
>  - Switch to branch "v1.0.0"
>  - git merge-base --fork-point "master"
>- or: git merge-base --fork-point "origin/master" 
>  - expected result: fork point "fbb1db34c6317a6e8b319c1ec261e97ca1672c22"
>  - but result is empty
> 
> In the repo where we created this example tree in the first place the command 
> returned the expected fork point. If you clone it new and fresh it does not 
> return any result anymore.
> 
> Works, however, on branch "v2.0.0". Assumption: because "master" is older.(?)
> I think it works locally because the command uses the reflog in addition(!), 
> however, it should work without the local reflog as well. (since the history 
> was not modified in any way)

The documentation lies, unfortunately. It claims that in fork-mode, "git
merge-base" "also" looks at the reflog. In fact, the code explicitely
*discards* any merge bases that it finds but which are not in the
reflog. (The merge base that you find for v2.0.0 is in the reflog
because it's the checked out HEAD.)

Removing the corresponding check gives the merge base for v1.0.0 that
you expect, and git still passes all tests. But I'll look at the history
of these lines before I submit a patch.

Michael



[PATCH v4 0/4] imap-send: Fix and enable curl by default

2017-09-14 Thread Nicolas Morey-Chaisemartin
Changes since v3:
- Fix return code in patch #1
- Reword patch#4

Nicolas Morey-Chaisemartin (4):
  imap-send: return with error if curl failed
  imap-send: add wrapper to get server credentials if needed
  imap_send: setup_curl: retreive credentials if not set in config file
  imap-send: use curl by default when possible

 imap-send.c | 60 
 1 file changed, 40 insertions(+), 20 deletions(-)

-- 
2.14.1.461.g503560879



[PATCH v4 1/4] imap-send: return with error if curl failed

2017-09-14 Thread Nicolas Morey-Chaisemartin
curl_append_msgs_to_imap always returned 0, whether curl failed or not.
Return a proper status so git imap-send will exit with an error code
if something wrong happened.

Signed-off-by: Nicolas Morey-Chaisemartin 
---
 imap-send.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/imap-send.c b/imap-send.c
index b2d0b849bb..b5e332420a 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -1490,7 +1490,7 @@ static int curl_append_msgs_to_imap(struct 
imap_server_conf *server,
curl_easy_cleanup(curl);
curl_global_cleanup();
 
-   return 0;
+   return res != CURLE_OK;
 }
 #endif
 
-- 
2.14.1.461.g503560879




[PATCH v4 3/4] imap_send: setup_curl: retreive credentials if not set in config file

2017-09-14 Thread Nicolas Morey-Chaisemartin
Up to this point, the curl mode only supported getting the username
and password from the gitconfig file while the legacy mode could also
fetch them using the credential API.

Signed-off-by: Nicolas Morey-Chaisemartin 
---
 imap-send.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/imap-send.c b/imap-send.c
index 1b8fbbd545..7e39993d95 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -1398,7 +1398,7 @@ static int append_msgs_to_imap(struct imap_server_conf 
*server,
 }
 
 #ifdef USE_CURL_FOR_IMAP_SEND
-static CURL *setup_curl(struct imap_server_conf *srvc)
+static CURL *setup_curl(struct imap_server_conf *srvc, struct credential *cred)
 {
CURL *curl;
struct strbuf path = STRBUF_INIT;
@@ -1411,6 +1411,7 @@ static CURL *setup_curl(struct imap_server_conf *srvc)
if (!curl)
die("curl_easy_init failed");
 
+   server_fill_credential(, cred);
curl_easy_setopt(curl, CURLOPT_USERNAME, server.user);
curl_easy_setopt(curl, CURLOPT_PASSWORD, server.pass);
 
@@ -1460,8 +1461,9 @@ static int curl_append_msgs_to_imap(struct 
imap_server_conf *server,
struct buffer msgbuf = { STRBUF_INIT, 0 };
CURL *curl;
CURLcode res = CURLE_OK;
+   struct credential cred = CREDENTIAL_INIT;
 
-   curl = setup_curl(server);
+   curl = setup_curl(server, );
curl_easy_setopt(curl, CURLOPT_READDATA, );
 
fprintf(stderr, "sending %d message%s\n", total, (total != 1) ? "s" : 
"");
@@ -1496,6 +1498,18 @@ static int curl_append_msgs_to_imap(struct 
imap_server_conf *server,
curl_easy_cleanup(curl);
curl_global_cleanup();
 
+   if (cred.username)
+   if (res == CURLE_OK)
+   credential_approve();
+#if LIBCURL_VERSION_NUM >= 0x070d01
+   else if (res == CURLE_LOGIN_DENIED)
+#else
+   else
+#endif
+   credential_reject();
+
+   credential_clear();
+
return res != CURLE_OK;
 }
 #endif
-- 
2.14.1.461.g503560879




[PATCH v4 4/4] imap-send: use curl by default when possible

2017-09-14 Thread Nicolas Morey-Chaisemartin
Set curl as the runtime default when it is available.
When linked against older curl versions (< 7_34_0) or without curl,
use the legacy imap implementation.

The goal is to validate feature parity between the legacy and
the curl implementation, deprecate the legacy implementation
later on and in the long term, hopefully drop it altogether.

Signed-off-by: Nicolas Morey-Chaisemartin 
---
 imap-send.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/imap-send.c b/imap-send.c
index 7e39993d95..af1e1576bd 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -35,11 +35,11 @@ typedef void *SSL;
 #include "http.h"
 #endif
 
-#if defined(USE_CURL_FOR_IMAP_SEND) && defined(NO_OPENSSL)
-/* only available option */
+#if defined(USE_CURL_FOR_IMAP_SEND)
+/* Always default to curl if it's available. */
 #define USE_CURL_DEFAULT 1
 #else
-/* strictly opt in */
+/* We don't have curl, so continue to use the historical implementation */
 #define USE_CURL_DEFAULT 0
 #endif
 
-- 
2.14.1.461.g503560879



[PATCH v4 2/4] imap-send: add wrapper to get server credentials if needed

2017-09-14 Thread Nicolas Morey-Chaisemartin
Signed-off-by: Nicolas Morey-Chaisemartin 
---
 imap-send.c | 34 --
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/imap-send.c b/imap-send.c
index b5e332420a..1b8fbbd545 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -926,6 +926,25 @@ static int auth_cram_md5(struct imap_store *ctx, struct 
imap_cmd *cmd, const cha
return 0;
 }
 
+static void server_fill_credential(struct imap_server_conf *srvc, struct 
credential *cred)
+{
+   if (srvc->user && srvc->pass)
+   return;
+
+   cred->protocol = xstrdup(srvc->use_ssl ? "imaps" : "imap");
+   cred->host = xstrdup(srvc->host);
+
+   cred->username = xstrdup_or_null(srvc->user);
+   cred->password = xstrdup_or_null(srvc->pass);
+
+   credential_fill(cred);
+
+   if (!srvc->user)
+   srvc->user = xstrdup(cred->username);
+   if (!srvc->pass)
+   srvc->pass = xstrdup(cred->password);
+}
+
 static struct imap_store *imap_open_store(struct imap_server_conf *srvc, char 
*folder)
 {
struct credential cred = CREDENTIAL_INIT;
@@ -1078,20 +1097,7 @@ static struct imap_store *imap_open_store(struct 
imap_server_conf *srvc, char *f
}
 #endif
imap_info("Logging in...\n");
-   if (!srvc->user || !srvc->pass) {
-   cred.protocol = xstrdup(srvc->use_ssl ? "imaps" : 
"imap");
-   cred.host = xstrdup(srvc->host);
-
-   cred.username = xstrdup_or_null(srvc->user);
-   cred.password = xstrdup_or_null(srvc->pass);
-
-   credential_fill();
-
-   if (!srvc->user)
-   srvc->user = xstrdup(cred.username);
-   if (!srvc->pass)
-   srvc->pass = xstrdup(cred.password);
-   }
+   server_fill_credential(srvc, );
 
if (srvc->auth_method) {
struct imap_cmd_cb cb;
-- 
2.14.1.461.g503560879




Re: [PATCH v5 19/40] lib-httpd: add upload.sh

2017-09-14 Thread Christian Couder
On Thu, Aug 3, 2017 at 10:07 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> +OLDIFS="$IFS"
>> +IFS='&'
>> +set -- $QUERY_STRING
>> +IFS="$OLDIFS"
>> +
>> +while test $# -gt 0
>> +do
>> +key=${1%=*}
>> +val=${1#*=}
>
> When you see that ${V%X*} and ${V#*X} appear in a pair for the same
> variable V and same delimiter X, it almost always indicates a bug
> waiting to happen.
>
> What's the definition of "key" here?  A member of known set of short
> tokens, all of which consists only of alphanumeric, or something?

Yeah, the key can be only "sha1", "type", "size" or "delete" as can be
seen later in the code.

> Even if you do not currently plan to deal with a value with '=' in
> it, it may be prudent to double '%' above (and do not double '#').

Yeah I agree. Thanks for spotting this!

> Style: indent your shell script with tabs.

Sure.

>> +case "$key" in
>> + "sha1") sha1="$val" ;;
>> + "type") type="$val" ;;
>> + "size") size="$val" ;;
>> + "delete") delete=1 ;;
>> + *) echo >&2 "unknown key '$key'" ;;
>> +esac
>
> Indent your shell script with tabs; case/esac and the labels used
> for each case arms all align at the same column.

Yeah, it will be properly indented in the version I will send soon.


BUSINESS PROPOSAL

2017-09-14 Thread LING LUNG
Please I   like you to keep this proposal as a top secret and delete it if you 
are not interested and get back to 
me if you are interested for  details as regards to the transfer of $24,500,000 
to you. This money initially 
belongs to a Libyan client who died in the libya crisis  and had no next of kin 
in his account-opening 
package in my bank here in Hong kong where I am a bank director. In other to 
achieve this, I shall 
require your full name, and telephone number to reach you.Most importantly,  a 
confirmation of 
acceptance from you will be needed after which I shall furnish you with the 
full details of this transaction.

Kindly reply to linglung0...@gmail.com

Respectfully,

Ling Lung


Re: [PATCH v2] commit-template: change a message to be more intuitive

2017-09-14 Thread Michael J Gruber
Kaartic Sivaraam venit, vidit, dixit 13.09.2017 15:05:
> It's not good to use the phrase 'do not touch' to convey the information
> that the cut-line should not be modified or removed as it could possibly
> be mis-interpreted by a person who doesn't know that the word 'touch' has
> the meaning of 'tamper with'. Further, it could make translations a little
> difficult as it might not have the intended meaning in a few languages when
> translated as such.
> 
> So, use more intuitive terms in the sentence.
> 
> Signed-off-by: Kaartic Sivaraam 
> ---
>  wt-status.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/wt-status.c b/wt-status.c
> index 77c27c51134d2..be53579760ee7 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -934,7 +934,7 @@ size_t wt_status_locate_end(const char *s, size_t len)
>  
>  void wt_status_add_cut_line(FILE *fp)
>  {
> - const char *explanation = _("Do not touch the line above.\nEverything 
> below will be removed.");
> + const char *explanation = _("Do not modify or remove the line 
> above.\nEverything below will be removed.");

I don't want to complicate things. But now - due to the repeated usage
of "remove" - these two sentences seem to be connected by an invisible
"or else" ("will" vs. "would" not withstanding). i.e. "or else
everything below will be removed, too", which is wrong. Before, they
were separated more clearly.

Also, given all the translations that we have, it seems somewhat strange
to try and foresee and workaround possible misunderstandings of commonly
used English phrases.

>   struct strbuf buf = STRBUF_INIT;
>  
>   fprintf(fp, "%c %s", comment_line_char, cut_line);
> 
> --
> https://github.com/git/git/pull/401
> 



Re: [PATCH v5 11/40] odb-helper: add odb_helper_init() to send 'init' instruction

2017-09-14 Thread Christian Couder
On Sun, Sep 10, 2017 at 2:12 PM, Lars Schneider
 wrote:
>
>> On 03 Aug 2017, at 10:18, Christian Couder  
>> wrote:
>>
>> +static void parse_capabilities(char *cap_buf,
>> +unsigned int *supported_capabilities,
>> +const char *process_name)
>> +{
>> + struct string_list cap_list = STRING_LIST_INIT_NODUP;
>> +
>> + string_list_split_in_place(_list, cap_buf, '=', 1);
>> +
>> + if (cap_list.nr == 2 && !strcmp(cap_list.items[0].string, 
>> "capability")) {
>> + const char *cap_name = cap_list.items[1].string;
>> +
>> + if (!strcmp(cap_name, "get_git_obj")) {
>> + *supported_capabilities |= ODB_HELPER_CAP_GET_GIT_OBJ;
>> + } else if (!strcmp(cap_name, "get_raw_obj")) {
>> + *supported_capabilities |= ODB_HELPER_CAP_GET_RAW_OBJ;
>> + } else if (!strcmp(cap_name, "get_direct")) {
>> + *supported_capabilities |= ODB_HELPER_CAP_GET_DIRECT;
>> + } else if (!strcmp(cap_name, "put_git_obj")) {
>> + *supported_capabilities |= ODB_HELPER_CAP_PUT_GIT_OBJ;
>> + } else if (!strcmp(cap_name, "put_raw_obj")) {
>> + *supported_capabilities |= ODB_HELPER_CAP_PUT_RAW_OBJ;
>> + } else if (!strcmp(cap_name, "put_direct")) {
>> + *supported_capabilities |= ODB_HELPER_CAP_PUT_DIRECT;
>> + } else if (!strcmp(cap_name, "have")) {
>> + *supported_capabilities |= ODB_HELPER_CAP_HAVE;
>> + } else {
>> + warning("external process '%s' requested unsupported 
>> read-object capability '%s'",
>> + process_name, cap_name);
>> + }
>
> In 1514c8ed ("convert: refactor capabilities negotiation", 2017-06-30) I 
> introduced
> a simpler version of the capabilities negotiation. Maybe useful for you here, 
> too?

Yeah, actually there is also fa64a2fdbe (sub-process: refactor
handshake to common function, 2017-07-26) that Jonathan Tan wrote on
top of your changes and that adds subprocess_handshake(). So the
current code is using it like that:

static int start_object_process_fn(struct subprocess_entry *subprocess)
{
static int versions[] = {1, 0};
static struct subprocess_capability capabilities[] = {
{ "get_git_obj", ODB_HELPER_CAP_GET_GIT_OBJ },
{ "get_raw_obj", ODB_HELPER_CAP_GET_RAW_OBJ },
{ "get_direct",  ODB_HELPER_CAP_GET_DIRECT  },
{ "put_git_obj", ODB_HELPER_CAP_PUT_GIT_OBJ },
{ "put_raw_obj", ODB_HELPER_CAP_PUT_RAW_OBJ },
{ "put_direct",  ODB_HELPER_CAP_PUT_DIRECT  },
{ "have",ODB_HELPER_CAP_HAVE },
{ NULL, 0 }
};
struct object_process *entry = (struct object_process *)subprocess;
return subprocess_handshake(subprocess, "git-read-object", versions, NULL,
capabilities,
>supported_capabilities);
}


Re: [PATCH v5 12/40] t0400: add 'put_raw_obj' instruction to odb-helper script

2017-09-14 Thread Christian Couder
On Sun, Sep 10, 2017 at 2:12 PM, Lars Schneider
 wrote:
>
>> On 03 Aug 2017, at 10:18, Christian Couder  
>> wrote:
>>
>> To properly test passing objects from Git to an external odb
>> we need an odb-helper script that supports a 'put'
>> capability/instruction.
>>
>> For now we will support only sending raw blobs, so the
>> supported capability/instruction will be 'put_raw_obj'.
>
> What other kind of blobs do you imagine could we send?

As for the get instructions there could be 'put_git_obj' to send blobs
in the Git format and 'put_direct' to have the helper read directly
from the Git object store.

>> +put_raw_obj)
>> + sha1="$2"
>> + size="$3"
>> + kind="$4"
>> + writen=$(git hash-object -w -t "$kind" --stdin)
>> + test "$writen" = "$sha1" || die "bad sha1 passed '$sha1' vs writen 
>> '$writen'"
>
> Typo? Should it be "written"?

Yeah, thanks. It's fixed in the current version.


Re: [PATCH v5 00/40] Add initial experimental external ODB support

2017-09-14 Thread Christian Couder
On Sun, Sep 10, 2017 at 2:30 PM, Lars Schneider
 wrote:
>
>> On 03 Aug 2017, at 10:18, Christian Couder  
>> wrote:
>>
>> ...
>>
>> * The "helpers" (registered commands)
>>
>> Each helper manages access to one external ODB.
>>
>> There are 2 different modes for helper:
>>
>>  - Helpers configured using "odb..scriptCommand" are
>>launched each time Git wants to communicate with the 
>>external ODB. This is called "script mode".
>>
>>  - Helpers configured using "odb..subprocessCommand" are
>>launched launched once as a sub-process (using sub-process.h), and
>>Git communicates with them using packet lines. This is called
>>"process mode".
>
> I am curious, why would we support two modes? Wouldn't that increase
> the maintenance cost? Wouldn't the subprocess command be superior?
> I imagine the script mode eases testing, right?!

The script mode makes it much easier to write some helpers. For
example, as shown in t0430 at the end of the patch series, a helper
for a restartable bundle based clone could be something like
basically:

case "$1" in
init)
ref_hash=$(git rev-parse refs/odbs/magic/bundle) ||
die "couldn't find refs/odbs/magic/bundle"
GIT_NO_EXTERNAL_ODB=1 git cat-file blob "$ref_hash" >bundle_info ||
die "couldn't get blob $ref_hash"
bundle_url=$(sed -e 's/bundle url: //' bundle_info)
curl "$bundle_url" -o bundle_file ||
die "curl '$bundle_url' failed"
GIT_NO_EXTERNAL_ODB=1 git bundle unbundle bundle_file >unbundling_info ||
die "unbundling 'bundle_file' failed"
;;

>> These odb refs point to a blob that is stored in the Git
>> repository and contain information about the blob stored in the
>> external odb. This information can be specific to the external odb.
>> The repos can then share this information using commands like:
>>
>> `git fetch origin "refs/odbs//*:refs/odbs//*"`
>>
>> At the end of the current patch series, "git clone" is teached a
>> "--initial-refspec" option, that asks it to first fetch some specified
>> refs. This is used in the tests to fetch the odb refs first.
>>
>> This way only one "git clone" command can setup a repo using the
>> external ODB mechanism as long as the right helper is installed on the
>> machine and as long as the following options are used:
>>
>>  - "--initial-refspec " to fetch the odb refspec
>>  - "-c odb..command=" to configure the helper
>
> The "odb" config could, of course, go into the global git config.

Sure.

> The odbrefspec is optional, right?

Using "--initial-refspec " is optional. There will be more
information in the documentation about this option in the next version
of the series.

> I have the impression there are a number of topics on the list
> that tackle the "many/big objects in a Git repo" problem. Is
> there a write up about the status of them, how they relate
> to each other, and what the current problems are?
> I found the following but it looks abandoned:
> https://github.com/jrn/git-large-repositories

Yeah, it could be interesting to discuss all these topics together. On
the other hand people working on existing patch series, like me, have
to work on them and post new versions, as just discussing the topics
is not enough to move things forward.
Anyway Junio and Jonathan Tan also asked me questions about how my
work relates to Jonathan's, so I will reply to them hopefully soon...