Re: Fetch-hooks

2018-02-09 Thread Leo Gaspard
On 02/10/2018 02:08 AM, Junio C Hamano wrote:
> Leo Gaspard  writes:
> 
>> On 02/10/2018 01:13 AM, Jeff King wrote:
>>> On Sat, Feb 10, 2018 at 12:49:31AM +0100, Leo Gaspard wrote:
 So the changes that are required are:
  * Adding a notification-only post-fetch hook
> 
> Maybe I missed a very early part of the discussion, but why does
> this even need a hook?  There are some numbers [*1*] of classes of
> valid reasons we may want to have hooks triggered by operations, but
> "always do something locally after doing something else locally,
> regardless of the outcome of that something else" feels like the
> most typical anti-pattern that we do not want a hook for.  If you
> are doing "git fetch" (or "git pull"), you already know you are
> doing that and you donot need a notification.  You just create a
> workflow specific script that calls fetch or pull, followed by
> whatever you want to do and use that, instead of doing "git pull",
> and that is not any extra work than writing a hook and installing
> it.
> 
> Unlike something like post-receive, which happens on the remote side
> where you may not even have an interactive access to, in response to
> the operation you locally do (i.e. "git push"), fetching and then
> doing something else in a repository you fetch into has no reason to
> be done as a hook.

I guess the very early part of the discussion you're speaking of is what
I was assuming after reading
https://marc.info/?l=git=132478296309094=2

Then, it's always possible to just write workflow-specific scripts that
manually run git fetch then copy the refs (resp. run git fetch then copy
the refs then run git merge) to get git myfetch (resp. git mypull) [1].

But then, the question is, why is there a pre-push hook? it's already
possible to have a custom script that first runs the hook then runs git
push, for most if not all of the use cases of the pre-push hook. Yet the
possibility to not change the end-user's workflow is what makes pre-push
(or pre-commit, with which the similarity is perhaps even more obvious)
so useful.

So the reason for a post-fetch in my opinion is the same as for a
pre-push / pre-commit: not changing the user's workflow, while providing
additional features.


[1]

Which makes me notice that actually the post-fetch hook technique we
were discussing with Peff suffers the same not-updating-FETCH_HEAD issue
that was discussed in this early thread: a `git pull` would try to merge
from refs/quarantine, I guess. So things are a bit harder than we thought.

I guess the tweak-fetch hook could be left “as-is”, but with git
automatically doing the “quarantine-ing” thing transparently so that the
references that the end-user (or the hook for that matter) see are the
“curated” ones? Then it's too late for me to think efficiently right
now, so that idea may be stupid or over-complex.


Re: [PATCH v1] worktree: set worktree environment in post-checkout hook

2018-02-09 Thread Lars Schneider

> On 10 Feb 2018, at 02:01, lars.schnei...@autodesk.com wrote:
> 
> From: Lars Schneider 
> 
> In ade546be47 (worktree: invoke post-checkout hook (unless
> --no-checkout), 2017-12-07) we taught Git to run the post-checkout hook
> in worktrees. Unfortunately, the environment of the hook was not made
> aware of the worktree. Consequently, a 'git rev-parse --show-toplevel'
> call in the post-checkout hook would return a wrong result.
> 
> Fix this by setting the 'GIT_WORK_TREE' environment variable to make
> Git calls within the post-checkout hook aware of the worktree.
> 
> Signed-off-by: Lars Schneider 
> ---
> 
> Hi,
> 
> I think this is a bug in Git 2.16. We noticed it because it caused a
> problem in Git LFS [1]. The modified test case fails with Git 2.16 and
> succeeds with this patch.
> 
> Cheers,
> Lars
> 
> 
> [1] https://github.com/git-lfs/git-lfs/issues/2848
> 
> 
> Notes:
>Base Ref: v2.16.1
>Web-Diff: https://github.com/larsxschneider/git/commit/214e9342e7
>Checkout: git fetch https://github.com/larsxschneider/git 
> fix-worktree-add-v1 && git checkout 214e9342e7
> 
> builtin/worktree.c  |  7 +--
> t/t2025-worktree-add.sh | 11 +--
> 2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index 7cef5b120b..032f9b86bf 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -345,9 +345,12 @@ static int add_worktree(const char *path, const char 
> *refname,
>* Hook failure does not warrant worktree deletion, so run hook after
>* is_junk is cleared, but do return appropriate code when hook fails.
>*/
> - if (!ret && opts->checkout)
> - ret = run_hook_le(NULL, "post-checkout", oid_to_hex(_oid),
> + if (!ret && opts->checkout) {
> + struct argv_array env = ARGV_ARRAY_INIT;
> + argv_array_pushf(, "GIT_WORK_TREE=%s", absolute_path(path));
> + ret = run_hook_le(env.argv, "post-checkout", 
> oid_to_hex(_oid),
> oid_to_hex(>object.oid), "1", NULL);

As I hit "send" I realized that I forgot to cleanup.
@Junio: Can you squash this in?

argv_array_clear();

Thanks,
Lars

> + }
> 
>   argv_array_clear(_env);
>   strbuf_release();
> diff --git a/t/t2025-worktree-add.sh b/t/t2025-worktree-add.sh
> index 2b95944973..d022ac0c26 100755
> --- a/t/t2025-worktree-add.sh
> +++ b/t/t2025-worktree-add.sh
> @@ -455,19 +455,26 @@ post_checkout_hook () {
>   mkdir -p .git/hooks &&
>   write_script .git/hooks/post-checkout <<-\EOF
>   echo $* >hook.actual
> + git rev-parse --show-toplevel >>hook.actual
>   EOF
> }
> 
> test_expect_success '"add" invokes post-checkout hook (branch)' '
>   post_checkout_hook &&
> - printf "%s %s 1\n" $_z40 $(git rev-parse HEAD) >hook.expect &&
> + cat >hook.expect <<-EOF &&
> + $_z40 $(git rev-parse HEAD) 1
> + $(pwd)/gumby
> + EOF
>   git worktree add gumby &&
>   test_cmp hook.expect hook.actual
> '
> 
> test_expect_success '"add" invokes post-checkout hook (detached)' '
>   post_checkout_hook &&
> - printf "%s %s 1\n" $_z40 $(git rev-parse HEAD) >hook.expect &&
> + cat >hook.expect <<-EOF &&
> + $_z40 $(git rev-parse HEAD) 1
> + $(pwd)/grumpy
> + EOF
>   git worktree add --detach grumpy &&
>   test_cmp hook.expect hook.actual
> '
> 
> base-commit: 8279ed033f703d4115bee620dccd32a9ec94d9aa
> --
> 2.16.1
> 



Re: Fetch-hooks

2018-02-09 Thread Junio C Hamano
Leo Gaspard  writes:

> On 02/10/2018 01:13 AM, Jeff King wrote:
>> On Sat, Feb 10, 2018 at 12:49:31AM +0100, Leo Gaspard wrote:
>>> So the changes that are required are:
>>>  * Adding a notification-only post-fetch hook

Maybe I missed a very early part of the discussion, but why does
this even need a hook?  There are some numbers [*1*] of classes of
valid reasons we may want to have hooks triggered by operations, but
"always do something locally after doing something else locally,
regardless of the outcome of that something else" feels like the
most typical anti-pattern that we do not want a hook for.  If you
are doing "git fetch" (or "git pull"), you already know you are
doing that and you donot need a notification.  You just create a
workflow specific script that calls fetch or pull, followed by
whatever you want to do and use that, instead of doing "git pull",
and that is not any extra work than writing a hook and installing
it.

Unlike something like post-receive, which happens on the remote side
where you may not even have an interactive access to, in response to
the operation you locally do (i.e. "git push"), fetching and then
doing something else in a repository you fetch into has no reason to
be done as a hook.

[Footnote]

*1* I think the number was 5 when I last counted/enumerated, but
don't quote me on that ;-)


[no subject]

2018-02-09 Thread Bobby Smith

I am contacting you in respect of our late client who deposited fund
valued at the sum of $6,200,000.00 in my bank,he died with his family
in an auto accident. I have contacted you to stand as his heir and
inherit the funds, after the success of this transaction, we should
share the total funds 50%/50% each, reply back if you are interested
 (Bobby sim...@gmail.com)

Best regards,

Mr. Bobby Smith.

Re: [PATCH v6 0/7] convert: add support for different encodings

2018-02-09 Thread Lars Schneider

> On 09 Feb 2018, at 21:09, Junio C Hamano  wrote:
> 
> Documentation has core.checkRoundtripEncoding while t0028 and a
> comment in convert.c capitalize it differently.  I suspect that it
> would be more reader-friendly to update the documentation to match.

Agreed. I will wait a little for additional feedback and then send 
a new round.

Thanks,
Lars


[PATCH v1] worktree: set worktree environment in post-checkout hook

2018-02-09 Thread lars . schneider
From: Lars Schneider 

In ade546be47 (worktree: invoke post-checkout hook (unless
--no-checkout), 2017-12-07) we taught Git to run the post-checkout hook
in worktrees. Unfortunately, the environment of the hook was not made
aware of the worktree. Consequently, a 'git rev-parse --show-toplevel'
call in the post-checkout hook would return a wrong result.

Fix this by setting the 'GIT_WORK_TREE' environment variable to make
Git calls within the post-checkout hook aware of the worktree.

Signed-off-by: Lars Schneider 
---

Hi,

I think this is a bug in Git 2.16. We noticed it because it caused a
problem in Git LFS [1]. The modified test case fails with Git 2.16 and
succeeds with this patch.

Cheers,
Lars


[1] https://github.com/git-lfs/git-lfs/issues/2848


Notes:
Base Ref: v2.16.1
Web-Diff: https://github.com/larsxschneider/git/commit/214e9342e7
Checkout: git fetch https://github.com/larsxschneider/git 
fix-worktree-add-v1 && git checkout 214e9342e7

 builtin/worktree.c  |  7 +--
 t/t2025-worktree-add.sh | 11 +--
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 7cef5b120b..032f9b86bf 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -345,9 +345,12 @@ static int add_worktree(const char *path, const char 
*refname,
 * Hook failure does not warrant worktree deletion, so run hook after
 * is_junk is cleared, but do return appropriate code when hook fails.
 */
-   if (!ret && opts->checkout)
-   ret = run_hook_le(NULL, "post-checkout", oid_to_hex(_oid),
+   if (!ret && opts->checkout) {
+   struct argv_array env = ARGV_ARRAY_INIT;
+   argv_array_pushf(, "GIT_WORK_TREE=%s", absolute_path(path));
+   ret = run_hook_le(env.argv, "post-checkout", 
oid_to_hex(_oid),
  oid_to_hex(>object.oid), "1", NULL);
+   }

argv_array_clear(_env);
strbuf_release();
diff --git a/t/t2025-worktree-add.sh b/t/t2025-worktree-add.sh
index 2b95944973..d022ac0c26 100755
--- a/t/t2025-worktree-add.sh
+++ b/t/t2025-worktree-add.sh
@@ -455,19 +455,26 @@ post_checkout_hook () {
mkdir -p .git/hooks &&
write_script .git/hooks/post-checkout <<-\EOF
echo $* >hook.actual
+   git rev-parse --show-toplevel >>hook.actual
EOF
 }

 test_expect_success '"add" invokes post-checkout hook (branch)' '
post_checkout_hook &&
-   printf "%s %s 1\n" $_z40 $(git rev-parse HEAD) >hook.expect &&
+   cat >hook.expect <<-EOF &&
+   $_z40 $(git rev-parse HEAD) 1
+   $(pwd)/gumby
+   EOF
git worktree add gumby &&
test_cmp hook.expect hook.actual
 '

 test_expect_success '"add" invokes post-checkout hook (detached)' '
post_checkout_hook &&
-   printf "%s %s 1\n" $_z40 $(git rev-parse HEAD) >hook.expect &&
+   cat >hook.expect <<-EOF &&
+   $_z40 $(git rev-parse HEAD) 1
+   $(pwd)/grumpy
+   EOF
git worktree add --detach grumpy &&
test_cmp hook.expect hook.actual
 '

base-commit: 8279ed033f703d4115bee620dccd32a9ec94d9aa
--
2.16.1



Re: Fetch-hooks

2018-02-09 Thread Leo Gaspard
On 02/10/2018 01:13 AM, Jeff King wrote:
> On Sat, Feb 10, 2018 at 12:49:31AM +0100, Leo Gaspard wrote:
>> So the changes that are required are:
>>  * Adding a notification-only post-fetch hook
>>  * For handling tags, there is a need to have a refmap for tags. Maybe
>> adding a remote.my-remote.fetchTags refmap, that would be used when
>> running with --tags, and having it default to “refs/tags/*:refs/tags/*”
>> to keep the current behavior by default?
> 
> Yeah, tag-following may be a little tricky, because it usually wants to
> write to refs/tags/. One workaround would be to have your config look
> like this:
> 
>   [remote "origin"]
>   fetch = +refs/heads/*:refs/quarantine/origin/heads/*
>   fetch = +refs/tags/*:refs/quarantine/origin/tags/*
>   tagOpt = --no-tags
> 
> That's not exactly the same thing, because it would fetch all tags, not
> just those that point to the history on the branches. But in most
> repositories and workflows the distinction doesn't matter.

Hmm... apart from the implementation complexity (of which I have no
idea), is there an argument against the idea of adding a
remote..fetchTagsTo refmap similar to remote..fetch but used
every time a tag is fetched? (well, maybe not exactly similar to
remote..fetch because we know the source is going to be
refs/tags/*; so just having the part of .fetch past the ':' would be
more like what's needed I guess)

The issue with your solution is that if the user runs 'git fetch
--tags', he will get the (potentially compromised) tags directly in his
refs/tags/.

> (By the way, the I specifically chose the name "refs/quarantine" instead
> of anything in "refs/remotes" because we'd want to make sure that the
> "git checkout" DWIM behavior cannot accidentally pull from quarantine).

(Indeed, I understood after reading it, and would likely not have
thought of it otherwise, thanks!)

>> The only remaining issue I can think of is: How do we avoid the issue
>> of the
>> trigger-only-hook-inciting-bad-behavior-by-hook-authors-who-really-want-modification
>> raised in the side-thread that Junio wrote in [1]? Maybe just writing
>> in the documentation that the hook should use a quarantine-like
>> approach if it wants modification would be enough to not have hook
>> authors try to modify the ref in the post-fetch hook?
> 
> I don't have a silver bullet there. Documenting the "right" way at least
> seems like a good first step.

So long as it's not a merge-blocker it's good with me! (but then I'm
likely not the one who's going to be pointed at when things go wrong in
a hook, so I'm clearly biased on this matter)


Re: Fetch-hooks

2018-02-09 Thread Jeff King
On Sat, Feb 10, 2018 at 12:49:31AM +0100, Leo Gaspard wrote:

> > I tend to agree with the direction of thinking you outlined: you're
> > generally better off completing the fetch to a local namespace that
> > tracks the other side completely, and then manipulating the local refs
> > as you see fit (e.g., fetching into refs/quarantine, and then migrating
> > "good" refs over to refs/remotes/origin).
> 
> Hmm... so do I understand it correctly when I say the process you're
> thinking about works like this?
>  * User installs hook for my-remote by running [something]
>  * User runs git fetch
>  * git fetch fetches remote refs/heads/* to local refs/quarantine/* (so
> I guess [something] changes the remote.my-remote.fetch refmap)
>  * When this is done `git fetch` runs a notification-only post-fetch
> hook (that would need to be added)
>  * The post-fetch hook then performs whatever it wants and updates the
> references in refs/remotes/my-remote/*

Yeah, that was roughly my thinking.

> So the changes that are required are:
>  * Adding a notification-only post-fetch hook
>  * For handling tags, there is a need to have a refmap for tags. Maybe
> adding a remote.my-remote.fetchTags refmap, that would be used when
> running with --tags, and having it default to “refs/tags/*:refs/tags/*”
> to keep the current behavior by default?

Yeah, tag-following may be a little tricky, because it usually wants to
write to refs/tags/. One workaround would be to have your config look
like this:

  [remote "origin"]
  fetch = +refs/heads/*:refs/quarantine/origin/heads/*
  fetch = +refs/tags/*:refs/quarantine/origin/tags/*
  tagOpt = --no-tags

That's not exactly the same thing, because it would fetch all tags, not
just those that point to the history on the branches. But in most
repositories and workflows the distinction doesn't matter.

(By the way, the I specifically chose the name "refs/quarantine" instead
of anything in "refs/remotes" because we'd want to make sure that the
"git checkout" DWIM behavior cannot accidentally pull from quarantine).

> The only remaining issue I can think of is: How do we avoid the issue
> of the
> trigger-only-hook-inciting-bad-behavior-by-hook-authors-who-really-want-modification
> raised in the side-thread that Junio wrote in [1]? Maybe just writing
> in the documentation that the hook should use a quarantine-like
> approach if it wants modification would be enough to not have hook
> authors try to modify the ref in the post-fetch hook?

I don't have a silver bullet there. Documenting the "right" way at least
seems like a good first step.

-Peff


Re: Fetch-hooks

2018-02-09 Thread Leo Gaspard
On 02/09/2018 11:30 PM, Jeff King wrote:
> On Fri, Feb 09, 2018 at 11:04:17PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> One thing that's not discussed yet, and I know just enough about for it
>> to tingle my spidey sense, but not enough to say for sure (CC'd Jeff &
>> Brandon who know more) is that this feature once shipped might cause
>> higher load on git hosting providers.
>>
>> This is because people will inevitably use it in popular projects for
>> some custom filtering, and because you're continually re-fetching and
>> inspecting stuff what used to be a really cheap no-op "pull" most of the
>> time is a more expensive negotiation every time before the client
>> rejects the refs again, and worse for hosting providers because you have
>> bespoke ref fetching strategies you have less odds of being able to
>> cache both the negotiation and the pack you serve.
> 
> Most of the discussion so far seems to be about "accept this ref or
> don't accept this ref", which seems OK. But if you are going to do
> custom tweaking like rewriting objects, or making it common to refuse
> some refs, then I think things get pretty inefficient for _everybody_.
> 
> The negotiation for future fetches uses the existing refs as the
> starting point. And if we don't know that we have the objects because
> there are no refs pointing at them, they're going to get transferred
> again. That's extra load no the server, and extra time for the user
> waiting on the network.

Oh. I thought the protocol git used was something like
client: I want to fetch refs A and B
server: so you'll need objects 12345678 and 90ABCDEF, A and B both point
to 12345678
client: please give me object 12345678
server: here it is
[...]

I was clearly wrong, thanks! (and thanks Ævar for your explanation in
the side-thread, too!)

> I tend to agree with the direction of thinking you outlined: you're
> generally better off completing the fetch to a local namespace that
> tracks the other side completely, and then manipulating the local refs
> as you see fit (e.g., fetching into refs/quarantine, and then migrating
> "good" refs over to refs/remotes/origin).

Hmm... so do I understand it correctly when I say the process you're
thinking about works like this?
 * User installs hook for my-remote by running [something]
 * User runs git fetch
 * git fetch fetches remote refs/heads/* to local refs/quarantine/* (so
I guess [something] changes the remote.my-remote.fetch refmap)
 * When this is done `git fetch` runs a notification-only post-fetch
hook (that would need to be added)
 * The post-fetch hook then performs whatever it wants and updates the
references in refs/remotes/my-remote/*

So the changes that are required are:
 * Adding a notification-only post-fetch hook
 * For handling tags, there is a need to have a refmap for tags. Maybe
adding a remote.my-remote.fetchTags refmap, that would be used when
running with --tags, and having it default to “refs/tags/*:refs/tags/*”
to keep the current behavior by default?

The only remaining issue I can think of is: How do we avoid the issue of
the
trigger-only-hook-inciting-bad-behavior-by-hook-authors-who-really-want-modification
raised in the side-thread that Junio wrote in [1]? Maybe just writing in
the documentation that the hook should use a quarantine-like approach if
it wants modification would be enough to not have hook authors try to
modify the ref in the post-fetch hook?

Thanks for all your thoughts, and hope we're getting somewhere!
Leo


PS: I'll read over the reviews once I'm all clear as to what exactly is
wanted for this patch, as most likely they'll just be dumped, given the
current state of affairs.

[1] https://marc.info/?l=git=132480559712592=2


[RFC PATCH 2/3] t1450-fsck: Add tests for HEAD of other worktrees

2018-02-09 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 t/t1450-fsck.sh | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index cb4b66e29d..fa94c59458 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -102,6 +102,33 @@ test_expect_success 'HEAD link pointing at a funny place' '
grep "HEAD points to something strange" out
 '
 
+test_expect_failure 'other worktree HEAD link pointing at a funny object' '
+   test_when_finished "rm -rf .git/worktrees" &&
+   mkdir -p .git/worktrees/other &&
+   echo  
>.git/worktrees/other/HEAD &&
+   test_must_fail git fsck 2>out &&
+   cat out &&
+   grep "worktrees/other/HEAD: detached HEAD points" out
+'
+
+test_expect_failure 'other worktree HEAD link pointing at missing object' '
+   test_when_finished "rm -rf .git/worktrees" &&
+   mkdir -p .git/worktrees/other &&
+   echo "Contents missing from repo" | git hash-object --stdin 
>.git/worktrees/other/HEAD &&
+   test_must_fail git fsck 2>out &&
+   cat out &&
+   grep "worktrees/other/HEAD: invalid sha1 pointer" out
+'
+
+test_expect_failure 'other worktree HEAD link pointing at a funny place' '
+   test_when_finished "rm -rf .git/worktrees" &&
+   mkdir -p .git/worktrees/other &&
+   echo "ref: refs/funny/place" >.git/worktrees/other/HEAD &&
+   test_must_fail git fsck 2>out &&
+   cat out &&
+   grep "worktrees/other/HEAD points to something strange" out
+'
+
 test_expect_success 'email without @ is okay' '
git cat-file commit HEAD >basis &&
sed "s/@/AT/" basis >okay &&
-- 
2.16.1.75.gc01c8fdd7d



Re: [PATCH 046/194] object-store: move replace_objects back to object-store

2018-02-09 Thread Junio C Hamano
Stefan Beller  writes:

> @@ -32,7 +31,15 @@ struct object_store {
>* Objects that should be substituted by other objects
>* (see git-replace(1)).
>*/
> - struct replace_objects replacements;
> + struct replace_objects {
> + /*
> +  * An array of replacements.  The array is kept sorted by the 
> original
> +  * sha1.
> +  */
> + struct replace_object **items;
> +
> + int alloc, nr;
> + } replacements;
>  
>   /*
>* A fast, rough count of the number of objects in the repository.
> @@ -49,7 +56,7 @@ struct object_store {
>   unsigned packed_git_initialized : 1;
>  };
>  #define OBJECT_STORE_INIT \
> - { NULL, MRU_INIT, ALTERNATES_INIT, REPLACE_OBJECTS_INIT, 0, 0, 0 }
> + { NULL, MRU_INIT, ALTERNATES_INIT, { NULL, 0, 0 }, 0, 0, 0 }

Not the primary thrust of this topic, but we may want to convert
these to use designated initializers after this series is done.


[RFC PATCH 0/3] Make fsck check other worktree HEADs

2018-02-09 Thread Elijah Newren
This patchset adds checking of other worktree HEADs to fsck.

The reason I've marked this RFC is that I'm worried my incidental
reliance on "worktrees/$WORKTREE/HEAD" resolving as a ref (in patch 3)
might raise some flags for others.  In particular, in [1] Peff said
that this refname resolves right now mostly by accident and will
probably stop working in the future.  However, I feel that since fsck
checks the storage format as well as contents, it seems natural that a
change of storage model would result in the fsck code changing and
thus that I'm not locking in any particular ref format long term with
these changes.  But I want to flag this issue for discussion.

[1] https://public-inbox.org/git/20180207181706.ga4...@sigill.intra.peff.net/

Elijah Newren (3):
  fsck: Move fsck_head_link() to get_default_heads() to avoid some
globals
  t1450-fsck: Add tests for HEAD of other worktrees
  fsck: Check HEAD of other worktrees as well

 builtin/fsck.c  | 73 -
 t/t1450-fsck.sh | 27 +
 2 files changed, 84 insertions(+), 16 deletions(-)

-- 
2.16.1.75.gc01c8fdd7d



[RFC PATCH 1/3] fsck: Move fsck_head_link() to get_default_heads() to avoid some globals

2018-02-09 Thread Elijah Newren
This will make it easier to check the HEAD of other worktrees from fsck.

Signed-off-by: Elijah Newren 
---
 builtin/fsck.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 04846d46f9..4132034170 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -33,8 +33,6 @@ static int check_strict;
 static int keep_cache_objects;
 static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
 static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;
-static struct object_id head_oid;
-static const char *head_points_at;
 static int errors_found;
 static int write_lost_and_found;
 static int verbose;
@@ -453,8 +451,15 @@ static int fsck_handle_ref(const char *refname, const 
struct object_id *oid,
return 0;
 }
 
+static int fsck_head_link(const char **head_points_at,
+ struct object_id *head_oid);
+
 static void get_default_heads(void)
 {
+   const char *head_points_at;
+   struct object_id head_oid;
+
+   fsck_head_link(_points_at, _oid);
if (head_points_at && !is_null_oid(_oid))
fsck_handle_ref("HEAD", _oid, 0, NULL);
for_each_rawref(fsck_handle_ref, NULL);
@@ -548,33 +553,34 @@ static void fsck_object_dir(const char *path)
stop_progress();
 }
 
-static int fsck_head_link(void)
+static int fsck_head_link(const char **head_points_at,
+ struct object_id *head_oid)
 {
int null_is_error = 0;
 
if (verbose)
fprintf(stderr, "Checking HEAD link\n");
 
-   head_points_at = resolve_ref_unsafe("HEAD", 0, _oid, NULL);
-   if (!head_points_at) {
+   *head_points_at = resolve_ref_unsafe("HEAD", 0, head_oid, NULL);
+   if (!*head_points_at) {
errors_found |= ERROR_REFS;
return error("Invalid HEAD");
}
-   if (!strcmp(head_points_at, "HEAD"))
+   if (!strcmp(*head_points_at, "HEAD"))
/* detached HEAD */
null_is_error = 1;
-   else if (!starts_with(head_points_at, "refs/heads/")) {
+   else if (!starts_with(*head_points_at, "refs/heads/")) {
errors_found |= ERROR_REFS;
return error("HEAD points to something strange (%s)",
-head_points_at);
+*head_points_at);
}
-   if (is_null_oid(_oid)) {
+   if (is_null_oid(head_oid)) {
if (null_is_error) {
errors_found |= ERROR_REFS;
return error("HEAD: detached HEAD points at nothing");
}
fprintf(stderr, "notice: HEAD points to an unborn branch 
(%s)\n",
-   head_points_at + 11);
+   *head_points_at + 11);
}
return 0;
 }
@@ -686,7 +692,6 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
 
git_config(fsck_config, NULL);
 
-   fsck_head_link();
if (connectivity_only) {
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(mark_packed_for_connectivity, NULL, 0);
-- 
2.16.1.75.gc01c8fdd7d



[RFC PATCH 3/3] fsck: Check HEAD of other worktrees as well

2018-02-09 Thread Elijah Newren
This takes advantage of the fact that "worktrees/$WORKTREE/HEAD" will
currently resolve as a ref, which may not be true in the future with
different ref backends.  I don't think it locks us in to supporting
resolving other worktree HEADs with this syntax, as I view the
user-visible error message as more of a pointer to a pathname that the
user will need to fix.  If the underlying ref storage changes, naturally
both this code and the hint may need to change to match.

Signed-off-by: Elijah Newren 
---
 builtin/fsck.c  | 60 +
 t/t1450-fsck.sh |  6 +++---
 2 files changed, 51 insertions(+), 15 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 4132034170..850b217e93 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -451,17 +451,51 @@ static int fsck_handle_ref(const char *refname, const 
struct object_id *oid,
return 0;
 }
 
-static int fsck_head_link(const char **head_points_at,
+static void get_worktree_names(struct string_list *names)
+{
+   struct strbuf path = STRBUF_INIT;
+   DIR *dir;
+   struct dirent *d;
+
+   strbuf_addf(, "%s/worktrees", get_git_common_dir());
+   dir = opendir(path.buf);
+   strbuf_release();
+   if (dir) {
+   while ((d = readdir(dir)) != NULL) {
+   if (!is_dot_or_dotdot(d->d_name))
+   string_list_append(names, d->d_name);
+   }
+   closedir(dir);
+   }
+}
+
+static int fsck_head_link(const char *head_ref_name,
+ const char **head_points_at,
  struct object_id *head_oid);
 
 static void get_default_heads(void)
 {
const char *head_points_at;
struct object_id head_oid;
+   struct string_list worktree_names = STRING_LIST_INIT_DUP;
+   struct string_list_item *worktree_item;
+   struct strbuf head_name = STRBUF_INIT;
 
-   fsck_head_link(_points_at, _oid);
+   fsck_head_link("HEAD", _points_at, _oid);
if (head_points_at && !is_null_oid(_oid))
fsck_handle_ref("HEAD", _oid, 0, NULL);
+
+   get_worktree_names(_names);
+   for_each_string_list_item(worktree_item, _names) {
+   strbuf_reset(_name);
+   strbuf_addf(_name, "worktrees/%s/HEAD",
+   worktree_item->string);
+   fsck_head_link(head_name.buf, _points_at, _oid);
+   if (head_points_at && !is_null_oid(_oid))
+   fsck_handle_ref(head_name.buf, _oid, 0, NULL);
+   }
+   strbuf_release(_name);
+
for_each_rawref(fsck_handle_ref, NULL);
if (include_reflogs)
for_each_reflog(fsck_handle_reflog, NULL);
@@ -553,34 +587,36 @@ static void fsck_object_dir(const char *path)
stop_progress();
 }
 
-static int fsck_head_link(const char **head_points_at,
+static int fsck_head_link(const char *head_ref_name,
+ const char **head_points_at,
  struct object_id *head_oid)
 {
int null_is_error = 0;
 
if (verbose)
-   fprintf(stderr, "Checking HEAD link\n");
+   fprintf(stderr, "Checking %s link\n", head_ref_name);
 
-   *head_points_at = resolve_ref_unsafe("HEAD", 0, head_oid, NULL);
+   *head_points_at = resolve_ref_unsafe(head_ref_name, 0, head_oid, NULL);
if (!*head_points_at) {
errors_found |= ERROR_REFS;
-   return error("Invalid HEAD");
+   return error("Invalid %s", head_ref_name);
}
-   if (!strcmp(*head_points_at, "HEAD"))
+   if (!strcmp(*head_points_at, head_ref_name))
/* detached HEAD */
null_is_error = 1;
else if (!starts_with(*head_points_at, "refs/heads/")) {
errors_found |= ERROR_REFS;
-   return error("HEAD points to something strange (%s)",
-*head_points_at);
+   return error("%s points to something strange (%s)",
+head_ref_name, *head_points_at);
}
if (is_null_oid(head_oid)) {
if (null_is_error) {
errors_found |= ERROR_REFS;
-   return error("HEAD: detached HEAD points at nothing");
+   return error("%s: detached HEAD points at nothing",
+head_ref_name);
}
-   fprintf(stderr, "notice: HEAD points to an unborn branch 
(%s)\n",
-   *head_points_at + 11);
+   fprintf(stderr, "notice: %s points to an unborn branch (%s)\n",
+   head_ref_name, *head_points_at + 11);
}
return 0;
 }
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index fa94c59458..3da651be4c 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -102,7 +102,7 @@ test_expect_success 'HEAD link 

Re: [PATCH 042/194] object-store: move alternates API to new alternates.h

2018-02-09 Thread Junio C Hamano
"brian m. carlson"  writes:

>> +#include "strbuf.h"
>> +#include "sha1-array.h"
>> +
>> +struct alternates {
>> +struct alternate_object_database *list;
>> +struct alternate_object_database **tail;
>> +};
>> +#define ALTERNATES_INIT { NULL, NULL }
>
> I was surprised to find that this patch not only moves the alternates
> API to a new location, but introduces this struct.  I certainly think
> the struct is a good idea, but it should probably go in a separate
> patch, or at least get a mention in the commit message.

Yeah, I tend to agree that splitting it into two patches may make
more sense, especially given that this is already close to 200 patch
series ;-)



Re: [PATCH 1/2] update-index doc: note a fixed bug in the untracked cache

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Junio C. Hamano jotted:

> Ævar Arnfjörð Bjarmason  writes:
>
>>> Will queue with the above typofix, together with the other one.  I
>>> am not sure if we want to say "Before 2.17", though.
>>
>> I'm just keeping in mind the user who later on upgrades git from say
>> 2.14 to 2.18 or something, and is able to find in the docs when/why this
>> new warning got added, which helps diagnose it.
>
> Ah, no, that is not what I meant.  I just didn't think '2.17' in
> that sentence may not be understood as "Git version 2.17" by most
> readers.

Ah, I see. Yes I agree, sorry. Do you mind fixing it up to just say
"Before Git version 2.17, ..." ?


Re: Fetch-hooks

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Leo Gaspard jotted:

> On 02/09/2018 11:04 PM, Ævar Arnfjörð Bjarmason wrote:>>> You could also
> have some intermediate step between these two, where
 e.g. your refspec for "origin" is
 "+refs/heads/*:refs/remotes/origin-untrusted/*" instead of the default
 "+refs/heads/*:refs/remotes/origin/*", you fetch all refs to that
 location, then you move them (with some alias/hook) to
 "refs/remotes/origin/*" once they're seen to be "OK".
>>>
>>> That is indeed another possibility, but then the idea is to make things
>>> as transparent as possible for the end-user, not to completely change
>>> their git workflow. As such, considering only signed commits to be part
>>> of the upstream seems to make sense to me?
>>
>> I mean this would be something that would be part of a post-fetch hook,
>> so it would be as transparent as what you're doing to the user, with the
>> difference that it doesn't need to involve changes to what you slurp
>> down from the server.
>>
>> I.e. we'd just fetch into refs/remotes/origin-untrusted/, then we run
>> your post-fetch hook and you go over the new refs, and copy what you
>> like (usually everything) to refs/remotes/origin/*.
>
> Hmm... but that would then require a post-fetch hook, wouldn't it? And
> about a post-fetch hook, if I understood correctly, Junio in [1] had a
> quite nice argument against it:
>
> Although I do not deeply care between such a "trigger to only
> notify, no touching" hook and a full-blown "allow hook writers to
> easily lie about what happened in the fetch" hook, I was hoping that
> we would get this right and useful if we were spending our brain
> bandwidth on it. I am not very fond of an easier "trigger to only
> notify" hook because people are bound to misuse the interface and
> try updating the refs anyway, making it easy to introduce
> inconsistencies between refs and FETCH_HEAD that will confuse the
> later "merge" step.
>
> Otherwise, if it doesn't require a post-fetch hook, then it would
> require the end-user to first fetch, then run the
> `copy-trusted-refs-over` script, which would add stuff to the user's
> workflow.
>
> Did I miss another possibility?

Yes, the assumption is that there would be a post-fetch hook of some
sort to ferry the refs over from the quarantine.

My reading of that thread is not that Junio's outright against such a
facility (and also it was 2011 and the discussion can be re-visited),
but rather that there's specific concerns that need to be kept in mind
so reliable things involving the ref store don't become brittle as a
result of unstable user hooks, or us offering an interface that's easily
misused.

>> [...]
>>
>> One thing that's not discussed yet, and I know just enough about for it
>> to tingle my spidey sense, but not enough to say for sure (CC'd Jeff &
>> Brandon who know more) is that this feature once shipped might cause
>> higher load on git hosting providers.
>>
>> This is because people will inevitably use it in popular projects for
>> some custom filtering, and because you're continually re-fetching and
>> inspecting stuff what used to be a really cheap no-op "pull" most of the
>> time is a more expensive negotiation every time before the client
>> rejects the refs again, and worse for hosting providers because you have
>> bespoke ref fetching strategies you have less odds of being able to
>> cache both the negotiation and the pack you serve.
>>
>> I.e. you want this for some security feature where 99.99% of the time
>> you accept all refs, but most people will probably use this to implement
>> dynamic Turing-complete refspecs.
>>
>> Maybe that's worrying about nothing, but worth thinking about.
>
> Well... First, I must say I didn't really understand your last paragraph
> about Turing-complete refspecs.
>
> But my understanding of how the fetch-hook patchset I sent this evening
> works is that it first receives all the objects from the hosting
> provider, then locally moves the refs, but never actually discards the
> downloaded objects (well, until a `git gc` I guess).
>
> So I don't think the network traffic with the provider would be any
> different wrt. what it is now, even if a tweak-fetch hook rejects some
> commits? Then again I don't know git's internals enough to be have even
> a bit of certainty about what I'm saying right now, so...
>
>
> [1] https://marc.info/?l=git=132480559712592=2

As Jeff notes in the side-thread in
<20180209223011.ga24...@sigill.intra.peff.net> when you do a "fetch" the
protocol is not negotiating on the basis of what loose unreferenced
(because you threw away the ref!) objects you have already, but what
the ref commonality is between you and the server (and this is just on
a best-effort basis).

So for example, let's say you only accept the "master" branch and have a
hook that's refusing updates to the "dev" branch, fetch is going to be
re-negotiating fetching the difference between the two over the 

Re: [PATCH 1/2] update-index doc: note a fixed bug in the untracked cache

2018-02-09 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>> Will queue with the above typofix, together with the other one.  I
>> am not sure if we want to say "Before 2.17", though.
>
> I'm just keeping in mind the user who later on upgrades git from say
> 2.14 to 2.18 or something, and is able to find in the docs when/why this
> new warning got added, which helps diagnose it.

Ah, no, that is not what I meant.  I just didn't think '2.17' in
that sentence may not be understood as "Git version 2.17" by most
readers.


Re: Fetch-hooks

2018-02-09 Thread Junio C Hamano
Jeff King  writes:

> The negotiation for future fetches uses the existing refs as the
> starting point. And if we don't know that we have the objects because
> there are no refs pointing at them, they're going to get transferred
> again. That's extra load no the server, and extra time for the user
> waiting on the network.
>
> I tend to agree with the direction of thinking you outlined: you're
> generally better off completing the fetch to a local namespace that
> tracks the other side completely, and then manipulating the local refs
> as you see fit (e.g., fetching into refs/quarantine, and then migrating
> "good" refs over to refs/remotes/origin).

Thanks for a dose of sanity ;-)


Re: "git bisect run make" adequate to locate first unbuildable commit?

2018-02-09 Thread Philip Oakley

From: "Robert P. J. Day" 

On Fri, 9 Feb 2018, Philip Oakley, CEng MIET wrote:

(apologies for using the fancy letters after the name ID...)



From: "Robert P. J. Day" 
>
> writing a short tutorial on "git bisect" and, all the details of
> special exit code 125 aside, if one wanted to locate the first
> unbuildable commit, would it be sufficient to just run?
>
>  $ git bisect run make
>
> as i read it, make returns either 0, 1 or 2 so there doesn't appear
> to be any possibility of weirdness with clashing with a 125 exit code.
> am i overlooking some subtle detail here i should be aware of? thanks.
>
> rday

In the spirit of pedanticism, one should also clarify the word
"first", in that it's not a linear search for _an_ unbuildable
commit, but that one is looking for the transition between an
unbroken sequence of unbuildable commits, which transitions to
buildable commits, and its the transition that is sought. (there
could be many random unbuildable commits within a sequence in some
folks' processes!)


 quite so, i should have been more precise.

rday


The other two things that may be happening (in the wider bisect discussion) 
that I've heard of are:
1. there may be feature branches that bypass the known good starting commit, 
which can cause understanding issues as those side branches that predate the 
start point are also considered potential bu commits.
2. if you just want the first parent check for the bad commit point, that 
mark the second parents of merges as being good.


Also, I'd expect that the skipped commits aren't 'counted' (too hard?) for 
the bisect algorithm's reporting.


https://stackoverflow.com/questions/5638211/how-do-you-get-git-bisect-to-ignore-merged-branches 
contains a number of the ideas..


Philip



Re: [PATCH 2/2] fetch: add tweak-fetch hook

2018-02-09 Thread Junio C Hamano
Leo Gaspard  writes:

> +tweak-fetch
> +~~~
> +
> +This hook is invoked by 'git fetch' (commonly called by 'git pull'), after 
> refs
> +have been fetched from the remote repository. It is not executed, if nothing 
> was
> +fetched.

Need to tighten explanation of "nothing was fetched".  If the only
change I made to my repository is that I created a new branch that
points at an existing object since last time you fetched, you would
obtain no new object when you fetch from me.  Would that count as
"nothing was fetched"?  Or would it be still fetching something
(i.e. your remote-tracking hierarchy will record the fact that I now
have this new branch)?

> +   SP not-for-merge|merge|ignore SP  SP 
>  LF
> + ...
> +The `` is the remote's name for the ref that was fetched, and
> +`` is a name of a remote-tracking branch, like
> +"refs/remotes/origin/master". `` can be undefined if the 
> fetched
> +ref is not being stored in a local refname. In this case, it will be set to 
> `@`,

Don't use "@"; leave it empty instead.

> +TODO: Add documentation for the “ignore” parameter. Unfortunately, I'm not
> +really sure I get what this does or what invariants it is supposed to 
> maintain
> +(eg. all “ignore” updates at the end of the refs list?), so this may also
> +require code changes.

If you are not using the feature, wouldn't it make more sense not to
add it in the first place?



Re: [PATCH 1/2] fetch: preparations for tweak-fetch hook

2018-02-09 Thread Junio C Hamano
Leo Gaspard  writes:

> From: Léo Gaspard 
>
> No behavior changes yet, only some groundwork for the next change.
>
> The refs_result structure combines a status code with a ref map, which
> can be NULL even on success. 

Sorry, but I have absolutely no idea what this sentence wants to do
by being here in the log message.  "even on success" makes it sound
as if it is normal to be NULL when status code != success, but it is
not even clear why it is the case, and "can be NULL" implies that it
is not always NULL, but it does not make it clear when it is NULL
and when it is not NULL when code == success.  If you find the need
to explain what each field of a new struct you are introducing to
the codebase means (and it probably is a good idea for this one),
perhaps the right place to do so is in in-code comment next to the
struct definition?

It seems that you also moved add_existing() in the file, for no
apparent reason.

> This will be needed when there's a
> tweak-fetch hook, because it can filter out all refs, while still
> succeeding.
>
> fetch_refs returns a refs_result, so that it can modify the ref_map.

The description keeps saying "ref map", but it is unclear what it
is, why it is a good thing to allow the caller "modify" it, and what
kind of modification the caller wants to make for what reason.

Also, in C, we tend to avoid passing structure by value.  It seems
that the patch makes a callchain involving backfill_tags() return
this struct by value, which goes against the convention.  I suspect
that it should be trivial to have the caller supply an on-stack
instance and pass the pointer to it down the callchain, though.

> Based-on-patch-by: Joey Hess 
> Signed-off-by: Leo Gaspard 
> ---
>  builtin/fetch.c | 68 
> +
>  1 file changed, 44 insertions(+), 24 deletions(-)
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 7bbcd26fa..76dc05f61 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -34,6 +34,11 @@ enum {
>   TAGS_SET = 2
>  };
>  
> +struct refs_result {
> + struct ref *new_refs;
> + int status;
> +};
> +
>  static int fetch_prune_config = -1; /* unspecified */
>  static int prune = -1; /* unspecified */
>  #define PRUNE_BY_DEFAULT 0 /* do we prune by default? */
> @@ -57,6 +62,18 @@ static int shown_url = 0;
>  static int refmap_alloc, refmap_nr;
>  static const char **refmap_array;
>  
> +static int add_existing(const char *refname, const struct object_id *oid,
> + int flag, void *cbdata)
> +{
> + struct string_list *list = (struct string_list *)cbdata;
> + struct string_list_item *item = string_list_insert(list, refname);
> + struct object_id *old_oid = xmalloc(sizeof(*old_oid));
> +
> + oidcpy(old_oid, oid);
> + item->util = old_oid;
> + return 0;
> +}
> +
>  static int git_fetch_config(const char *k, const char *v, void *cb)
>  {
>   if (!strcmp(k, "fetch.prune")) {
> @@ -217,18 +234,6 @@ static void add_merge_config(struct ref **head,
>   }
>  }
>  
> -static int add_existing(const char *refname, const struct object_id *oid,
> - int flag, void *cbdata)
> -{
> - struct string_list *list = (struct string_list *)cbdata;
> - struct string_list_item *item = string_list_insert(list, refname);
> - struct object_id *old_oid = xmalloc(sizeof(*old_oid));
> -
> - oidcpy(old_oid, oid);
> - item->util = old_oid;
> - return 0;
> -}
> -
>  static int will_fetch(struct ref **head, const unsigned char *sha1)
>  {
>   struct ref *rm = *head;
> @@ -920,15 +925,20 @@ static int quickfetch(struct ref *ref_map)
>   return check_connected(iterate_ref_map, , );
>  }
>  
> -static int fetch_refs(struct transport *transport, struct ref *ref_map)
> +static struct refs_result fetch_refs(struct transport *transport,
> + struct ref *ref_map)
>  {
> - int ret = quickfetch(ref_map);
> - if (ret)
> - ret = transport_fetch_refs(transport, ref_map);
> - if (!ret)
> - ret |= store_updated_refs(transport->url,
> + struct refs_result ret;
> + ret.status = quickfetch(ref_map);
> + if (ret.status) {
> + ret.status = transport_fetch_refs(transport, ref_map);
> + }
> + if (!ret.status) {
> + ret.new_refs = ref_map;
> + ret.status |= store_updated_refs(transport->url,
>   transport->remote->name,
> - ref_map);
> + ret.new_refs);
> + }
>   transport_unlock_pack(transport);
>   return ret;
>  }
> @@ -1048,9 +1058,11 @@ static struct transport *prepare_transport(struct 
> remote *remote, int deepen)
>   return transport;
>  }
>  
> -static void backfill_tags(struct transport *transport, struct ref *ref_map)
> +static struct refs_result backfill_tags(struct transport *transport,
> 

Re: Fetch-hooks

2018-02-09 Thread Jeff King
On Fri, Feb 09, 2018 at 11:04:17PM +0100, Ævar Arnfjörð Bjarmason wrote:

> One thing that's not discussed yet, and I know just enough about for it
> to tingle my spidey sense, but not enough to say for sure (CC'd Jeff &
> Brandon who know more) is that this feature once shipped might cause
> higher load on git hosting providers.
> 
> This is because people will inevitably use it in popular projects for
> some custom filtering, and because you're continually re-fetching and
> inspecting stuff what used to be a really cheap no-op "pull" most of the
> time is a more expensive negotiation every time before the client
> rejects the refs again, and worse for hosting providers because you have
> bespoke ref fetching strategies you have less odds of being able to
> cache both the negotiation and the pack you serve.

Most of the discussion so far seems to be about "accept this ref or
don't accept this ref", which seems OK. But if you are going to do
custom tweaking like rewriting objects, or making it common to refuse
some refs, then I think things get pretty inefficient for _everybody_.

The negotiation for future fetches uses the existing refs as the
starting point. And if we don't know that we have the objects because
there are no refs pointing at them, they're going to get transferred
again. That's extra load no the server, and extra time for the user
waiting on the network.

I tend to agree with the direction of thinking you outlined: you're
generally better off completing the fetch to a local namespace that
tracks the other side completely, and then manipulating the local refs
as you see fit (e.g., fetching into refs/quarantine, and then migrating
"good" refs over to refs/remotes/origin).

-Peff


Re: Fetch-hooks

2018-02-09 Thread Leo Gaspard
On 02/09/2018 11:04 PM, Ævar Arnfjörð Bjarmason wrote:>>> You could also
have some intermediate step between these two, where
>>> e.g. your refspec for "origin" is
>>> "+refs/heads/*:refs/remotes/origin-untrusted/*" instead of the default
>>> "+refs/heads/*:refs/remotes/origin/*", you fetch all refs to that
>>> location, then you move them (with some alias/hook) to
>>> "refs/remotes/origin/*" once they're seen to be "OK".
>>
>> That is indeed another possibility, but then the idea is to make things
>> as transparent as possible for the end-user, not to completely change
>> their git workflow. As such, considering only signed commits to be part
>> of the upstream seems to make sense to me?
> 
> I mean this would be something that would be part of a post-fetch hook,
> so it would be as transparent as what you're doing to the user, with the
> difference that it doesn't need to involve changes to what you slurp
> down from the server.
> 
> I.e. we'd just fetch into refs/remotes/origin-untrusted/, then we run
> your post-fetch hook and you go over the new refs, and copy what you
> like (usually everything) to refs/remotes/origin/*.

Hmm... but that would then require a post-fetch hook, wouldn't it? And
about a post-fetch hook, if I understood correctly, Junio in [1] had a
quite nice argument against it:

Although I do not deeply care between such a "trigger to only
notify, no touching" hook and a full-blown "allow hook writers to
easily lie about what happened in the fetch" hook, I was hoping that
we would get this right and useful if we were spending our brain
bandwidth on it. I am not very fond of an easier "trigger to only
notify" hook because people are bound to misuse the interface and
try updating the refs anyway, making it easy to introduce
inconsistencies between refs and FETCH_HEAD that will confuse the
later "merge" step.

Otherwise, if it doesn't require a post-fetch hook, then it would
require the end-user to first fetch, then run the
`copy-trusted-refs-over` script, which would add stuff to the user's
workflow.

Did I miss another possibility?

> [...]
> 
> One thing that's not discussed yet, and I know just enough about for it
> to tingle my spidey sense, but not enough to say for sure (CC'd Jeff &
> Brandon who know more) is that this feature once shipped might cause
> higher load on git hosting providers.
> 
> This is because people will inevitably use it in popular projects for
> some custom filtering, and because you're continually re-fetching and
> inspecting stuff what used to be a really cheap no-op "pull" most of the
> time is a more expensive negotiation every time before the client
> rejects the refs again, and worse for hosting providers because you have
> bespoke ref fetching strategies you have less odds of being able to
> cache both the negotiation and the pack you serve.
> 
> I.e. you want this for some security feature where 99.99% of the time
> you accept all refs, but most people will probably use this to implement
> dynamic Turing-complete refspecs.
> 
> Maybe that's worrying about nothing, but worth thinking about.

Well... First, I must say I didn't really understand your last paragraph
about Turing-complete refspecs.

But my understanding of how the fetch-hook patchset I sent this evening
works is that it first receives all the objects from the hosting
provider, then locally moves the refs, but never actually discards the
downloaded objects (well, until a `git gc` I guess).

So I don't think the network traffic with the provider would be any
different wrt. what it is now, even if a tweak-fetch hook rejects some
commits? Then again I don't know git's internals enough to be have even
a bit of certainty about what I'm saying right now, so...


[1] https://marc.info/?l=git=132480559712592=2


Re: [PATCH 1/2] update-index doc: note a fixed bug in the untracked cache

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Junio C. Hamano jotted:

> Ævar Arnfjörð Bjarmason   writes:
>
>> +Before 2.17, the untracked cache had a bug where replacing a directory
>> +with a symlink to another directory could cause it to incorrectly show
>> +files tracked by git as untracked. See the "status: add a failing test
>> +showing a core.untrackedCache bug" commit to git.git. A workaround for
>> +that was (and this might work for other undiscoverd bugs in the
>> +future):
>
> s/undiscoverd/undiscovered/
>
> But more importantly, would it help _us_ to encourage people to
> squelch the diagnoses without telling us about potential breakage, I
> wonder, by telling them to do this for other undiscovered cases,
> too?

You mean including something like "if you see this the git ML would like
to hear about it"?

> Will queue with the above typofix, together with the other one.  I
> am not sure if we want to say "Before 2.17", though.

I'm just keeping in mind the user who later on upgrades git from say
2.14 to 2.18 or something, and is able to find in the docs when/why this
new warning got added, which helps diagnose it.

>> +
>> +
>> +$ git -c core.untrackedCache=false status
>> +
>> +
>> +This bug has also been shown to affect non-symlink cases of replacing
>> +a directory with a file when it comes to the internal structures of
>> +the untracked cache, but no case has been found where this resulted in
>> +wrong "git status" output.
>> +
>>  File System Monitor
>>  ---


Re: [PATCH 0/2] update-index doc: note new caveats in 2.17

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Junio C. Hamano jotted:

> Ævar Arnfjörð Bjarmason   writes:
>
>> When users upgrade to 2.17 they're going to have git yelling at them
>> (as my users did) on existing checkouts, with no indication whatsoever
>> that it's due to the UC or how to fix it.
>
> Wait.  Are you saying that the new warning is "warning" against a
> condition that is not an error?

No I mean the warning itself gives you no indication what the solution
is, or why it might be happening, and because it's printed on every
occurrence we had "git status" invocations on some big repos where there
would be hundreds of lines of the same warning for different
now-nonexisting directories.

Deferring it and just printing "we had N cases of..." would be better,
but would make the logic a lot more complex, I'm not sure how common
this would be in the wild, but as noted happened on a large proportion
of our checkouts, so having something mentioning this in the docs is
good.

I only had that git version out for about an hour, but had some users rm
-rfing their checkouts and re-cloning because they couldn't figure out
how to fix it.

Is noting it in the docs going to help all those users? No, but at least
we'll have something Google-able they're likely to find.

>> ... doesn't it only warn under that mode? I.e.:
>>
>> -"could not open directory '%s'")
>> +"core.untrackedCache: could not open directory '%s'")
>
> For example, if it attempts to open a directory which does *not*
> have to exist, and sees an error from opendir() due to ENOENT, then
> I do not think it should be giving a warning.  If we positively know
> that a directory should exist there and we cannot open it, of course
> we should warn.  Also, if we know a directory should be readable
> when it exists, then we should be warning when we see an error other
> than ENOENT.

*nod*, so not UC-specific, even though I've only seen it in relation to
that.


Re: [PATCH 003/194] object-store: move packed_git and packed_git_mru to object store

2018-02-09 Thread Junio C Hamano
Stefan Beller  writes:

> Patch generated by
>
>  2. Applying the semantic patch contrib/coccinelle/packed_git.cocci
> to adjust callers.

About this part...

> diff --git a/contrib/coccinelle/packed_git.cocci 
> b/contrib/coccinelle/packed_git.cocci
> new file mode 100644
> index 00..da317a51a9
> --- /dev/null
> +++ b/contrib/coccinelle/packed_git.cocci
> @@ -0,0 +1,7 @@
> +@@ @@
> +- packed_git
> ++ the_repository->objects.packed_git
> +
> +@@ @@
> +- packed_git_mru
> ++ the_repository->objects.packed_git_mru

The above is correct for one-time transition turning pre-transition
code to post the_repository world, but I am not sure if we want to
have it in contrib/coccinelle/, where "make coccicheck" looks at, as
a way to continuously keep an eye on "style" violations like using
strbuf_addf() for a constant when strbuf_addstr() suffices.

Wouldn't we need a mechanism to ensure that this file will *not* be
used in "make coccicheck" somehow?




Re: Fetch-hooks

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Thu, Feb 08 2018, Leo Gaspard jotted:

> On 02/08/2018 10:06 PM, Ævar Arnfjörð Bjarmason wrote:>> Hmm, OK, so I
> guess I'll try to update the patch when I get some time to
>>> delve into git's internals, as my use case (forbidding some fetches)
>>> couldn't afaik be covered by a wrapper hook.
>>
>> Per my reading of
>> https://public-inbox.org/git/20111224234212.ga21...@gnu.kitenet.net/
>> what Joey implemented is not what you described in your initial mail.
>>
>> His is a *post*-fetch hook, we've already done the fetch and are just
>> telling you as a courtesy what refs changed. You could also implement
>> this as some cronjob that polls git for-each-ref but it's easier as a
>> hook, fine.
>
> I was thinking along the lines of
> https://marc.info/?l=git=132486687023893=2
> with high-level description at
> https://marc.info/?l=git=132480559712592=2
>
> With the high-level description given here, I'm pretty sure I can hack a
> hook together to make things work as I want them to.
>
>> What you're describing is something like a pre-fetch hook analogous to
>> the pre-receive hooks, where you're fed refs updated on the remote on
>> stdin, and can say you don't want some of those to be updated.
>>
>> This may just be a lack of imagination on my part, but I don't see how
>> that's sensible at all.
>>
>> The refs we fetch are our *copy* of the remote refs, why would you not
>> want to track the upstream remote. You're going to refuse some branches
>> and what? Be further behind until some point in the future where the tip
>> is GPG-signed and you accept it, at which poich you'll need to do more
>> work than if you were up-to-date with the almost-GPG-signed version?
>
> That's about it. I want all fetching to be blocked in case of the tip
> not being signed. As there is a pre-push hook ensuring committers don't
> forget to sign before pushing, the only case the tip could not be signed
> is in case of an attack, which means it's better to just force-push
> master because any git repo that fetched it is doomed anyway. Definitely
> would not want to allow an untrusted revision get into anything that
> could even remotely be taken as “endorsed” by the user.
>
> (BTW, in order to avoid the case of someone forgetting to sign the
> commit and not having installed the pre-push hook, there can be holes in
> the commit-signing chain, the drawback being that the committer pushing
> a signed commit takes responsibility for all unsigned commits directly
> preceding his -- allowing them to recover in case of a mistaken push)
>
>> I think you're confusing two things here. One is the reasonable concern
>> of wanting to not have your local copy of remote refs have undesirable
>> content, but a pre-fetch hook is not the way to accomplish that.
>
> Well, a pre-fetch hook is a possible way of accomplishing that, and I
> don't know of any better one?
>
>> The other is e.g. to ensure that you never locally check out some "bad"
>> ref, we don't have hook support for that, but could add it,
>> e.g. git-checkout and git reset --hard could be taught about some
>> pre-checkout hook.
>
> Issue is, once we have to fix checkout and reset, all other commands
> that potentially touch the worktree also have to be fixed (eg. I don't
> know whether worktree add triggers pre-checkout?)
>
> Also, this requires the hook to store a database of all the paths that
> have been checked, because there is no logic in how one may choose to
> checkout the repo. While having a tweak-fetch hook would make the
> implementation straightforward, because at the time of invoking the hook
> the “refname at remote” commit is already trusted, and the “object name”
> is the commit whose validity we want to check, so we just have to check
> the path between those two. (I don't know if you checked my current
> scripts, but basically as the set of allowed PGP keys can change at any
> commit, it's only possible to check a commit path, not a single commit
> out-of-nowhere)

Right, it's certainly the case that refusing the refs is a more
effective gatekeeper, we're not going to have hooks for all the
low-level stuff that might inspect sha1s or check them out.

My assumption was that hooking into just a couple of things would be
good enough, but yes, there's other trade-offs.

> The only issue that could arise with a tweak-fetch hook is in case of a
> force-fetch (and even maybe it's not even an actual issue, I haven't
> given it real thought yet), but this can reasonably be banned, as once a
> commit is signed it enters the “real” master branch, that should never
> be moved backward, as it can't be the sign of an attack.
>
>> You could also have some intermediate step between these two, where
>> e.g. your refspec for "origin" is
>> "+refs/heads/*:refs/remotes/origin-untrusted/*" instead of the default
>> "+refs/heads/*:refs/remotes/origin/*", you fetch all refs to that
>> location, then you move them (with some alias/hook) to
>> "refs/remotes/origin/*" once 

Re: [PATCH 1/2] update-index doc: note a fixed bug in the untracked cache

2018-02-09 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> +Before 2.17, the untracked cache had a bug where replacing a directory
> +with a symlink to another directory could cause it to incorrectly show
> +files tracked by git as untracked. See the "status: add a failing test
> +showing a core.untrackedCache bug" commit to git.git. A workaround for
> +that was (and this might work for other undiscoverd bugs in the
> +future):

s/undiscoverd/undiscovered/

But more importantly, would it help _us_ to encourage people to
squelch the diagnoses without telling us about potential breakage, I
wonder, by telling them to do this for other undiscovered cases,
too?

Will queue with the above typofix, together with the other one.  I
am not sure if we want to say "Before 2.17", though.

> +
> +
> +$ git -c core.untrackedCache=false status
> +
> +
> +This bug has also been shown to affect non-symlink cases of replacing
> +a directory with a file when it comes to the internal structures of
> +the untracked cache, but no case has been found where this resulted in
> +wrong "git status" output.
> +
>  File System Monitor
>  ---


[PATCH 1/2] fetch: preparations for tweak-fetch hook

2018-02-09 Thread Leo Gaspard
From: Léo Gaspard 

No behavior changes yet, only some groundwork for the next change.

The refs_result structure combines a status code with a ref map, which
can be NULL even on success. This will be needed when there's a
tweak-fetch hook, because it can filter out all refs, while still
succeeding.

fetch_refs returns a refs_result, so that it can modify the ref_map.

Based-on-patch-by: Joey Hess 
Signed-off-by: Leo Gaspard 
---
 builtin/fetch.c | 68 +
 1 file changed, 44 insertions(+), 24 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbcd26fa..76dc05f61 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -34,6 +34,11 @@ enum {
TAGS_SET = 2
 };
 
+struct refs_result {
+   struct ref *new_refs;
+   int status;
+};
+
 static int fetch_prune_config = -1; /* unspecified */
 static int prune = -1; /* unspecified */
 #define PRUNE_BY_DEFAULT 0 /* do we prune by default? */
@@ -57,6 +62,18 @@ static int shown_url = 0;
 static int refmap_alloc, refmap_nr;
 static const char **refmap_array;
 
+static int add_existing(const char *refname, const struct object_id *oid,
+   int flag, void *cbdata)
+{
+   struct string_list *list = (struct string_list *)cbdata;
+   struct string_list_item *item = string_list_insert(list, refname);
+   struct object_id *old_oid = xmalloc(sizeof(*old_oid));
+
+   oidcpy(old_oid, oid);
+   item->util = old_oid;
+   return 0;
+}
+
 static int git_fetch_config(const char *k, const char *v, void *cb)
 {
if (!strcmp(k, "fetch.prune")) {
@@ -217,18 +234,6 @@ static void add_merge_config(struct ref **head,
}
 }
 
-static int add_existing(const char *refname, const struct object_id *oid,
-   int flag, void *cbdata)
-{
-   struct string_list *list = (struct string_list *)cbdata;
-   struct string_list_item *item = string_list_insert(list, refname);
-   struct object_id *old_oid = xmalloc(sizeof(*old_oid));
-
-   oidcpy(old_oid, oid);
-   item->util = old_oid;
-   return 0;
-}
-
 static int will_fetch(struct ref **head, const unsigned char *sha1)
 {
struct ref *rm = *head;
@@ -920,15 +925,20 @@ static int quickfetch(struct ref *ref_map)
return check_connected(iterate_ref_map, , );
 }
 
-static int fetch_refs(struct transport *transport, struct ref *ref_map)
+static struct refs_result fetch_refs(struct transport *transport,
+   struct ref *ref_map)
 {
-   int ret = quickfetch(ref_map);
-   if (ret)
-   ret = transport_fetch_refs(transport, ref_map);
-   if (!ret)
-   ret |= store_updated_refs(transport->url,
+   struct refs_result ret;
+   ret.status = quickfetch(ref_map);
+   if (ret.status) {
+   ret.status = transport_fetch_refs(transport, ref_map);
+   }
+   if (!ret.status) {
+   ret.new_refs = ref_map;
+   ret.status |= store_updated_refs(transport->url,
transport->remote->name,
-   ref_map);
+   ret.new_refs);
+   }
transport_unlock_pack(transport);
return ret;
 }
@@ -1048,9 +1058,11 @@ static struct transport *prepare_transport(struct remote 
*remote, int deepen)
return transport;
 }
 
-static void backfill_tags(struct transport *transport, struct ref *ref_map)
+static struct refs_result backfill_tags(struct transport *transport,
+   struct ref *ref_map)
 {
int cannot_reuse;
+   struct refs_result res;
 
/*
 * Once we have set TRANS_OPT_DEEPEN_SINCE, we can't unset it
@@ -1069,12 +1081,14 @@ static void backfill_tags(struct transport *transport, 
struct ref *ref_map)
transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL);
transport_set_option(transport, TRANS_OPT_DEPTH, "0");
transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL);
-   fetch_refs(transport, ref_map);
+   res = fetch_refs(transport, ref_map);
 
if (gsecondary) {
transport_disconnect(gsecondary);
gsecondary = NULL;
}
+
+   return res;
 }
 
 static int do_fetch(struct transport *transport,
@@ -1083,6 +1097,7 @@ static int do_fetch(struct transport *transport,
struct string_list existing_refs = STRING_LIST_INIT_DUP;
struct ref *ref_map;
struct ref *rm;
+   struct refs_result res;
int autotags = (transport->remote->fetch_tags == 1);
int retcode = 0;
 
@@ -1135,7 +1150,10 @@ static int do_fetch(struct transport *transport,
   transport->url);
}
}
-   if (fetch_refs(transport, ref_map)) {
+
+   res = fetch_refs(transport, ref_map);
+   ref_map = res.new_refs;
+   if (res.status) {

[PATCH 2/2] fetch: add tweak-fetch hook

2018-02-09 Thread Leo Gaspard
From: Léo Gaspard 

The tweak-fetch hook is fed lines on stdin for all refs that were
fetched, and outputs on stdout possibly modified lines. Its output is
then parsed and used when `git fetch` updates the remote tracking refs,
records the entries in FETCH_HEAD, and produces its report.

The modifications here are heavily based on prior work by Joey Hess.

Based-on-patch-by: Joey Hess 
Signed-off-by: Leo Gaspard 
---
 Documentation/githooks.txt  |  37 +++
 builtin/fetch.c | 210 +++-
 t/t5574-fetch-tweak-fetch-hook.sh   |  90 
 templates/hooks--tweak-fetch.sample |  24 +
 4 files changed, 359 insertions(+), 2 deletions(-)
 create mode 100755 t/t5574-fetch-tweak-fetch-hook.sh
 create mode 100755 templates/hooks--tweak-fetch.sample

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index f877f7b7c..1b4a18bf0 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -177,6 +177,43 @@ This hook can be used to perform repository validity 
checks, auto-display
 differences from the previous HEAD if different, or set working dir metadata
 properties.
 
+tweak-fetch
+~~~
+
+This hook is invoked by 'git fetch' (commonly called by 'git pull'), after refs
+have been fetched from the remote repository. It is not executed, if nothing 
was
+fetched.
+
+The output of the hook is used to update the remote-tracking branches, and
+`.git/FETCH_HEAD`, in preparation for a later merge operation done by 'git
+merge'.
+
+It takes no arguments, but is fed a line of the following format on its 
standard
+input for each ref that was fetched.
+
+   SP not-for-merge|merge|ignore SP  SP  
LF
+
+Where the "not-for-merge" flag indicates the ref is not to be merged into the
+current branch, and the "merge" flag indicates that 'git merge' should later
+merge it.
+
+The `` is the remote's name for the ref that was fetched, and
+`` is a name of a remote-tracking branch, like
+"refs/remotes/origin/master". `` can be undefined if the fetched
+ref is not being stored in a local refname. In this case, it will be set to 
`@`,
+an invalide refspec, so that scripts can be written more easily.
+
+TODO: Add documentation for the “ignore” parameter. Unfortunately, I'm not
+really sure I get what this does or what invariants it is supposed to maintain
+(eg. all “ignore” updates at the end of the refs list?), so this may also
+require code changes.
+
+The hook must consume all of its standard input, and output back lines of the
+same format. It can modify its input as desired, including adding or removing
+lines, updating the sha1 (i.e. re-point the remote-tracking branch), changing
+the merge flag, and changing the `` (i.e. use different
+remote-tracking branch).
+
 post-merge
 ~~
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 76dc05f61..1bb394530 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -28,6 +28,8 @@ static const char * const builtin_fetch_usage[] = {
NULL
 };
 
+static const char tweak_fetch_hook[] = "tweak-fetch";
+
 enum {
TAGS_UNSET = 0,
TAGS_DEFAULT = 1,
@@ -181,6 +183,206 @@ static struct option builtin_fetch_options[] = {
OPT_END()
 };
 
+static int feed_tweak_fetch_hook(int in, int out, void *data)
+{
+   struct ref *ref;
+   struct strbuf buf = STRBUF_INIT;
+   const char *kw, *peer_ref;
+   char oid_buf[GIT_SHA1_HEXSZ + 1];
+   int ret;
+
+   for (ref = data; ref; ref = ref->next) {
+   if (ref->fetch_head_status == FETCH_HEAD_MERGE)
+   kw = "merge";
+   else if (ref->fetch_head_status == FETCH_HEAD_IGNORE)
+   kw = "ignore";
+   else
+   kw = "not-for-merge";
+   if (!ref->name)
+   die("trying to fetch an inexistant ref");
+   if (ref->peer_ref && ref->peer_ref->name)
+   peer_ref = ref->peer_ref->name;
+   else
+   peer_ref = "@";
+   strbuf_addf(, "%s %s %s %s\n",
+   oid_to_hex_r(oid_buf, >old_oid), kw,
+   ref->name, peer_ref);
+   }
+
+   ret = write_in_full(out, buf.buf, buf.len) != buf.len;
+   if (ret)
+   warning("%s hook failed to consume all its input",
+   tweak_fetch_hook);
+   close(out);
+   strbuf_release();
+   return ret;
+}
+
+static struct ref *parse_tweak_fetch_hook_line(char *l,
+   struct string_list *existing_refs)
+{
+   struct ref *ref = NULL, *peer_ref = NULL;
+   struct string_list_item *peer_item = NULL;
+   char *words[4];
+   int i, word = 0;
+   char *problem;
+
+   for (i = 0; l[i]; i++) {
+   if (isspace(l[i])) {
+   l[i] = '\0';
+   

Re: [PATCH 0/2] update-index doc: note new caveats in 2.17

2018-02-09 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> When users upgrade to 2.17 they're going to have git yelling at them
> (as my users did) on existing checkouts, with no indication whatsoever
> that it's due to the UC or how to fix it.

Wait.  Are you saying that the new warning is "warning" against a
condition that is not an error?

> ... doesn't it only warn under that mode? I.e.:
>
> -"could not open directory '%s'")
> +"core.untrackedCache: could not open directory '%s'")

For example, if it attempts to open a directory which does *not*
have to exist, and sees an error from opendir() due to ENOENT, then
I do not think it should be giving a warning.  If we positively know
that a directory should exist there and we cannot open it, of course
we should warn.  Also, if we know a directory should be readable
when it exists, then we should be warning when we see an error other
than ENOENT.

So...


[PATCH 0/2] fetch: add tweak-fetch hook

2018-02-09 Thread Leo Gaspard
On 02/09/2018 09:20 PM, Joey Hess wrote:> Yes; my patches are under the
same GPL-2 as the rest of git.

Thanks! So here comes my patch series, heavily based on yours.

There are some things to bear in mind while reviewing it:

 * This is my first real attempt at contributing to git, which means I
could be very very far off-track

 * Most of it is based on trying to make the 6-year-old patch series
work and pass all the tests, so if a new feature has been added since
then I likely didn't notice it or don't know how to handle it correctly

There are still three TODO's in the code:

 * In the documentation, one stating that I don't really get what this
“ignore” parameter exactly does, and whether it should be handled
specially (a prime example of a new feature I'm not really sure how to
handle, somewhere in the code it's written all the “ignore” references
are at the end of the list, but I'm already not self-confident enough
about the difference between “merge” and “not-for-merge” to even
consider making a good choice about how to handle “ignore”)

 * In `builtins/fetch.c`, function `do_fetch`, there is a conflict of
interest between placing the `prune` before the `fetch` (as done by
commit 10a6cc889 ("fetch --prune: Run prune before fetching",
2014-01-02)), and placing the `fetch` before the `prune` (which would
allow hooks that rename the local-ref to not be prune'd and then
re-fetched when doing a `git fetch --prune` -- without that a hook that
would want to both read the old commit information and rename the
local-ref would not be able to). Or maybe this question means actually
there should be a third solution? but I don't really know what. Maybe
also hooking into the prune operation?

 * In `templates/hooks--tweak-fetch.sample`, the “check this actually
works” todo, as I'd rather first check this patch series is not too far
off-topic before doing non-essential work -- anyway another version of
the patch series will be required for the other two TODO's, so I can fix
it at this point.

That being said, what do you think about these patches?

Thanks for your time!
Leo Gaspard


Re: [PATCH v2 11/17] fetch tests: fetch as well as fetch []

2018-02-09 Thread Eric Sunshine
On Fri, Feb 9, 2018 at 3:27 PM, Jeff King  wrote:
> On Fri, Feb 09, 2018 at 09:05:00PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> >> +   key=$(echo $1 | sed -e 's/^remote\.origin/fetch/')
>> >
>> > Faster (thus more Windows-friendly) assuming that $1 always starts
>> > with "remote.origin":
>> >
>> > key=fetch${u#remote.origin}
>>
>> Tests fail with this and I'm not excited to be the first user in git's
>> test suite to use some novel shell feature, no existing uses of
>> ${u[...].
>>
>> I also think stuff like this is on the wrong side of cleverness
>> v.s. benefit. I can't find any reference to this syntax in bash or dash
>> manpages (forward-search "${u"), but echo | sed is obvious, and it's not
>> going to make a big difference for Windows.
>
> The "u" isn't the magic, it's the "#". I.e.:
>
>   key=fetch${1#remote.origin}
>
> and it's used all over the place in our scripts.  I'm not sure why Eric
> wrote "u". :)

Because I was testing the expression interactively in the shell and
assigned to a variable arbitrarily named "u". When I copy/pasted the
working expression from the shell session into the email, I stupidly
forgot to change the "u" to "1".


Re: [PATCH v2 11/17] fetch tests: fetch as well as fetch []

2018-02-09 Thread Junio C Hamano
Junio C Hamano  writes:

> Ævar Arnfjörð Bjarmason  writes:
>
 +   key=$(echo $1 | sed -e 's/^remote\.origin/fetch/')
>>>
>>> Faster (thus more Windows-friendly) assuming that $1 always starts
>>> with "remote.origin":
>>>
>>> key=fetch${u#remote.origin}
>>
>> Tests fail with this and I'm not excited to be the first user in git's
>> test suite to use some novel shell feature, no existing uses of
>> ${u[...].
>
> s/u/1/, I think.

Ah, that's been already pointed out.  Sorry for the noise.


Re: [PATCH v2 00/17] document & test fetch pruning & add fetch.pruneTags

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Thu, Feb 08 2018, Junio C. Hamano jotted:

> Ævar Arnfjörð Bjarmason  writes:
>
>> On Thu, Feb 08 2018, Ævar Arnfjörð Bjarmason jotted:
>>
>>> As noted in my 87h8quytmq@evledraar.gmail.com there was a bug I
>>> noticed in v3 where it would segfault on some git-fetch invocations,
>>> but there were not tests anywhere that caught that.
>>
>> ...and of course this whole submission this time around should be v4,
>> not v2, but I didn't notice that I didn't adjust the subject prefix
>> before sending. Junio: Hopefully you can pick it up anyway without too
>> much trouble, sorry.
>>
>> FWIW I've deployed this to production @ work to some tens of k of
>> machines (low "k" of which have users using git) without any issues.
>
> Will replace.  I found that the resolution of conflicts necessary
> with the jh/partial-clone topic is a bit different from the previous
> version due to addition of an extra parameter to fetch_one(), and I
> think I didn't botch it, but please double check when I push the
> results out in a few hours.

Thanks. FWIW I've looked over that resolution in pu and it LGTM.


[PATCH 1/2] update-index doc: note a fixed bug in the untracked cache

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Document the bug tested for in my "status: add a failing test showing
a core.untrackedCache bug" and fixed in Duy's "dir.c: fix missing dir
invalidation in untracked code".

Since this is very likely something others will encounter in the
future on older versions, and it's not obvious how to fix it let's
document both that it exists, and how to "fix" it with a one-off
command.

As noted in that commit, even though this bug gets the untracked cache
into a bad state, we have not yet found a case where this is user
visible, and thus it makes sense for these docs to focus on the
symlink case only.

Signed-off-by: Ævar Arnfjörð Bjarmason 
Signed-off-by: Junio C Hamano 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-update-index.txt | 16 
 1 file changed, 16 insertions(+)

diff --git a/Documentation/git-update-index.txt 
b/Documentation/git-update-index.txt
index bdb0342593..e30b185918 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -464,6 +464,22 @@ command reads the index; while when 
`--[no-|force-]untracked-cache`
 are used, the untracked cache is immediately added to or removed from
 the index.
 
+Before 2.17, the untracked cache had a bug where replacing a directory
+with a symlink to another directory could cause it to incorrectly show
+files tracked by git as untracked. See the "status: add a failing test
+showing a core.untrackedCache bug" commit to git.git. A workaround for
+that was (and this might work for other undiscoverd bugs in the
+future):
+
+
+$ git -c core.untrackedCache=false status
+
+
+This bug has also been shown to affect non-symlink cases of replacing
+a directory with a file when it comes to the internal structures of
+the untracked cache, but no case has been found where this resulted in
+wrong "git status" output.
+
 File System Monitor
 ---
 
-- 
2.15.1.424.g9478a66081



[PATCH 0/2] update-index doc: note new caveats in 2.17

2018-02-09 Thread Ævar Arnfjörð Bjarmason
On Fri, Feb 09 2018, Junio C. Hamano jotted:
> Ævar Arnfjörð Bjarmason  writes:
>> I think you / Nguyễn may not have seen my
>> <87d11omi2o@evledraar.gmail.com>
>> (https://public-inbox.org/git/87d11omi2o@evledraar.gmail.com/)
>>
>> As noted there I think it's best to proceed without the "dir.c: stop
>> ignoring opendir[...]" patch.
>>
>> It's going to be a bad regression in 2.17 if it ends up spewing pagefuls
>> of warnings in previously working repos if the UC is on.
>
> Well, I am not sure if it is a regression to diagnose problematic
> untracked cache information left by earlier version of the software
> with bugs.  After all, this is still an experimental feature, and we
> do want to see the warning to serve its purpose to diagnose possible
> remaining bugs, and new similar ones when a newer iteration of the
> feature introduces, no?

Fair enough. I just wanted to make sure it had been seen. It wasn't
obvious whether it had just been missed since there was no follow-up
there or note in WC.

We were disagreeing to the extent that some of this should be
documented in 878tcbmbqb@evledraar.gmail.com and related, and I
see you've ejected the doc patch I had in that series.

I really think we should be saying something like what this new doc
series says, it's conceptually on top of Duy's series but applies on
top of master.

I think there's room to quibble about whether to include 1/2 at all
since it's a relatively obscure edge case.

2/2 however is something I think a *lot* of users are going to hit. I
just skimmed the relevant bits of git-config and git-update-index now,
and nothing describes the UC as an experimental feature, and it's been
well-known to improve performance.

When users upgrade to 2.17 they're going to have git yelling at them
(as my users did) on existing checkouts, with no indication whatsoever
that it's due to the UC or how to fix it.

Let's at least documente it. I also wonder if the "dir.c: stop
ignoring opendir() error in open_cached_dir()" change shouldn't have
something in the warning itself about it being caused by the UC,
doesn't it only warn under that mode? I.e.:

-"could not open directory '%s'")
+"core.untrackedCache: could not open directory '%s'")

Ævar Arnfjörð Bjarmason (2):
  update-index doc: note a fixed bug in the untracked cache
  update-index doc: note the caveat with "could not open..."

 Documentation/git-update-index.txt | 26 ++
 1 file changed, 26 insertions(+)

-- 
2.15.1.424.g9478a66081



[PATCH 2/2] update-index doc: note the caveat with "could not open..."

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Note the caveat where 2.17 is stricter about index validation
potentially causing "could not open directory" warnings when git is
upgraded. See the preceding "dir.c: stop ignoring opendir() error in
open_cached_dir()" change.

This caused some mayhem when I upgraded git to a version with this
series at Booking.com, and other users have doubtless enabled the UC
extension and are in for a surprise when they upgrade. Let's give them
a headsup in the docs.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/git-update-index.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/git-update-index.txt 
b/Documentation/git-update-index.txt
index e30b185918..0c81600d8c 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -480,6 +480,16 @@ a directory with a file when it comes to the internal 
structures of
 the untracked cache, but no case has been found where this resulted in
 wrong "git status" output.
 
+There are also cases where existing indexes written by git versions
+before 2.17 will reference directories that don't exist anymore,
+potentially causing many "could not open directory" warnings to be
+printed on "git status". These are new warnings for existing issues
+that were previously silently discarded.
+
+As with the bug described above the solution is to one-off do a "git
+status" run with `core.untrackedCache=false` to flush out the leftover
+bad data.
+
 File System Monitor
 ---
 
-- 
2.15.1.424.g9478a66081



Re: [PATCH v2 11/17] fetch tests: fetch as well as fetch []

2018-02-09 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>>> +   key=$(echo $1 | sed -e 's/^remote\.origin/fetch/')
>>
>> Faster (thus more Windows-friendly) assuming that $1 always starts
>> with "remote.origin":
>>
>> key=fetch${u#remote.origin}
>
> Tests fail with this and I'm not excited to be the first user in git's
> test suite to use some novel shell feature, no existing uses of
> ${u[...].

s/u/1/, I think.


Re: "git bisect run make" adequate to locate first unbuildable commit?

2018-02-09 Thread Philip Oakley, CEng MIET

From: "Robert P. J. Day" 


 writing a short tutorial on "git bisect" and, all the details of
special exit code 125 aside, if one wanted to locate the first
unbuildable commit, would it be sufficient to just run?

 $ git bisect run make

 as i read it, make returns either 0, 1 or 2 so there doesn't appear
to be any possibility of weirdness with clashing with a 125 exit code.
am i overlooking some subtle detail here i should be aware of? thanks.

rday



In the spirit of pedanticism, one should also clarify the word "first", in 
that it's not a linear search for _an_ unbuildable commit, but that one is 
looking for the transition between an unbroken sequence of unbuildable 
commits, which transitions to buildable commits, and its the transition that 
is sought. (there could be many random unbuildable commits within a sequence 
in some folks' processes!)

--
Philip 



Re: "git bisect run make" adequate to locate first unbuildable commit?

2018-02-09 Thread Robert P. J. Day
On Fri, 9 Feb 2018, Philip Oakley, CEng MIET wrote:

> From: "Robert P. J. Day" 
> >
> > writing a short tutorial on "git bisect" and, all the details of
> > special exit code 125 aside, if one wanted to locate the first
> > unbuildable commit, would it be sufficient to just run?
> >
> >  $ git bisect run make
> >
> > as i read it, make returns either 0, 1 or 2 so there doesn't appear
> > to be any possibility of weirdness with clashing with a 125 exit code.
> > am i overlooking some subtle detail here i should be aware of? thanks.
> >
> > rday
>
> In the spirit of pedanticism, one should also clarify the word
> "first", in that it's not a linear search for _an_ unbuildable
> commit, but that one is looking for the transition between an
> unbroken sequence of unbuildable commits, which transitions to
> buildable commits, and its the transition that is sought. (there
> could be many random unbuildable commits within a sequence in some
> folks' processes!)

  quite so, i should have been more precise.

rday


[PATCH v5 17/17] fetch: make the --prune-tags work with

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Make the new --prune-tags option work properly when git-fetch is
invoked with a  parameter instead of a 
parameter.

This change is split off from the introduction of --prune-tags due to
the relative complexity of munging the incoming argv, which is easier
to review as a separate change.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/git-fetch.txt | 21 ++---
 builtin/fetch.c | 17 ++---
 t/t5510-fetch.sh| 16 +++-
 3 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/Documentation/git-fetch.txt b/Documentation/git-fetch.txt
index af12310f75..e319935597 100644
--- a/Documentation/git-fetch.txt
+++ b/Documentation/git-fetch.txt
@@ -175,24 +175,15 @@ It's reasonable to e.g. configure `fetch.pruneTags=true` 
in
 run, without making every invocation of `git fetch` without `--prune`
 an error.
 
-Another special case of `--prune-tags` is that
-`refs/tags/*:refs/tags/*` will not be implicitly provided if an URL is
-being fetched. I.e.:
-
-
-$ git fetch  --prune --prune-tags
-
-
-Will prune no tags, as opposed to:
+Pruning tags with `--prune-tags` also works when fetching a URL
+instead of a named remote. These will all prune tags not found on
+origin:
 
 
 $ git fetch origin --prune --prune-tags
-
-
-To prune tags given a URL supply the refspec explicitly:
-
-
-$ git fetch  --prune 'refs/tags/*:refs/tags/*'
+$ git fetch origin --prune 'refs/tags/*:refs/tags/*'
+$ git fetch  --prune --prune-tags
+$ git fetch  --prune 'refs/tags/*:refs/tags/*'
 
 
 OUTPUT
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 55a0fc37be..c96f17a9a3 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1283,7 +1283,10 @@ static int fetch_one(struct remote *remote, int argc, 
const char **argv, int pru
static const char **refs = NULL;
struct refspec *refspec;
int ref_nr = 0;
+   int j = 0;
int exit_code;
+   int maybe_prune_tags;
+   int remote_via_config = remote_is_configured(remote, 0);
 
if (!remote)
die(_("No remote repository specified.  Please, specify either 
a URL or a\n"
@@ -1311,13 +1314,21 @@ static int fetch_one(struct remote *remote, int argc, 
const char **argv, int pru
prune_tags = PRUNE_TAGS_BY_DEFAULT;
}
 
-   if (prune_tags_ok && prune_tags && remote_is_configured(remote, 0))
+   maybe_prune_tags = prune_tags_ok && prune_tags;
+   if (maybe_prune_tags && remote_via_config)
add_prune_tags_to_fetch_refspec(remote);
 
+   if (argc > 0 || (maybe_prune_tags && !remote_via_config)) {
+   size_t nr_alloc = st_add3(argc, maybe_prune_tags, 1);
+   refs = xcalloc(nr_alloc, sizeof(const char *));
+   if (maybe_prune_tags) {
+   refs[j++] = xstrdup("refs/tags/*:refs/tags/*");
+   ref_nr++;
+   }
+   }
+
if (argc > 0) {
-   int j = 0;
int i;
-   refs = xcalloc(st_add(argc, 1), sizeof(const char *));
for (i = 0; i < argc; i++) {
if (!strcmp(argv[i], "tag")) {
i++;
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 227dd70b7b..dce2371302 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -738,18 +738,15 @@ test_configured_prune unset unset unset true pruned  kept 
\
"--prune origin +refs/heads/*:refs/remotes/origin/*"
 
 # Pruning that also takes place if a file:// url replaces a named
-# remote, with the exception of --prune-tags on the command-line
-# (arbitrary limitation).
-#
-# However, because there's no implicit
+# remote. However, because there's no implicit
 # +refs/heads/*:refs/remotes/origin/* refspec and supplying it on the
 # command-line negates --prune-tags, the branches will not be pruned.
 test_configured_prune_type unset unset unset unset kept   kept   "origin 
--prune-tags" "name"
 test_configured_prune_type unset unset unset unset kept   kept   "origin 
--prune-tags" "link"
 test_configured_prune_type unset unset unset unset pruned pruned "origin 
--prune --prune-tags" "name"
-test_configured_prune_type unset unset unset unset kept   kept   "origin 
--prune --prune-tags" "link"
+test_configured_prune_type unset unset unset unset kept   pruned "origin 
--prune --prune-tags" "link"
 test_configured_prune_type unset unset unset unset pruned pruned "--prune 
--prune-tags origin" "name"
-test_configured_prune_type unset unset unset unset kept   kept   "--prune 
--prune-tags origin" "link"
+test_configured_prune_type unset unset unset unset kept   pruned "--prune 
--prune-tags 

[PATCH v5 16/17] fetch: add a --prune-tags option and fetch.pruneTags config

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Add a --prune-tags option to git-fetch, along with fetch.pruneTags
config option and a -P shorthand (-p is --prune). This allows for
doing any of:

git fetch -p -P
git fetch --prune --prune-tags
git fetch -p -P origin
git fetch --prune --prune-tags origin

Or simply:

git config fetch.prune true &&
git config fetch.pruneTags true &&
git fetch

Instead of the much more verbose:

git fetch --prune origin 'refs/tags/*:refs/tags/*' 
'+refs/heads/*:refs/remotes/origin/*'

Before this feature it was painful to support the use-case of pulling
from a repo which is having both its branches *and* tags deleted
regularly, and have our local references to reflect upstream.

At work we create deployment tags in the repo for each rollout, and
there's *lots* of those, so they're archived within weeks for
performance reasons.

Without this change it's hard to centrally configure such repos in
/etc/gitconfig (on servers that are only used for working with
them). You need to set fetch.prune=true globally, and then for each
repo:

git -C {} config --replace-all remote.origin.fetch 
"refs/tags/*:refs/tags/*" "^\+*refs/tags/\*:refs/tags/\*$"

Now I can simply set fetch.pruneTags=true in /etc/gitconfig as well,
and users running "git pull" will automatically get the pruning
semantics I want.

Even though "git remote" has corresponding "prune" and "update
--prune" subcommands I'm intentionally not adding a corresponding
prune-tags or "update --prune --prune-tags" mode to that command.

It's advertised (as noted in my recent "git remote doc: correct
dangerous lies about what prune does") as only modifying remote
tracking references, whereas any --prune-tags option is always going
to modify what from the user's perspective is a local copy of the tag,
since there's no such thing as a remote tracking tag.

Ideally add_prune_tags_to_fetch_refspec() would be something that
would use ALLOC_GROW() to grow the 'fetch` member of the 'remote'
struct. Instead I'm realloc-ing remote->fetch and adding the
tag_refspec to the end.

The reason is that parse_{fetch,push}_refspec which allocate the
refspec (ultimately remote->fetch) struct are called many places that
don't have access to a 'remote' struct. It would be hard to change all
their callsites to be amenable to carry around the bookkeeping
variables required for dynamic allocation.

All the other callers of the API first incrementally construct the
string version of the refspec in remote->fetch_refspec via
add_fetch_refspec(), before finally calling parse_fetch_refspec() via
some variation of remote_get().

It's less of a pain to deal with the one special case that needs to
modify already constructed refspecs than to chase down and change all
the other callsites. The API I'm adding is intentionally not
generalized because if we add more of these we'd probably want to
re-visit how this is done.

See my "Re: [BUG] git remote prune removes local tags, depending on
fetch config" (87po6ahx87@evledraar.gmail.com;
https://public-inbox.org/git/87po6ahx87@evledraar.gmail.com/) for
more background info.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt   | 14 +++
 Documentation/fetch-options.txt| 14 ++-
 Documentation/git-fetch.txt| 47 +++
 builtin/fetch.c| 32 ++--
 contrib/completion/git-completion.bash |  2 +-
 remote.c   | 14 +++
 remote.h   |  3 ++
 t/t5510-fetch.sh   | 70 ++
 8 files changed, 191 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0f27af5760..e254bfd531 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1401,6 +1401,14 @@ fetch.prune::
option was given on the command line.  See also `remote..prune`
and the PRUNING section of linkgit:git-fetch[1].
 
+fetch.pruneTags::
+   If true, fetch will automatically behave as if the
+   `refs/tags/*:refs/tags/*` refspec was provided when pruning,
+   if not set already. This allows for setting both this option
+   and `fetch.prune` to maintain a 1=1 mapping to upstream
+   refs. See also `remote..pruneTags` and the PRUNING
+   section of linkgit:git-fetch[1].
+
 fetch.output::
Control how ref update status is printed. Valid values are
`full` and `compact`. Default value is `full`. See section
@@ -2945,6 +2953,12 @@ remote..prune::
remove any remote-tracking references that no longer exist on the
remote (as if the `--prune` option was given on the command line).
Overrides `fetch.prune` settings, if any.
+
+remote..pruneTags::
+   When set to true, fetching from this remote by default will also
+   remove any local tags that no longer exist on the remote if pruning
+   is activated in general 

[PATCH v5 15/17] fetch tests: add scaffolding for the new fetch.pruneTags

2018-02-09 Thread Ævar Arnfjörð Bjarmason
The fetch.pruneTags configuration doesn't exist yet, but will be added
in a subsequent commit. Since testing for it requires adding new
parameters to the test_configured_prune function it's easier to review
this patch first to assert that no functional changes are introduced
yet.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t5510-fetch.sh | 92 ++--
 1 file changed, 50 insertions(+), 42 deletions(-)

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 9c87fa6106..1713111006 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -562,10 +562,12 @@ test_configured_prune () {
 test_configured_prune_type () {
fetch_prune=$1
remote_origin_prune=$2
-   expected_branch=$3
-   expected_tag=$4
-   cmdline=$5
-   mode=$6
+   fetch_prune_tags=$3
+   remote_origin_prune_tags=$4
+   expected_branch=$5
+   expected_tag=$6
+   cmdline=$7
+   mode=$8
 
if test -z "$cmdline_setup"
then
@@ -590,14 +592,16 @@ test_configured_prune_type () {
cmdline="$new_cmdline"
fi
 
-   test_expect_success "$mode prune fetch.prune=$1 
remote.origin.prune=$2${5:+ $5}; branch:$3 tag:$4" '
+   test_expect_success "$mode prune fetch.prune=$1 remote.origin.prune=$2 
fetch.pruneTags=$3 remote.origin.pruneTags=$4${7:+ $7}; branch:$5 tag:$6" '
# make sure a newbranch is there in . and also in one
git branch -f newbranch &&
git tag -f newtag &&
(
cd one &&
test_unconfig fetch.prune &&
+   test_unconfig fetch.pruneTags &&
test_unconfig remote.origin.prune &&
+   test_unconfig remote.origin.pruneTags &&
git fetch '"$cmdline_setup"' &&
git rev-parse --verify refs/remotes/origin/newbranch &&
git rev-parse --verify refs/tags/newtag
@@ -612,7 +616,9 @@ test_configured_prune_type () {
cd one &&
git_fetch_c="" &&
set_config_tristate fetch.prune $fetch_prune &&
+   set_config_tristate fetch.pruneTags $fetch_prune_tags &&
set_config_tristate remote.origin.prune 
$remote_origin_prune &&
+   set_config_tristate remote.origin.pruneTags 
$remote_origin_prune_tags &&
 
if test "$mode" != "link"
then
@@ -641,57 +647,59 @@ test_configured_prune_type () {
 
 # $1 config: fetch.prune
 # $2 config: remote..prune
-# $3 expect: branch to be pruned?
-# $4 expect: tag to be pruned?
-# $5 git-fetch $cmdline:
+# $3 config: fetch.pruneTags
+# $4 config: remote..pruneTags
+# $5 expect: branch to be pruned?
+# $6 expect: tag to be pruned?
+# $7 git-fetch $cmdline:
 #
-# $1$2$3 $4 $5
-test_configured_prune unset unset kept   kept   ""
-test_configured_prune unset unset kept   kept   "--no-prune"
-test_configured_prune unset unset pruned kept   "--prune"
-test_configured_prune unset unset kept   pruned \
+# $1$2$3$4$5 $6 $7
+test_configured_prune unset unset unset unset kept   kept   ""
+test_configured_prune unset unset unset unset kept   kept   "--no-prune"
+test_configured_prune unset unset unset unset pruned kept   "--prune"
+test_configured_prune unset unset unset unset kept   pruned \
"--prune origin refs/tags/*:refs/tags/*"
-test_configured_prune unset unset pruned pruned \
+test_configured_prune unset unset unset unset pruned pruned \
"--prune origin refs/tags/*:refs/tags/* 
+refs/heads/*:refs/remotes/origin/*"
 
-test_configured_prune false unset kept   kept   ""
-test_configured_prune false unset kept   kept   "--no-prune"
-test_configured_prune false unset pruned kept   "--prune"
+test_configured_prune false unset unset unset kept   kept   ""
+test_configured_prune false unset unset unset kept   kept   "--no-prune"
+test_configured_prune false unset unset unset pruned kept   "--prune"
 
-test_configured_prune true  unset pruned kept   ""
-test_configured_prune true  unset pruned kept   "--prune"
-test_configured_prune true  unset kept   kept   "--no-prune"
+test_configured_prune true  unset unset unset pruned kept   ""
+test_configured_prune true  unset unset unset pruned kept   "--prune"
+test_configured_prune true  unset unset unset kept   kept   "--no-prune"
 
-test_configured_prune unset false kept   kept   ""
-test_configured_prune unset false kept   kept   "--no-prune"
-test_configured_prune unset false pruned kept   "--prune"
+test_configured_prune unset false unset unset kept   kept   ""
+test_configured_prune unset false unset unset kept   kept   "--no-prune"
+test_configured_prune unset false unset unset pruned kept   "--prune"
 
-test_configured_prune false 

[PATCH v5 07/17] fetch tests: add a tag to be deleted to the pruning tests

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Add a tag to be deleted to the fetch --prune tests. The tag is always
kept for now, which is the expected behavior, but now I can add a test
for tag pruning in a later commit.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t5510-fetch.sh | 93 
 1 file changed, 53 insertions(+), 40 deletions(-)

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index ab8b25344d..fad65bd885 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -552,21 +552,25 @@ test_configured_prune () {
fetch_prune=$1
remote_origin_prune=$2
expected_branch=$3
-   cmdline=$4
+   expected_tag=$4
+   cmdline=$5
 
-   test_expect_success "prune fetch.prune=$1 remote.origin.prune=$2${4:+ 
$4}; branch:$3" '
+   test_expect_success "prune fetch.prune=$1 remote.origin.prune=$2${5:+ 
$5}; branch:$3 tag:$4" '
# make sure a newbranch is there in . and also in one
git branch -f newbranch &&
+   git tag -f newtag &&
(
cd one &&
test_unconfig fetch.prune &&
test_unconfig remote.origin.prune &&
git fetch &&
-   git rev-parse --verify refs/remotes/origin/newbranch
+   git rev-parse --verify refs/remotes/origin/newbranch &&
+   git rev-parse --verify refs/tags/newtag
) &&
 
# now remove it
git branch -d newbranch &&
+   git tag -d newtag &&
 
# then test
(
@@ -582,6 +586,14 @@ test_configured_prune () {
kept)
git rev-parse --verify 
refs/remotes/origin/newbranch
;;
+   esac &&
+   case "$expected_tag" in
+   pruned)
+   test_must_fail git rev-parse --verify 
refs/tags/newtag
+   ;;
+   kept)
+   git rev-parse --verify refs/tags/newtag
+   ;;
esac
)
'
@@ -590,44 +602,45 @@ test_configured_prune () {
 # $1 config: fetch.prune
 # $2 config: remote..prune
 # $3 expect: branch to be pruned?
-# $4 git-fetch $cmdline:
+# $4 expect: tag to be pruned?
+# $5 git-fetch $cmdline:
 #
-# $1$2$3 $4
-test_configured_prune unset unset kept   ""
-test_configured_prune unset unset kept   "--no-prune"
-test_configured_prune unset unset pruned "--prune"
-
-test_configured_prune false unset kept   ""
-test_configured_prune false unset kept   "--no-prune"
-test_configured_prune false unset pruned "--prune"
-
-test_configured_prune true  unset pruned ""
-test_configured_prune true  unset pruned "--prune"
-test_configured_prune true  unset kept   "--no-prune"
-
-test_configured_prune unset false kept   ""
-test_configured_prune unset false kept   "--no-prune"
-test_configured_prune unset false pruned "--prune"
-
-test_configured_prune false false kept   ""
-test_configured_prune false false kept   "--no-prune"
-test_configured_prune false false pruned "--prune"
-
-test_configured_prune true  false kept   ""
-test_configured_prune true  false pruned "--prune"
-test_configured_prune true  false kept   "--no-prune"
-
-test_configured_prune unset true  pruned ""
-test_configured_prune unset true  kept   "--no-prune"
-test_configured_prune unset true  pruned "--prune"
-
-test_configured_prune false true  pruned ""
-test_configured_prune false true  kept   "--no-prune"
-test_configured_prune false true  pruned "--prune"
-
-test_configured_prune true  true  pruned ""
-test_configured_prune true  true  pruned "--prune"
-test_configured_prune true  true  kept   "--no-prune"
+# $1$2$3 $4 $5
+test_configured_prune unset unset kept   kept   ""
+test_configured_prune unset unset kept   kept   "--no-prune"
+test_configured_prune unset unset pruned kept   "--prune"
+
+test_configured_prune false unset kept   kept   ""
+test_configured_prune false unset kept   kept   "--no-prune"
+test_configured_prune false unset pruned kept   "--prune"
+
+test_configured_prune true  unset pruned kept   ""
+test_configured_prune true  unset pruned kept   "--prune"
+test_configured_prune true  unset kept   kept   "--no-prune"
+
+test_configured_prune unset false kept   kept   ""
+test_configured_prune unset false kept   kept   "--no-prune"
+test_configured_prune unset false pruned kept   "--prune"
+
+test_configured_prune false false kept   kept   ""
+test_configured_prune false false kept   kept   "--no-prune"
+test_configured_prune false false pruned kept   "--prune"
+
+test_configured_prune true  false kept   kept   ""
+test_configured_prune true  false pruned kept   "--prune"
+test_configured_prune 

[PATCH v5 06/17] fetch tests: re-arrange arguments for future readability

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Re-arrange the arguments to the test_configured_prune() function used
in this test to pass the arguments to --fetch last. A subsequent
change will test for more elaborate fetch arguments, including long
refspecs. It'll be more readable to be able to wrap those on a new
line of their own.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t5510-fetch.sh | 82 ++--
 1 file changed, 44 insertions(+), 38 deletions(-)

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 11da97f9b7..ab8b25344d 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -551,10 +551,10 @@ set_config_tristate () {
 test_configured_prune () {
fetch_prune=$1
remote_origin_prune=$2
-   cmdline=$3
-   expected_branch=$4
+   expected_branch=$3
+   cmdline=$4
 
-   test_expect_success "prune fetch.prune=$1 remote.origin.prune=$2${3:+ 
$3}; branch:$4" '
+   test_expect_success "prune fetch.prune=$1 remote.origin.prune=$2${4:+ 
$4}; branch:$3" '
# make sure a newbranch is there in . and also in one
git branch -f newbranch &&
(
@@ -587,41 +587,47 @@ test_configured_prune () {
'
 }
 
-test_configured_prune unset unset ""   kept
-test_configured_prune unset unset "--no-prune" kept
-test_configured_prune unset unset "--prune"pruned
-
-test_configured_prune false unset ""   kept
-test_configured_prune false unset "--no-prune" kept
-test_configured_prune false unset "--prune"pruned
-
-test_configured_prune true  unset ""   pruned
-test_configured_prune true  unset "--prune"pruned
-test_configured_prune true  unset "--no-prune" kept
-
-test_configured_prune unset false ""   kept
-test_configured_prune unset false "--no-prune" kept
-test_configured_prune unset false "--prune"pruned
-
-test_configured_prune false false ""   kept
-test_configured_prune false false "--no-prune" kept
-test_configured_prune false false "--prune"pruned
-
-test_configured_prune true  false ""   kept
-test_configured_prune true  false "--prune"pruned
-test_configured_prune true  false "--no-prune" kept
-
-test_configured_prune unset true  ""   pruned
-test_configured_prune unset true  "--no-prune" kept
-test_configured_prune unset true  "--prune"pruned
-
-test_configured_prune false true  ""   pruned
-test_configured_prune false true  "--no-prune" kept
-test_configured_prune false true  "--prune"pruned
-
-test_configured_prune true  true  ""   pruned
-test_configured_prune true  true  "--prune"pruned
-test_configured_prune true  true  "--no-prune" kept
+# $1 config: fetch.prune
+# $2 config: remote..prune
+# $3 expect: branch to be pruned?
+# $4 git-fetch $cmdline:
+#
+# $1$2$3 $4
+test_configured_prune unset unset kept   ""
+test_configured_prune unset unset kept   "--no-prune"
+test_configured_prune unset unset pruned "--prune"
+
+test_configured_prune false unset kept   ""
+test_configured_prune false unset kept   "--no-prune"
+test_configured_prune false unset pruned "--prune"
+
+test_configured_prune true  unset pruned ""
+test_configured_prune true  unset pruned "--prune"
+test_configured_prune true  unset kept   "--no-prune"
+
+test_configured_prune unset false kept   ""
+test_configured_prune unset false kept   "--no-prune"
+test_configured_prune unset false pruned "--prune"
+
+test_configured_prune false false kept   ""
+test_configured_prune false false kept   "--no-prune"
+test_configured_prune false false pruned "--prune"
+
+test_configured_prune true  false kept   ""
+test_configured_prune true  false pruned "--prune"
+test_configured_prune true  false kept   "--no-prune"
+
+test_configured_prune unset true  pruned ""
+test_configured_prune unset true  kept   "--no-prune"
+test_configured_prune unset true  pruned "--prune"
+
+test_configured_prune false true  pruned ""
+test_configured_prune false true  kept   "--no-prune"
+test_configured_prune false true  pruned "--prune"
+
+test_configured_prune true  true  pruned ""
+test_configured_prune true  true  pruned "--prune"
+test_configured_prune true  true  kept   "--no-prune"
 
 test_expect_success 'all boundary commits are excluded' '
test_commit base &&
-- 
2.15.1.424.g9478a66081



[PATCH v5 14/17] git-fetch & config doc: link to the new PRUNING section

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Amend the documentation for fetch.prune, fetch..prune and
--prune to link to the recently added PRUNING section.

I'd have liked to link directly to it with "<>" from
fetch-options.txt, since it's included in git-fetch.txt (git-pull.txt
also includes it, but doesn't include that option). However making a
reference across files yields this error:

[...]/Documentation/git-fetch.xml:226: element xref: validity
error : IDREF attribute linkend references an unknown ID "PRUNING"

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt| 6 +-
 Documentation/fetch-options.txt | 3 +++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0e25b2c92b..0f27af5760 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1398,7 +1398,8 @@ fetch.unpackLimit::
 
 fetch.prune::
If true, fetch will automatically behave as if the `--prune`
-   option was given on the command line.  See also `remote..prune`.
+   option was given on the command line.  See also `remote..prune`
+   and the PRUNING section of linkgit:git-fetch[1].
 
 fetch.output::
Control how ref update status is printed. Valid values are
@@ -2944,6 +2945,9 @@ remote..prune::
remove any remote-tracking references that no longer exist on the
remote (as if the `--prune` option was given on the command line).
Overrides `fetch.prune` settings, if any.
++
+See also `remote..prune` and the PRUNING section of
+linkgit:git-fetch[1].
 
 remotes.::
The list of remotes which are fetched by "git remote update
diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index fb6bebbc61..9f5c85ad96 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -74,6 +74,9 @@ ifndef::git-pull[]
line or in the remote configuration, for example if the remote
was cloned with the --mirror option), then they are also
subject to pruning.
++
+See the PRUNING section below for more details.
+
 endif::git-pull[]
 
 ifndef::git-pull[]
-- 
2.15.1.424.g9478a66081



[PATCH v5 12/17] git fetch doc: add a new section to explain the ins & outs of pruning

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Add a new section to canonically explain how remote reference pruning
works, and how users should be careful about using it in conjunction
with tag refspecs in particular.

A subsequent commit will update the git-remote documentation to refer
to this section, and details the motivation for writing this in the
first place.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/git-fetch.txt | 49 +
 1 file changed, 49 insertions(+)

diff --git a/Documentation/git-fetch.txt b/Documentation/git-fetch.txt
index b153aefa68..e94bcfb8c3 100644
--- a/Documentation/git-fetch.txt
+++ b/Documentation/git-fetch.txt
@@ -99,6 +99,55 @@ The latter use of the `remote..fetch` values can 
be
 overridden by giving the `--refmap=` parameter(s) on the
 command line.
 
+PRUNING
+---
+
+Git has a default disposition of keeping data unless it's explicitly
+thrown away; this extends to holding onto local references to branches
+on remotes that have themselves deleted those branches.
+
+If left to accumulate, these stale references might make performance
+worse on big and busy repos that have a lot of branch churn, and
+e.g. make the output of commands like `git branch -a --contains
+` needlessly verbose, as well as impacting anything else
+that'll work with the complete set of known references.
+
+These remote-tracking references can be deleted as a one-off with
+either of:
+
+
+# While fetching
+$ git fetch --prune 
+
+# Only prune, don't fetch
+$ git remote prune 
+
+
+To prune references as part of your normal workflow without needing to
+remember to run that, set `fetch.prune` globally, or
+`remote..prune` per-remote in the config. See
+linkgit:git-config[1].
+
+Here's where things get tricky and more specific. The pruning feature
+doesn't actually care about branches, instead it'll prune local <->
+remote-references as a function of the refspec of the remote (see
+`` and <> above).
+
+Therefore if the refspec for the remote includes
+e.g. `refs/tags/*:refs/tags/*`, or you manually run e.g. `git fetch
+--prune  "refs/tags/*:refs/tags/*"` it won't be stale remote
+tracking branches that are deleted, but any local tag that doesn't
+exist on the remote.
+
+This might not be what you expect, i.e. you want to prune remote
+``, but also explicitly fetch tags from it, so when you fetch
+from it you delete all your local tags, most of which may not have
+come from the `` remote in the first place.
+
+So be careful when using this with a refspec like
+`refs/tags/*:refs/tags/*`, or any other refspec which might map
+references from multiple remotes to the same local namespace.
+
 OUTPUT
 --
 
-- 
2.15.1.424.g9478a66081



[PATCH v5 11/17] fetch tests: fetch as well as fetch []

2018-02-09 Thread Ævar Arnfjörð Bjarmason
When a remote URL is supplied on the command-line the internals of the
fetch are different, in particular the code in get_ref_map(). An
earlier version of the subsequent fetch.pruneTags patch hid a segfault
because the difference wasn't tested for.

Now all the tests are run as both of the variants of:

git fetch
git -c [...] fetch $(git config remote.origin.url) $(git config 
remote.origin.fetch)

I'm using -c because while the [fetch] config just set by
set_config_tristate will be picked up, the remote.origin.* config
won't override it as intended.

Work around that and turn this into a purely command-line test by
always setting the variables on the command-line, and translate any
setting of remote.origin.X into fetch.X.

The reason for choosing the names "name" and "link" as opposed to
e.g. "named" and "url" is because they're the same length, which makes
the test output easier to read as it will be aligned.

Due to shellscript quoting madness it's not worthwhile to do all of
this within a test_expect_success, but do the parts that can easily be
done there, including the one-time setting of variables that don't
change between runs to be used by subsequent runs in the 'prune_type
setup' test.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t5510-fetch.sh | 44 
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index dfc749f576..9c87fa6106 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -548,18 +548,49 @@ set_config_tristate () {
;;
*)
git config "$1" "$2"
+   key=$(echo $1 | sed -e 's/^remote\.origin/fetch/')
+   git_fetch_c="$git_fetch_c -c $key=$2"
;;
esac
 }
 
 test_configured_prune () {
+   test_configured_prune_type "$@" "name"
+   test_configured_prune_type "$@" "link"
+}
+
+test_configured_prune_type () {
fetch_prune=$1
remote_origin_prune=$2
expected_branch=$3
expected_tag=$4
cmdline=$5
-
-   test_expect_success "prune fetch.prune=$1 remote.origin.prune=$2${5:+ 
$5}; branch:$3 tag:$4" '
+   mode=$6
+
+   if test -z "$cmdline_setup"
+   then
+   test_expect_success 'setup cmdline_setup variable for 
subsequent test' '
+   remote_url="file://$(git -C one config 
remote.origin.url)" &&
+   remote_fetch="$(git -C one config remote.origin.fetch)" 
&&
+   cmdline_setup="\"$remote_url\" \"$remote_fetch\""
+   '
+   fi
+
+   if test "$mode" = 'link'
+   then
+   new_cmdline=""
+
+   if test "$cmdline" = ""
+   then
+   new_cmdline=$cmdline_setup
+   else
+   new_cmdline=$(printf "%s" "$cmdline" | perl -pe 
's[origin(?!/)]["'"$remote_url"'"]g')
+   fi
+
+   cmdline="$new_cmdline"
+   fi
+
+   test_expect_success "$mode prune fetch.prune=$1 
remote.origin.prune=$2${5:+ $5}; branch:$3 tag:$4" '
# make sure a newbranch is there in . and also in one
git branch -f newbranch &&
git tag -f newtag &&
@@ -567,7 +598,7 @@ test_configured_prune () {
cd one &&
test_unconfig fetch.prune &&
test_unconfig remote.origin.prune &&
-   git fetch &&
+   git fetch '"$cmdline_setup"' &&
git rev-parse --verify refs/remotes/origin/newbranch &&
git rev-parse --verify refs/tags/newtag
) &&
@@ -579,10 +610,15 @@ test_configured_prune () {
# then test
(
cd one &&
+   git_fetch_c="" &&
set_config_tristate fetch.prune $fetch_prune &&
set_config_tristate remote.origin.prune 
$remote_origin_prune &&
 
-   git fetch '"$cmdline"' &&
+   if test "$mode" != "link"
+   then
+   git_fetch_c=""
+   fi &&
+   git$git_fetch_c fetch '"$cmdline"' &&
case "$expected_branch" in
pruned)
test_must_fail git rev-parse --verify 
refs/remotes/origin/newbranch
-- 
2.15.1.424.g9478a66081



[PATCH v5 13/17] git remote doc: correct dangerous lies about what prune does

2018-02-09 Thread Ævar Arnfjörð Bjarmason
The "git remote prune " command uses the same machinery as "git
fetch  --prune", and shares all the same caveats, but its
documentation has suggested that it'll just "delete stale
remote-tracking branches under ".

This isn't true, and hasn't been true since at least v1.8.5.6 (the
oldest version I could be bothered to test).

E.g. if "refs/tags/*:refs/tags/*" is explicitly set in the refspec of
the remote, it'll delete all local tags  doesn't know about.

Instead, briefly give the reader just enough of a hint that this
option might constitute a shotgun aimed at their foot, and point them
to the new PRUNING section in the git-fetch documentation which
explains all the nuances of what this facility does.

See "[BUG] git remote prune removes local tags, depending on fetch
config" (caci5s_39wnrbfjlfn0xhcy+uewtfn2ymnacrc86z6pjutjw...@mail.gmail.com)
by Michael Giuffrida for the initial report.

Reported-by: Michael Giuffrida 
Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/git-remote.txt | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-remote.txt b/Documentation/git-remote.txt
index 577b969c1b..4feddc0293 100644
--- a/Documentation/git-remote.txt
+++ b/Documentation/git-remote.txt
@@ -172,10 +172,14 @@ With `-n` option, the remote heads are not queried first 
with
 
 'prune'::
 
-Deletes all stale remote-tracking branches under .
-These stale branches have already been removed from the remote repository
-referenced by , but are still locally available in
-"remotes/".
+Deletes stale references associated with . By default, stale
+remote-tracking branches under  are deleted, but depending on
+global configuration and the configuration of the remote we might even
+prune local tags that haven't been pushed there. Equivalent to `git
+fetch --prune `, except that no new references will be fetched.
++
+See the PRUNING section of linkgit:git-fetch[1] for what it'll prune
+depending on various configuration.
 +
 With `--dry-run` option, report what branches will be pruned, but do not
 actually prune them.
@@ -189,7 +193,7 @@ remotes.default is not defined, all remotes which do not 
have the
 configuration parameter remote..skipDefaultUpdate set to true will
 be updated.  (See linkgit:git-config[1]).
 +
-With `--prune` option, prune all the remotes that are updated.
+With `--prune` option, run pruning against all the remotes that are updated.
 
 
 DISCUSSION
-- 
2.15.1.424.g9478a66081



[PATCH v5 05/17] fetch tests: refactor in preparation for testing tag pruning

2018-02-09 Thread Ævar Arnfjörð Bjarmason
In a subsequent commit this function will learn to test for tag
pruning, prepare for that by making space for more variables, and
making it clear that "expected" here refers to branches.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t5510-fetch.sh | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 668c54be41..11da97f9b7 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -549,9 +549,12 @@ set_config_tristate () {
 }
 
 test_configured_prune () {
-   fetch_prune=$1 remote_origin_prune=$2 cmdline=$3 expected=$4
+   fetch_prune=$1
+   remote_origin_prune=$2
+   cmdline=$3
+   expected_branch=$4
 
-   test_expect_success "prune fetch.prune=$1 remote.origin.prune=$2${3:+ 
$3}; $4" '
+   test_expect_success "prune fetch.prune=$1 remote.origin.prune=$2${3:+ 
$3}; branch:$4" '
# make sure a newbranch is there in . and also in one
git branch -f newbranch &&
(
@@ -572,7 +575,7 @@ test_configured_prune () {
set_config_tristate remote.origin.prune 
$remote_origin_prune &&
 
git fetch $cmdline &&
-   case "$expected" in
+   case "$expected_branch" in
pruned)
test_must_fail git rev-parse --verify 
refs/remotes/origin/newbranch
;;
-- 
2.15.1.424.g9478a66081



[PATCH v5 01/17] fetch: don't redundantly NULL something calloc() gave us

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Stop redundantly NULL-ing the last element of the refs structure,
which was retrieved via calloc(), and is thus guaranteed to be
pre-NULL'd.

This code dates back to b888d61c83 ("Make fetch a builtin",
2007-09-10), where wasn't any reason to do this back then either, it's
just boilerplate left over from when git-fetch was initially
introduced.

The motivation for this change was to make a subsequent change which
would also modify the refs variable smaller, since it won't have to
copy this redundant "NULL the last + 1 item" pattern.

We may not end up keeping that change, but as this pattern is still
pointless, so let's fix it.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/fetch.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbcd26faf..b34665db9e 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1302,7 +1302,6 @@ static int fetch_one(struct remote *remote, int argc, 
const char **argv)
} else
refs[j++] = argv[i];
}
-   refs[j] = NULL;
ref_nr = j;
}
 
-- 
2.15.1.424.g9478a66081



[PATCH v5 03/17] fetch: stop accessing "remote" variable indirectly

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Access the "remote" variable passed to the fetch_one() directly rather
than through the gtransport wrapper struct constructed in this
function for other purposes.

This makes the code more readable, as it's now obvious that the remote
struct doesn't somehow get munged by the prepare_transport() function
above, which takes the "remote" struct as an argument and constructs
the "gtransport" struct, containing among other things the "remote"
struct.

A subsequent change will copy this pattern to access a new
remote->prune_tags field, but without the use of the gtransport
variable. It's useful once that change lands to see that the two
pieces of code behave exactly the same.

This pattern of accessing the container struct was added in
737c5a9cde ("fetch: make --prune configurable", 2013-07-13) when this
code was initially introduced.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/fetch.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 72085e30b9..a7705bc150 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1280,8 +1280,8 @@ static int fetch_one(struct remote *remote, int argc, 
const char **argv)
 
if (prune < 0) {
/* no command line request */
-   if (0 <= gtransport->remote->prune)
-   prune = gtransport->remote->prune;
+   if (0 <= remote->prune)
+   prune = remote->prune;
else if (0 <= fetch_prune_config)
prune = fetch_prune_config;
else
-- 
2.15.1.424.g9478a66081



[PATCH v5 04/17] remote: add a macro for "refs/tags/*:refs/tags/*"

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Add a macro with the refspec string "refs/tags/*:refs/tags/*". There's
been a pre-defined struct version of this since e0aaa29ff3 ("Have a
constant extern refspec for "--tags"", 2008-04-17), but nothing that
could be passed to e.g. add_fetch_refspec().

This will be used in subsequent commits to avoid hardcoding this
string in multiple places.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 remote.c | 1 +
 remote.h | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/remote.c b/remote.c
index 4e93753e19..356c123e3e 100644
--- a/remote.c
+++ b/remote.c
@@ -22,6 +22,7 @@ static struct refspec s_tag_refspec = {
"refs/tags/*"
 };
 
+/* See TAG_REFSPEC for the string version */
 const struct refspec *tag_refspec = _tag_refspec;
 
 struct counted_string {
diff --git a/remote.h b/remote.h
index 1f6611be21..80fea6dd11 100644
--- a/remote.h
+++ b/remote.h
@@ -297,4 +297,6 @@ extern int parseopt_push_cas_option(const struct option *, 
const char *arg, int
 extern int is_empty_cas(const struct push_cas_option *);
 void apply_push_cas(struct push_cas_option *, struct remote *, struct ref *);
 
+#define TAG_REFSPEC "refs/tags/*:refs/tags/*"
+
 #endif
-- 
2.15.1.424.g9478a66081



[PATCH v5 10/17] fetch tests: expand case/esac for later change

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Expand a compact case/esac statement for a later change that'll add
more logic to the body of the "*" case. This is a whitespace-only
change.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t5510-fetch.sh | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 88d38e0819..dfc749f576 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -543,8 +543,12 @@ test_expect_success "should be able to fetch with 
duplicate refspecs" '
 set_config_tristate () {
# var=$1 val=$2
case "$2" in
-   unset)  test_unconfig "$1" ;;
-   *)  git config "$1" "$2" ;;
+   unset)
+   test_unconfig "$1"
+   ;;
+   *)
+   git config "$1" "$2"
+   ;;
esac
 }
 
-- 
2.15.1.424.g9478a66081



[PATCH v5 02/17] fetch: trivially refactor assignment to ref_nr

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Trivially refactor an assignment to make a subsequent patch
smaller. The "ref_nr" variable is initialized to 0 earlier, just as
"j" is, and "j" is only incremented in that loop, so this change isn't
a logic error.

This change simplifies a subsequent change, which will split the
incrementing of "ref_nr" into two blocks.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/fetch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index b34665db9e..72085e30b9 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1301,8 +1301,8 @@ static int fetch_one(struct remote *remote, int argc, 
const char **argv)
argv[i], argv[i]);
} else
refs[j++] = argv[i];
+   ref_nr++;
}
-   ref_nr = j;
}
 
sigchain_push_common(unlock_pack_on_signal);
-- 
2.15.1.424.g9478a66081



[PATCH v5 09/17] fetch tests: double quote a variable for interpolation

2018-02-09 Thread Ævar Arnfjörð Bjarmason
If the $cmdline variable contains arguments with spaces they won't be
interpolated correctly, since the body of the test is single quoted,
and because test-lib.sh does its own eval().

This will be used in a subsequent commit to pass arguments that need
to be quoted to git-fetch, i.e. a file:// path to fetch, which will
have a space in it.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t5510-fetch.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index dacdb8759c..88d38e0819 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -578,7 +578,7 @@ test_configured_prune () {
set_config_tristate fetch.prune $fetch_prune &&
set_config_tristate remote.origin.prune 
$remote_origin_prune &&
 
-   git fetch $cmdline &&
+   git fetch '"$cmdline"' &&
case "$expected_branch" in
pruned)
test_must_fail git rev-parse --verify 
refs/remotes/origin/newbranch
-- 
2.15.1.424.g9478a66081



[PATCH v5 00/17] document & test fetch pruning & add fetch.pruneTags

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Here's a v5 (correct subject line this time!). Many thanks to Eric for
a thorough review.

I'll spare you the per-patch changelog. These are all minor commit
message / doc / comment wording changes, with the exception of making
a bit of the test code better, and adding a \n for grep
portability. tbdiff at the end.

Ævar Arnfjörð Bjarmason (17):
  fetch: don't redundantly NULL something calloc() gave us
  fetch: trivially refactor assignment to ref_nr
  fetch: stop accessing "remote" variable indirectly
  remote: add a macro for "refs/tags/*:refs/tags/*"
  fetch tests: refactor in preparation for testing tag pruning
  fetch tests: re-arrange arguments for future readability
  fetch tests: add a tag to be deleted to the pruning tests
  fetch tests: test --prune and refspec interaction
  fetch tests: double quote a variable for interpolation
  fetch tests: expand case/esac for later change
  fetch tests: fetch   as well as fetch []
  git fetch doc: add a new section to explain the ins & outs of pruning
  git remote doc: correct dangerous lies about what prune does
  git-fetch & config doc: link to the new PRUNING section
  fetch tests: add scaffolding for the new fetch.pruneTags
  fetch: add a --prune-tags option and fetch.pruneTags config
  fetch: make the --prune-tags work with 

 Documentation/config.txt   |  20 ++-
 Documentation/fetch-options.txt|  17 ++-
 Documentation/git-fetch.txt|  87 
 Documentation/git-remote.txt   |  14 +-
 builtin/fetch.c|  54 ++--
 contrib/completion/git-completion.bash |   2 +-
 remote.c   |  15 +++
 remote.h   |   5 +
 t/t5510-fetch.sh   | 238 +++--
 9 files changed, 391 insertions(+), 61 deletions(-)

tbdiff:

 1: da0992d97c =  1: da0992d97c fetch: don't redundantly NULL something 
calloc() gave us
 2: c781d18a29 !  2: 899430a3b2 fetch: trivially refactor assignment to ref_nr
@@ -7,8 +7,8 @@
 "j" is, and "j" is only incremented in that loop, so this change isn't
 a logic error.
 
-This change makes a subsequent change which splits the incrementing of
-"ref_nr" into two blocks.
+This change simplifies a subsequent change, which will split the
+incrementing of "ref_nr" into two blocks.
 
 Signed-off-by: Ævar Arnfjörð Bjarmason 
 
 3: 1203ef6e35 =  3: 98e3a28bdf fetch: stop accessing "remote" variable 
indirectly
 4: 1d7956c444 =  4: 384c1fc318 remote: add a macro for 
"refs/tags/*:refs/tags/*"
 5: a0dc3eb024 =  5: ec92777861 fetch tests: refactor in preparation for 
testing tag pruning
 6: 7ed1561e3d =  6: 1c23526223 fetch tests: re-arrange arguments for future 
readability
 7: 6dbf0e688d =  7: 59abe07b71 fetch tests: add a tag to be deleted to the 
pruning tests
 8: 9fc3589793 =  8: af7acef671 fetch tests: test --prune and refspec 
interaction
 9: a4487f9389 =  9: cb72187362 fetch tests: double quote a variable for 
interpolation
10: b8c07e2d42 = 10: 300c1c0985 fetch tests: expand case/esac for later change
11: 6341131ee0 ! 11: 550e7df594 fetch tests: fetch   as well as 
fetch []
@@ -60,15 +60,12 @@
 -  test_expect_success "prune fetch.prune=$1 remote.origin.prune=$2${5:+ 
$5}; branch:$3 tag:$4" '
 +  mode=$6
 +
-+  if ! test -e prune-type-setup-done
++  if test -z "$cmdline_setup"
 +  then
-+  test_expect_success 'prune_type setup' '
-+  git -C one config remote.origin.url >one.remote-url &&
-+  git -C one config remote.origin.fetch >one.remote-fetch 
&&
-+  remote_url="file://$(cat one.remote-url)" &&
-+  remote_fetch="$(cat one.remote-fetch)" &&
-+  cmdline_setup="\"$remote_url\" \"$remote_fetch\"" &&
-+  touch prune-type-setup-done
++  test_expect_success 'setup cmdline_setup variable for 
subsequent test' '
++  remote_url="file://$(git -C one config 
remote.origin.url)" &&
++  remote_fetch="$(git -C one config remote.origin.fetch)" 
&&
++  cmdline_setup="\"$remote_url\" \"$remote_fetch\""
 +  '
 +  fi
 +
12: 824c3ed4c1 ! 12: 273f4c603f git fetch doc: add a new section to explain the 
ins & outs of pruning
@@ -32,7 +32,7 @@
 +` needlessly verbose, as well as impacting anything else
 +that'll work with the complete set of known references.
 +
-+These remote tracking references can be deleted as a one-off with
++These remote-tracking references can be deleted as a one-off with
 +either of:
 +
 +
@@ -44,13 +44,13 @@
 +
 +
 +To prune references as part of your normal workflow without needing to

[PATCH v5 08/17] fetch tests: test --prune and refspec interaction

2018-02-09 Thread Ævar Arnfjörð Bjarmason
Add a test for the interaction between explicitly provided refspecs
and fetch.prune.

There's no point in adding this boilerplate to every combination of
unset/false/true, it's instructive and sufficient to show that no
matter if the variable is unset, false or true the refspec on the
command-line overrides any configuration variable.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t5510-fetch.sh | 12 
 1 file changed, 12 insertions(+)

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index fad65bd885..dacdb8759c 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -609,6 +609,10 @@ test_configured_prune () {
 test_configured_prune unset unset kept   kept   ""
 test_configured_prune unset unset kept   kept   "--no-prune"
 test_configured_prune unset unset pruned kept   "--prune"
+test_configured_prune unset unset kept   pruned \
+   "--prune origin refs/tags/*:refs/tags/*"
+test_configured_prune unset unset pruned pruned \
+   "--prune origin refs/tags/*:refs/tags/* 
+refs/heads/*:refs/remotes/origin/*"
 
 test_configured_prune false unset kept   kept   ""
 test_configured_prune false unset kept   kept   "--no-prune"
@@ -625,6 +629,10 @@ test_configured_prune unset false pruned kept   "--prune"
 test_configured_prune false false kept   kept   ""
 test_configured_prune false false kept   kept   "--no-prune"
 test_configured_prune false false pruned kept   "--prune"
+test_configured_prune false false kept   pruned \
+   "--prune origin refs/tags/*:refs/tags/*"
+test_configured_prune false false pruned pruned \
+   "--prune origin refs/tags/*:refs/tags/* 
+refs/heads/*:refs/remotes/origin/*"
 
 test_configured_prune true  false kept   kept   ""
 test_configured_prune true  false pruned kept   "--prune"
@@ -641,6 +649,10 @@ test_configured_prune false true  pruned kept   "--prune"
 test_configured_prune true  true  pruned kept   ""
 test_configured_prune true  true  pruned kept   "--prune"
 test_configured_prune true  true  kept   kept   "--no-prune"
+test_configured_prune true  true  kept   pruned \
+   "--prune origin refs/tags/*:refs/tags/*"
+test_configured_prune true  true  pruned pruned \
+   "--prune origin refs/tags/*:refs/tags/* 
+refs/heads/*:refs/remotes/origin/*"
 
 test_expect_success 'all boundary commits are excluded' '
test_commit base &&
-- 
2.15.1.424.g9478a66081



Re: [PATCH v2 11/17] fetch tests: fetch as well as fetch []

2018-02-09 Thread Jeff King
On Fri, Feb 09, 2018 at 09:05:00PM +0100, Ævar Arnfjörð Bjarmason wrote:

> >> diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
> >> @@ -548,18 +548,52 @@ set_config_tristate () {
> >> *)
> >> git config "$1" "$2"
> >> +   key=$(echo $1 | sed -e 's/^remote\.origin/fetch/')
> >
> > Faster (thus more Windows-friendly) assuming that $1 always starts
> > with "remote.origin":
> >
> > key=fetch${u#remote.origin}
> 
> Tests fail with this and I'm not excited to be the first user in git's
> test suite to use some novel shell feature, no existing uses of
> ${u[...].
> 
> I also think stuff like this is on the wrong side of cleverness
> v.s. benefit. I can't find any reference to this syntax in bash or dash
> manpages (forward-search "${u"), but echo | sed is obvious, and it's not
> going to make a big difference for Windows.

The "u" isn't the magic, it's the "#". I.e.:

  key=fetch${1#remote.origin}

and it's used all over the place in our scripts.  I'm not sure why Eric
wrote "u". :)

-Peff


Re: Fetch-hooks

2018-02-09 Thread Joey Hess
Leo Gaspard wrote:
> I just wanted to check, you did not put the Signed-off-by line in
> patches in https://marc.info/?l=git=132491485901482=2
> 
> Could you confirm that the patches you sent are “covered under an
> appropriate open source license and I have the right under that license
> to submit that work with modifications, whether created in whole or in
> part by me, under the same open source license (unless I am permitted to
> submit under a different license), as indicated in the file”, so that I
> could send the patch I wrote based on yours with a Signed-off-by line of
> my own without breaking the DCO?

Yes; my patches are under the same GPL-2 as the rest of git.

-- 
see shy jo


signature.asc
Description: PGP signature


Re: [PATCH v6 0/7] convert: add support for different encodings

2018-02-09 Thread Junio C Hamano
Documentation has core.checkRoundtripEncoding while t0028 and a
comment in convert.c capitalize it differently.  I suspect that it
would be more reader-friendly to update the documentation to match.




Re: [PATCH v2 11/17] fetch tests: fetch as well as fetch []

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Eric Sunshine jotted:

> On Thu, Feb 8, 2018 at 11:19 AM, Ævar Arnfjörð Bjarmason
>  wrote:
>> When a remote URL is supplied on the command-line the internals of the
>> fetch are different, in particular the code in get_ref_map(). An
>> earlier version of the subsequent fetch.pruneTags patch hid a segfault
>> because the difference wasn't tested for.
>>
>> Now all the tests are run as both of the variants of:
>>
>> git fetch
>> git -c [...] fetch $(git config remote.origin.url) $(git config 
>> remote.origin.fetch)
>>
>> I'm using -c because while the [fetch] config just set by
>> set_config_tristate will be picked up, the remote.origin.* config
>> won't override it as intended.
>>
>> Work around that and turn this into a purely command-line test by
>> always setting the variables on the command-line, and translate any
>> setting of remote.origin.X into fetch.X.
>> [...]
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>> diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
>> @@ -548,18 +548,52 @@ set_config_tristate () {
>> *)
>> git config "$1" "$2"
>> +   key=$(echo $1 | sed -e 's/^remote\.origin/fetch/')
>
> Faster (thus more Windows-friendly) assuming that $1 always starts
> with "remote.origin":
>
> key=fetch${u#remote.origin}

Tests fail with this and I'm not excited to be the first user in git's
test suite to use some novel shell feature, no existing uses of
${u[...].

I also think stuff like this is on the wrong side of cleverness
v.s. benefit. I can't find any reference to this syntax in bash or dash
manpages (forward-search "${u"), but echo | sed is obvious, and it's not
going to make a big difference for Windows.

>> +   git_fetch_c="$git_fetch_c -c $key=$2"
>> ;;
>> esac
>>  }
>>
>> +test_configured_prune_type () {
>> fetch_prune=$1
>> remote_origin_prune=$2
>> expected_branch=$3
>> expected_tag=$4
>> cmdline=$5
>> -
>> -   test_expect_success "prune fetch.prune=$1 
>> remote.origin.prune=$2${5:+ $5}; branch:$3 tag:$4" '
>> +   mode=$6
>> +
>> +   if ! test -e prune-type-setup-done
>> +   then
>> +   test_expect_success 'prune_type setup' '
>> +   git -C one config remote.origin.url >one.remote-url 
>> &&
>> +   git -C one config remote.origin.fetch 
>> >one.remote-fetch &&
>> +   remote_url="file://$(cat one.remote-url)" &&
>> +   remote_fetch="$(cat one.remote-fetch)" &&
>
> Is there a reason that these values need to be captured to files
> (which are otherwise not used) before being assigned to variables?
> That is, wouldn't this work?
>
> remote_url="file://$(git -C one config remote.origin.url)" &&
> remote_fetch="$(git -C one config remote.origin.fetch)" &&

Nope, I'll just do that. This was cruft left over from an earlier
version which I didn't clean up.

>> +   cmdline_setup="\"$remote_url\" \"$remote_fetch\"" &&
>> +   touch prune-type-setup-done
>
> Why does "prune-type-setup-done" need to be a file rather than a
> simple shell variable (which is global by default even when assigned
> inside test_expect_success)?
>
> Also, since the purpose of this code seems to compute 'cmdline_setup'
> just once, can't you do away with 'prune-type-setup-done' altogether
> and change:
>
> if ! test -e prune-type-setup-done
>
> to:
>
> if test -z "$cmdline_setup"
>
>> +   '
>> +   fi

Yup, that's much simpler. Thanks.


Re: [PATCH v6 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-02-09 Thread Lars Schneider

> On 09 Feb 2018, at 20:28, Junio C Hamano  wrote:
> 
> lars.schnei...@autodesk.com writes:
> 
>> From: Lars Schneider 
>> 
>> If the endianness is not defined in the encoding name, then let's
>> ...
>> [3] https://encoding.spec.whatwg.org/#utf-16le
>> 
>> Signed-off-by: Lars Schneider 
>> 
>> utf
>> ---
> 
> Huh?

That is garbage, sorry! :-(

Can you remove it for me?

Thanks,
Lars


Re: totally confused as to what "git bisect skip" is supposed to do

2018-02-09 Thread Robert P. J. Day
On Fri, 9 Feb 2018, Junio C Hamano wrote:

> "Robert P. J. Day"  writes:
>
> >   i'm confused ... why, after skipping a good chunk in the
> > interval [v4.13,v4.14], do i still have exactly 7300 revisions to
> > bisect? what am i so hopelessly misunderstanding here?
>
> Are you really "skipping" a chunk in the interval?

  i don't know ... am i? how should i interpret the man page?

> I thought that "git bisect skip" is a way for you to respond, when
> "git bisect" gave you a commit to test, saying "sorry, I cannot test
> that exact version, please offer me something else to test".  And
> each time you say that, you are not narrowing the search space in
> any way, so it is understandable that the number of candidate bad
> commits will not decrease.

  i caught that part of the man page, but i'm trying to understand
what i'm reading here:

https://blog.smart.ly/2015/02/03/git-bisect-debugging-with-feature-branches/

  particularly this suggestion:

  $ git bisect start master 75369f4a4c026772242368d870872562a3b693cb

  $ for rev in $(git rev-list
75369f4a4c026772242368d870872562a3b693cb..master --merges
--first-parent); do
>   git rev-list $rev^2 --not $rev^
> done | xargs git bisect skip

my interpretation of that page is how, after starting a bisection, one
can use "git bisect skip" to explicitly disqualify a sizable number of
commits from consideration, speeding up the subsequent bisection.

  if that's not what that web page is describing, then what *is* it
talking about?

rday


Re: [PATCH 2/2] packfile: refactor hash search with fanout table

2018-02-09 Thread Jonathan Tan
On Fri, 9 Feb 2018 19:03:48 +0100
René Scharfe  wrote:

> Going from unsigned to signed int means the patch breaks support for
> more than 2G pack entries, which was put with 326bf39677 (Use uint32_t
> for all packed object counts.) in 2007.

Ah, good catch. I'll wait to see if there are any more comments, then
send out a new version.

> > +int bsearch_hash(const unsigned char *sha1, const void *fanout_,
> > +const void *table_, size_t stride)
> > +{
> > +   const uint32_t *fanout = fanout_;
> 
> Why hide the type?  It doesn't make the function more generic.

I thought that the fanout_ parameter could come from a variety of
sources (e.g. direct mmap - void *, or mmap with some pointer arithmetic
- char *) so I just picked the generic one. But now I realize that that
could lead to unaligned reads, which is probably not a good idea. I'll
update it.

For consistency, I'll also update table_ to be unsigned char *.
(Unsigned because it is primarily interpreted as hashes, which use
"unsigned char *" in the Git code.)

> Why not use sha1_pos()?  I guess because it avoids the overhead of the
> accessor function, right?  And I wonder how much of difference it makes.

Yes, overhead of the accessor function. We would also need to modify
sha1_pos to take in a function that we can pass userdata to (to contain
the stride).

> A binary search function for embedded hashes just needs the key, a
> pointer to the first hash in the array, the stride and the number of
> elements.  It can then be used with or without a fanout table, making it
> more versatile.  Just a thought.

I specifically want to include the fanout table in the calculation here,
because it will be used by subsequent patches that also incorporate the
fanout table.


Re: totally confused as to what "git bisect skip" is supposed to do

2018-02-09 Thread Junio C Hamano
"Robert P. J. Day"  writes:

>   i'm confused ... why, after skipping a good chunk in the interval
> [v4.13,v4.14], do i still have exactly 7300 revisions to bisect? what
> am i so hopelessly misunderstanding here?

Are you really "skipping" a chunk in the interval?

I thought that "git bisect skip" is a way for you to respond, when
"git bisect" gave you a commit to test, saying "sorry, I cannot test
that exact version, please offer me something else to test".  And
each time you say that, you are not narrowing the search space in
any way, so it is understandable that the numver of candidate bad
commits will not decrease.




totally confused as to what "git bisect skip" is supposed to do

2018-02-09 Thread Robert P. J. Day

  all right, i'm sure i'm just being an idiot, but i always thought i
knew what "git bisect skip" did and, now that i'm trying to put
together a simple example, i'm utterly confused so here's a stripped
down example.

  with linus kernel source code, start bisecting [v4.13,v4.14]:

$ git bisect start v4.14 v4.13
Bisecting: 7300 revisions left to test after this (roughly 13 steps)
[15d8ffc96464f6571ecf22043c45fad659f11bdd] Merge tag 'mmc-v4.14' of
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
$

  now, based on how linux kernel release candidates work, i want to
skip a sizable chunk of those revisions:

$ git bisect skip v4.14-rc1..v4.14-rc2
Bisecting: 7300 revisions left to test after this (roughly 13 steps)
[15d8ffc96464f6571ecf22043c45fad659f11bdd] Merge tag 'mmc-v4.14' of
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
$

  i'm confused ... why, after skipping a good chunk in the interval
[v4.13,v4.14], do i still have exactly 7300 revisions to bisect? what
am i so hopelessly misunderstanding here?

rday


Re: [PATCH] CodingGuidelines: mention "static" and "extern"

2018-02-09 Thread Jeff King
On Fri, Feb 09, 2018 at 11:07:38AM -0800, Jonathan Tan wrote:

> On Thu, 8 Feb 2018 18:14:06 -0500
> Eric Sunshine  wrote:
> 
> > On Thu, Feb 8, 2018 at 4:38 PM, Jeff King  wrote:
> > > Subject: [PATCH] CodingGuidelines: mention "static" and "extern"
> > > [...]
> > >
> > > Signed-off-by: Jeff King 
> > > ---
> > > diff --git a/Documentation/CodingGuidelines 
> > > b/Documentation/CodingGuidelines
> > > @@ -386,6 +386,11 @@ For C programs:
> > > + - Variables and functions local to a given source file should be marked
> > > +   with "static". Variables that are visible to other source files
> > > +   must be declared with "extern" in header files. However, function
> > > +   declarations should not use "extern", as that is already the default.
> > 
> > Perhaps:
> > 
> > ... as that is already the default, unless declarations in the
> > header are already "extern", in which case consistency
> > may favor mirroring existing usage.
> > 
> > or something.
> 
> I would prefer not mirroring existing usage in this case - I think it's
> better if the code becomes eventually consistent in not using extern.

Agreed. I think individual bullets in CodingGuidelines should be
definitive where possible, describing the end-game we strive for.

-Peff


Re: [PATCH 1/1] Mark messages for translations

2018-02-09 Thread Jeff King
On Fri, Feb 09, 2018 at 11:15:06AM -0800, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> >> -  if ! grep "Invalid gitfile format" .err
> >> +  if ! test_i18ngrep "invalid gitfile format" .err
> >
> > Shouldn't this rather be like so instead?
> >
> > if test_i18ngrep ! "invalid gitfile format" .err
> >
> > Ditto for the other negated use of test_i18ngrep we see in the same
> > file in this patch.
> 
> Sorry, my thinko.  These two ones want to be written in the patch
> as-is.

Yes, I think so, but we may want to avoid this anti-pattern (since
usually "! test_i18ngrep" is a sign of something wrong. It seems like
these tests are doing more manual reporting work than is necessary, and
could just be relying on helpers to report errors.

Something like the patch below, though I'm not sure if we'd want to
leave it as "grep" (if applying on master), or have "test_i18ngrep" in
the preimage (if basing on top of Alexander's patch).

-Peff

---
 t/t0002-gitfile.sh | 54 +++
 1 file changed, 11 insertions(+), 43 deletions(-)

diff --git a/t/t0002-gitfile.sh b/t/t0002-gitfile.sh
index 9670e8cbe6..74b7307997 100755
--- a/t/t0002-gitfile.sh
+++ b/t/t0002-gitfile.sh
@@ -10,15 +10,6 @@ objpath() {
echo "$1" | sed -e 's|\(..\)|\1/|'
 }
 
-objck() {
-   p=$(objpath "$1")
-   if test ! -f "$REAL/objects/$p"
-   then
-   echo "Object not found: $REAL/objects/$p"
-   false
-   fi
-}
-
 test_expect_success 'initial setup' '
REAL="$(pwd)/.real" &&
mv .git "$REAL"
@@ -26,30 +17,14 @@ test_expect_success 'initial setup' '
 
 test_expect_success 'bad setup: invalid .git file format' '
echo "gitdir $REAL" >.git &&
-   if git rev-parse 2>.err
-   then
-   echo "git rev-parse accepted an invalid .git file"
-   false
-   fi &&
-   if ! grep "Invalid gitfile format" .err
-   then
-   echo "git rev-parse returned wrong error"
-   false
-   fi
+   test_must_fail git rev-parse 2>.err &&
+   test_i18ngrep "Invalid gitfile format" .err
 '
 
 test_expect_success 'bad setup: invalid .git file path' '
echo "gitdir: $REAL.not" >.git &&
-   if git rev-parse 2>.err
-   then
-   echo "git rev-parse accepted an invalid .git file path"
-   false
-   fi &&
-   if ! grep "Not a git repository" .err
-   then
-   echo "git rev-parse returned wrong error"
-   false
-   fi
+   test_must_fail git rev-parse 2>.err &&
+   test_i18ngrep "Not a git repository" .err
 '
 
 test_expect_success 'final setup + check rev-parse --git-dir' '
@@ -60,7 +35,7 @@ test_expect_success 'final setup + check rev-parse --git-dir' 
'
 test_expect_success 'check hash-object' '
echo "foo" >bar &&
SHA=$(cat bar | git hash-object -w --stdin) &&
-   objck $SHA
+   test_path_is_file "$REAL/objects/$(objpath $SHA)"
 '
 
 test_expect_success 'check cat-file' '
@@ -69,29 +44,22 @@ test_expect_success 'check cat-file' '
 '
 
 test_expect_success 'check update-index' '
-   if test -f "$REAL/index"
-   then
-   echo "Hmm, $REAL/index exists?"
-   false
-   fi &&
+   test_path_is_missing "$REAL/index" &&
rm -f "$REAL/objects/$(objpath $SHA)" &&
git update-index --add bar &&
-   if ! test -f "$REAL/index"
-   then
-   echo "$REAL/index not found"
-   false
-   fi &&
-   objck $SHA
+   test_path_is_file "$REAL/index" &&
+   test_path_is_file "$REAL/objects/$(objpath $SHA)" &&
+   false
 '
 
 test_expect_success 'check write-tree' '
SHA=$(git write-tree) &&
-   objck $SHA
+   test_path_is_file "$REAL/objects/$(objpath $SHA)"
 '
 
 test_expect_success 'check commit-tree' '
SHA=$(echo "commit bar" | git commit-tree $SHA) &&
-   objck $SHA
+   test_path_is_file "$REAL/objects/$(objpath $SHA)"
 '
 
 test_expect_success 'check rev-list' '


Re: [PATCH v6 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-02-09 Thread Junio C Hamano
lars.schnei...@autodesk.com writes:

> From: Lars Schneider 
>
> If the endianness is not defined in the encoding name, then let's
> ...
> [3] https://encoding.spec.whatwg.org/#utf-16le
>
> Signed-off-by: Lars Schneider 
>
> utf
> ---

Huh?

>  utf8.c | 13 +
>  utf8.h | 16 
>  2 files changed, 29 insertions(+)


Re: [PATCH 1/1] Mark messages for translations

2018-02-09 Thread Junio C Hamano
Junio C Hamano  writes:

>> -if ! grep "Invalid gitfile format" .err
>> +if ! test_i18ngrep "invalid gitfile format" .err
>
> Shouldn't this rather be like so instead?
>
>   if test_i18ngrep ! "invalid gitfile format" .err
>
> Ditto for the other negated use of test_i18ngrep we see in the same
> file in this patch.

Sorry, my thinko.  These two ones want to be written in the patch
as-is.



Re: Fetch-hooks

2018-02-09 Thread Leo Gaspard
On 02/08/2018 06:02 PM, Leo Gaspard wrote:
> On 02/08/2018 04:30 PM, Joey Hess wrote:
>> [...]
> 
> Hmm, OK, so I guess I'll try to update the patch when I get some time to
> delve into git's internals, as my use case (forbidding some fetches)
> couldn't afaik be covered by a wrapper hook.

Joey,

I just wanted to check, you did not put the Signed-off-by line in
patches in https://marc.info/?l=git=132491485901482=2

Could you confirm that the patches you sent are “covered under an
appropriate open source license and I have the right under that license
to submit that work with modifications, whether created in whole or in
part by me, under the same open source license (unless I am permitted to
submit under a different license), as indicated in the file”, so that I
could send the patch I wrote based on yours with a Signed-off-by line of
my own without breaking the DCO?

Thanks!
Leo


Re: [PATCH] CodingGuidelines: mention "static" and "extern"

2018-02-09 Thread Jonathan Tan
On Thu, 8 Feb 2018 18:14:06 -0500
Eric Sunshine  wrote:

> On Thu, Feb 8, 2018 at 4:38 PM, Jeff King  wrote:
> > Subject: [PATCH] CodingGuidelines: mention "static" and "extern"
> > [...]
> >
> > Signed-off-by: Jeff King 
> > ---
> > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> > @@ -386,6 +386,11 @@ For C programs:
> > + - Variables and functions local to a given source file should be marked
> > +   with "static". Variables that are visible to other source files
> > +   must be declared with "extern" in header files. However, function
> > +   declarations should not use "extern", as that is already the default.
> 
> Perhaps:
> 
> ... as that is already the default, unless declarations in the
> header are already "extern", in which case consistency
> may favor mirroring existing usage.
> 
> or something.

I would prefer not mirroring existing usage in this case - I think it's
better if the code becomes eventually consistent in not using extern.


Re: [PATCH v2 01/17] fetch: don't redundantly NULL something calloc() gave us

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Eric Sunshine jotted:

> On Thu, Feb 8, 2018 at 11:19 AM, Ævar Arnfjörð Bjarmason
>  wrote:
>> Stop redundantly NULL-ing the last element of the refs structure,
>> which was retrieved via calloc(), and is thus guaranteed to be
>> pre-NULL'd.
>> [...]
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>> diff --git a/builtin/fetch.c b/builtin/fetch.c
>> @@ -1302,7 +1302,6 @@ static int fetch_one(struct remote *remote, int argc, 
>> const char **argv)
>> } else
>> refs[j++] = argv[i];
>> }
>> -   refs[j] = NULL;
>> ref_nr = j;
>> }
>
> This is purely subjective, and I neglected to mention it as early as
> v1, but I find that this change hurts readability. Specifically, as
> I'm scanning or reading code, explicit termination conditions, like
> this NULL assignment, are things I'm expecting to see; they're part of
> the idiom of the language. When they're missing, I have to stop, go
> back, and study the code more carefully to see if the "missing bit" is
> a bug or is intentional. And, it's easy to misread xcalloc() as
> xmalloc(), meaning that it's likely that one studying the code would
> conclude that the missing NULL assignment is a bug.
>
> If anything, if this patch wants to eliminate redundancy, I'd expect
> it to change xcalloc() to xmalloc() and keep the NULL assignment, thus
> leaving the idiom intact.

Thanks for all your review, really appreciate it.

I'm going to keep this as-is, reasons:

 * With calloc it's easier to look at the values in a debugger, you get
   NULLs instead of some random garbage for e.g. the ref src/dst. It
   makes it clear it's unset / the tail value.

 * Ditto fewer things to step through / inpect in a debugger. E.g. I
   have a dump of variables before/after in the debugger, with
   assignments like this it's just adding verbosity & something to
   eyeball for something that's never going to change.

 * If there's a bug in the code using calloc is likely to reveal it
   sooner, since you'll be deref-ing NULL instead of some stray
   (possibly still valid) pointer you got from malloc.

 * It looks more in line with established idioms in the codebase. I
   wasn't able to find other cases where we were double-NULL ing
   calloc'd data, but rather stuff like this:

   git grep -W '\bpattern\b' -- '*/ls-remote.c'


Re: [PATCH 2/3] t: teach 'test_must_fail' to save the command's stderr to a file

2018-02-09 Thread Jeff King
On Fri, Feb 09, 2018 at 01:57:10PM -0500, Jeff King wrote:

> Here's what it looks like as a patch.
> 
> -- >8 --
> Subject: [PATCH] t: send verbose test-helper output to fd 4

That applies on 'master'. If we go this route, we'd want this on
sg/test-i18ngrep, which is in 'next' right now:

-- >8 --
Subject: [PATCH] test_i18n_grep: send error messages to fd 4

These were newly added in 63b1a175ee (t: make
'test_i18ngrep' more informative on failure, 2018-02-08),
and so missed the conversion in commit X.

Signed-off-by: Jeff King 
---
(The X above is whatever you queue the original at).

 t/test-lib-functions.sh | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index cd2082..aabee13e5d 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -742,18 +742,18 @@ test_i18ngrep () {
shift
! grep "$@" && return 0
 
-   echo >&2 "error: '! grep $@' did find a match in:"
+   echo >&4 "error: '! grep $@' did find a match in:"
else
grep "$@" && return 0
 
-   echo >&2 "error: 'grep $@' didn't find a match in:"
+   echo >&4 "error: 'grep $@' didn't find a match in:"
fi
 
if test -s "$last_arg"
then
-   cat >&2 "$last_arg"
+   cat >&4 "$last_arg"
else
-   echo >&2 ""
+   echo >&4 ""
fi
 
return 1
-- 
2.16.1.464.gc4bae515b7



Re: [PATCH v3 04/42] git-completion.bash: introduce __gitcomp_builtin

2018-02-09 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> +# This function is equivalent to
> +#
> +#__gitcomp "$(git xxx --git-completion-helper) ..."
> +#
> +# except that the output is cached. Accept 1-3 arguments:
> +# 1: the git command to execute, this is also the cache key
> +# 2: extra options to be added on top (e.g. negative forms)
> +# 3: options to be excluded

The options="${options/ $i / }" substitution in a loop is cute.  The
third argument to this helper is an IFS separated list of regular
expressions to match options that we do not want to see in the
completion, so 

__gitcomp_builtin foo '' '--a[^ ]*'

presumably would exclude all options whose names begin with 'a' ;-).

> + if [ -z "$options" ]; then
> + # leading and trailing spaces are significant to make
> + # option removal work correctly.
> + options=" $(__git ${cmd/_/ } --git-completion-helper) $incl "
> + for i in $excl; do
> + options="${options/ $i / }"
> + done
> + eval "$var=\"$options\""
> + fi
> +
> + __gitcomp "$options"
> +}
> +
>  # Variation of __gitcomp_nl () that appends to the existing list of
>  # completion candidates, COMPREPLY.
>  __gitcomp_nl_append ()


Re: [PATCH 2/3] t: teach 'test_must_fail' to save the command's stderr to a file

2018-02-09 Thread Jeff King
On Fri, Feb 09, 2018 at 10:36:19AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >   2. The "-x" problems aren't specific to test_must_fail at all. They're
> >  a general issue with shell functions.
> >
> > I'm not entirely happy with saying "if you want to use -x, please use
> > bash". But given that it actually solves the problems everywhere with no
> > further effort, is it really that bad a solution?
> >
> > For the error messages from test_must_fail, could we go in the same
> > direction, and send them to descriptor 4 rather than 2? We've already
> > staked out descriptor 4 as something magical that must be left alone
> > (see 9be795fb). If we can rely on that, then it becomes a convenient way
> > for functions to make sure their output is going to the script's stderr.
> 
> That sounds clever and rather attractive.  It isn't that much of an
> layering violation for test-lib-functions.sh::test_must_fail to have
> such an intimate knowledge on how test_-lib.sh::test_eval_ sets up
> the file descriptors, either.

Here's what it looks like as a patch.

-- >8 --
Subject: [PATCH] t: send verbose test-helper output to fd 4

Test helper functions like test_must_fail may write messages
to stderr when they see a problem. When the tests are run
with "--verbose", this ends up on the test script's stderr,
and the user can read it.

But there's a problem. Some tests record stderr as part of
the test, like:

  test_must_fail git foo 2>output &&
  test_i18ngrep expected.message output

In this case any message from test_must_fail goes into
"output", making the --verbose output less useful. It also
means we might accidentally match it in the second line,
though in practice we tend to produce these messages only on
error. So we'd abort the test when the first command fails.

Let's send this user-facing output directly to descriptor 4,
which always points to the original stderr (or /dev/null in
non-verbose mode). It's already forbidden to redirect
descriptor 4, since we use it for BASH_XTRACEFD, as
explained in 9be795fbce (t5615: avoid re-using descriptor 4,
2017-12-08).

Signed-off-by: Jeff King 
---
I've considered a patch to teach "make test-lint" to complain if it sees
" 4>" in any shell script, to try to enforce that rule automatically.
But maybe that's overkill (and also it felt funny to put it in
check-non-portable-shell.pl, but we can always change the name ;) ).

 t/test-lib-functions.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 1701fe2a06..a156b81aa5 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -625,22 +625,22 @@ test_must_fail () {
exit_code=$?
if test $exit_code -eq 0 && ! list_contains "$_test_ok" success
then
-   echo >&2 "test_must_fail: command succeeded: $*"
+   echo >&4 "test_must_fail: command succeeded: $*"
return 1
elif test_match_signal 13 $exit_code && list_contains "$_test_ok" 
sigpipe
then
return 0
elif test $exit_code -gt 129 && test $exit_code -le 192
then
-   echo >&2 "test_must_fail: died by signal $(($exit_code - 128)): 
$*"
+   echo >&4 "test_must_fail: died by signal $(($exit_code - 128)): 
$*"
return 1
elif test $exit_code -eq 127
then
-   echo >&2 "test_must_fail: command not found: $*"
+   echo >&4 "test_must_fail: command not found: $*"
return 1
elif test $exit_code -eq 126
then
-   echo >&2 "test_must_fail: valgrind error: $*"
+   echo >&4 "test_must_fail: valgrind error: $*"
return 1
fi
return 0
@@ -678,7 +678,7 @@ test_expect_code () {
return 0
fi
 
-   echo >&2 "test_expect_code: command exited with $exit_code, we wanted 
$want_code $*"
+   echo >&4 "test_expect_code: command exited with $exit_code, we wanted 
$want_code $*"
return 1
 }
 
@@ -710,7 +710,7 @@ test_cmp_bin() {
 # not output anything when they fail.
 verbose () {
"$@" && return 0
-   echo >&2 "command failed: $(git rev-parse --sq-quote "$@")"
+   echo >&4 "command failed: $(git rev-parse --sq-quote "$@")"
return 1
 }
 
-- 
2.16.1.464.gc4bae515b7



Re: What's cooking in git.git (Jan 2018, #04; Wed, 31)

2018-02-09 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> On Thu, Feb 01 2018, Junio C. Hamano jotted:
>
>> * nd/fix-untracked-cache-invalidation (2018-01-24) 5 commits
>>  - dir.c: stop ignoring opendir() error in open_cached_dir()
>>  - update-index doc: note a fixed bug in the untracked cache
>>  - dir.c: fix missing dir invalidation in untracked code
>>  - dir.c: avoid stat() in valid_cached_dir()
>>  - status: add a failing test showing a core.untrackedCache bug
>>
>>  Some bugs around "untracked cache" feature have been fixed.
>>
>>  Will merge to 'next'.
>
> I think you / Nguyễn may not have seen my
> <87d11omi2o@evledraar.gmail.com>
> (https://public-inbox.org/git/87d11omi2o@evledraar.gmail.com/)
>
> As noted there I think it's best to proceed without the "dir.c: stop
> ignoring opendir[...]" patch.
>
> It's going to be a bad regression in 2.17 if it ends up spewing pagefuls
> of warnings in previously working repos if the UC is on.

Well, I am not sure if it is a regression to diagnose problematic
untracked cache information left by earlier version of the software
with bugs.  After all, this is still an experimental feature, and we
do want to see the warning to serve its purpose to diagnose possible
remaining bugs, and new similar ones when a newer iteration of the
feature introduces, no?


Re: What's cooking in git.git (Jan 2018, #04; Wed, 31)

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Thu, Feb 01 2018, Junio C. Hamano jotted:

> * nd/fix-untracked-cache-invalidation (2018-01-24) 5 commits
>  - dir.c: stop ignoring opendir() error in open_cached_dir()
>  - update-index doc: note a fixed bug in the untracked cache
>  - dir.c: fix missing dir invalidation in untracked code
>  - dir.c: avoid stat() in valid_cached_dir()
>  - status: add a failing test showing a core.untrackedCache bug
>
>  Some bugs around "untracked cache" feature have been fixed.
>
>  Will merge to 'next'.

I think you / Nguyễn may not have seen my
<87d11omi2o@evledraar.gmail.com>
(https://public-inbox.org/git/87d11omi2o@evledraar.gmail.com/)

As noted there I think it's best to proceed without the "dir.c: stop
ignoring opendir[...]" patch.

It's going to be a bad regression in 2.17 if it ends up spewing pagefuls
of warnings in previously working repos if the UC is on.


Re: [PATCH 2/3] t: teach 'test_must_fail' to save the command's stderr to a file

2018-02-09 Thread Junio C Hamano
Jeff King  writes:

>   2. The "-x" problems aren't specific to test_must_fail at all. They're
>  a general issue with shell functions.
>
> I'm not entirely happy with saying "if you want to use -x, please use
> bash". But given that it actually solves the problems everywhere with no
> further effort, is it really that bad a solution?
>
> For the error messages from test_must_fail, could we go in the same
> direction, and send them to descriptor 4 rather than 2? We've already
> staked out descriptor 4 as something magical that must be left alone
> (see 9be795fb). If we can rely on that, then it becomes a convenient way
> for functions to make sure their output is going to the script's stderr.

That sounds clever and rather attractive.  It isn't that much of an
layering violation for test-lib-functions.sh::test_must_fail to have
such an intimate knowledge on how test_-lib.sh::test_eval_ sets up
the file descriptors, either.



Re: What's cooking in git.git (Jan 2018, #04; Wed, 31)

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Johannes Schindelin jotted:

> Hi,
>
> On Thu, 1 Feb 2018, Ævar Arnfjörð Bjarmason wrote:
>
>> On Thu, Feb 01 2018, Junio C. Hamano jotted:
>>
>> > * ab/wildmatch-tests (2018-01-30) 10 commits
>> >  - wildmatch test: mark test as EXPENSIVE_ON_WINDOWS
>> >  - test-lib: add an EXPENSIVE_ON_WINDOWS prerequisite
>> >  - wildmatch test: create & test files on disk in addition to in-memory
>> >  - wildmatch test: perform all tests under all wildmatch() modes
>> >  - wildmatch test: use test_must_fail, not ! for test-wildmatch
>> >  - wildmatch test: remove dead fnmatch() test code
>> >  - wildmatch test: use a paranoia pattern from nul_match()
>> >  - wildmatch test: don't try to vertically align our output
>> >  - wildmatch test: use more standard shell style
>> >  - wildmatch test: indent with tabs, not spaces
>> >
>> >  More tests for wildmatch functions.
>> >
>> >  Expecting an update.
>> >  cf. <87vaga9mgf@evledraar.gmail.com>
>>
>> The 2018-01-30 series is the update mentioned in
>> 87vaga9mgf@evledraar.gmail.com. You probably noticed this / just
>> didn't adjust the note since you queued in in pu already, but just in
>> case: the known issues in it have been resolved, but hopefully Johannes
>> Schindelin can test it on Windows & report.
>
> Sorry, I did not have time to look at this. All I can say is that the `pu`
> builds are green for a couple of days already. Which I celebrate!

Thanks, if you get time it would be great to know if:

time GIT_TEST_LONG=1 ./t3070-wildmatch.sh

Runs cleanly for you, and how long it takes now. Even though it's going
to take a long time still, I optimized the test a lot so I expect it'll
be quicker.


Re: [PATCH 3/3] t1404: use 'test_must_fail stderr='

2018-02-09 Thread Junio C Hamano
SZEDER Gábor  writes:

> On Fri, Feb 9, 2018 at 4:16 AM, Eric Sunshine  wrote:
>> On Thu, Feb 8, 2018 at 9:42 PM, SZEDER Gábor  wrote:
>>> Instead of 'test_must_fail git cmd... 2>output.err', which redirects
>>> the standard error of the 'test_must_fail' helper function as well,
>>> causing various issues as discussed in the previous patch.
>>
>> ECANTPARSE: This either wants to say:
>>
>> "Instead of , do ."
>
> The "do " part is in the subject line.

Please don't.  The title should not be a half-sentence that begins
the first paragraph.


Re: [PATCH 1/1] Mark messages for translations

2018-02-09 Thread Junio C Hamano
Alexander Shopov  writes:

> Small changes in messages to fit the style and typography of rest
> Reuse already translated messages if possible
> Do not translate messages aimed at developers of git
> Fix unit tests depending on the original string
> Use `test_i18ngrep` for tests with translatable strings
> Change and verifyrest of tests via `make GETTEXT_POISON=1 test`

Perhaps end each sentence with a full-stop?

> diff --git a/t/t0002-gitfile.sh b/t/t0002-gitfile.sh
> index 9670e8cbe..797dcf95b 100755
> --- a/t/t0002-gitfile.sh
> +++ b/t/t0002-gitfile.sh
> @@ -31,7 +31,7 @@ test_expect_success 'bad setup: invalid .git file format' '
>   echo "git rev-parse accepted an invalid .git file"
>   false
>   fi &&
> - if ! grep "Invalid gitfile format" .err
> + if ! test_i18ngrep "invalid gitfile format" .err

Shouldn't this rather be like so instead?

if test_i18ngrep ! "invalid gitfile format" .err

Ditto for the other negated use of test_i18ngrep we see in the same
file in this patch.


Re: [PATCH 2/2] packfile: refactor hash search with fanout table

2018-02-09 Thread René Scharfe
Am 02.02.2018 um 23:36 schrieb Jonathan Tan:
> Subsequent patches will introduce file formats that make use of a fanout
> array and a sorted table containing hashes, just like packfiles.
> Refactor the hash search in packfile.c into its own function, so that
> those patches can make use of it as well.
> 
> Signed-off-by: Jonathan Tan 
> ---
>   packfile.c| 19 +--
>   sha1-lookup.c | 24 
>   sha1-lookup.h | 21 +
>   3 files changed, 50 insertions(+), 14 deletions(-)
> 
> diff --git a/packfile.c b/packfile.c
> index 58bdced3b..29f5dc239 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1712,7 +1712,8 @@ off_t find_pack_entry_one(const unsigned char *sha1,
>   {
>   const uint32_t *level1_ofs = p->index_data;
>   const unsigned char *index = p->index_data;
> - unsigned hi, lo, stride;
> + unsigned stride;
> + int ret;
>   
>   if (!index) {
>   if (open_pack_index(p))
> @@ -1725,8 +1726,6 @@ off_t find_pack_entry_one(const unsigned char *sha1,
>   index += 8;
>   }
>   index += 4 * 256;
> - hi = ntohl(level1_ofs[*sha1]);
> - lo = ((*sha1 == 0x0) ? 0 : ntohl(level1_ofs[*sha1 - 1]));
>   if (p->index_version > 1) {
>   stride = 20;
>   } else {
> @@ -1734,17 +1733,9 @@ off_t find_pack_entry_one(const unsigned char *sha1,
>   index += 4;
>   }
>   
> - while (lo < hi) {
> - unsigned mi = lo + (hi - lo) / 2;
> - int cmp = hashcmp(index + mi * stride, sha1);
> -
> - if (!cmp)
> - return nth_packed_object_offset(p, mi);
> - if (cmp > 0)
> - hi = mi;
> - else
> - lo = mi+1;
> - }
> + ret = bsearch_hash(sha1, level1_ofs, index, stride);
> + if (ret >= 0)
> + return nth_packed_object_offset(p, ret);

Going from unsigned to signed int means the patch breaks support for
more than 2G pack entries, which was put with 326bf39677 (Use uint32_t
for all packed object counts.) in 2007.

>   return 0;
>   }
>   
> diff --git a/sha1-lookup.c b/sha1-lookup.c
> index 4cf3ebd92..d11c5e526 100644
> --- a/sha1-lookup.c
> +++ b/sha1-lookup.c
> @@ -99,3 +99,27 @@ int sha1_pos(const unsigned char *sha1, void *table, 
> size_t nr,
>   } while (lo < hi);
>   return -lo-1;
>   }
> +
> +int bsearch_hash(const unsigned char *sha1, const void *fanout_,
> +  const void *table_, size_t stride)
> +{
> + const uint32_t *fanout = fanout_;

Why hide the type?  It doesn't make the function more generic.

> + const unsigned char *table = table_;
> + int hi, lo;
> +
> + hi = ntohl(fanout[*sha1]);
> + lo = ((*sha1 == 0x0) ? 0 : ntohl(fanout[*sha1 - 1]));
> +
> + while (lo < hi) {
> + unsigned mi = lo + (hi - lo) / 2;
> + int cmp = hashcmp(table + mi * stride, sha1);
> +
> + if (!cmp)
> + return mi;
> + if (cmp > 0)
> + hi = mi;
> + else
> + lo = mi + 1;
> + }
> + return -lo - 1;
> +}
> diff --git a/sha1-lookup.h b/sha1-lookup.h
> index cf5314f40..3c59e9cb1 100644
> --- a/sha1-lookup.h
> +++ b/sha1-lookup.h
> @@ -7,4 +7,25 @@ extern int sha1_pos(const unsigned char *sha1,
>   void *table,
>   size_t nr,
>   sha1_access_fn fn);
> +
> +/*
> + * Searches for sha1 in table, using the given fanout table to determine the
> + * interval to search, then using binary search. Returns the element index of
> + * the position found if successful, -i-1 if not (where i is the index of the
> + * least element that is greater than sha1).
> + *
> + * Takes the following parameters:
> + *
> + *  - sha1: the hash to search for
> + *  - fanout: a 256-element array of NETWORK-order 32-bit integers; the 
> integer
> + *at position i represents the number of elements in table whose first 
> byte
> + *is less than or equal to i
> + *  - table: a sorted list of hashes with optional extra information in 
> between
> + *  - stride: distance between two consecutive elements in table (should be
> + *GIT_MAX_RAWSZ or greater)
> + *
> + * This function does not verify the validity of the fanout table.
> + */
> +extern int bsearch_hash(const unsigned char *sha1, const void *fanout,
> + const void *table, size_t stride);
>   #endif
> 

Why not use sha1_pos()?  I guess because it avoids the overhead of the
accessor function, right?  And I wonder how much of difference it makes.

A binary search function for embedded hashes just needs the key, a
pointer to the first hash in the array, the stride and the number of
elements.  It can then be used with or without a fanout table, making it
more versatile.  Just a thought.

René


Re: [PATCH] rebase -p: fix incorrect commit message when calling `git merge`.

2018-02-09 Thread Johannes Schindelin
Hi,

On Thu, 8 Feb 2018, gregory.herr...@oracle.com wrote:

> From: Gregory Herrero 
> 
> Since commit dd6fb0053 ("rebase -p: fix quoting when calling `git
> merge`"), commit message of the merge commit being rebased is passed to
> the merge command using a subshell executing 'git rev-parse --sq-quote'.

Thanks for fixing my incomplete fix.

Ciao,
Johannes


Re: What's cooking in git.git (Jan 2018, #04; Wed, 31)

2018-02-09 Thread Johannes Schindelin
Hi,

On Thu, 1 Feb 2018, Ævar Arnfjörð Bjarmason wrote:

> On Thu, Feb 01 2018, Junio C. Hamano jotted:
> 
> > * ab/wildmatch-tests (2018-01-30) 10 commits
> >  - wildmatch test: mark test as EXPENSIVE_ON_WINDOWS
> >  - test-lib: add an EXPENSIVE_ON_WINDOWS prerequisite
> >  - wildmatch test: create & test files on disk in addition to in-memory
> >  - wildmatch test: perform all tests under all wildmatch() modes
> >  - wildmatch test: use test_must_fail, not ! for test-wildmatch
> >  - wildmatch test: remove dead fnmatch() test code
> >  - wildmatch test: use a paranoia pattern from nul_match()
> >  - wildmatch test: don't try to vertically align our output
> >  - wildmatch test: use more standard shell style
> >  - wildmatch test: indent with tabs, not spaces
> >
> >  More tests for wildmatch functions.
> >
> >  Expecting an update.
> >  cf. <87vaga9mgf@evledraar.gmail.com>
> 
> The 2018-01-30 series is the update mentioned in
> 87vaga9mgf@evledraar.gmail.com. You probably noticed this / just
> didn't adjust the note since you queued in in pu already, but just in
> case: the known issues in it have been resolved, but hopefully Johannes
> Schindelin can test it on Windows & report.

Sorry, I did not have time to look at this. All I can say is that the `pu`
builds are green for a couple of days already. Which I celebrate!

Ciao,
Dscho

Dear friend,

2018-02-09 Thread Baari Abdul


Dear friend,

 

I  Mr.  Baari Abdul, Head of Operation at Bank of Africa. I want invite into a 
business overture which involves an amount of $ 22.3 million. At your 
acceptance, this amount will be transferred to your name as a foreign partner.

 

I need your help to get this fund to be transfer out from here to your account, 
and we share at a ratio of 50% each. You will receive this amount by bank 
transfer.

Please send your full name...

You’re directly phone numbers...

Address

 

I will detail you more about this transaction but i need above data to make 
some vital changes in the transit account which will make your name appear as 
the true beneficiary of the fund. You have to contact me through my private 
e-mail at (baariabdul...@gmail.com) your prompt reply will be highly 
appreciated.

Sincerely,

CONTACTS ME THROUGH MY INFORMATION

Bank of Africa.(Boa).BF

Mr.  Baari Abdul  .


.


should "git bisect skip" not visually reduce number of revisions left?

2018-02-09 Thread Robert P. J. Day

  perhaps i'm misreading something, but i'm trying to put together a
hands-on example on how to use "git bisect" with feature branches as
explained here:

https://blog.smart.ly/2015/02/03/git-bisect-debugging-with-feature-branches/

and i'm using the linux kernel source as the content, so i started a
bisection session with:

$ git bisect start v4.15 v4.14
Bisecting: 8497 revisions left to test after this (roughly 13 steps)
[5d352e69c60e54b5f04d6e337a1d2bf0dbf3d94a] Merge tag 'media/v4.15-1' of
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
$

  so far, apparently, so good. now i'm running the script:

  for rev in $(git rev-list v4.14..v4.15 --merges --first-parent) ; do
echo "=== ${rev} ==="
git rev-list ${rev}^2 --not ${rev}^
git rev-list ${rev}^2 --not ${rev}^ | xargs git bisect skip
  done

so students can see the "git bisect skip" operations happening (there
will be 435 of them). first few lines of output verifying the opening
skip operations:

=== 24b1cccf922914f3d6eeb84036dde8338bc03abb ===
1df37383a8aeabb9b418698f0bcdffea01f4b1b2
Bisecting: 8497 revisions left to test after this (roughly 13 steps)
[5d352e69c60e54b5f04d6e337a1d2bf0dbf3d94a] Merge tag 'media/v4.15-1' of
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
=== 32c6cdf75c9270848d2d0ed7c814eba05b47081f ===
dd085168a74c99c3ebe7f813069e412eb8444243
8a95b74d50825067fb6c8af7f9db03e711b1cb9d
36b3a7726886f24c4209852a58e64435bde3af98
5beda7d54eafece4c974cfa9fbb9f60fb18fd20a
1d080f096fe33f031d26e19b3ef0146f66b8b0f1
7e702d17ed138cf4ae7c00e8c00681ed464587c7
40d4071ce2d20840d224b4a77b5dc6f752c9ab15
Bisecting: 8497 revisions left to test after this (roughly 13 steps)
[5d352e69c60e54b5f04d6e337a1d2bf0dbf3d94a] Merge tag 'media/v4.15-1' of
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
=== 07b0137c0268b8d0694a5f09284449353a1a6fed ===
... and so on ...

  what's weird(?) is that, while it's running right now, every "git
bisect skip" operation doesn't change the apparent number of revisions
left to examine -- it's always:

  Bisecting: 8497 revisions left to test after this (roughly 13 steps)

is that not supposed to change to reflect the increasing number of
revisions to skip? or does that stay the same for the entire process
until i start to bisect? or am i just doing something wrong?

rday


[BUG] Integer overflow when supplying large context value to diff --unified

2018-02-09 Thread Branko Majic
Hello,

Git versions tested: 2.13.6, 2.1.4

When passing-in a large context value for the --unified option for
git-diff, Git will produce an invalid-looking range information for
hunks.

For example, if running 'git diff --unified=10 HEAD^', the output will
include (this is just a run against my local git repo):

@@ -42,23 +42,23 @@ master_doc = 'index'

Note the numbers for denoting begin/end line etc "look fine"(they're
within the expected numeric range).

Now, if we pass on a big value to it (2 to the power of 32 divided by 2,
e.g. enough so that signed long int can't hold it without overflowing),
e.g. 'git diff --unified=2147483648 HEAD^', the output will include
(again, just a sample):

@@ -2147483700,4294967295- +2147483700,4294967295- @@

Note that the begin/end line numbers are way out of range. The diff
itself will actually contain no context lines.

Best regards

-- 
Branko Majic
XMPP: bra...@majic.rs
Please use only Free formats when sending attachments to me.

Бранко Мајић
XMPP: bra...@majic.rs
Молим вас да додатке шаљете искључиво у слободним форматима.


pgpxCCb6k1ljg.pgp
Description: OpenPGP digital signature


Re: [PATCH 2/3] t: teach 'test_must_fail' to save the command's stderr to a file

2018-02-09 Thread Jeff King
On Fri, Feb 09, 2018 at 03:42:34AM +0100, SZEDER Gábor wrote:

> To check that a git command fails and to inspect its error message we
> usually execute a command like this throughout our test suite:
> 
>   test_must_fail git command --option 2>output.err
> 
> Note that this command doesn't limit the redirection to the git
> command, but it redirects the standard error of the 'test_must_fail'
> helper function as well.  This is wrong for several reasons:
> 
>   - If that git command were to succeed or die for an unexpected
> reason e.g. signal, then 'test_must_fail's helpful error message
> would end up in the given file instead of on the terminal or in
> 't/test-results/$T.out', when the test is run with '-v' or
> '--verbose-log', respectively.
> 
>   - If the test is run with '-x', the trace of executed commands in
> 'test_must_fail' will go to stderr as well (except for more recent
> Bash versions supporting $BASH_XTRACEFD), and then in turn will
> end up in the given file.

I have to admit that I'm slightly negative on this approach, just
because:

  1. It requires every caller to do something special, rather than just
 using normal redirection. And the feature isn't as powerful as
 redirection. E.g., some callers do:

   test_must_fail foo >output 2>&1

 But:

   test_must_fail stderr=output foo >output

 is not quite right (stdout and stderr outputs may clobber each
 other, because they have independent position pointers).

  2. The "-x" problems aren't specific to test_must_fail at all. They're
 a general issue with shell functions.

I'm not entirely happy with saying "if you want to use -x, please use
bash". But given that it actually solves the problems everywhere with no
further effort, is it really that bad a solution?

For the error messages from test_must_fail, could we go in the same
direction, and send them to descriptor 4 rather than 2? We've already
staked out descriptor 4 as something magical that must be left alone
(see 9be795fb). If we can rely on that, then it becomes a convenient way
for functions to make sure their output is going to the script's stderr.

-Peff


Re: [PATCH v3 42/42] git-completion.bash: add GIT_COMPLETION_OPTIONS=all config

2018-02-09 Thread Ævar Arnfjörð Bjarmason

On Fri, Feb 09 2018, Nguyễn Thái Ngọc Duy jotted:

> By default, some option names (mostly --force, scripting related or for
> internal use) are not completable for various reasons. When
> GIT_COMPLETION_OPTIONS is set to all, all options (except hidden ones)
> are completable.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  contrib/completion/git-completion.bash |  6 +-
>  parse-options.c| 11 +++
>  2 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index 0ddf40063b..0cfa489a8e 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -36,6 +36,10 @@
>  #
>  # When set to "1", do not include "DWIM" suggestions in git-checkout
>  # completion (e.g., completing "foo" when "origin/foo" exists).
> +#
> +#   GIT_COMPLETION_OPTIONS
> +#
> +# When set to "all", complete all possible options

I was going to suggest some wording like:

When set to "all", include options considered unsafe such as --force
in the completion.

However per your cover letter it's not just used for that:

 10 --force
  4 --rerere-autoupdate
  1 --unsafe-paths
  1 --thin
  1 --overwrite-ignore
  1 --open-files-in-pager
  1 --null
  1 --ext-grep
  1 --exit-code
  1 --auto

I wonder if we shouldn't just make this only about --force, I don't see
why "git grep --o" should only show --or but not
--open-files-in-pager, and e.g. "git grep --" is already verbose so
we're not saving much by excluding those.

Then this could just become:

GIT_COMPLETION_SHOWUNSAFEOPTIONS=1

Or other similar boolean variable, for consistency with all the "*SHOW*
variables already in git-completion.bash.

>  case "$COMP_WORDBREAKS" in
>  *:*) : great ;;
> @@ -303,7 +307,7 @@ __gitcomp_builtin ()
>   if [ -z "$options" ]; then
>   # leading and trailing spaces are significant to make
>   # option removal work correctly.
> - options=" $(__git ${cmd/_/ } --git-completion-helper) $incl "
> + options=" $(__git ${cmd/_/ } 
> --git-completion-helper=$GIT_COMPLETION_OPTIONS) $incl "
>   for i in $excl; do
>   options="${options/ $i / }"
>   done
> diff --git a/parse-options.c b/parse-options.c
> index 979577ba2c..5b8b2b376e 100644
> --- a/parse-options.c
> +++ b/parse-options.c
> @@ -430,14 +430,17 @@ void parse_options_start(struct parse_opt_ctx_t *ctx,
>   * many options that do not suppress it properly.
>   */
>  static int show_gitcomp(struct parse_opt_ctx_t *ctx,
> - const struct option *opts)
> + const struct option *opts,
> + const char *arg)
>  {
>   for (; opts->type != OPTION_END; opts++) {
>   const char *suffix = "";
>
>   if (!opts->long_name)
>   continue;
> - if (opts->flags & (PARSE_OPT_HIDDEN | PARSE_OPT_NOCOMPLETE))
> + if (opts->flags & PARSE_OPT_HIDDEN)
> + continue;
> + if ((opts->flags & PARSE_OPT_NOCOMPLETE) && strcmp(arg, "all"))
>   continue;
>
>   switch (opts->type) {
> @@ -498,8 +501,8 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
>   goto show_usage;
>
>   /* lone --git-completion-helper is asked by git-completion.bash 
> */
> - if (ctx->total == 1 && !strcmp(arg + 1, 
> "-git-completion-helper"))
> - return show_gitcomp(ctx, options);
> + if (ctx->total == 1 && skip_prefix(arg + 1, 
> "-git-completion-helper=", ))
> + return show_gitcomp(ctx, options, arg);
>
>   if (arg[1] != '-') {
>   ctx->opt = arg + 1;


Re: [PATCH v3 14/14] commit-graph: build graph from starting commits

2018-02-09 Thread Derrick Stolee

On 2/9/2018 8:02 AM, SZEDER Gábor wrote:

On Thu, Feb 8, 2018 at 9:37 PM, Derrick Stolee  wrote:

Teach git-commit-graph to read commits from stdin when the
--stdin-commits flag is specified. Commits reachable from these
commits are added to the graph. This is a much faster way to construct
the graph than inspecting all packed objects, but is restricted to
known tips.

For the Linux repository, 700,000+ commits were added to the graph
file starting from 'master' in 7-9 seconds, depending on the number
of packfiles in the repo (1, 24, or 120).

It seems something went wrong with '--stdin-commits' in v3, look:

   ~/src/git (commit-graph-v2 %)$ time { git rev-parse HEAD | ./git
commit-graph --write --update-head --stdin-commits ; }
   ee3223fe116bf7031a6c1ad6d41e0456beefa754

   real  0m1.199s
   user  0m1.123s
   sys   0m0.024s

   ~/src/git (commit-graph-v3 %)$ time { git rev-parse HEAD | ./git
commit-graph write --update-head --stdin-commits ; }
   ee3223fe116bf7031a6c1ad6d41e0456beefa754

   real  0m30.766s
   user  0m29.120s
   sys   0m0.546s


Thanks, Szeder. You're right. This is the diff that I forgot to apply in 
the last commit:


diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 28d043b..175b967 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -257,7 +257,7 @@ static int graph_write(int argc, const char **argv)

    has_existing = !!get_graph_head_hash(opts.pack_dir, 
_graph_hash);


-   if (opts.stdin_packs) {
+   if (opts.stdin_packs || opts.stdin_commits) {
    struct strbuf buf = STRBUF_INIT;
    nr_lines = 0;
    alloc_lines = 128;


I'll work to create a test that ensures we are only adding commits 
reachable from specific commits to prevent this regression.


Thanks,
-Stolee


Re: categorization, documentation and packaging of "git core" commands

2018-02-09 Thread Robert P. J. Day
On Wed, 7 Feb 2018, Todd Zullinger wrote:

> Robert P. J. Day wrote:
> > not to belabour this (and i'm sure it's *way* too late for that),
> > but fedora has the following packaging scheme.  first, there's a
> > bunch of stuff in "git-core", which has no dependencies on any
> > other git-related packages.
>
> The split in Fedora between git and git-core is done to minimize the
> dependencies required for a minimal git install.  The initial reason
> was to to allow installing the git-core package on systems, in
> containers, etc. without requiring perl and its various dependencies
> to be installed.
>
> The name git-core was not chosen to imply any official status as
> core versus contrib from upstream.

  ... snip ...

  oh, i understand completely (i particularly like that fedora
supports a "git-extras" package with loads of cool stuff). remember
that this all started when i pointed out that, over at
https://git-scm.com/doc, there is a link entitled "Reference Manual"
(https://git-scm.com/docs) that assures the reader, "The official and
comprehensive man pages that are included in the Git package itself",
so it's just a matter of someone deciding what *exactly* corresponds
to "the Git package itself" to make sure all relevant man pages can be
found there.

  just being pedantic, as is my wont.

rday


Re: "git bisect run make" adequate to locate first unbuildable commit?

2018-02-09 Thread Christian Couder
On Fri, Feb 9, 2018 at 2:20 PM, Robert P. J. Day  wrote:
>
>   writing a short tutorial on "git bisect" and, all the details of
> special exit code 125 aside, if one wanted to locate the first
> unbuildable commit, would it be sufficient to just run?
>
>   $ git bisect run make
>
>   as i read it, make returns either 0, 1 or 2 so there doesn't appear
> to be any possibility of weirdness with clashing with a 125 exit code.
> am i overlooking some subtle detail here i should be aware of? thanks.

I think you are not overlooking anything.


[PATCH v6 6/7] convert: add tracing for 'working-tree-encoding' attribute

2018-02-09 Thread lars . schneider
From: Lars Schneider 

Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable
tracing for content that is reencoded with the 'working-tree-encoding'
attribute. This is useful to debug encoding issues.

Signed-off-by: Lars Schneider 
---
 convert.c| 25 +
 t/t0028-working-tree-encoding.sh |  2 ++
 2 files changed, 27 insertions(+)

diff --git a/convert.c b/convert.c
index dc9e2db6b5..5b49416ee1 100644
--- a/convert.c
+++ b/convert.c
@@ -266,6 +266,29 @@ static int will_convert_lf_to_crlf(size_t len, struct 
text_stat *stats,
 
 }
 
+static void trace_encoding(const char *context, const char *path,
+  const char *encoding, const char *buf, size_t len)
+{
+   static struct trace_key coe = TRACE_KEY_INIT(WORKING_TREE_ENCODING);
+   struct strbuf trace = STRBUF_INIT;
+   int i;
+
+   strbuf_addf(, "%s (%s, considered %s):\n", context, path, 
encoding);
+   for (i = 0; i < len && buf; ++i) {
+   strbuf_addf(
+   ,"| \e[2m%2i:\e[0m %2x \e[2m%c\e[0m%c",
+   i,
+   (unsigned char) buf[i],
+   (buf[i] > 32 && buf[i] < 127 ? buf[i] : ' '),
+   ((i+1) % 8 && (i+1) < len ? ' ' : '\n')
+   );
+   }
+   strbuf_addchars(, '\n', 1);
+
+   trace_strbuf(, );
+   strbuf_release();
+}
+
 static struct encoding {
const char *name;
struct encoding *next;
@@ -325,6 +348,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
error(error_msg, path, enc->name);
}
 
+   trace_encoding("source", path, enc->name, src, src_len);
dst = reencode_string_len(src, src_len, default_encoding, enc->name,
  _len);
if (!dst) {
@@ -340,6 +364,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
else
error(msg, path, enc->name, default_encoding);
}
+   trace_encoding("destination", path, default_encoding, dst, dst_len);
 
strbuf_attach(buf, dst, dst_len, dst_len + 1);
return 1;
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index f9ce3e5ef5..01789ae1b8 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -4,6 +4,8 @@ test_description='working-tree-encoding conversion via 
gitattributes'
 
 . ./test-lib.sh
 
+GIT_TRACE_WORKING_TREE_ENCODING=1 && export GIT_TRACE_WORKING_TREE_ENCODING
+
 test_expect_success 'setup test repo' '
git config core.eol lf &&
 
-- 
2.16.1



[PATCH v6 1/7] strbuf: remove unnecessary NUL assignment in xstrdup_tolower()

2018-02-09 Thread lars . schneider
From: Lars Schneider 

Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we
allocate the buffer for the lower case string with xmallocz(). This
already ensures a NUL at the end of the allocated buffer.

Remove the unnecessary assignment.

Signed-off-by: Lars Schneider 
---
 strbuf.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/strbuf.c b/strbuf.c
index 1df674e919..55b7daeb35 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -781,7 +781,6 @@ char *xstrdup_tolower(const char *string)
result = xmallocz(len);
for (i = 0; i < len; i++)
result[i] = tolower(string[i]);
-   result[i] = '\0';
return result;
 }
 
-- 
2.16.1



[PATCH v6 2/7] strbuf: add xstrdup_toupper()

2018-02-09 Thread lars . schneider
From: Lars Schneider 

Create a copy of an existing string and make all characters upper case.
Similar xstrdup_tolower().

This function is used in a subsequent commit.

Signed-off-by: Lars Schneider 
---
 strbuf.c | 12 
 strbuf.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index 55b7daeb35..b635f0bdc4 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -784,6 +784,18 @@ char *xstrdup_tolower(const char *string)
return result;
 }
 
+char *xstrdup_toupper(const char *string)
+{
+   char *result;
+   size_t len, i;
+
+   len = strlen(string);
+   result = xmallocz(len);
+   for (i = 0; i < len; i++)
+   result[i] = toupper(string[i]);
+   return result;
+}
+
 char *xstrvfmt(const char *fmt, va_list ap)
 {
struct strbuf buf = STRBUF_INIT;
diff --git a/strbuf.h b/strbuf.h
index 14c8c10d66..df7ced53ed 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -607,6 +607,7 @@ __attribute__((format (printf,2,3)))
 extern int fprintf_ln(FILE *fp, const char *fmt, ...);
 
 char *xstrdup_tolower(const char *);
+char *xstrdup_toupper(const char *);
 
 /**
  * Create a newly allocated string using printf format. You can do this easily
-- 
2.16.1



[PATCH v6 0/7] convert: add support for different encodings

2018-02-09 Thread lars . schneider
From: Lars Schneider 

Hi,

Patches 1-4, 6 are preparation and helper functions.
Patch 5,7 are the actual change.

This series depends on Torsten's 8462ff43e4 (convert_to_git():
safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is already
in next.

Changes since v5:

* added 'core.checkRoundtripEncoding' to let the end user define a list
  of encodings for which Git performs conversion round trip checks
  (Junio's feedback)
* add a test case for round trip conversions
* move all round trip related changes into a new patch
* rename has_missing_utf_bom() to is_missing_required_utf_bom() (Junio)
* remove dead code in conversion round trip check (Junio)
* use *.proj files instead of *.txt files as example (Torsten's feedback)
* recommend explicitly setting the eol attribute if working-tree-encoding
  attribute is used (Torsten)
* improve test case and check that 'git ls-files --eol' works as expected
* rename variables from checkout_encoding to working_tree_encoding


Thanks,
Lars

   RFC: 
https://public-inbox.org/git/bdb9b884-6d17-4be3-a83c-f67e2afa2...@gmail.com/
v1: 
https://public-inbox.org/git/20171211155023.1405-1-lars.schnei...@autodesk.com/
v2: 
https://public-inbox.org/git/2017122915.39680-1-lars.schnei...@autodesk.com/
v3: 
https://public-inbox.org/git/20180106004808.77513-1-lars.schnei...@autodesk.com/
v4: 
https://public-inbox.org/git/20180120152418.52859-1-lars.schnei...@autodesk.com/
v5: https://public-inbox.org/git/20180129201855.9182-1-tbo...@web.de/


Base Ref:
Web-Diff: https://github.com/larsxschneider/git/commit/9e31832f3c
Checkout: git fetch https://github.com/larsxschneider/git encoding-v6 && git 
checkout 9e31832f3c


### Interdiff (v5..v6):

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0e25b2c92b..d7a56054a5 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -530,6 +530,12 @@ core.autocrlf::
This variable can be set to 'input',
in which case no output conversion is performed.

+core.checkRoundtripEncoding::
+   A comma separated list of encodings that Git performs UTF-8 round
+   trip checks on if they are used in an `working-tree-encoding`
+   attribute (see linkgit:gitattributes[5]). The default value is
+   `SHIFT-JIS`.
+
 core.symlinks::
If false, symbolic links are checked out as small plain files that
contain the link text. linkgit:git-update-index[1] and
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index a8dbf4be30..ea5a9509c6 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -283,10 +283,10 @@ visualize the content.

 In these cases you can tell Git the encoding of a file in the working
 directory with the `working-tree-encoding` attribute. If a file with this
-attributes is added to Git, then Git reencodes the content from the
-specified encoding to UTF-8 and stores the result in its internal data
-structure (called "the index"). On checkout the content is encoded
-back to the specified encoding.
+attribute is added to Git, then Git reencodes the content from the
+specified encoding to UTF-8. Finally, Git stores the UTF-8 encoded
+content in its internal data structure (called "the index"). On checkout
+the content is reencoded back to the specified encoding.

 Please note that using the `working-tree-encoding` attribute may have a
 number of pitfalls:
@@ -296,32 +296,38 @@ number of pitfalls:
   expected encoding. Consequently, these files will appear different
   which typically causes trouble. This is in particular the case for
   older Git versions and alternative Git implementations such as JGit
-  or libgit2 (as of January 2018).
+  or libgit2 (as of February 2018).

-- Reencoding content to non-UTF encodings (e.g. SHIFT-JIS) can cause
-  errors as the conversion might not be round trip safe.
+- Reencoding content to non-UTF encodings can cause errors as the
+  conversion might not be UTF-8 round trip safe. If you suspect your
+  encoding to not be round trip safe, then add it to 
`core.checkRoundtripEncoding`
+  to make Git check the round trip encoding (see linkgit:git-config[1]).
+  SHIFT-JIS (Japanese character set) is known to have round trip issues
+  with UTF-8 and is checked by default.

 - Reencoding content requires resources that might slow down certain
   Git operations (e.g 'git checkout' or 'git add').

-Use the `working-tree-encoding` attribute only if you cannot store a file in
-UTF-8 encoding and if you want Git to be able to process the content as
-text.
+Use the `working-tree-encoding` attribute only if you cannot store a file
+in UTF-8 encoding and if you want Git to be able to process the content
+as text.

-Use the following attributes if your '*.txt' files are UTF-16 encoded
-with byte order mark (BOM) and you want Git to perform automatic line
-ending conversion based on your platform.
+As an example, use the following attributes if your '*.proj' files 

  1   2   >