date:20180928

Re: [PATCH v6 3/7] eoie: add End of Index Entry (EOIE) extension

2018-09-28 Thread Duy Nguyen

On Wed, Sep 26, 2018 at 03:54:38PM -0400, Ben Peart wrote:
> +
> +#define EOIE_SIZE (4 + GIT_SHA1_RAWSZ) /* <4-byte offset> + <20-byte hash> */
> +#define EOIE_SIZE_WITH_HEADER (4 + 4 + EOIE_SIZE) /* <4-byte signature> + 
> <4-byte length> + EOIE_SIZE */

If you make these variables instead of macros, you can use
the_hash_algo, which makes this code sha256-friendlier and probably
can explain less, e.g. ...

> +
> +static size_t read_eoie_extension(const char *mmap, size_t mmap_size)
> +{
> + /*
> +  * The end of index entries (EOIE) extension is guaranteed to be last
> +  * so that it can be found by scanning backwards from the EOF.
> +  *
> +  * "EOIE"
> +  * <4-byte length>
> +  * <4-byte offset>
> +  * <20-byte hash>
> +  */

uint32_t EOIE_SIZE = 4 + the_hash_algo->rawsz;
uint32_t EOIE_SIZE_WITH_HEADER = 4 + 4 + EOIE_SIZE;

> + const char *index, *eoie;
> + uint32_t extsize;
> + size_t offset, src_offset;
> + unsigned char hash[GIT_MAX_RAWSZ];
> + git_hash_ctx c;
--
Duy

Re: [PATCH v3 5/6] split-index: don't compare stat data of entries already marked for split index

2018-09-28 Thread Duy Nguyen

On Fri, Sep 28, 2018 at 06:24:58PM +0200, SZEDER Gábor wrote:
> When unpack_trees() constructs a new index, it copies cache entries
> from the original index [1].  prepare_to_write_split_index() has to
> deal with this, and it has a dedicated code path for copied entries
> that are present in the shared index, where it compares the cached
> data in the corresponding copied and original entries.  If the cached
> data matches, then they are considered the same; if it differs, then
> the copied entry will be marked for inclusion as a replacement entry
> in the just about to be written split index by setting the
> CE_UPDATE_IN_BASE flag.
> 
> However, a cache entry already has its CE_UPDATE_IN_BASE flag set upon
> reading the split index, if the entry already has a replacement entry
> there, or upon refreshing the cached stat data, if the corresponding
> file was modified.  The state of this flag is then preserved when
> unpack_trees() copies a cache entry from the shared index.
> 
> So modify prepare_to_write_split_index() to check the copied cache
> entries' CE_UPDATE_IN_BASE flag first, and skip the thorough
> comparison of cached data if the flag is already set.

OK so this is an optimization, not a bug fix. Right?

> Note that comparing the cached data in copied and original entries in

s/cached data/cached stat data/ ? I was confused for a bit.

> the shared index might actually be entirely unnecessary.  In theory
> all code paths refreshing the cached stat data of an entry in the
> shared index should set the CE_UPDATE_IN_BASE flag in that entry, and
> unpack_trees() should preserve this flag when copying cache entries.
> This means that the cached data is only ever changed if the
> CE_UPDATE_IN_BASE flag is set as well.  Our test suite seems to
> confirm this: instrumenting the conditions in question and running the
> test suite repeatedly with 'GIT_TEST_SPLIT_INDEX=yes' showed that the
> cached data in a copied entry differs from the data in the shared
> entry only if its CE_UPDATE_IN_BASE flag is indeed set.

Yes I was probably just being paranoid (or sticking to simpler
checks). I was told that split index is computation expensive and not
doing unnecesary/expensive checks may help. But let's leave it for
later.

> + } else {
> + /*
> +  * Thoroughly compare the cached data to see
> +  * whether it should be marked for inclusion
> +  * in the split index.
> +  *
> +  * This comparison might be unnecessary, as
> +  * code paths modifying the cached data do
> +  * set CE_UPDATE_IN_BASE as well.
> +  */
> + const unsigned int ondisk_flags =
> + CE_STAGEMASK | CE_VALID |
> + CE_EXTENDED_FLAGS;
> + unsigned int ce_flags, base_flags, ret;
> + ce_flags = ce->ce_flags;
> + base_flags = base->ce_flags;
> + /* only on-disk flags matter */
> + ce->ce_flags   &= ondisk_flags;
> + base->ce_flags &= ondisk_flags;
> + ret = memcmp(>ce_stat_data, 
> >ce_stat_data,
> +  offsetof(struct cache_entry, name) 
> -
> +  offsetof(struct cache_entry, 
> ce_stat_data));
> + ce->ce_flags = ce_flags;
> + base->ce_flags = base_flags;

Maybe make this block a separate function (compare_ce_content or
something). The amount of indentation is getting too high.

> + if (ret)
> + ce->ce_flags |= CE_UPDATE_IN_BASE;
> + }
>   discard_cache_entry(base);
>   si->base->cache[ce->index - 1] = ce;
>   }
> -- 
> 2.19.0.361.gafc87ffe72
>

Re: [PATCH v3 6/6] split-index: smudge and add racily clean cache entries to split index

2018-09-28 Thread Duy Nguyen

On Fri, Sep 28, 2018 at 06:24:59PM +0200, SZEDER Gábor wrote:
> diff --git a/t/t1701-racy-split-index.sh b/t/t1701-racy-split-index.sh
> index fbb77046da..5dc221ef38 100755
> --- a/t/t1701-racy-split-index.sh
> +++ b/t/t1701-racy-split-index.sh
> @@ -148,7 +148,7 @@ done
>  
>  for trial in $trials
>  do
> - test_expect_failure "update the split index when a racily clean cache 
> entry is stored only in the shared index $trial" '
> + test_expect_success "update the split index when a racily clean cache 
> entry is stored only in the shared index #$trial" '

The the new '#' before '$trial' intended?

>   rm -f .git/index .git/sharedindex.* &&
>  
>   # The next three commands must be run within the same
> @@ -170,8 +170,6 @@ do
>   # entry of racy-file is only stored in the shared index.
>   # A corresponding replacement cache entry with smudged
>   # stat data should be added to the new split index.
> - #
> - # Alas, such a smudged replacement entry is not added!
>   git update-index --add other-file &&
>  
>   # Subsequent git commands should notice the smudged
> @@ -182,7 +180,7 @@ done
>  
>  for trial in $trials
>  do
> - test_expect_failure "update the split index after unpack trees() copied 
> a racily clean cache entry from the shared index $trial" '
> + test_expect_success "update the split index after unpack trees() copied 
> a racily clean cache entry from the shared index #$trial" '
>   rm -f .git/index .git/sharedindex.* &&
>  
>   # The next three commands must be run within the same
--
Duy

wrong output on fail

2018-09-28 Thread Paul Wratt

--
...
Total 21 (delta 8), reused 0 (delta 0)
error: RPC failed; result=56, HTTP code = 0
fatal: The remote end hung up unexpectedly
fatal: The remote end hung up unexpectedly
Everything up-to-date
--

I am getting the above from "git push".

I am having intermittent HTTPS connection failures with my ISP
(Vodafone NZ, mobile), although HTTP works fine. It may even be a
peering issue.

The failures above are accounted for, but that last line is definitely wrong

Cheers
Paul

Re: [PATCH v6 3/7] eoie: add End of Index Entry (EOIE) extension

2018-09-28 Thread SZEDER Gábor

On Wed, Sep 26, 2018 at 03:54:38PM -0400, Ben Peart wrote:
> diff --git a/read-cache.c b/read-cache.c
> index 6ba99e2c96..80255d3088 100644
> --- a/read-cache.c
> +++ b/read-cache.c

> +static size_t read_eoie_extension(const char *mmap, size_t mmap_size)
> +{

<>

> + the_hash_algo->final_fn(hash, );
> + if (hashcmp(hash, (const unsigned char *)index))
> + return 0;

Please use !hasheq() instead of hashcmp().

Re: [PATCH] git doc: direct bug reporters to mailing list archive (Re: [PATCH v2] git.txt: mention mailing list archive)

2018-09-28 Thread Jonathan Nieder

Jonathan Nieder wrote:
> Junio C Hamano wrote:

>> OK.  This unfortunately came a bit too late for today's integration
>> cycle, so I'll leave this in my inbox and replace what is queued
>> with it later.
>>
>> Unless there is another one to supersede this proposal comes before
>> that happens, that is.
>>
>> Thanks.
>
> Sounds good.  Thanks for the heads up, and thanks for the heads up.

Ahem, what I meant is "and thanks for the help with writing the
patch".

Sincerely,
Jonathan

Re: [PATCH] git doc: direct bug reporters to mailing list archive (Re: [PATCH v2] git.txt: mention mailing list archive)

2018-09-28 Thread Jonathan Nieder

Junio C Hamano wrote:

> OK.  This unfortunately came a bit too late for today's integration
> cycle, so I'll leave this in my inbox and replace what is queued
> with it later.
>
> Unless there is another one to supersede this proposal comes before
> that happens, that is.
>
> Thanks.

Sounds good.  Thanks for the heads up, and thanks for the heads up.

Jonathan

Re: [PATCH] git doc: direct bug reporters to mailing list archive (Re: [PATCH v2] git.txt: mention mailing list archive)

2018-09-28 Thread Junio C Hamano

Jonathan Nieder  writes:

> My experience is that bug reporters are very sensitive to hints the
> project gives about what kind of bugs they want to receive.  I'd
> rather make use of that lesson now instead of waiting to relearn it in
> the wild.  Here goes.

OK.  This unfortunately came a bit too late for today's integration
cycle, so I'll leave this in my inbox and replace what is queued 
with it later.

Unless there is another one to supersede this proposal comes before
that happens, that is.

Thanks.

Re: [PATCH] submodule: Alllow staged changes for get_superproject_working_tree

2018-09-28 Thread Junio C Hamano

Sam McKelvie  writes:

>> Or perhaps
>> 
>> rev-parse: --show-superproject-working-tree should work during a merge
>> 
>> may be more to the point.  It does not hint the root cause of the
>> bug like the other one, but is more direct how the breakage would
>> have been observed by the end users.
>> 
> Haha, that is closer to my original title that Stefan wanted to change:
>
> submodule.c: Make get_superproject_working_tree() work when superproject has 
> unmerged changes of the submodule reference
>
> Though I could see why mine was too long.
>
> I’m really happy with both your suggestions

I've pushed out with the "rev-parse: ..." as the title.

Thanks for the fix.

Re: [PATCH v6 4/7] config: add new index.threads config setting

2018-09-28 Thread Junio C Hamano

Ramsay Jones  writes:

>>     if (!nr) {
>>     ieot_blocks = istate->cache_nr / THREAD_COST;
>> -   if (ieot_blocks < 1)
>> -   ieot_blocks = 1;
>>     cpus = online_cpus();
>>     if (ieot_blocks > cpus - 1)
>>     ieot_blocks = cpus - 1;
>
> So, am I reading this correctly - you need cpus > 2 before an
> IEOT extension block is written out?
>
> OK.

Why should we be even calling online_cpus() in this codepath to
write the index in a single thread to begin with?

The number of cpus that readers would use to read this index file
has nothing to do with the number of cpus available to this
particular writer process.

Re: [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes

2018-09-28 Thread Taylor Blau

On Fri, Sep 28, 2018 at 01:30:57AM -0400, Jeff King wrote:
> On Thu, Sep 27, 2018 at 09:25:45PM -0700, Taylor Blau wrote:
>
> > The recently-introduced "core.alternateRefsCommand" allows callers to
> > specify with high flexibility the tips that they wish to advertise from
> > alternates. This flexibility comes at the cost of some inconvenience
> > when the caller only wishes to limit the advertisement to one or more
> > prefixes.
> >
> > For example, to advertise only tags, a caller using
> > 'core.alternateRefsCommand' would have to do:
> >
> >   $ git config core.alternateRefsCommand ' \
> >   git -C "$1" for-each-ref refs/tags --format="%(objectname)"'
>
> This has the same "$@" issue as the previous one, I think (which only
> makes your point about it being cumbersome more true!).

Hmm. I'll be curious to how you respond to my other message about the
same topic. I feel that whatever the outcome there is will affect both
locations in the same way.

> > In the case that the caller wishes to specify multiple prefixes, they
> > may separate them by whitespace. If "core.alternateRefsCommand" is set,
> > it will take precedence over "core.alternateRefsPrefixes".
>
> Just a meta-comment: I don't particularly mind this discussion in the
> commit message, but since these points ought to be in the documentation
> anyway, it may make sense to omit them here in the name of brevity.

Sure, that makes sense.

> > +core.alternateRefsPrefixes::
> > +   When listing references from an alternate, list only references that 
> > begin
> > +   with the given prefix. Prefixes match as if they were given as 
> > arguments to
> > +   linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them 
> > with
> > +   whitespace. If `core.alternateRefsCommand` is set, setting
> > +   `core.alternateRefsPrefixes` has no effect.
>
> Looks good.
>
> > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> > index 503dde35a4..3449967cc7 100755
> > --- a/t/t5410-receive-pack.sh
> > +++ b/t/t5410-receive-pack.sh
> > @@ -46,4 +46,12 @@ test_expect_success 'with core.alternateRefsCommand' '
> > test_cmp expect actual.haves
> >  '
> >
> > +test_expect_success 'with core.alternateRefsPrefixes' '
> > +   test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
> > +   git rev-parse one three two >expect &&
> > +   printf "" | git receive-pack fork >actual &&
> > +   extract_haves actual.haves &&
> > +   test_cmp expect actual.haves
> > +'
>
> If you follow my suggestion on the test setup from the last patch, it
> would make sense to just put "refs/heads/public/" here. Although neither
> that nor what you have here tests the whitespace separation. Possibly
> there should be a third hierarchy.

Sounds good; that's what I did.

> > diff --git a/transport.c b/transport.c
> > index e271b66603..83474add28 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct 
> > child_process *cmd,
> > argv_array_pushf(>args, "--git-dir=%s", repo_path);
> > argv_array_push(>args, "for-each-ref");
> > argv_array_push(>args, "--format=%(objectname)");
> > +
> > +   if (!git_config_get_value("core.alternateRefsPrefixes", 
> > )) {
> > +   argv_array_push(>args, "--");
> > +   argv_array_split(>args, value);
> > +   }
>
> And this part looks good.

Thanks for the review of this patch, too.

Thanks,
Taylor

Re: [PATCH] git-rebase.sh: fix typos in error messages

2018-09-28 Thread Junio C Hamano

Junio C Hamano  writes:

> ...  However, because the same mistakes are inherited to
> builtin/rebase.c by these topics, we'd need a matching fix to
> correct 07664161 ("builtin rebase: error out on incompatible
> option/mode combinations", 2018-08-08) and either squash the fix
> into that commit, or queue it on top of pk/rebase-in-c-5-test topic.
>
> Will queue; thanks.

Here is what I'd queue, too.

-- >8 --
Subject: [PATCH] rebase: fix typos in error messages

The separator between words in a multi-word option name is a dash,
not an underscore.

Inspired by a matching change by Ralf Thielow for the scripted
version of "git rebase".

Signed-off-by: Junio C Hamano 
---
 builtin/rebase.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 1a697d70c9..0f9a40aae5 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -1135,15 +1135,15 @@ int cmd_rebase(int argc, const char **argv, const char 
*prefix)
 *   git-rebase.txt caveats with "unless you know what you 
are doing"
 */
if (options.rebase_merges)
-   die(_("error: cannot combine '--preserve_merges' with "
+   die(_("error: cannot combine '--preserve-merges' with "
  "'--rebase-merges'"));
 
if (options.rebase_merges) {
if (strategy_options.nr)
-   die(_("error: cannot combine '--rebase_merges' with "
+   die(_("error: cannot combine '--rebase-merges' with "
  "'--strategy-option'"));
if (options.strategy)
-   die(_("error: cannot combine '--rebase_merges' with "
+   die(_("error: cannot combine '--rebase-merges' with "
  "'--strategy'"));
}
 
-- 
2.19.0-271-gfe8321ec05

Re: [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand

2018-09-28 Thread Taylor Blau

On Fri, Sep 28, 2018 at 01:26:13AM -0400, Jeff King wrote:
> On Thu, Sep 27, 2018 at 09:25:42PM -0700, Taylor Blau wrote:
>
> > Let the repository that has alternates configure this command to avoid
> > trusting the alternate to provide us a safe command to run in the shell.
> > To behave differently on each alternate (e.g., only list tags from
> > alternate A, only heads from B) provide the path of the alternate as the
> > first argument.
>
> Well, you also need to pass the path so it knows which repo to look at.
> Which I think is the primary reason we do it, but behaving differently
> for each alternate is another option.

Yeah. I think that the clearer argument is yours, so I'll amend my copy.
I am thinking of:

  To find the alternate, pass its absolute path as the first argument.

How does that sound?

> > +core.alternateRefsCommand::
> > +   When advertising tips of available history from an alternate, use the 
> > shell to
> > +   execute the specified command instead of linkgit:git-for-each-ref[1]. 
> > The
> > +   first argument is the absolute path of the alternate. Output must be of 
> > the
> > +   form: `%(objectname)`, where multiple tips are separated by newlines.
>
> I wonder if people may be confused about the %(objectname) syntax, since
> it's specific to for-each-ref.  Now that we've simplified the output
> format to a single value, perhaps we should define it more directly.
> E.g., like:
>
>   The output should contain one hex object id per line (i.e., the same
>   as produced by `git for-each-ref --format='%(objectname)'`).

I think that that's clearer, thanks. I applied it pretty much as you
suggested, but changed 'should' to 'must' and dropped the leading 'the'.

> Now that we've dropped the refname requirement from the output, it is
> more clear that this really does not have to be about refs at all.  In
> the most technical sense, what we really allow in the output is any
> object id X for which the alternate promises it has all objects
> reachable from X. Ref tips are a convenient and efficient way of
> providing that, but they are not the only possibility (and likewise, it
> is fine to omit duplicates or even tips that are ancestors of other
> tips).
>
> I think that's probably getting _too_ technical, though. It probably
> makes sense to just keep thinking of these as "what are the ref tips".

Yep, I agree completely.

> > +This is useful when a repository only wishes to advertise some of its
> > +alternate's references as ".have"'s. For example, to only advertise branch
>
> Maybe put ".have" into backticks for formatting?

Good idea, thanks. I took this locally as suggested.

> > +heads, configure `core.alternateRefsCommand` to the path of a script which 
> > runs
> > +`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
>
> Does that script actually work? Because of the way we invoke shell
> commands with arguments, I think we'd end up with:
>
>   git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads "$@"

I think that you're right...

> Possibly for-each-ref would ignore the extra path argument (thinking
> it's a ref pattern that just doesn't match), but it's definitely not
> what you intended. You'd have to write:
>
>   f() { git --git-dir=$1 ...etc; } f
>
> in the usual way. That's a minor pain, but it's what makes the more
> direct:
>
>   /my/script
>
> work.

...but this was what I was trying to get across with saying "...to the
path of a script which runs...", such that we would get the implicit
scoping that you make explicit in your example with "f() { ... }; f".

Does that seem OK as-is after the additional context? I think that after
reading your response, it seems to be confusing, so perhaps it should be
changed...

> The other alternative is to pass $GIT_DIR in the environment on behalf
> of the program. Then writing:
>
>   git for-each-ref --format='%(objectname)' refs/heads
>
> would Just Work. But it's a bit subtle, since it is not immediately
> obvious that the command is meant to run in a different repository.

I think that we discussed this approach a bit off-list, and I had the
idea that it was too fragile to work in practice, and that it would be
too surprising for callers to suddenly be in a different world.

I say this not because it wouldn't make this particular scenario more
convenient, which it uncountably would, but because it would make other
scenarios _more_ complicated.

For example, if a caller uses an alternate reference backed, perhaps,
MySQL (or anything that _isn't_ Git), they're not going to want to have
these GIT_ environment variable set.

So, I think that the greatest common denominator between the two is to
pass the alternate's absolute path as the first argument.

> > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> > new file mode 100755
> > index 00..503dde35a4
> > --- /dev/null
> > +++ b/t/t5410-receive-pack.sh
> > @@ -0,0 +1,49 @@
> > +#!/bin/sh
> > +
> > +test_description='git

Re: [PATCH] git-rebase.sh: fix typos in error messages

2018-09-28 Thread Junio C Hamano

Ralf Thielow  writes:

> Signed-off-by: Ralf Thielow 
> ---
>  git-rebase.sh | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

This patch itself will soon stop mattering once the group of
rebase-in-c topics graduate, which hopefully will happen during this
cycle.  However, because the same mistakes are inherited to
builtin/rebase.c by these topics, we'd need a matching fix to
correct 07664161 ("builtin rebase: error out on incompatible
option/mode combinations", 2018-08-08) and either squash the fix
into that commit, or queue it on top of pk/rebase-in-c-5-test topic.

Will queue; thanks.

>
> diff --git a/git-rebase.sh b/git-rebase.sh
> index 7973447645..45b6ee9c0e 100755
> --- a/git-rebase.sh
> +++ b/git-rebase.sh
> @@ -553,15 +553,15 @@ then
>   # Note: incompatibility with --interactive is just a strong warning;
>   #   git-rebase.txt caveats with "unless you know what you are doing"
>   test -n "$rebase_merges" &&
> - die "$(gettext "error: cannot combine '--preserve_merges' with 
> '--rebase-merges'")"
> + die "$(gettext "error: cannot combine '--preserve-merges' with 
> '--rebase-merges'")"
>  fi
>  
>  if test -n "$rebase_merges"
>  then
>   test -n "$strategy_opts" &&
> - die "$(gettext "error: cannot combine '--rebase_merges' with 
> '--strategy-option'")"
> + die "$(gettext "error: cannot combine '--rebase-merges' with 
> '--strategy-option'")"
>   test -n "$strategy" &&
> - die "$(gettext "error: cannot combine '--rebase_merges' with 
> '--strategy'")"
> + die "$(gettext "error: cannot combine '--rebase-merges' with 
> '--strategy'")"
>  fi
>  
>  if test -z "$rebase_root"

Re: [PATCH] submodule: Alllow staged changes for get_superproject_working_tree

2018-09-28 Thread Sam McKelvie




> On Sep 28, 2018, at 11:00 AM, Junio C Hamano  wrote:
> 
> Sam McKelvie  writes:
> 
>>> Ah, that, too.  I meant to correct triple ell, though ;-)
>>> ...
>> 
>> I wholeheartedly approve of that plan and your tweaking commit below. Thank 
>> you, Junio.
> 
> Thanks for a fix.  But now I re-read the title and think about it,
> this is mistitled.  The word 'stage' in "ls-files --stage" is not
> about 'stage' people use when they talk about "staged changes" at
> all.  The latter is what "diff --cached" is about---what's different
> between HEAD and the index.  The 'stage' "ls-files --stage" talks
> about is "which parent the cache entry came from, among common
> ancestor, us, or the other branch being merged".
> 
> Also we are not really "allowing" with this change; "allowing" makes
> it sound as if we were earlier "forbidding" with a good reason and
> the change is lifting the limitation.
> 
> We used to incorrectly die when superproject is resolving a conflict
> for the submodule we are currently in, and that is what the patch
> fixed.
> 
> submodule: parse output of conflicted ls-files in superproject correctly
> 
> is the shortest I could come up with, while touching all the points
> that need to be touched and still being technically not-incorrect.
> 
> Or perhaps
> 
> rev-parse: --show-superproject-working-tree should work during a merge
> 
> may be more to the point.  It does not hint the root cause of the
> bug like the other one, but is more direct how the breakage would
> have been observed by the end users.
> 
Haha, that is closer to my original title that Stefan wanted to change:

submodule.c: Make get_superproject_working_tree() work when superproject has 
unmerged changes of the submodule reference

Though I could see why mine was too long.

I’m really happy with both your suggestions. If you still think you can squash 
it with your own tweaks, great. Let me know
If you’d prefer another patch.

Re: [PATCH] strbuf.h: format according to coding guidelines

2018-09-28 Thread Stefan Beller

On Fri, Sep 28, 2018 at 2:42 PM Junio C Hamano  wrote:
>
> Stefan Beller  writes:
>
> > The previous patch suggested the strbuf header to be the leading example
> > of how we would want our APIs to be documented. This may lead to some
> > scrutiny of that code and the coding style (which is different from the
> > API documentation style) and hence might be taken as an example on how
> > to format code as well.
> >
> > So let's format strbuf.h in a way that we'd like to see:
> > * omit the extern keyword from function declarations
> > * name all parameters (usually the parameters are obvious from its type,
> >   but consider exceptions like
> >   `int strbuf_getwholeline_fd(struct strbuf *, int, int);`
> > * break overly long lines
> >
> > Signed-off-by: Stefan Beller 
> > ---
> >
> > Maybe this on top of Junios guideline patch?
>
> If we were to do this, I'd rather see us do a very good job on this
> file first, with "We are going to use this file as the best current
> practice model for an API header file" to begin its log message,
> and then recommend its use as the model on top.

I started going through that file and undoing "the naming all parameters"
and additionally started to remove any obvious parameter name
(mostly the first argument, that is of struct strbuf, and sometimes
was called sb or out)

But it seems in addition to all this we also want to re-read the
documentation and make sure it reflects the API accurately and
describes it well.

Re: [PATCH] strbuf.h: format according to coding guidelines

2018-09-28 Thread Junio C Hamano

Stefan Beller  writes:

> The previous patch suggested the strbuf header to be the leading example
> of how we would want our APIs to be documented. This may lead to some
> scrutiny of that code and the coding style (which is different from the
> API documentation style) and hence might be taken as an example on how
> to format code as well.
>
> So let's format strbuf.h in a way that we'd like to see:
> * omit the extern keyword from function declarations
> * name all parameters (usually the parameters are obvious from its type,
>   but consider exceptions like
>   `int strbuf_getwholeline_fd(struct strbuf *, int, int);`
> * break overly long lines
>
> Signed-off-by: Stefan Beller 
> ---
>
> Maybe this on top of Junios guideline patch?

If we were to do this, I'd rather see us do a very good job on this
file first, with "We are going to use this file as the best current
practice model for an API header file" to begin its log message,
and then recommend its use as the model on top.

Re: [PATCH] strbuf.h: format according to coding guidelines

2018-09-28 Thread Junio C Hamano

Junio C Hamano  writes:

> I actually do not mind the rule to be more like
>
>  * Use the same parameter names used in the function declaration when
>the description in the API documentation refers the parameter.

Assuming that we adopt the above guideline, let's extending it to
the original patch's review.

The following is a good example.  FIRST and SECOND would have been
upcased if this followed my earlier illustration to make them stand
out as references to the parameters, but it is already readable
without upcasing _and_ naming parameters is helping here.

 /**
  * Compare two buffers. Returns an integer less than, equal to, or greater
  * than zero if the first buffer is found, respectively, to be less than,
  * to match, or be greater than the second buffer.
  */
-extern int strbuf_cmp(const struct strbuf *, const struct strbuf *);
+int strbuf_cmp(const struct strbuf *first, const struct strbuf *second);

The next one could be improved and say something like: "Remove LEN
bytes in the strbuf SB starting at offset POS", as it already had
'pos' and 'len' that are readily usable.  Notice that "Remove LEN
bytes starting at offset POS" is a sufficiently clear description
and that is why I do not think we should require all parameters to
be named.

 /**
  * Remove given amount of data from a given position of the buffer.
  */
-extern void strbuf_remove(struct strbuf *, size_t pos, size_t len);
+void strbuf_remove(struct strbuf *sb, size_t pos, size_t len);

The last example is a job half-done.  The original had pos and len
parameters and referred to them in the text, but just said "with the
given data".  Now we have data and data_len, "the given data" can
and should be clarified by referring to them.

 /**
  * Remove the bytes between `pos..pos+len` and replace it with the given
  * data.
  */
-extern void strbuf_splice(struct strbuf *, size_t pos, size_t len,
- const void *, size_t);
+void strbuf_splice(struct strbuf *sb, size_t pos, size_t len,
+  const void *data, size_t data_len);

[PATCH] git doc: direct bug reporters to mailing list archive (Re: [PATCH v2] git.txt: mention mailing list archive)

2018-09-28 Thread Jonathan Nieder

Subject: git doc: direct bug reporters to mailing list archive

The mailing list archive can help a user encountering a bug to tell
whether a recent regression has already been reported and whether a
longstanding bug has already had some discussion to start their
thinking.

Based-on-patch-by: Martin Ågren 
Improved-by: Junio C Hamano 
Signed-off-by: Jonathan Nieder 
---
Junio C Hamano wrote:
>> Jonathan Nieder wrote:

>>> Hm.  I think this encourages a behavior that I want to discourage:
>>> assuming that if a bug has already been reported then there's nothing
>>> more for the new user to add.
[...]
> Yup, in short I think any one of the above three is good enough.
> Let's just pick one and move on.  Unless somebody sends in an
> improvement that can be applied readily, by default I'll just "git
> am" the one Martin sent, as that is the easiest thing to do ;-).

I assume this is meant as a nudge, in which case nudge successful. ;-)

My experience is that bug reporters are very sensitive to hints the
project gives about what kind of bugs they want to receive.  I'd
rather make use of that lesson now instead of waiting to relearn it in
the wild.  Here goes.

Thanks, both.

 Documentation/git.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 74a9d7edb4..8e6a92e8ba 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -858,7 +858,9 @@ Reporting Bugs
 
 Report bugs to the Git mailing list  where the
 development and maintenance is primarily done.  You do not have to be
-subscribed to the list to send a message there.
+subscribed to the list to send a message there.  See the list archive
+at https://public-inbox.org/git for previous bug reports and other
+discussions.
 
 Issues which are security relevant should be disclosed privately to
 the Git Security mailing list .
-- 
2.19.0.605.g01d371f741

Re: [PATCH] Improvement to only call Git Credential Helper once

2018-09-28 Thread Kyle Hubert

Thank you for the review, commenting inline.

On Fri, Sep 28, 2018 at 3:29 PM Junio C Hamano  wrote:
> Kyle Hubert  writes:
>
> > Subject: Re: [PATCH] Improvement to only call Git Credential Helper once
>
> Nobody will send in a patch to worsen things, so phrases like
> "Improvement to" that convey no useful information has no place on
> the title.
>
> There probably are multiple ways that credential helpers are not
> called just once and many of them probably are legit (e.g. "get" is
> not the only request one helper can receive).  It is unclear why
> "only call helper once" is an improvement unless the reader reads
> more, which means the title could be a lot more improved.
>
> Side note: this matters because in "git shortlog --no-merges"
> the title is the only thing you can tell your readers what
> your contribution was about.

Yes, thank you. I just joined the mailing list. I'm getting a sense
for the style here. If this patch moves forward, I will rewrite the
title.

> > When calling the Git Credential Helper that is set in the git config,
> > the get command can return a credential. Git immediately turns around
> > and calls the store command, even though that credential was just
> > retrieved by the Helper.
>
> Good summary of the current behaviour.  A paragraph break here would
> make the result easier to read.

Check.

> > This creates two side effects. First of all,
> > if the Helper requires a passphrase, the user has to type it in
> > twice.
>
> Hmph, because...?  If I am reading the flow correctly, an
> application would
>
>  - call credential_fill(), which returns when both username and
>password are obtained by a "get" request to one of the helpers.
>
>  - use the credential to authenticate with a service and find that
>it was accepted.
>
>  - call credential_approve(), which does "store" to all the helpers.
>
> Where does the "twice" come from?
>
> Ah, that is not between the application and the service, but between
> the helper and the user you are required to "unlock" the helper?
>
> OK, that wish makes sense.
>
> It does not make much sense to ask helper A for credential and then
> tell it to write it back the same thing.
>
> HOWEVER.  Let me play a devil's advocate.
>
> The "store" does not have to necessarily mean "write it back", no?
>
> Imagine a helper that is connected to an OTP password device that
> gives a different passcode every time button A is pressed, and there
> are two other buttons B to tell the device that the password was
> accepted and C to tell the device that the password was rejected
> (i.e. we are out of sync).  "get" would press button A and read the
> output, "store" would press button B and "erase" would press button
> C, I would imagine.  With the current credential.c framework, you
> can construct such a helper.  The proposed patch that stops calling
> "store" unconditionally makes it impossible to build.

This example helper would require knowing external state between
button pushes. What about aborting the operation? Regardless of the
example, I can understand your concern. If a helper depended on
receiving confirmation and rejection, this change would break that
behavior.

The patch is intended to address the common problem of a multi-user
system running on a server in headless mode, in other words, without
libsecret available via DBus. As such, this patch could eliminate
having to double type passwords for every git operation accessing the
stored credential.

> > Secondly, if the user has a number of helpers, this retrieves
> > the credential from one service and writes it to all services.
>
> It is unclear why you think it is a bad thing.  You need to
> elaborate.
>
> On the other hand, I can think of a case to illustrate why it is a
> bad idea to unconditionally stop calling "store" to other helpers.
> If one helper is a read-only "encrypted on disk" one, you may want
> to require passphrase to "decrypt" to implement the "get" request to
> the helper.  You would then overlay a "stay only in-core for a short
> time" helper and give higher priority to it.
>
> By doing so, on the first "get" request will ask the in-core one,
> which says "I dunno", then the encrypted-on-disk one interacts with
> the end-user and gives the credential.  The current code "store"s to
> the in-core one as well as the encrypted-on-disk one, and second and
> subseqhent "get" request can be served from that in-core helper.
>
> Side note: and the "store" to encrypted-on-disk one may not
> even need passphrase, even if "get" from it may need one.
>
> "We got the credential from some helper, so we won't call store"
> makes it impossible to build such an arrangement.
>
> The above is a devil's advocate response in that I do not mean to
> say that your proposed workaround does not solve *your* immediate
> need, but to point out that you are closing many doors for needs
> other people would have, or needs they already satisfy by taking
>

Re: Git for Windows for Unix?

2018-09-28 Thread Jonathan Nieder

Ævar Arnfjörð Bjarmason wrote:
> On Thu, Sep 27 2018, Jonathan Nieder wrote:

>> That said, that seems to me like a lot of work to avoid adding some
>> patches to "next" that belong in "next" anyway.  I understand why the
>> Git for Windows maintainer does not always have time to upstream
>> promptly, which is why I suggest working with him to find a way to
>> help with that.
>>
>> If there's something I'm missing and Git is actually an uncooperative
>> upstream like the cases you've mentioned, then I'd be happy to learn
>> about that so we can fix it, too.
>
> That's one and valid way to look at it, convergence would be ideal.
>
> Another way to look at it, which is closer to what I was thinking about,
> is to just view GFW as some alternate universe "next" branch (which by
> my count is ~2-3k commits ahead of master[1]).

You could view it that way, but I don't.  Many Git for Windows patches
have never even visited the Git mailing list.

Thanks,
Jonathan

Re: [PATCH v6 4/7] config: add new index.threads config setting

2018-09-28 Thread Ramsay Jones




On 28/09/18 20:41, Ben Peart wrote:
> 
> 
> On 9/28/2018 1:07 PM, Junio C Hamano wrote:
>> Ben Peart  writes:
>>
 Why does multithreading have to be disabled in this test?
>>>
>>> If multi-threading is enabled, it will write out the IEOT extension
>>> which changes the SHA and causes the test to fail.
>>
>> I think it is a design mistake to let the writing processes's
>> capability decide what is written in the file to be read later by a
>> different process, which possibly may have different capability.  If
>> you are not writing with multiple threads, it should not matter if
>> that writer process is capable of and configured to spawn 8 threads
>> if the process were reading the file---as it is not reading the file
>> it is writing right now.
>>
>> I can understand if the design is to write IEOT only if the
>> resulting index is expected to become large enough (above an
>> arbitrary threshold like 100k entries) to matter.  I also can
>> understand if IEOT is omitted when the repository configuration says
>> that no process is allowed to read the index with multi-threaded
>> codepath in that repository.
>>
> 
> There are two different paths which determine how many blocks are written to 
> the IEOT.  The first is the default path.  On this path, the number of blocks 
> is determined by the number of cache entries divided by the THREAD_COST.  If 
> there are sufficient entries to make it faster to use threading, then it will 
> automatically use enough blocks to optimize the performance of reading the 
> entries across multiple threads.
> 
> I currently cap the maximum number of blocks to be the number of cores that 
> would be available to process them on that same machine purely as an 
> optimization.  The majority of the time, the index will be read from the same 
> machine that it was written on so this works well.  Before I added that 
> logic, you would usually end up with more blocks than available threads which 
> meant some threads had more to do than the other threads and resulted in 
> worse performance.  For example, 4 blocks across 3 threads results in the 1st 
> thread having twice as much work to do as the other threads.
> 
> If the index is copied to a machine with a different number of cores, it will 
> still all work - it just may not be optimal for that machine.  This is self 
> correcting because as soon as the index is written out, it will be optimized 
> for that machine.
> 
> If the "automatically try to make it perform optimally" logic doesn't work 
> for some reason, we have path #2.
> 
> The second path is when the user specifies a specific number of blocks via 
> the GIT_TEST_INDEX_THREADS= environment variable or the index.threads= 
> config setting.  If they ask for n blocks, they will get n blocks.  This is 
> the "I know what I'm doing and want to control the behavior" path.
> 
> I just added one additional test (see patch below) to avoid a divide by zero 
> bug and simplify things a bit.  With this change, if there are fewer than two 
> blocks, the IEOT extension is not written out as it isn't needed.  The load 
> would be single threaded anyway so there is no reason to write out a IEOT 
> extensions that won't be used.
> 
> 
> 
> diff --git a/read-cache.c b/read-cache.c
> index f5d766088d..a1006fa824 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2751,18 +2751,23 @@ static int do_write_index(struct index_state *istate, 
> struct tempfile *tempfil
> e,
>  */
>     if (!nr) {
>     ieot_blocks = istate->cache_nr / THREAD_COST;
> -   if (ieot_blocks < 1)
> -   ieot_blocks = 1;
>     cpus = online_cpus();
>     if (ieot_blocks > cpus - 1)
>     ieot_blocks = cpus - 1;

So, am I reading this correctly - you need cpus > 2 before an
IEOT extension block is written out?

OK.

ATB,
Ramsay Jones

>     } else {
>     ieot_blocks = nr;
>     }
> -   ieot = xcalloc(1, sizeof(struct index_entry_offset_table)
> -   + (ieot_blocks * sizeof(struct index_entry_offset)));
> -   ieot->nr = 0;
> -   ieot_work = DIV_ROUND_UP(entries, ieot_blocks);
> +
> +   /*
> +    * no reason to write out the IEOT extension if we don't
> +    * have enough blocks to utilize multi-threading
> +    */
> +   if (ieot_blocks > 1) {
> +   ieot = xcalloc(1, sizeof(struct 
> index_entry_offset_table)
> +   + (ieot_blocks * sizeof(struct 
> index_entry_offset)));
> +   ieot->nr = 0;
> +   ieot_work = DIV_ROUND_UP(entries, ieot_blocks);
> +   }
>     }
>  #endif
> 
>

Re: [PATCH] FYI / RFC: submodules: introduce repo-like workflow

2018-09-28 Thread Stefan Beller

On Fri, Sep 28, 2018 at 12:26 PM Ævar Arnfjörð Bjarmason
 wrote:
>
>
> On Thu, Sep 27 2018, Stefan Beller wrote:
>
> > Internally we have rolled out this as an experiment for
> > "submodules replacing the repo tool[1]". The repo tool is described as:
> >
> > Repo unifies Git repositories when necessary, performs uploads to the
> > Gerrit revision control system, and automates parts of the Android
> > development workflow. Repo is not meant to replace Git, only to make
> > it easier to work with Git in the context of Android. The repo command
> > is an executable Python script that you can put anywhere in your path.
> >
> > In working with the Android source files, you use Repo for
> > across-network operations. For example, with a single Repo command you
> > can download files from multiple repositories into your local working
> > directory.
> >
> > In most situations, you can use Git instead of Repo, or mix Repo and
> > Git commands to form complex commands.
> >
> > [1] https://source.android.com/setup/develop/
>
> Some questions just out of curiosity, not for this patch in particular:
>
> Those docs seem to describe the situation without this patch, with this
> patch is the repo tool fully replaced?

Yeah I started by describing the status quo. Some points to add:

* The repo UX looks very similar to perforce, as that was used internally
   at Google at the time a lot. Git wasn't around for long in the days of
   early Android. Now that Git is well known, new grads joining are more likely
   to know Git than perforce.
* repo has no tests because it started as a small script
* maintenance/releases of repo are not ideal.

So yes, one of our long term goals is to replace repo with Git-only
(on the client side) workflow.

> How are you planning to migrate from repo to this on a repository-data
> basis

We don't plan to migrate the existing clients. You'd have to re-clone.

>, does gerrit also populate .gitmodules files appropriately, which
> your clone --recurse-submodules will pick up, but repo will just ignore,
> so you can use the two in parallel?

Gerrit has native support to update submodules in a superproject.
So if you submit code to a project that is a submodule of a superproject
the superproject updates its gitlink. (and if you use a topic based workflow
to submit to two projects, it will show up as an atomic update in the
superproject by having just one commit updating two gitlinks)

Also Gerrit has a plugin "supermanifest" [1], which tracks repo
manifests and mirrors changes from the manifest into the
superproject, e.g. adding a new project to the manifest will
add a new submodule to the superproject.

[1] https://gerrit.googlesource.com/plugins/supermanifest/

With both Gerrits internal superproject subscription and that plugin
it should be possible to use repo or git-submodule in parallel (on an
organisational level, i.e. you choose one of them, and your coworker
chooses the other one)

> Now "repo init -u" takes a URL to a manifest of repositories to stitch
> together, I've understood from past conversations (but am not sure) that
> this is used e.g. by downstream Android vendors so they can use what
> Google's using + whatever they have in-house, i.e. make the manifest the
> set of open source repos plus some (e.g. drivers specific to their
> device). How is that sort of workflow going to work where you
> (presumably) have do that via .gitmodules + commit entries in trees?

The manifest is tracked in its own manifest repo, so today they fork
that project, with the superproject you'd fork the superproject and modify
the .gitmodules file as needed.

> They run their own Gerrit install with some magic to sync back & forth?
>
> I assume that now the recursive "checkout" relies on all the origin/HEAD
> symbolic refs pointing to "master", but how is this going to deal with
> incorporating a repo whose main branch has a different name?

Change the name? Or pin it via sha1.

> E.g. "trunk" or "blead"? Perhaps some interaction with
> checkout.defaultRemote + submodule..branch= could make "git
> checkout :mainbranch" DWYM.
>
> > * Fetching changes ("repo sync") needs to fetch all repositories, as there
> >   is no central place that tracks what has changed. In a superproject
> >   git fetch can determine which submodules need fetching.  In Androids
> >   case the daily change is only in a few repositories (think 10s), so
> >   migrating to a superproject would save an order of magnitude in fetch
> >   traffic for daily updates of developers.
>
> Interesting that in all this time with the reliance on a central server
> repo wasn't already asking some custom API "what repos changed since xyz
> time" to narrow that down, but hey, .gitmodules + commit entries in
> trees will do it for you.

It's complicated. Shawn really grew to dislike the repo tool and wanted
to have it gone as quickly as possible, so no hacks that extend its life. ;-)
I would prefer to keep

Re: [PATCH] strbuf.h: format according to coding guidelines

2018-09-28 Thread Junio C Hamano

Junio C Hamano  writes:

> Stefan Beller  writes:
>
>> So let's format strbuf.h in a way that we'd like to see:
>> * omit the extern keyword from function declarations
>
> OK
>
>> * name all parameters (usually the parameters are obvious from its type,
>>   but consider exceptions like
>>   `int strbuf_getwholeline_fd(struct strbuf *, int, int);`
>
> I am mildly against giving names to obvious ones.

Just to make sure nobody listening from sidelines do not
misunderstand, ones like getwholeline_fd() that takes more than one
parameter with the same time is a prime example of a function that
take non-obvious parameters that MUST be named.  

What I am mildly against is the rule that says "name ALL
parameters".  I tend to think these names make headers harder to
read, and prefer to keep them to the minimum.

I actually do not mind the rule to be more like

 * Use the same parameter names used in the function declaration when
   the description in the API documentation refers the parameter.

That _forces_ people to write

/**
 * Read a whole line up to the terminating character 
 * TERM (typically LF or NUL) from the file descriptor FD
 * and replace the contents of the strbuf SB with the
 * result ...
 */
int strbuf_getwholeline_fd(struct strbuf *sb, int fd, int term);

which is mostly equivalent to having a rule to always name all
parameters, while still allowing "sb" to be omitted by rephrasing
"the contents of the given strbuf with the result ..." and I
consider that a good flexibility to have.

Re: Git for Windows for Unix?

2018-09-28 Thread Ævar Arnfjörð Bjarmason

On Thu, Sep 27 2018, Jonathan Nieder wrote:

> Hi,
>
> Ævar Arnfjörð Bjarmason wrote:
>
>> So it's similar to various packages that have "alternates" and are semi
>> or permanently forked, like emacs & xemacs, JDK etc., although I can't
>> recall one offhand that's quite similar to GFW v.s. git.git.
>>
>> My only stake in this is I thought it would be neat to be able to "apt
>> install git-for-windows", but I understand there's a support burden, but
>> if some *nix packagers are interested, maybe never taking it out of the
>> Debian equivalent of "experimental" and saying "this is unsupported, go
>> to the GFW tracker..." when bugs are filed would cut down on the support
>> burden.
>
> If someone else wants to package git-for-windows for Debian, I am
> happy to offer them advice and will not stop them.

Thanks, presumably this is going to be more involved than just
downloading the source .deb, replacing the source tarball &
s/2.19.0/2.19.0.windows.../, since presumably both would need to be
aware of each other not to conflict, or would need update-alternatives
integration.

> That said, that seems to me like a lot of work to avoid adding some
> patches to "next" that belong in "next" anyway.  I understand why the
> Git for Windows maintainer does not always have time to upstream
> promptly, which is why I suggest working with him to find a way to
> help with that.
>
> If there's something I'm missing and Git is actually an uncooperative
> upstream like the cases you've mentioned, then I'd be happy to learn
> about that so we can fix it, too.

That's one and valid way to look at it, convergence would be ideal.

Another way to look at it, which is closer to what I was thinking about,
is to just view GFW as some alternate universe "next" branch (which by
my count is ~2-3k commits ahead of master[1]).

>From that perspective if/how/when it's all converging is just inside
baseball. I can install git from "testing" in Debian, and get the same
version that's probably stress tested by tens of thousands of users, or
"experimental" (next) that's probably run by tens or hundreds of people,
maybe thousands.

I'd just like to have some easy way to install the gfw, which I'm
assuming is used by a lot more users than any of those.

One reason to do that these days is to get git-stash and git-rebase
that's a lot faster, yeah I could just pick those from the ML or build
from "pu", but I'm betting the GFW version has been tested a lot more.

I can just build it for myself, the reason I sent this mail was just to
a) see if there were some packagers that were interested in making
non-windows packages for it (e.g. some linux distros) b) it wasn't clear
to me before what the extent of the divergence with vanilla git was, and
that there wasn't much Windows-specific about GFW, it's just a fork
whose maintainer happens to just make releases on that OS (but as I've
learned, also tests them on Linux VMs).

1. $ git log --max-parents=1 --pretty=format:%s 
v2.19.0...v2.19.0.windows.1|sort|uniq|wc -l
   2346

   sort|uniq because without that we get ~27k commits, a lot of these
   seem to be automated commits made by some bots.

Re: [PATCH] strbuf.h: format according to coding guidelines

2018-09-28 Thread Junio C Hamano

Stefan Beller  writes:

> So let's format strbuf.h in a way that we'd like to see:
> * omit the extern keyword from function declarations

OK

> * name all parameters (usually the parameters are obvious from its type,
>   but consider exceptions like
>   `int strbuf_getwholeline_fd(struct strbuf *, int, int);`

I am mildly against giving names to obvious ones.

Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs

2018-09-28 Thread Martin Ågren

On Fri, 28 Sep 2018 at 18:51, Junio C Hamano  wrote:
> We recommend documenting in the header over documenting near the
> implementation to encourage people to write the docs that are
> readable without peeking at the implemention.

s/implemention/implementation/

> - - When you come up with an API, document it.
> + - When you come up with an API, document it the functions and the

s/it the/the/

> +   structures in the header file that exposes the API to its callers.
> +   Use what is in "strbuf.h" as a model to decide the appropriate tone
> +   in which the description is given, and the level of details to put
> +   in the description.

Martin

Re: [PATCH v6 4/7] config: add new index.threads config setting

2018-09-28 Thread Ben Peart





On 9/28/2018 1:07 PM, Junio C Hamano wrote:

Ben Peart  writes:


Why does multithreading have to be disabled in this test?


If multi-threading is enabled, it will write out the IEOT extension
which changes the SHA and causes the test to fail.


I think it is a design mistake to let the writing processes's
capability decide what is written in the file to be read later by a
different process, which possibly may have different capability.  If
you are not writing with multiple threads, it should not matter if
that writer process is capable of and configured to spawn 8 threads
if the process were reading the file---as it is not reading the file
it is writing right now.

I can understand if the design is to write IEOT only if the
resulting index is expected to become large enough (above an
arbitrary threshold like 100k entries) to matter.  I also can
understand if IEOT is omitted when the repository configuration says
that no process is allowed to read the index with multi-threaded
codepath in that repository.



There are two different paths which determine how many blocks are 
written to the IEOT.  The first is the default path.  On this path, the 
number of blocks is determined by the number of cache entries divided by 
the THREAD_COST.  If there are sufficient entries to make it faster to 
use threading, then it will automatically use enough blocks to optimize 
the performance of reading the entries across multiple threads.


I currently cap the maximum number of blocks to be the number of cores 
that would be available to process them on that same machine purely as 
an optimization.  The majority of the time, the index will be read from 
the same machine that it was written on so this works well.  Before I 
added that logic, you would usually end up with more blocks than 
available threads which meant some threads had more to do than the other 
threads and resulted in worse performance.  For example, 4 blocks across 
3 threads results in the 1st thread having twice as much work to do as 
the other threads.


If the index is copied to a machine with a different number of cores, it 
will still all work - it just may not be optimal for that machine.  This 
is self correcting because as soon as the index is written out, it will 
be optimized for that machine.


If the "automatically try to make it perform optimally" logic doesn't 
work for some reason, we have path #2.


The second path is when the user specifies a specific number of blocks 
via the GIT_TEST_INDEX_THREADS= environment variable or the 
index.threads= config setting.  If they ask for n blocks, they will 
get n blocks.  This is the "I know what I'm doing and want to control 
the behavior" path.


I just added one additional test (see patch below) to avoid a divide by 
zero bug and simplify things a bit.  With this change, if there are 
fewer than two blocks, the IEOT extension is not written out as it isn't 
needed.  The load would be single threaded anyway so there is no reason 
to write out a IEOT extensions that won't be used.




diff --git a/read-cache.c b/read-cache.c
index f5d766088d..a1006fa824 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2751,18 +2751,23 @@ static int do_write_index(struct index_state 
*istate, struct tempfile *tempfil

e,
 */
if (!nr) {
ieot_blocks = istate->cache_nr / THREAD_COST;
-   if (ieot_blocks < 1)
-   ieot_blocks = 1;
cpus = online_cpus();
if (ieot_blocks > cpus - 1)
ieot_blocks = cpus - 1;
} else {
ieot_blocks = nr;
}
-   ieot = xcalloc(1, sizeof(struct index_entry_offset_table)
-   + (ieot_blocks * sizeof(struct 
index_entry_offset)));

-   ieot->nr = 0;
-   ieot_work = DIV_ROUND_UP(entries, ieot_blocks);
+
+   /*
+* no reason to write out the IEOT extension if we don't
+* have enough blocks to utilize multi-threading
+*/
+   if (ieot_blocks > 1) {
+   ieot = xcalloc(1, sizeof(struct 
index_entry_offset_table)
+   + (ieot_blocks * sizeof(struct 
index_entry_offset)));

+   ieot->nr = 0;
+   ieot_work = DIV_ROUND_UP(entries, ieot_blocks);
+   }
}
 #endif

Re: [PATCH] Improvement to only call Git Credential Helper once

2018-09-28 Thread Junio C Hamano

Kyle Hubert  writes:

> Subject: Re: [PATCH] Improvement to only call Git Credential Helper once

Nobody will send in a patch to worsen things, so phrases like
"Improvement to" that convey no useful information has no place on
the title.

There probably are multiple ways that credential helpers are not
called just once and many of them probably are legit (e.g. "get" is
not the only request one helper can receive).  It is unclear why
"only call helper once" is an improvement unless the reader reads
more, which means the title could be a lot more improved.

Side note: this matters because in "git shortlog --no-merges"
the title is the only thing you can tell your readers what
your contribution was about.

> When calling the Git Credential Helper that is set in the git config,
> the get command can return a credential. Git immediately turns around
> and calls the store command, even though that credential was just
> retrieved by the Helper. 

Good summary of the current behaviour.  A paragraph break here would
make the result easier to read.

> This creates two side effects. First of all,
> if the Helper requires a passphrase, the user has to type it in
> twice.

Hmph, because...?  If I am reading the flow correctly, an
application would

 - call credential_fill(), which returns when both username and
   password are obtained by a "get" request to one of the helpers.

 - use the credential to authenticate with a service and find that
   it was accepted.

 - call credential_approve(), which does "store" to all the helpers.

Where does the "twice" come from?

Ah, that is not between the application and the service, but between
the helper and the user you are required to "unlock" the helper?

OK, that wish makes sense.

It does not make much sense to ask helper A for credential and then
tell it to write it back the same thing.

HOWEVER.  Let me play a devil's advocate.

The "store" does not have to necessarily mean "write it back", no?

Imagine a helper that is connected to an OTP password device that
gives a different passcode every time button A is pressed, and there
are two other buttons B to tell the device that the password was
accepted and C to tell the device that the password was rejected
(i.e. we are out of sync).  "get" would press button A and read the
output, "store" would press button B and "erase" would press button
C, I would imagine.  With the current credential.c framework, you
can construct such a helper.  The proposed patch that stops calling
"store" unconditionally makes it impossible to build.

> Secondly, if the user has a number of helpers, this retrieves
> the credential from one service and writes it to all services.

It is unclear why you think it is a bad thing.  You need to
elaborate.

On the other hand, I can think of a case to illustrate why it is a
bad idea to unconditionally stop calling "store" to other helpers.
If one helper is a read-only "encrypted on disk" one, you may want
to require passphrase to "decrypt" to implement the "get" request to
the helper.  You would then overlay a "stay only in-core for a short
time" helper and give higher priority to it.

By doing so, on the first "get" request will ask the in-core one,
which says "I dunno", then the encrypted-on-disk one interacts with
the end-user and gives the credential.  The current code "store"s to
the in-core one as well as the encrypted-on-disk one, and second and
subseqhent "get" request can be served from that in-core helper.

Side note: and the "store" to encrypted-on-disk one may not
even need passphrase, even if "get" from it may need one.

"We got the credential from some helper, so we won't call store"
makes it impossible to build such an arrangement.

The above is a devil's advocate response in that I do not mean to
say that your proposed workaround does not solve *your* immediate
need, but to point out that you are closing many doors for needs
other people would have, or needs they already satisfy by taking
advantage of the current behaviour the proposed patch is breaking.

So, I dunno.  I certainly do not think it is a bad idea to stop
feeding _other_ helpers.  I also do not think it is a good idea to
unconditionally stop calling "store" to the same helper, but I can
see the benefit for having an option to skip "store" to the same
helper.  I am not sure if there should be an option to stop feeding
other helpers.

Thanks.

[PATCH] git-rebase.sh: fix typos in error messages

2018-09-28 Thread Ralf Thielow

Signed-off-by: Ralf Thielow 
---
 git-rebase.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-rebase.sh b/git-rebase.sh
index 7973447645..45b6ee9c0e 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -553,15 +553,15 @@ then
# Note: incompatibility with --interactive is just a strong warning;
#   git-rebase.txt caveats with "unless you know what you are doing"
test -n "$rebase_merges" &&
-   die "$(gettext "error: cannot combine '--preserve_merges' with 
'--rebase-merges'")"
+   die "$(gettext "error: cannot combine '--preserve-merges' with 
'--rebase-merges'")"
 fi
 
 if test -n "$rebase_merges"
 then
test -n "$strategy_opts" &&
-   die "$(gettext "error: cannot combine '--rebase_merges' with 
'--strategy-option'")"
+   die "$(gettext "error: cannot combine '--rebase-merges' with 
'--strategy-option'")"
test -n "$strategy" &&
-   die "$(gettext "error: cannot combine '--rebase_merges' with 
'--strategy'")"
+   die "$(gettext "error: cannot combine '--rebase-merges' with 
'--strategy'")"
 fi
 
 if test -z "$rebase_root"
-- 
2.19.0.599.g22e244bd67

Re: [PATCH] FYI / RFC: submodules: introduce repo-like workflow

2018-09-28 Thread Ævar Arnfjörð Bjarmason

On Thu, Sep 27 2018, Stefan Beller wrote:

> Internally we have rolled out this as an experiment for
> "submodules replacing the repo tool[1]". The repo tool is described as:
>
> Repo unifies Git repositories when necessary, performs uploads to the
> Gerrit revision control system, and automates parts of the Android
> development workflow. Repo is not meant to replace Git, only to make
> it easier to work with Git in the context of Android. The repo command
> is an executable Python script that you can put anywhere in your path.
>
> In working with the Android source files, you use Repo for
> across-network operations. For example, with a single Repo command you
> can download files from multiple repositories into your local working
> directory.
>
> In most situations, you can use Git instead of Repo, or mix Repo and
> Git commands to form complex commands.
>
> [1] https://source.android.com/setup/develop/

Some questions just out of curiosity, not for this patch in particular:

Those docs seem to describe the situation without this patch, with this
patch is the repo tool fully replaced?

How are you planning to migrate from repo to this on a repository-data
basis, does gerrit also populate .gitmodules files appropriately, which
your clone --recurse-submodules will pick up, but repo will just ignore,
so you can use the two in parallel?

Now "repo init -u" takes a URL to a manifest of repositories to stitch
together, I've understood from past conversations (but am not sure) that
this is used e.g. by downstream Android vendors so they can use what
Google's using + whatever they have in-house, i.e. make the manifest the
set of open source repos plus some (e.g. drivers specific to their
device). How is that sort of workflow going to work where you
(presumably) have do that via .gitmodules + commit entries in trees?
They run their own Gerrit install with some magic to sync back & forth?

I assume that now the recursive "checkout" relies on all the origin/HEAD
symbolic refs pointing to "master", but how is this going to deal with
incorporating a repo whose main branch has a different name?
E.g. "trunk" or "blead"? Perhaps some interaction with
checkout.defaultRemote + submodule..branch= could make "git
checkout :mainbranch" DWYM.

> * Fetching changes ("repo sync") needs to fetch all repositories, as there
>   is no central place that tracks what has changed. In a superproject
>   git fetch can determine which submodules need fetching.  In Androids
>   case the daily change is only in a few repositories (think 10s), so
>   migrating to a superproject would save an order of magnitude in fetch
>   traffic for daily updates of developers.

Interesting that in all this time with the reliance on a central server
repo wasn't already asking some custom API "what repos changed since xyz
time" to narrow that down, but hey, .gitmodules + commit entries in
trees will do it for you.

> * Sometimes when the dependencies are not on a clear repository boundary
>   one would like to have git-bisect available across the different
>   repositories, which repo cannot provide due to its design.

I assume that you're not upgrading independently to e.g. every single
linux commit, just stable releases, so does bisecting deal with knowing
that e.g. a breakage occurred when linux.git was updated from v4.10 to
v4.12, and then to go within the repo itself and bisect from there, or
is that done manually?

[PATCH 0/4] Multiple subtree split fixes regarding complex repos

2018-09-28 Thread Strain, Roger L

We recently (about eight months ago) transitioned to git source control systems 
for several very large, very complex systems. We brought over several active 
versions requiring maintenance updates, and also set up several subtree repos 
to manage code shared between the systems. Recently, we attempted to push 
updates back to those subtrees and encountered errors. I believe I have 
identified and corrected the errors we found in our repos, and would like to 
contribute those fixes back.

Commands to demonstrate both failures using the current version of the subtree 
script are here:
https://gist.github.com/FoxFireX/1b794384612b7fd5e7cd157cff96269e

Short summary of three problems involved:
1. Split using rejoins fails in some cases where a commit has a parent which 
was a parent commit further upstream from a rejoin, causing a new initial 
commit to be created, which is not related to the original subtree commits.
2. Split using rejoins fails to generate a merge commit which may have triaged 
the previous problem, but instead elected to use only the parent which is not 
connected to the original subtree commits. (This may occur when the commit and 
both parents all share the same subtree hash.)
3. Split ignoring joins also ignores the original add commit, which causes 
content prior to the add to be considered part of the subtree graph, changing 
the commit hashes so it is not connected to the original subtree commits.

The following commits address each problem individually, along with a single 
commit that makes no functional change but performs a small refactor of the 
existing code. Hopefully that will make reviewing it a simpler task. This is my 
first attempt at submitting a patch back, so apologies if I've made any errors 
in the process.

Strain, Roger L (4):
  subtree: refactor split of a commit into standalone method
  subtree: make --ignore-joins pay attention to adds
  subtree: use commits before rejoins for splits
  subtree: improve decision on merges kept in split

 contrib/subtree/git-subtree.sh | 129 +
 1 file changed, 83 insertions(+), 46 deletions(-)

-- 
2.19.0.windows.1

Re: [PATCH v3 3/5] fsmonitor: update GIT_TEST_FSMONITOR support

2018-09-28 Thread Junio C Hamano

Ben Peart  writes:

> Junio, can you squash in the following patch or would you prefer I
> reroll the entire series?

Squash it to f8cd77d5 ("fsmonitor: update GIT_TEST_FSMONITOR
support", 2018-09-18) and use the two new lines in the log message?

I can do that.

Thanks.

Re: [PATCH v6 3/7] eoie: add End of Index Entry (EOIE) extension

2018-09-28 Thread Ben Peart





On 9/27/2018 8:19 PM, SZEDER Gábor wrote:

On Wed, Sep 26, 2018 at 03:54:38PM -0400, Ben Peart wrote:

The End of Index Entry (EOIE) is used to locate the end of the variable


Nit: perhaps start with:

   The End of Index Entry (EOIE) optional extension can be used to ...

to make it clearer for those who don't immediately realize the
significance of the upper case 'E' in the extension's signature.


length index entries and the beginning of the extensions. Code can take
advantage of this to quickly locate the index extensions without having
to parse through all of the index entries.

Because it must be able to be loaded before the variable length cache
entries and other index extensions, this extension must be written last.
The signature for this extension is { 'E', 'O', 'I', 'E' }.

The extension consists of:

- 32-bit offset to the end of the index entries

- 160-bit SHA-1 over the extension types and their sizes (but not
their contents).  E.g. if we have "TREE" extension that is N-bytes
long, "REUC" extension that is M-bytes long, followed by "EOIE",
then the hash would be:

SHA-1("TREE" +  +
"REUC" + )

Signed-off-by: Ben Peart 
---
  Documentation/technical/index-format.txt |  23 
  read-cache.c | 151 +--
  t/README |   5 +
  t/t1700-split-index.sh   |   1 +
  4 files changed, 172 insertions(+), 8 deletions(-)




diff --git a/t/README b/t/README
index 3ea6c85460..aa33ac4f26 100644
--- a/t/README
+++ b/t/README
@@ -327,6 +327,11 @@ GIT_TEST_COMMIT_GRAPH=, when true, forces the 
commit-graph to
  be written after every 'git commit' command, and overrides the
  'core.commitGraph' setting to true.
  
+GIT_TEST_DISABLE_EOIE= disables writing the EOIE extension.

+This is used to allow tests 1, 4-9 in t1700-split-index.sh to succeed
+as they currently hard code SHA values for the index which are no longer
+valid due to the addition of the EOIE extension.


Is this extension enabled by default?  The commit message doesn't
explicitly say so, but I don't see any way to turn it on or off, while
there is this new GIT_TEST environment variable to disable it for one
particular test, so it seems so.  If that's indeed the case, then
wouldn't it be better to update those hard-coded SHA1 values in t1700
instead?



Yes, it is enabled by default and the only way to disable it is the 
GIT_TEST_DISABLE_EOIE environment variable.


The tests in t1700-split-index.sh assume that there are no extensions in 
the index file so anything that adds an extension, will break one or 
more of the tests.


First in 'enable split index', they hard code SHA values assuming there 
are no extensions. If some option adds an extension, these hard coded 
values no longer match and the test fails.


Later in 'disable split index' they save off the SHA of the index with 
split-index turned off and then in later tests, compare it to the SHA of 
the shared index.  Because extensions are stripped when the shared index 
is written out this only works if there were not extensions in the 
original index.


I'll document this behavior and reasoning in the test directly.

This did cause me to reexamine how EOIE and IEOT behave when split index 
is turned on.  These two extensions help most with a large index.  When 
split index is turned on, the large index is actually the shared index 
as the index is now the smaller set of deltas.


Currently, the extensions are stripped out of the shared index which 
means they are not available when they are needed to quickly load the 
shared index.  I'll see if I can update the patch so that these 
extensions are still written out and available in the shared index to 
speed up when it is loaded.


Thanks!


  Naming Tests
  
  
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh

index be22398a85..1f168378c8 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -7,6 +7,7 @@ test_description='split index mode tests'
  # We need total control of index splitting here
  sane_unset GIT_TEST_SPLIT_INDEX
  sane_unset GIT_FSMONITOR_TEST
+GIT_TEST_DISABLE_EOIE=true; export GIT_TEST_DISABLE_EOIE
  
  test_expect_success 'enable split index' '

git config splitIndex.maxPercentChange 100 &&
--
2.18.0.windows.1

[PATCH 4/4] subtree: improve decision on merges kept in split

2018-09-28 Thread Strain, Roger L

When multiple identical parents are detected for a commit being considered
for copying, explicitly check whether one is the common merge base between
the commits. If so, the other commit can be used as the identical parent;
if not, a merge must be performed to maintain history.

In some situations two parents of a merge commit may appear to both have
identical subtree content with each other and the current commit. However,
those parents can potentially come from different commit graphs.

Previous behavior would simply select one of the identical parents to
serve as the replacement for this commit, based on the order in which they
were processed.

New behavior compares the merge base between the commits to determine if
a new merge commit is necessary to maintain history despite the identical
content.

Signed-off-by: Strain, Roger L 
---
 contrib/subtree/git-subtree.sh | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index 23dd04cbe..1c157dbd9 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -541,6 +541,7 @@ copy_or_skip () {
nonidentical=
p=
gotparents=
+   copycommit=
for parent in $newparents
do
ptree=$(toptree_for_commit $parent) || exit $?
@@ -548,7 +549,24 @@ copy_or_skip () {
if test "$ptree" = "$tree"
then
# an identical parent could be used in place of this 
rev.
-   identical="$parent"
+   if test -n "$identical"
+   then
+   # if a previous identical parent was found, 
check whether
+   # one is already an ancestor of the other
+   mergebase=$(git merge-base $identical $parent)
+   if test "$identical" = "$mergebase"
+   then
+   # current identical commit is an 
ancestor of parent
+   identical="$parent"
+   elif test "$parent" != "$mergebase"
+   then
+   # no common history; commit must be 
copied
+   copycommit=1
+   fi
+   else
+   # first identical parent detected
+   identical="$parent"
+   fi
else
nonidentical="$parent"
fi
@@ -571,7 +589,6 @@ copy_or_skip () {
fi
done
 
-   copycommit=
if test -n "$identical" && test -n "$nonidentical"
then
extras=$(git rev-list --count $identical..$nonidentical)
-- 
2.19.0.windows.1

[PATCH 1/4] subtree: refactor split of a commit into standalone method

2018-09-28 Thread Strain, Roger L

In a particularly complex repo, subtree split was not creating
compatible splits for pushing back to a separate repo. Addressing
one of the issues requires recursive handling of parent commits
that were not initially considered by the algorithm. This commit
makes no functional changes, but relocates the code to be called
recursively into a new method to simply comparisons of later
commits.

Signed-off-by: Strain, Roger L 
---
 contrib/subtree/git-subtree.sh | 78 ++
 1 file changed, 42 insertions(+), 36 deletions(-)

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index d3f39a862..2cd7b345b 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -598,6 +598,47 @@ ensure_valid_ref_format () {
die "'$1' does not look like a ref"
 }
 
+process_split_commit () {
+   local rev="$1"
+   local parents="$2"
+   revcount=$(($revcount + 1))
+   progress "$revcount/$revmax ($createcount)"
+   debug "Processing commit: $rev"
+   exists=$(cache_get "$rev")
+   if test -n "$exists"
+   then
+   debug "  prior: $exists"
+   return
+   fi
+   createcount=$(($createcount + 1))
+   debug "  parents: $parents"
+   newparents=$(cache_get $parents)
+   debug "  newparents: $newparents"
+
+   tree=$(subtree_for_commit "$rev" "$dir")
+   debug "  tree is: $tree"
+
+   check_parents $parents
+
+   # ugly.  is there no better way to tell if this is a subtree
+   # vs. a mainline commit?  Does it matter?
+   if test -z "$tree"
+   then
+   set_notree "$rev"
+   if test -n "$newparents"
+   then
+   cache_set "$rev" "$rev"
+   fi
+   return
+   fi
+
+   newrev=$(copy_or_skip "$rev" "$tree" "$newparents") || exit $?
+   debug "  newrev is: $newrev"
+   cache_set "$rev" "$newrev"
+   cache_set latest_new "$newrev"
+   cache_set latest_old "$rev"
+}
+
 cmd_add () {
if test -e "$dir"
then
@@ -706,42 +747,7 @@ cmd_split () {
eval "$grl" |
while read rev parents
do
-   revcount=$(($revcount + 1))
-   progress "$revcount/$revmax ($createcount)"
-   debug "Processing commit: $rev"
-   exists=$(cache_get "$rev")
-   if test -n "$exists"
-   then
-   debug "  prior: $exists"
-   continue
-   fi
-   createcount=$(($createcount + 1))
-   debug "  parents: $parents"
-   newparents=$(cache_get $parents)
-   debug "  newparents: $newparents"
-
-   tree=$(subtree_for_commit "$rev" "$dir")
-   debug "  tree is: $tree"
-
-   check_parents $parents
-
-   # ugly.  is there no better way to tell if this is a subtree
-   # vs. a mainline commit?  Does it matter?
-   if test -z "$tree"
-   then
-   set_notree "$rev"
-   if test -n "$newparents"
-   then
-   cache_set "$rev" "$rev"
-   fi
-   continue
-   fi
-
-   newrev=$(copy_or_skip "$rev" "$tree" "$newparents") || exit $?
-   debug "  newrev is: $newrev"
-   cache_set "$rev" "$newrev"
-   cache_set latest_new "$newrev"
-   cache_set latest_old "$rev"
+   process_split_commit "$rev" "$parents"
done || exit $?
 
latest_new=$(cache_get latest_new)
-- 
2.19.0.windows.1

[PATCH 3/4] subtree: use commits before rejoins for splits

2018-09-28 Thread Strain, Roger L

Adds recursive evaluation of parent commits which were not part of the
initial commit list when performing a split.

Split expects all relevant commits to be reachable from the target commit
but not reachable from any previous rejoins. However, a branch could be
based on a commit prior to a rejoin, then later merged back into the
current code. In this case, a parent to the commit will not be present in
the initial list of commits, trigging an "incorrect order" warning.

Previous behavior was to consider that commit to have no parent, creating
an original commit containing all subtree content. This commit is not
present in an existing subtree commit graph, changing commit hashes and
making pushing to a subtree repo impossible.

New behavior will recursively check these unexpected parent commits to
track them back to either an earlier rejoin, or a true original commit.
The generated synthetic commits will properly match previously-generated
commits, allowing successful pushing to a prior subtree repo.

Signed-off-by: Strain, Roger L 
---
 contrib/subtree/git-subtree.sh | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index d8861f306..23dd04cbe 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -231,12 +231,14 @@ cache_miss () {
 }
 
 check_parents () {
-   missed=$(cache_miss "$@")
+   missed=$(cache_miss "$1")
+   local indent=$(($2 + 1))
for miss in $missed
do
if ! test -r "$cachedir/notree/$miss"
then
debug "  incorrect order: $miss"
+   process_split_commit "$miss" "" "$indent"
fi
done
 }
@@ -606,8 +608,20 @@ ensure_valid_ref_format () {
 process_split_commit () {
local rev="$1"
local parents="$2"
-   revcount=$(($revcount + 1))
-   progress "$revcount/$revmax ($createcount)"
+   local indent=$3
+
+   if test $indent -eq 0
+   then
+   revcount=$(($revcount + 1))
+   else
+   # processing commit without normal parent information;
+   # fetch from repo
+   parents=$(git show -s --pretty=%P "$rev")
+   extracount=$(($extracount + 1))
+   fi
+
+   progress "$revcount/$revmax ($createcount) [$extracount]"
+
debug "Processing commit: $rev"
exists=$(cache_get "$rev")
if test -n "$exists"
@@ -617,14 +631,13 @@ process_split_commit () {
fi
createcount=$(($createcount + 1))
debug "  parents: $parents"
+   check_parents "$parents" "$indent"
newparents=$(cache_get $parents)
debug "  newparents: $newparents"
 
tree=$(subtree_for_commit "$rev" "$dir")
debug "  tree is: $tree"
 
-   check_parents $parents
-
# ugly.  is there no better way to tell if this is a subtree
# vs. a mainline commit?  Does it matter?
if test -z "$tree"
@@ -744,10 +757,11 @@ cmd_split () {
revmax=$(eval "$grl" | wc -l)
revcount=0
createcount=0
+   extracount=0
eval "$grl" |
while read rev parents
do
-   process_split_commit "$rev" "$parents"
+   process_split_commit "$rev" "$parents" 0
done || exit $?
 
latest_new=$(cache_get latest_new)
-- 
2.19.0.windows.1

[PATCH 2/4] subtree: make --ignore-joins pay attention to adds

2018-09-28 Thread Strain, Roger L

Changes the behavior of --ignore-joins to always consider a subtree add
commit, and ignore only splits and squashes.

The --ignore-joins option is documented to ignore prior --rejoin commits.
However, it additionally ignored subtree add commits generated when a
subtree was initially added to a repo.

Due to the logic which determines whether a commit is a mainline commit
or a subtree commit (namely, the presence or absence of content in the
subtree prefix) this causes commits before the initial add to appear to
be part of the subtree. An --ignore-joins split would therefore consider
those commits part of the subtree history and include them at the
beginning of the synthetic history, causing the resulting hashes to be
incorrect for all later commits.

Signed-off-by: Strain, Roger L 
---
 contrib/subtree/git-subtree.sh | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index 2cd7b345b..d8861f306 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -340,7 +340,12 @@ find_existing_splits () {
revs="$2"
main=
sub=
-   git log --grep="^git-subtree-dir: $dir/*\$" \
+   local grep_format="^git-subtree-dir: $dir/*\$"
+   if test -n "$ignore_joins"
+   then
+   grep_format="^Add '$dir/' from commit '"
+   fi
+   git log --grep="$grep_format" \
--no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' 
$revs |
while read a b junk
do
@@ -730,12 +735,7 @@ cmd_split () {
done
fi
 
-   if test -n "$ignore_joins"
-   then
-   unrevs=
-   else
-   unrevs="$(find_existing_splits "$dir" "$revs")"
-   fi
+   unrevs="$(find_existing_splits "$dir" "$revs")"
 
# We can't restrict rev-list to only $dir here, because some of our
# parents have the $dir contents the root, and those won't match.
-- 
2.19.0.windows.1

Re: [PATCH 1/1] roll wt_status_state into wt_status and populate in the collect phase

2018-09-28 Thread Junio C Hamano

Taylor Blau  writes:

> On Thu, Sep 27, 2018 at 09:49:36PM -0700, Stephen P. Smith wrote:
>> When updating the collect and print functions, it was found that
>> status variables were initialized in the collect phase and some
>> variables were later freed in the print functions.
>
> Nit: I think that in the past Eric Sunshine has recommended that I use
> active voice in patches, but "it was found" is passive.

Yeah, and when/how it was found is much less interesting backstory
than _why_ we are doing this follow-thru.  I think the first line
can just simply go without losing clarity.

>> Move the status state structure variables into the status state
>> structure and populate them in the collect functions.

On the other hand this one may deserve a bit more backstory.  If I
understand correctly, what happened over time was

 - A "struct wt_status" used to be sufficient for the output phase
   to work.  It was designed to be filled in the collect phase and
   consumed in the output phase, but over time some fields are added
   and output phase started filling it; we recently corrected it so
   that .committable field is filled in the collect phase.

   A "struct wt_status_state" that was used in other codepaths
   turned out to be useful in showing the "git status" output, so
   some output phase functions started taking it.  This is not tied
   to "struct wt_status", so the discipline of filling in the
   collect phase to be consumed in the output phase was never
   followed.

I am not suggesting to write that much in the log message, but and
with a backstory like that, embedding a wt_status_state inside
wt_status and fill it in the collect phase, which this patch does,
starts to make sense, I would think.

>> diff --git a/wt-status.c b/wt-status.c
>> index c7f76d4758..9977f0cdf2 100644
>> --- a/wt-status.c
>> +++ b/wt-status.c
>> @@ -744,21 +744,26 @@ static int has_unmerged(struct wt_status *s)
>>
>>  void wt_status_collect(struct wt_status *s)
>>  {
>> -struct wt_status_state state;
>>  wt_status_collect_changes_worktree(s);
>> -
>
> Nit: unnecessary diff, but I certainly don't think that this is worth a
> re-roll on its own.

I do not think it is unnecessary to remove the blank between three
things this function does (i.e. (1) inspect working tree, (2)
inspect index and (3) inspect untrackeed; if there is no blank line
between (2) and (3), we shouldn't have a blank between (1) and (2)).

I do agree with you it is an unrelated change.  Its correctness (not
to the compiler, but to the humans due to the above) is so trivial
that it probably is a good taste to include it in this patch.

>>  if (s->is_initial)
>>  wt_status_collect_changes_initial(s);
>>  else
>>  wt_status_collect_changes_index(s);
>>  wt_status_collect_untracked(s);
>>
>> -memset(, 0, sizeof(state));
>> -wt_status_get_state(, s->branch && !strcmp(s->branch, "HEAD"));
>> -if (state.merge_in_progress && !has_unmerged(s))
>> +wt_status_get_state(>state, s->branch && !strcmp(s->branch, "HEAD"));
>> +if (s->state.merge_in_progress && !has_unmerged(s))
>>  s->committable = 1;
>
> Should this line be de-dented to match the above?

I am not sure if I follow.

Thanks.

Request for examples on git log --format:tformat

2018-09-28 Thread Daniel Lo

Greetings,

I was reviewing the tformat parameters on:
https://git-scm.com/docs/git-log (middle of the page).

Specifically: %<|(): make the next placeholder take at least until
Nth columns, padding spaces on the right if necessary

I found the instructions regard space formatting to be very confusing.
An example would be helpful to illustrate what the proper space
formatting syntax is:

Ex:
git log --format="tformat:%h %<(15)%an %s"

0123456 Author Name Commit message - author name is formatted to be
padded with space to occupy at least 15 characters

All of the special symbols %<|(<>) made me confused to what was
required and what was describing the syntax.

Thank you!

-daniel


-- 

Percy Shelley's poem "Ozymandias" is also a good argument for refactoring code.
http://en.wikipedia.org/wiki/Ozymandias

Re: [PATCH] FYI / RFC: submodules: introduce repo-like workflow

2018-09-28 Thread Junio C Hamano

Jonathan Nieder  writes:

> (dropping cc-s to my internal address that I don't use on this list
>  and to git-c...@google.com which bounces)
> Hi,
>
> Stefan Beller wrote:
>
>> Internally we have rolled out this as an experiment for
>> "submodules replacing the repo tool[1]". The repo tool is described as:
>>
>> Repo unifies Git repositories when necessary, performs uploads to the
>> Gerrit revision control system, and automates parts of the Android
>> development workflow. Repo is not meant to replace Git, only to make
>> it easier to work with Git in the context of Android. The repo command
>> is an executable Python script that you can put anywhere in your path.
> [...]
>> Submodules can also be understood as unifying Git repositories, even more,
>> in the superproject the submodules have relationships between their
>> versions, which the repo tool only provides in releases.
>>
>> The repo tool does not provide these relationships between versions, but
>> all the repositories (in case of Android think of ~1000 git repositories)
>> are put in their place without depending on each other.
>>
>> This comes with a couple of advantages and disadvantages:
>
> Thanks for describing this background.

Thanks for this.  I probably won't be reading this before other
topics, but I've often found that changes from google were lacking
the backstory to make sense of them as a coherent whole.  

For example, a patch that says "Instead of detaching at the commit
recorded in the superproject, check out the branch with the same
name, even if it points at a different commit.  Here is the write up
of what it does in the Documentation/ and code" is hard to judge
beyond checking if the code does what it claims to do and if the
docs describe what the code does---without backstory like this that
talks about how individual small pieces fit within the plan for the
whole thing, judging if that "what it claims to do" is even sensible
is impossible.

Re: [PATCH] submodule: Alllow staged changes for get_superproject_working_tree

2018-09-28 Thread Junio C Hamano

Sam McKelvie  writes:

>> Ah, that, too.  I meant to correct triple ell, though ;-)
>>  ...
>
> I wholeheartedly approve of that plan and your tweaking commit below. Thank 
> you, Junio.

Thanks for a fix.  But now I re-read the title and think about it,
this is mistitled.  The word 'stage' in "ls-files --stage" is not
about 'stage' people use when they talk about "staged changes" at
all.  The latter is what "diff --cached" is about---what's different
between HEAD and the index.  The 'stage' "ls-files --stage" talks
about is "which parent the cache entry came from, among common
ancestor, us, or the other branch being merged".

Also we are not really "allowing" with this change; "allowing" makes
it sound as if we were earlier "forbidding" with a good reason and
the change is lifting the limitation.

We used to incorrectly die when superproject is resolving a conflict
for the submodule we are currently in, and that is what the patch
fixed.

submodule: parse output of conflicted ls-files in superproject correctly

is the shortest I could come up with, while touching all the points
that need to be touched and still being technically not-incorrect.

Or perhaps

rev-parse: --show-superproject-working-tree should work during a merge

may be more to the point.  It does not hint the root cause of the
bug like the other one, but is more direct how the breakage would
have been observed by the end users.

Re: [RFC PATCH v2 4/4] fetch: do not list refs if fetching only hashes

2018-09-28 Thread Jonathan Tan

> > +   /*
> > +* We can avoid listing refs if all of them are exact
> > +* OIDs
> > +*/
> > +   must_list_refs = 0;
> > +   for (i = 0; i < rs->nr; i++) {
> > +   if (!rs->items[i].exact_sha1) {
> > +   must_list_refs = 1;
> > +   break;
> > +   }
> > +   }
> 
> This seems to be a repeat pattern, Is it worth it to encapsulate it
> as a function in transport or refs?
> 
>   int must_list_refs(struct ref **to_fetch)
>   {
> // body as the loop above
>   }

The repetition is unfortunate - I tried to think of a better way to do
it but couldn't. We can't do what you suggest because this one loops
over refspecs but the other one loops over refs.

Re: [PATCH] git help: promote 'git help -av'

2018-09-28 Thread Taylor Blau

On Fri, Sep 28, 2018 at 09:30:51AM -0700, Junio C Hamano wrote:
> Taylor Blau  writes:
>
> > On Wed, Sep 26, 2018 at 10:28:31AM -0700, Junio C Hamano wrote:
> >> Duy Nguyen  writes:
> >>
> >> > Here's the patch that adds that external commands and aliases
> >> > sections. I feel that external commands section is definitely good to
> >> > have even if we don't replace "help -a". Aliases are more
> >> > subjective...
> >>
> >> I didn't apply this (so I didn't try running it), but a quick
> >> eyeballing tells me that the listing of external commands in -av
> >> output done in this patch seems reasonable.
> >>
> >> I do not think removing "-v" and making the current "help -a" output
> >> unavailable is a wise idea, though.
> >
> > I think that your point about having to react in a split-second in order
> > to ^C before we open the manual page for a command is a good one. So, I
> > agree with this.
>
> Responding to a wrong thread?

Ah, I certainly am. Thanks for catching my mistake :-).

> I thought "now I need ^C within a handful of deciseconds if I want
> only alias?" was not about the change to make "-v" default when
> "help -a" is asked, but about what to do with "git foo --help" that
> only gives "foo is aliased to ...".

Thanks,
Taylor

[PATCH] strbuf.h: format according to coding guidelines

2018-09-28 Thread Stefan Beller

The previous patch suggested the strbuf header to be the leading example
of how we would want our APIs to be documented. This may lead to some
scrutiny of that code and the coding style (which is different from the
API documentation style) and hence might be taken as an example on how
to format code as well.

So let's format strbuf.h in a way that we'd like to see:
* omit the extern keyword from function declarations
* name all parameters (usually the parameters are obvious from its type,
  but consider exceptions like
  `int strbuf_getwholeline_fd(struct strbuf *, int, int);`
* break overly long lines

Signed-off-by: Stefan Beller 
---

Maybe this on top of Junios guideline patch?

Thanks,
Stefan

 strbuf.h | 148 ++-
 1 file changed, 81 insertions(+), 67 deletions(-)

diff --git a/strbuf.h b/strbuf.h
index 60a35aef16..bf18fddb5b 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -87,7 +87,7 @@ struct object_id;
  * Initialize the structure. The second parameter can be zero or a bigger
  * number to allocate memory, in case you want to prevent further reallocs.
  */
-extern void strbuf_init(struct strbuf *, size_t);
+void strbuf_init(struct strbuf *sb, size_t alloc);
 
 /**
  * Release a string buffer and the memory it used. After this call, the
@@ -97,7 +97,7 @@ extern void strbuf_init(struct strbuf *, size_t);
  * To clear a strbuf in preparation for further use without the overhead
  * of free()ing and malloc()ing again, use strbuf_reset() instead.
  */
-extern void strbuf_release(struct strbuf *);
+void strbuf_release(struct strbuf *sb);
 
 /**
  * Detach the string from the strbuf and returns it; you now own the
@@ -107,7 +107,7 @@ extern void strbuf_release(struct strbuf *);
  * The strbuf that previously held the string is reset to `STRBUF_INIT` so
  * it can be reused after calling this function.
  */
-extern char *strbuf_detach(struct strbuf *, size_t *);
+char *strbuf_detach(struct strbuf *sb, size_t *sz);
 
 /**
  * Attach a string to a buffer. You should specify the string to attach,
@@ -117,7 +117,7 @@ extern char *strbuf_detach(struct strbuf *, size_t *);
  * malloc()ed, and after attaching, the pointer cannot be relied upon
  * anymore, and neither be free()d directly.
  */
-extern void strbuf_attach(struct strbuf *, void *, size_t, size_t);
+void strbuf_attach(struct strbuf *sb, void *str, size_t len, size_t mem);
 
 /**
  * Swap the contents of two string buffers.
@@ -148,7 +148,7 @@ static inline size_t strbuf_avail(const struct strbuf *sb)
  * This is never a needed operation, but can be critical for performance in
  * some cases.
  */
-extern void strbuf_grow(struct strbuf *, size_t);
+void strbuf_grow(struct strbuf *sb, size_t amount);
 
 /**
  * Set the length of the buffer to a given value. This function does *not*
@@ -183,30 +183,30 @@ static inline void strbuf_setlen(struct strbuf *sb, 
size_t len)
  * Strip whitespace from the beginning (`ltrim`), end (`rtrim`), or both side
  * (`trim`) of a string.
  */
-extern void strbuf_trim(struct strbuf *);
-extern void strbuf_rtrim(struct strbuf *);
-extern void strbuf_ltrim(struct strbuf *);
+void strbuf_trim(struct strbuf *sb);
+void strbuf_rtrim(struct strbuf *sb);
+void strbuf_ltrim(struct strbuf *sb);
 
 /* Strip trailing directory separators */
-extern void strbuf_trim_trailing_dir_sep(struct strbuf *);
+void strbuf_trim_trailing_dir_sep(struct strbuf *sb);
 
 /**
  * Replace the contents of the strbuf with a reencoded form.  Returns -1
  * on error, 0 on success.
  */
-extern int strbuf_reencode(struct strbuf *sb, const char *from, const char 
*to);
+int strbuf_reencode(struct strbuf *sb, const char *from, const char *to);
 
 /**
  * Lowercase each character in the buffer using `tolower`.
  */
-extern void strbuf_tolower(struct strbuf *sb);
+void strbuf_tolower(struct strbuf *sb);
 
 /**
  * Compare two buffers. Returns an integer less than, equal to, or greater
  * than zero if the first buffer is found, respectively, to be less than,
  * to match, or be greater than the second buffer.
  */
-extern int strbuf_cmp(const struct strbuf *, const struct strbuf *);
+int strbuf_cmp(const struct strbuf *first, const struct strbuf *second);
 
 
 /**
@@ -233,37 +233,38 @@ static inline void strbuf_addch(struct strbuf *sb, int c)
 /**
  * Add a character the specified number of times to the buffer.
  */
-extern void strbuf_addchars(struct strbuf *sb, int c, size_t n);
+void strbuf_addchars(struct strbuf *sb, int c, size_t n);
 
 /**
  * Insert data to the given position of the buffer. The remaining contents
  * will be shifted, not overwritten.
  */
-extern void strbuf_insert(struct strbuf *, size_t pos, const void *, size_t);
+void strbuf_insert(struct strbuf *sb, size_t pos, const void *, size_t);
 
 /**
  * Remove given amount of data from a given position of the buffer.
  */
-extern void strbuf_remove(struct strbuf *, size_t pos, size_t len);
+void strbuf_remove(struct strbuf *sb, size_t pos,

Re: [PATCH v2 1/5] split-index: add tests to demonstrate the racy split index problem

2018-09-28 Thread Junio C Hamano

SZEDER Gábor  writes:

> Junio,
>
> On Thu, Sep 27, 2018 at 02:44:30PM +0200, SZEDER Gábor wrote:
>> diff --git a/t/t1701-racy-split-index.sh b/t/t1701-racy-split-index.sh
>> new file mode 100755
>> index 00..ebde418d7e
>> --- /dev/null
>> +++ b/t/t1701-racy-split-index.sh
>> @@ -0,0 +1,218 @@
>> +#!/bin/sh
>> +
>> +# This test can give false success if your machine is sufficiently
>> +# slow or all trials happened to happen on second boundaries.
>> +
>> +test_description='racy split index'
>> +
>> +. ./test-lib.sh
>> +
>> +test_expect_success 'setup' '
>> +# Only split the index when the test explicitly says so.
>> +sane_unset GIT_TEST_SPLIT_INDEX GIT_FSMONITOR_TEST &&
>
> Please note that this patch adds another use of the environment
> variable GIT_FSMONITOR_TEST, while the topic 'bp/rename-test-env-var'
> is about to rename that variable to GIT_TEST_FSMONITOR.

I think the most sensible thing to do during transition is to set
(or unset) both consistently (and leave a note in t/test-lib.sh near
check_var_migration that t1701 has done so and the dup should be
corrected when transition is over---but that is optional).

Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs

2018-09-28 Thread Stefan Beller

On Thu, Sep 27, 2018 at 6:11 PM Jeff King  wrote:
>
> On Thu, Sep 27, 2018 at 04:27:32PM -0700, Jonathan Nieder wrote:
>
> > > There are different opinions on how to document an API properly.
> > > Discussion turns out nobody disagrees with documenting new APIs on the
> > > function level in the header file and high level concepts in
> > > Documentation/technical, so let's state that in the guidelines.
> >
> > I disagree with this.  I think we should put all the API docs in the
> > header file, like we're already doing in e.g. strbuf.h.
>
> Me too.
>
> I think other high-level concepts that are _not_ APIs (e.g., file
> formats, protocol, etc) could go into technical/.

That is what I meant with high level concepts. Anything that talks
about or implies the existence of a function is clearly header level.

> (Though actually, those are the thing that I would not mind at all if
> they get formatted into real manpages and shipped to end users. We do
> not expect most users to dissect our file-formats, but they could at
> least be useful to somebody poking around).

Formats are sensible thing to present to the end user. I was also thinking
about partial-clone, which is a concept rather than a format.

>
> > Do you have a link to where in the discussion this split-docs strategy
> > was decided?
>
> I think the purpose of this patch is to spur people towards a decision.
> :)

For me it might end up in an exercise of writing clear and concise English.
Though after reading Junios patch below, I would hope that may find
consensus.

Stefan

Re: [PATCH v6 4/7] config: add new index.threads config setting

2018-09-28 Thread Junio C Hamano

Ben Peart  writes:

>> Why does multithreading have to be disabled in this test?
>
> If multi-threading is enabled, it will write out the IEOT extension
> which changes the SHA and causes the test to fail.

I think it is a design mistake to let the writing processes's
capability decide what is written in the file to be read later by a
different process, which possibly may have different capability.  If
you are not writing with multiple threads, it should not matter if
that writer process is capable of and configured to spawn 8 threads
if the process were reading the file---as it is not reading the file
it is writing right now.

I can understand if the design is to write IEOT only if the
resulting index is expected to become large enough (above an
arbitrary threshold like 100k entries) to matter.  I also can
understand if IEOT is omitted when the repository configuration says
that no process is allowed to read the index with multi-threaded
codepath in that repository.

Re: [PATCH] help: allow redirecting to help for aliased command

2018-09-28 Thread Junio C Hamano

Rasmus Villemoes  writes:

>>> +   if (follow_alias > 0) {
>>> +   fprintf_ln(stderr,
>>> +  _("Continuing to help for %s in %0.1f 
>>> seconds."),
>>> +  alias, follow_alias/10.0);
>>> +   sleep_millisec(follow_alias * 100);
>>> +   }
>>> +   return alias;
>> 
>> If you have "[alias] cp = cherry-pick -n", split_cmdline discards
>> "-n" and the follow-alias prompt does not even tell you that it did
>> so,
>
> That's not really true, as I deliberately did the split_cmdline after
> printing the "is an alias for", but before "continuing to help for", so
> this would precisely tell you
>
>   cp is an alias for 'cherry-pick -n'
>   continuing to help for 'cherry-pick' in 1.5 seconds

Yes, but notice that cherry-pick appears twice---I do not know about
you, but I know at least my eyes will be drawn to the last mention
that does not have '-n' stronger than the one before/above that
line.

In any case, I think Peff's "Let's teach 'git cp -h' to prefix what
'cp' is aliased to before invoking 'git cherry-pick -n -h' (and let
it fail)" approach is much more robust, so let's do that without
emulating that command-typo-correction codepath.

Re: [PATCH v2] git.txt: mention mailing list archive

2018-09-28 Thread Junio C Hamano

Martin Ågren  writes:

>> Hm.  I think this encourages a behavior that I want to discourage:
>> assuming that if a bug has already been reported then there's nothing
>> more for the new user to add.
>
> It was my hope that all of these could be inferred from the above text:
>
> "I'll just drop a mail anyway."
>
> "I wonder if there's a known solution to my issue."
>
> "I wonder if this is known and I can provide some more details compared
> to the original poster."
>
> "Maybe I can find some thread where I can just say '+1'."
>
> But what a language-lawyer reading says is of course a lot less relevant
> than what a fresh pair of eyes (yours) reads out of the text. Thanks.

I agree with your reading; the most neutral mention "archive is
here" is not very friendly because the readers do not know what we
want out of them being aware of the archive.  "Ah, I may find a
solution already there" was the reaction I wanted to draw by saying
"If you want to check if the issue has been reported", but any of
the above is a good reaction.

And from that point of view

>> See the list archive at https://public-inbox.org/git/ for
>> previous bug reports and other discussions.

is just as good, and there is not a big difference between that and

>   If you want to, you can see the list archive at, e.g.,
>    for bug reports and other discussions.

this one, at least to me.

> We might also conclude that trying to delicately word-smith something
> that doesn't scare off reports is tricky, and we're better off just
> avoiding doing anything which might raise someone's bar for reporting an
> issue. I'm leaning more and more towards "it's not broken, so don't fix
> it"...

Yup, in short I think any one of the above three is good enough.
Let's just pick one and move on.  Unless somebody sends in an
improvement that can be applied readily, by default I'll just "git
am" the one Martin sent, as that is the easiest thing to do ;-).

Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs

2018-09-28 Thread Junio C Hamano

Jeff King  writes:

> On Thu, Sep 27, 2018 at 04:27:32PM -0700, Jonathan Nieder wrote:
>
>> > There are different opinions on how to document an API properly.
>> > Discussion turns out nobody disagrees with documenting new APIs on the
>> > function level in the header file and high level concepts in
>> > Documentation/technical, so let's state that in the guidelines.
>> 
>> I disagree with this.  I think we should put all the API docs in the
>> header file, like we're already doing in e.g. strbuf.h.
>
> Me too.
>
> I think other high-level concepts that are _not_ APIs (e.g., file
> formats, protocol, etc) could go into technical/.
>
> (Though actually, those are the thing that I would not mind at all if
> they get formatted into real manpages and shipped to end users. We do
> not expect most users to dissect our file-formats, but they could at
> least be useful to somebody poking around).
>
>> Do you have a link to where in the discussion this split-docs strategy
>> was decided?
>
> I think the purpose of this patch is to spur people towards a decision.
> :)

OK.

-- >8 --
Subject: CodingGuidelines: document the API in *.h files

It makes it harder to let the API description and the reality drift
apart if the doc is kept close to the implementation or the header
of the API.  We have been slowly migrating API docs out of the
Documentation/technical/api-* to *.h files, and the development
community generally considers that how inline docs in strbuf.h is
done the best current practice.

We recommend documenting in the header over documenting near the
implementation to encourage people to write the docs that are
readable without peeking at the implemention.

Signed-off-by: Junio C Hamano 
---
 Documentation/CodingGuidelines | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 6d265327c9..e87090c849 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -385,7 +385,11 @@ For C programs:
string_list for sorted string lists, a hash map (mapping struct
objects) named "struct decorate", amongst other things.
 
- - When you come up with an API, document it.
+ - When you come up with an API, document it the functions and the
+   structures in the header file that exposes the API to its callers.
+   Use what is in "strbuf.h" as a model to decide the appropriate tone
+   in which the description is given, and the level of details to put
+   in the description.
 
  - The first #include in C files, except in platform specific compat/
implementations, must be either "git-compat-util.h", "cache.h" or

[PATCH] Improvement to only call Git Credential Helper once

2018-09-28 Thread Kyle Hubert

When calling the Git Credential Helper that is set in the git config,
the get command can return a credential. Git immediately turns around
and calls the store command, even though that credential was just
retrieved by the Helper. This creates two side effects. First of all,
if the Helper requires a passphrase, the user has to type it in
twice. Secondly, if the user has a number of helpers, this retrieves
the credential from one service and writes it to all services.

This commit introduces a new field in the credential struct that
detects when the credential was retrieved using the Helper, and early
exits when called to store the credential.
---
 credential.c | 8 +++-
 credential.h | 3 ++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/credential.c b/credential.c
index 62be651b0..79bf62d49 100644
--- a/credential.c
+++ b/credential.c
@@ -280,8 +280,10 @@ void credential_fill(struct credential *c)
 
for (i = 0; i < c->helpers.nr; i++) {
credential_do(c, c->helpers.items[i].string, "get");
-   if (c->username && c->password)
+   if (c->username && c->password) {
+   c->retrieved = 1;
return;
+   }
if (c->quit)
die("credential helper '%s' told us to quit",
c->helpers.items[i].string);
@@ -300,6 +302,10 @@ void credential_approve(struct credential *c)
return;
if (!c->username || !c->password)
return;
+   if (c->retrieved) {
+   c->approved = 1;
+   return;
+   }
 
credential_apply_config(c);
 
diff --git a/credential.h b/credential.h
index 6b0cd16be..d99df2f52 100644
--- a/credential.h
+++ b/credential.h
@@ -8,7 +8,8 @@ struct credential {
unsigned approved:1,
 configured:1,
 quit:1,
-use_http_path:1;
+use_http_path:1,
+retrieved:1;
 
char *username;
char *password;
-- 
2.11.0

Re: [PATCH] git help: promote 'git help -av'

2018-09-28 Thread Junio C Hamano

Taylor Blau  writes:

> On Wed, Sep 26, 2018 at 10:28:31AM -0700, Junio C Hamano wrote:
>> Duy Nguyen  writes:
>>
>> > Here's the patch that adds that external commands and aliases
>> > sections. I feel that external commands section is definitely good to
>> > have even if we don't replace "help -a". Aliases are more
>> > subjective...
>>
>> I didn't apply this (so I didn't try running it), but a quick
>> eyeballing tells me that the listing of external commands in -av
>> output done in this patch seems reasonable.
>>
>> I do not think removing "-v" and making the current "help -a" output
>> unavailable is a wise idea, though.
>
> I think that your point about having to react in a split-second in order
> to ^C before we open the manual page for a command is a good one. So, I
> agree with this.

Responding to a wrong thread?  

I thought "now I need ^C within a handful of deciseconds if I want
only alias?" was not about the change to make "-v" default when
"help -a" is asked, but about what to do with "git foo --help" that
only gives "foo is aliased to ...".

[PATCH v3 1/6] t1700-split-index: document why FSMONITOR is disabled in this test script

2018-09-28 Thread SZEDER Gábor

Signed-off-by: SZEDER Gábor 
---
 t/t1700-split-index.sh | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index be22398a85..822257ff7d 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -6,6 +6,9 @@ test_description='split index mode tests'
 
 # We need total control of index splitting here
 sane_unset GIT_TEST_SPLIT_INDEX
+# A couple of tests expect the index to have a specific checksum,
+# but the presence of the optional FSMN extension would interfere
+# with those checks, so disable it in this test script.
 sane_unset GIT_FSMONITOR_TEST
 
 test_expect_success 'enable split index' '
-- 
2.19.0.361.gafc87ffe72

[PATCH v3 0/6] Fix the racy split index problem

2018-09-28 Thread SZEDER Gábor

Third round of fixing occasional test failures when run with
'GIT_TEST_SPLIT_INDEX=yes', with only two rather minor changes since the
previous version; just look at the trivial interdiff below.

SZEDER Gábor (6):
  t1700-split-index: document why FSMONITOR is disabled in this test
script
  split-index: add tests to demonstrate the racy split index problem
  t1700-split-index: date back files to avoid racy situations
  split-index: count the number of deleted entries
  split-index: don't compare stat data of entries already marked for
split index
  split-index: smudge and add racily clean cache entries to split index

 cache.h |   2 +
 read-cache.c|   2 +-
 split-index.c   | 121 +---
 t/t1700-split-index.sh  |  52 +
 t/t1701-racy-split-index.sh | 214 
 5 files changed, 351 insertions(+), 40 deletions(-)
 create mode 100755 t/t1701-racy-split-index.sh

Interdiff:
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index c8acbc50d0..f053bf83eb 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -6,6 +6,9 @@ test_description='split index mode tests'
 
 # We need total control of index splitting here
 sane_unset GIT_TEST_SPLIT_INDEX
+# A couple of tests expect the index to have a specific checksum,
+# but the presence of the optional FSMN extension would interfere
+# with those checks, so disable it in this test script.
 sane_unset GIT_FSMONITOR_TEST
 
 # Create a file named as $1 with content read from stdin.
diff --git a/t/t1701-racy-split-index.sh b/t/t1701-racy-split-index.sh
index 7f16f5f7a3..5dc221ef38 100755
--- a/t/t1701-racy-split-index.sh
+++ b/t/t1701-racy-split-index.sh
@@ -9,7 +9,7 @@ test_description='racy split index'
 
 test_expect_success 'setup' '
# Only split the index when the test explicitly says so.
-   sane_unset GIT_TEST_SPLIT_INDEX GIT_FSMONITOR_TEST &&
+   sane_unset GIT_TEST_SPLIT_INDEX &&
git config splitIndex.maxPercentChange 100 &&
 
echo "cached content" >racy-file &&
-- 
2.19.0.361.gafc87ffe72

[PATCH v3 3/6] t1700-split-index: date back files to avoid racy situations

2018-09-28 Thread SZEDER Gábor

't1700-split-index.sh' checks that the index was split correctly under
various circumstances and that all the different ways to turn the
split index feature on and off work correctly.  To do so, most of its
tests use 'test-tool dump-split-index' to see which files have their
cache entries in the split index.  All these tests assume that all
cache entries are written to the shared index (called "base"
throughout these tests) when a new shared index is created.  This is
an implementation detail: most git commands (basically all except 'git
update-index') don't care or know at all about split index or whether
a cache entry is stored in the split or shared index.

As demonstrated in the previous patch, refreshing a split index is
prone to a variant of the classic racy git issue.  The next patch will
fix this issue, but while doing so it will also slightly change this
behaviour: only cache entries with mtime in the past will be written
only to the newly created shared index, but racily clean cache entries
will be written to the new split index (with smudged stat data).

While this upcoming change won't at all affect any git commands, it
will violate the above mentioned assumption of 't1700's tests.  Since
these tests create or modify files and create or refresh the split
index in rapid succession, there are plenty of racily clean cache
entries to be dealt with, which will then be written to the new split
indexes, and, ultimately, will cause several tests in 't1700' to fail.

Let's prepare 't1700-split-index.sh' for this upcoming change and
modify its tests to avoid racily clean files by backdating the mtime
of any file modifications (and since a lot of tests create or modify
files, encapsulate it into a helper function).

Signed-off-by: SZEDER Gábor 
---
 t/t1700-split-index.sh | 49 --
 1 file changed, 28 insertions(+), 21 deletions(-)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 822257ff7d..f053bf83eb 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -11,6 +11,13 @@ sane_unset GIT_TEST_SPLIT_INDEX
 # with those checks, so disable it in this test script.
 sane_unset GIT_FSMONITOR_TEST
 
+# Create a file named as $1 with content read from stdin.
+# Set the file's mtime to a few seconds in the past to avoid racy situations.
+create_non_racy_file () {
+   cat >"$1" &&
+   test-tool chmtime =-5 "$1"
+}
+
 test_expect_success 'enable split index' '
git config splitIndex.maxPercentChange 100 &&
git update-index --split-index &&
@@ -34,7 +41,7 @@ test_expect_success 'enable split index' '
 '
 
 test_expect_success 'add one file' '
-   : >one &&
+   create_non_racy_file one &&
git update-index --add one &&
git ls-files --stage >ls-files.actual &&
cat >ls-files.expect <<-EOF &&
@@ -86,7 +93,7 @@ test_expect_success 'enable split index again, "one" now 
belongs to base index"'
 '
 
 test_expect_success 'modify original file, base index untouched' '
-   echo modified >one &&
+   echo modified | create_non_racy_file one &&
git update-index one &&
git ls-files --stage >ls-files.actual &&
cat >ls-files.expect <<-EOF &&
@@ -105,7 +112,7 @@ test_expect_success 'modify original file, base index 
untouched' '
 '
 
 test_expect_success 'add another file, which stays index' '
-   : >two &&
+   create_non_racy_file two &&
git update-index --add two &&
git ls-files --stage >ls-files.actual &&
cat >ls-files.expect <<-EOF &&
@@ -158,7 +165,7 @@ test_expect_success 'remove file in base index' '
 '
 
 test_expect_success 'add original file back' '
-   : >one &&
+   create_non_racy_file one &&
git update-index --add one &&
git ls-files --stage >ls-files.actual &&
cat >ls-files.expect <<-EOF &&
@@ -177,7 +184,7 @@ test_expect_success 'add original file back' '
 '
 
 test_expect_success 'add new file' '
-   : >two &&
+   create_non_racy_file two &&
git update-index --add two &&
git ls-files --stage >actual &&
cat >expect <<-EOF &&
@@ -221,7 +228,7 @@ test_expect_success 'rev-parse --shared-index-path' '
 
 test_expect_success 'set core.splitIndex config variable to true' '
git config core.splitIndex true &&
-   : >three &&
+   create_non_racy_file three &&
git update-index --add three &&
git ls-files --stage >ls-files.actual &&
cat >ls-files.expect <<-EOF &&
@@ -256,9 +263,9 @@ test_expect_success 'set core.splitIndex config variable to 
false' '
test_cmp expect actual
 '
 
-test_expect_success 'set core.splitIndex config variable to true' '
+test_expect_success 'set core.splitIndex config variable back to true' '
git config core.splitIndex true &&
-   : >three &&
+   create_non_racy_file three &&
git update-index --add three &&
BASE=$(test-tool dump-split-index .git/index | grep "^base") &&

[PATCH v3 6/6] split-index: smudge and add racily clean cache entries to split index

2018-09-28 Thread SZEDER Gábor

Ever since the split index feature was introduced [1], refreshing a
split index is prone to a variant of the classic racy git problem.

Consider the following sequence of commands updating the split index
when the shared index contains a racily clean cache entry, i.e. an
entry whose cached stat data matches with the corresponding file in
the worktree and the cached mtime matches that of the index:

  echo "cached content" >file
  git update-index --split-index --add file
  echo "dirty worktree" >file# size stays the same!
  # ... wait ...
  git update-index --add other-file

Normally, when a non-split index is updated, then do_write_index()
(the function responsible for writing all kinds of indexes, "regular",
split, and shared) recognizes racily clean cache entries, and writes
them with smudged stat data, i.e. with file size set to 0.  When
subsequent git commands read the index, they will notice that the
smudged stat data doesn't match with the file in the worktree, and
then go on to check the file's content and notice its dirtiness.

In the above example, however, in the second 'git update-index'
prepare_to_write_split_index() decides which cache entries stored only
in the shared index should be replaced in the new split index.  Alas,
this function never looks out for racily clean cache entries, and
since the file's stat data in the worktree hasn't changed since the
shared index was written, it won't be replaced in the new split index.
Consequently, do_write_index() doesn't even get this racily clean
cache entry, and can't smudge its stat data.  Subsequent git commands
will then see that the index has more recent mtime than the file and
that the (not smudged) cached stat data still matches with the file in
the worktree, and, ultimately, will erroneously consider the file
clean.

Modify prepare_to_write_split_index() to recognize racily clean cache
entries, and mark them to be added to the split index.  Note that
there are two places where it should check raciness: first those cache
entries that are only stored in the shared index, and then those that
have been copied by unpack_trees() from the shared index while it
constructed a new index.  This way do_write_index() will get these
racily clean cache entries as well, and will then write them with
smudged stat data to the new split index.

Note that after this change if the index is split when it contains a
racily clean cache entry, then a smudged cache entry will be written
both to the new shared and to the new split indexes.  This doesn't
affect regular git commands: as far as they are concerned this is just
an entry in the split index replacing an outdated entry in the shared
index.  It did affect a few tests in 't1700-split-index.sh', though,
because they actually check which entries are stored in the split
index; a previous patch in this series has already made the necessary
adjustments in 't1700'.  And racily clean cache entries and index
splitting are rare enough to not worry about the resulting duplicated
smudged cache entries, and the additional complexity required to
prevent them is not worth it.

Several tests failed occasionally when the test suite was run with
'GIT_TEST_SPLIT_INDEX=yes'.  Here are those that I managed to trace
back to this racy split index problem, starting with those failing
more frequently, with a link to a failing Travis CI build job for
each.  The highlighted line [2] shows when the racy file was written,
which is not always in the failing test but in a preceeding setup
test.

  t3903-stash.sh:
https://travis-ci.org/git/git/jobs/385542084#L5858

  t4024-diff-optimize-common.sh:
https://travis-ci.org/git/git/jobs/386531969#L3174

  t4015-diff-whitespace.sh:
https://travis-ci.org/git/git/jobs/360797600#L8215

  t2200-add-update.sh:
https://travis-ci.org/git/git/jobs/382543426#L3051

  t0090-cache-tree.sh:
https://travis-ci.org/git/git/jobs/416583010#L3679

There might be others, e.g. perhaps 't1000-read-tree-m-3way.sh' and
others using 'lib-read-tree-m-3way.sh', but I couldn't confirm yet.

[1] In the branch leading to the merge commit v2.1.0-rc0~45 (Merge
branch 'nd/split-index', 2014-07-16).

[2] Note that those highlighted lines are in the 'after failure' fold,
and your browser might unhelpfully fold it up before you could
take a good look.

Signed-off-by: SZEDER Gábor 
---
 cache.h |  2 ++
 read-cache.c|  2 +-
 split-index.c   | 42 -
 t/t1701-racy-split-index.sh |  8 ++-
 4 files changed, 46 insertions(+), 8 deletions(-)

diff --git a/cache.h b/cache.h
index 4d014541ab..3f419b6c79 100644
--- a/cache.h
+++ b/cache.h
@@ -781,6 +781,8 @@ extern void *read_blob_data_from_index(const struct 
index_state *, const char *,
 #define CE_MATCH_REFRESH   0x10
 /* don't refresh_fsmonitor state or do stat comparison even if 
CE_FSMONITOR_VALID is true */
 #define CE_MATCH_IGNORE_FSMONITOR 0X20

[PATCH v3 5/6] split-index: don't compare stat data of entries already marked for split index

2018-09-28 Thread SZEDER Gábor

When unpack_trees() constructs a new index, it copies cache entries
from the original index [1].  prepare_to_write_split_index() has to
deal with this, and it has a dedicated code path for copied entries
that are present in the shared index, where it compares the cached
data in the corresponding copied and original entries.  If the cached
data matches, then they are considered the same; if it differs, then
the copied entry will be marked for inclusion as a replacement entry
in the just about to be written split index by setting the
CE_UPDATE_IN_BASE flag.

However, a cache entry already has its CE_UPDATE_IN_BASE flag set upon
reading the split index, if the entry already has a replacement entry
there, or upon refreshing the cached stat data, if the corresponding
file was modified.  The state of this flag is then preserved when
unpack_trees() copies a cache entry from the shared index.

So modify prepare_to_write_split_index() to check the copied cache
entries' CE_UPDATE_IN_BASE flag first, and skip the thorough
comparison of cached data if the flag is already set.

Note that comparing the cached data in copied and original entries in
the shared index might actually be entirely unnecessary.  In theory
all code paths refreshing the cached stat data of an entry in the
shared index should set the CE_UPDATE_IN_BASE flag in that entry, and
unpack_trees() should preserve this flag when copying cache entries.
This means that the cached data is only ever changed if the
CE_UPDATE_IN_BASE flag is set as well.  Our test suite seems to
confirm this: instrumenting the conditions in question and running the
test suite repeatedly with 'GIT_TEST_SPLIT_INDEX=yes' showed that the
cached data in a copied entry differs from the data in the shared
entry only if its CE_UPDATE_IN_BASE flag is indeed set.

In practice, however, our test suite doesn't have 100% coverage,
GIT_TEST_SPLIT_INDEX is inherently random, and I certainly can't claim
to possess complete understanding of what goes on in unpack_trees()...
Therefore I kept the comparison of the cached data when
CE_UPDATE_IN_BASE is not set, just in case that an unnoticed or future
code path were to accidentally miss setting this flag upon refreshing
the cached stat data or unpack_trees() were to drop this flag while
copying a cache entry.

[1] Note that when unpack_trees() constructs the new index and decides
that a cache entry should now refer to different content than what
was recorded in the original index (e.g. 'git read-tree -m
HEAD^'), then that can't really be considered a copy of the
original, but rather the creation of a new entry.  Notably and
pertinent to the split index feature, such a new entry doesn't
have a reference to the original's shared index entry anymore,
i.e. its 'index' field is set to 0.  Consequently, such an entry
is treated by prepare_to_write_split_index() as an entry not
present in the shared index and it will be added to the new split
index, while the original entry will be marked as deleted, and
neither the above discussion nor the changes in this patch apply
to them.

Signed-off-by: SZEDER Gábor 
---
 split-index.c | 79 ---
 1 file changed, 62 insertions(+), 17 deletions(-)

diff --git a/split-index.c b/split-index.c
index 548272ec33..7d8799f6b7 100644
--- a/split-index.c
+++ b/split-index.c
@@ -207,13 +207,28 @@ void prepare_to_write_split_index(struct index_state 
*istate)
 */
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *base;
-   /* namelen is checked separately */
-   const unsigned int ondisk_flags =
-   CE_STAGEMASK | CE_VALID | CE_EXTENDED_FLAGS;
-   unsigned int ce_flags, base_flags, ret;
ce = istate->cache[i];
-   if (!ce->index)
+   if (!ce->index) {
+   /*
+* During simple update index operations this
+* is a cache entry that is not present in
+* the shared index.  It will be added to the
+* split index.
+*
+* However, it might also represent a file
+* that already has a cache entry in the
+* shared index, but a new index has just
+* been constructed by unpack_trees(), and
+* this entry now refers to different content
+* than what was recorded in the original
+* index, e.g. during 'read-tree -m HEAD^' or
+* 'checkout HEAD^'.  In this case the
+*

[PATCH v3 4/6] split-index: count the number of deleted entries

2018-09-28 Thread SZEDER Gábor

'struct split_index' contains the field 'nr_deletions', whose name
with the 'nr_' prefix suggests that it contains the number of deleted
cache entries.  However, barring its initialization to 0, this field
is only ever set to 1, indicating that there is at least one deleted
entry, but not the number of deleted entries.  Luckily, this doesn't
cause any issues (other than confusing the reader, that is), because
the only place reading this field uses it in the same sense, i.e.: 'if
(si->nr_deletions)'.

To avoid confusion, we could either rename this field to something
like 'has_deletions' to make its name match its role, or make it a
counter of deleted cache entries to match its name.

Let's make it a counter, to keep it in sync with the related field
'nr_replacements', which does contain the number of replaced cache
entries.  This will also give developers debugging the split index
code easy access to the number of deleted cache entries.

Signed-off-by: SZEDER Gábor 
---
 split-index.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/split-index.c b/split-index.c
index 84f067e10d..548272ec33 100644
--- a/split-index.c
+++ b/split-index.c
@@ -111,7 +111,7 @@ static void mark_entry_for_delete(size_t pos, void *data)
die("position for delete %d exceeds base index size %d",
(int)pos, istate->cache_nr);
istate->cache[pos]->ce_flags |= CE_REMOVE;
-   istate->split_index->nr_deletions = 1;
+   istate->split_index->nr_deletions++;
 }
 
 static void replace_entry(size_t pos, void *data)
-- 
2.19.0.361.gafc87ffe72

[PATCH v3 2/6] split-index: add tests to demonstrate the racy split index problem

2018-09-28 Thread SZEDER Gábor

Ever since the split index feature was introduced [1], refreshing a
split index is prone to a variant of the classic racy git problem.
There are a couple of unrelated tests in the test suite that
occasionally fail when run with 'GIT_TEST_SPLIT_INDEX=yes', but
't1700-split-index.sh', the only test script focusing solely on split
index, has never noticed this issue, because it only cares about how
the index is split under various circumstances and all the different
ways to turn the split index feature on and off.

Add a dedicated test script 't1701-racy-split-index.sh' to exercise
the split index feature in racy situations as well; kind of a
"t0010-racy-git.sh for split index" but with modern style (the tests
do everything in &&-chained list of commands in 'test_expect_...'
blocks, and use 'test_cmp' for more informative output on failure).

The tests cover the following sequences of index splitting, updating,
and racy file modifications, with the last two cases demonstrating the
racy split index problem:

  1. Split the index while adding a racily clean file:

   echo "cached content" >file
   git update-index --split-index --add file
   echo "dirty worktree" >file# size stays the same

 This case already works properly.  Even though the cache entry's
 stat data matches with the modifid file in the worktree,
 subsequent git commands will notice that the (split) index and
 the file have the same mtime, and then will go on to check the
 file's content and notice its dirtiness.

  2. Add a racily clean file to an already split index:

   git update-index --split-index
   echo "cached content" >file
   git update-index --add file
   echo "dirty worktree" >file

 This case already works properly.  After the second 'git
 update-index' writes the newly added file's cache entry to the
 new split index, it basically works in the same way as case #1.

  3. Split the index when it (i.e. the not yet splitted index)
 contains a racily clean cache entry, i.e. an entry whose cached
 stat data matches with the corresponding file in the worktree and
 the cached mtime matches that of the index:

   echo "cached content" >file
   git update-index --add file
   echo "dirty worktree" >file
   # ... wait ...
   git update-index --split-index --add other-file

 This case already works properly.  The shared index is written by
 do_write_index(), i.e. the same function that is responsible for
 writing "regular" and split indexes as well.  This function
 cleverly notices the racily clean cache entry, and writes the
 entry to the new shared index with smudged stat data, i.e. file
 size set to 0.  When subsequent git commands read the index, they
 will notice that the smudged stat data doesn't match with the
 file in the worktree, and then go on to check the file's content
 and notice its dirtiness.

  4. Update the split index when it contains a racily clean cache
 entry:

   git update-index --split-index
   echo "cached content" >file
   git update-index --add file
   echo "dirty worktree" >file
   # ... wait ...
   git update-index --add other-file

 This case already works properly.  After the second 'git
 update-index' the newly added file's cache entry is only stored
 in the split index.  If a cache entry is present in the split
 index (even if it is a replacement of an outdated entry in the
 shared index), then it will always be included in the new split
 index on subsequent split index updates (until the file is
 removed or a new shared index is written), independently from
 whether the entry is racily clean or not.  When do_write_index()
 writes the new split index, it notices the racily clean cache
 entry, and smudges its stat date.  Subsequent git commands
 reading the index will notice the smudged stat data and then go
 on to check the file's content and notice its dirtiness.

  5. Update the split index when a racily clean cache entry is stored
 only in the shared index:

   echo "cached content" >file
   git update-index --split-index --add file
   echo "dirty worktree" >file
   # ... wait ...
   git update-index --add other-file

 This case fails due to the racy split index problem.  In the
 second 'git update-index' prepare_to_write_split_index() decides,
 among other things, which cache entries stored only in the shared
 index should be replaced in the new split index.  Alas, this
 function never looks out for racily clean cache entries, and
 since the file's stat data in the worktree hasn't changed since
 the shared index was written, the entry won't be replaced in the
 new split index.  Consequently, do_write_index() doesn't even get
 this racily clean cache entry, and can't smudge its stat data.
 Subsequent git commands will then see that the

Re: [PATCH] Improvement to only call Git Credential Helper once

2018-09-28 Thread Kyle Hubert

Sorry, this patch is damaged. I'm moving to `git send-email` now.

-Kyle

On Fri, Sep 28, 2018 at 11:10 AM Kyle Hubert  wrote:
>
> When calling the Git Credential Helper that is set in the git config,
> the get command can return a credential. Git immediately turns around
> and calls the store command, even though that credential was just
> retrieved by the Helper. This creates two side effects. First of all,
> if the Helper requires a passphrase, the user has to type it in
> twice. Secondly, if the user has a number of helpers, this retrieves
> the credential from one service and writes it to all services.
>
> This commit introduces a new field in the credential struct that
> detects when the credential was retrieved using the Helper, and early
> exits when called to store the credential.
> ---
>  credential.c | 8 +++-
>  credential.h | 3 ++-
>  2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/credential.c b/credential.c
> index 62be651b0..79bf62d49 100644
> --- a/credential.c
> +++ b/credential.c
> @@ -280,8 +280,10 @@ void credential_fill(struct credential *c)
>
>   for (i = 0; i < c->helpers.nr; i++) {
>   credential_do(c, c->helpers.items[i].string, "get");
> - if (c->username && c->password)
> + if (c->username && c->password) {
> + c->retrieved = 1;
>   return;
> + }
>   if (c->quit)
>   die("credential helper '%s' told us to quit",
>   c->helpers.items[i].string);
> @@ -300,6 +302,10 @@ void credential_approve(struct credential *c)
>   return;
>   if (!c->username || !c->password)
>   return;
> + if (c->retrieved) {
> + c->approved = 1;
> + return;
> + }
>
>   credential_apply_config(c);
>
> diff --git a/credential.h b/credential.h
> index 6b0cd16be..d99df2f52 100644
> --- a/credential.h
> +++ b/credential.h
> @@ -8,7 +8,8 @@ struct credential {
>   unsigned approved:1,
>   configured:1,
>   quit:1,
> - use_http_path:1;
> + use_http_path:1,
> + retrieved:1;
>
>   char *username;
>   char *password;

[PATCH] t1400: drop debug `echo` to actually execute `test`

2018-09-28 Thread Martin Ågren

Instead of running `test "foo" = "$(bar)"`, we prefix the whole thing
with `echo`. Comparing to nearby tests makes it clear that this is just
debug leftover. This line has actually been modified four times since it
was introduced in e52290428b (General ref log reading improvements.,
2006-05-19) and the `echo` has always survived. Let's finally drop it.

This script could need some more cleanups. This is just an immediate fix
so that we actually test what we intend to.

All other hits for `git grep "\
---
 t/t1400-update-ref.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
index 02493f14ba..b72beebe42 100755
--- a/t/t1400-update-ref.sh
+++ b/t/t1400-update-ref.sh
@@ -359,21 +359,21 @@ ld="Thu, 26 May 2005 18:43:00 -0500"
 test_expect_success 'Query "master@{May 25 2005}" (before history)' '
test_when_finished "rm -f o e" &&
git rev-parse --verify "master@{May 25 2005}" >o 2>e &&
test $C = $(cat o) &&
test "warning: Log for '\''master'\'' only goes back to $ed." = "$(cat 
e)"
 '
 test_expect_success 'Query master@{2005-05-25} (before history)' '
test_when_finished "rm -f o e" &&
git rev-parse --verify master@{2005-05-25} >o 2>e &&
test $C = $(cat o) &&
-   echo test "warning: Log for '\''master'\'' only goes back to $ed." = 
"$(cat e)"
+   test "warning: Log for '\''master'\'' only goes back to $ed." = "$(cat 
e)"
 '
 test_expect_success 'Query "master@{May 26 2005 23:31:59}" (1 second before 
history)' '
test_when_finished "rm -f o e" &&
git rev-parse --verify "master@{May 26 2005 23:31:59}" >o 2>e &&
test $C = $(cat o) &&
test "warning: Log for '\''master'\'' only goes back to $ed." = "$(cat 
e)"
 '
 test_expect_success 'Query "master@{May 26 2005 23:32:00}" (exactly history 
start)' '
test_when_finished "rm -f o e" &&
git rev-parse --verify "master@{May 26 2005 23:32:00}" >o 2>e &&
-- 
2.19.0.216.g2d3b1c576c

Re: [PATCH] read-cache: fix division by zero core-dump

2018-09-28 Thread Ramsay Jones

On 28/09/18 02:20, Ben Peart wrote:
> 
> 
> On 9/27/2018 6:24 PM, Ramsay Jones wrote:
>>
>> commit 225df8a468 ("ieot: add Index Entry Offset Table (IEOT)
>> extension", 2018-09-26) added a 'DIV_ROUND_UP(entries, ieot_blocks)
>> expression, where ieot_blocks was set to zero for a single cpu
>> platform. This caused an SIGFPE and a core dump in practically
>> every test in the test-suite, until test t4056-diff-order.sh, which
>> then went into an infinite loop!
>>
>> Signed-off-by: Ramsay Jones 
>> ---
>>
>> Hi Ben,
>>
>> Could you please squash this into the relevant commits on your
>> 'bp/read-cache-parallel' branch. (The first hunk fixes a sparse
>> warning about using an integer as a NULL pointer).
>>
> 
> Absolutely - thanks for the patch.
> 
> I don't know how long it's been since I've been on a single core CPU - I'm 
> sad for you. ;-)

Heh, don't be - whilst I do still have a single cpu laptop about
the place _somewhere_, I haven't booted it up in about 15 years! :-D

I used to regularly test git (and other software) on my old 32-bit
laptop (windows xp/Linux Mint 17.x, Core2 duo), but just lately
I have taken to using a 32-bit VM on my current laptop (4th gen i5)
instead. (The git test-suite would take approx 50 min to run on
the 32-bit hardware, whereas it only takes 8 min on the VM!).

I have configured the 32-bit VM with a single cpu, because when
the VM was configured with two cpus the git test-suite would take
longer to run (approx. 8 -> 10 min)! Taking more resources from
the host, but increasing the running time, didn't seem like a good
return. ;-)

Also, this is not the first time some multi-threaded code in git
has 'failed' by assuming more than one cpu, so ...

ATB,
Ramsay Jones

[PATCH] Improvement to only call Git Credential Helper once

2018-09-28 Thread Kyle Hubert

When calling the Git Credential Helper that is set in the git config,
the get command can return a credential. Git immediately turns around
and calls the store command, even though that credential was just
retrieved by the Helper. This creates two side effects. First of all,
if the Helper requires a passphrase, the user has to type it in
twice. Secondly, if the user has a number of helpers, this retrieves
the credential from one service and writes it to all services.

This commit introduces a new field in the credential struct that
detects when the credential was retrieved using the Helper, and early
exits when called to store the credential.
---
 credential.c | 8 +++-
 credential.h | 3 ++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/credential.c b/credential.c
index 62be651b0..79bf62d49 100644
--- a/credential.c
+++ b/credential.c
@@ -280,8 +280,10 @@ void credential_fill(struct credential *c)

  for (i = 0; i < c->helpers.nr; i++) {
  credential_do(c, c->helpers.items[i].string, "get");
- if (c->username && c->password)
+ if (c->username && c->password) {
+ c->retrieved = 1;
  return;
+ }
  if (c->quit)
  die("credential helper '%s' told us to quit",
  c->helpers.items[i].string);
@@ -300,6 +302,10 @@ void credential_approve(struct credential *c)
  return;
  if (!c->username || !c->password)
  return;
+ if (c->retrieved) {
+ c->approved = 1;
+ return;
+ }

  credential_apply_config(c);

diff --git a/credential.h b/credential.h
index 6b0cd16be..d99df2f52 100644
--- a/credential.h
+++ b/credential.h
@@ -8,7 +8,8 @@ struct credential {
  unsigned approved:1,
  configured:1,
  quit:1,
- use_http_path:1;
+ use_http_path:1,
+ retrieved:1;

  char *username;
  char *password;

Re: [PATCH v3 3/5] fsmonitor: update GIT_TEST_FSMONITOR support

2018-09-28 Thread Ben Peart





On 9/28/2018 10:21 AM, Ben Peart wrote:



On 9/28/2018 6:01 AM, SZEDER Gábor wrote:

On Tue, Sep 18, 2018 at 11:29:35PM +, Ben Peart wrote:

diff --git a/t/README b/t/README
index 56a417439c..47165f7eab 100644
--- a/t/README
+++ b/t/README
@@ -319,6 +319,10 @@ GIT_TEST_OE_DELTA_SIZE= exercises the 
uncommon pack-objects code

  path where deltas larger than this limit require extra memory
  allocation for bookkeeping.
+GIT_TEST_FSMONITOR=$PWD/t7519/fsmonitor-all exercises the fsmonitor
+code path for utilizing a file system monitor to speed up detecting
+new or changed files.


Here you tell us to set GIT_TEST_FSMONITOR to an absolute path, and we
are good to go.


diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index 756beb0d8e..d77012ea6d 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -8,7 +8,7 @@ test_description='git status with file system watcher'
  # To run the entire git test suite using fsmonitor:
  #
  # copy t/t7519/fsmonitor-all to a location in your path and then set
-# GIT_FSMONITOR_TEST=fsmonitor-all and run your tests.
+# GIT_TEST_FSMONITOR=fsmonitor-all and run your tests.


But this old comment is different, suggesting copying that script to
our $PATH.

I prefer your instructions above, because it's only a single step,
and, more importantly, it won't pollute my $PATH.  I think this
comment should be updated to make the advices in both places
consistent.  Or perhaps even removed, now that all GIT_TEST variables
are documented in the same place?



I prefer the suggestion to simply remove this text from the test script 
now that there is documentation for it in the t/README file.


Junio, can you squash in the following patch or would you prefer I 
reroll the entire series?


Thanks,

Ben

From 393007340dc1baf3539ab727e0a8128e7c408a27 Mon Sep 17 00:00:00 2001
From: Ben Peart 
Date: Fri, 28 Sep 2018 10:23:18 -0400
Subject: fixup! fsmonitor: remove outdated instructions from test

Remove the outdated instructions on how to run the test suite utilizing
fsmonitor now that it is properly documented in t/README.

Signed-off-by: Ben Peart 
---

Notes:
Base Ref: git-test-cleanup-v3
Web-Diff: https://github.com/benpeart/git/commit/393007340d
Checkout: git fetch https://github.com/benpeart/git 
git-test-cleanup-v1 && git checkout 393007340d


 t/t7519-status-fsmonitor.sh | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index 8308d6d5b1..3f0dd98010 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -4,13 +4,6 @@ test_description='git status with file system watcher'

 . ./test-lib.sh

-#
-# To run the entire git test suite using fsmonitor:
-#
-# copy t/t7519/fsmonitor-all to a location in your path and then set
-# GIT_TEST_FSMONITOR=fsmonitor-all and run your tests.
-#
-
 # Note, after "git reset --hard HEAD" no extensions exist other than 
'TREE'

 # "git update-index --fsmonitor" can be used to get the extension written
 # before testing the results.

base-commit: 043246d9369fb851c5c2b922466f77fc7ef0327b
--
2.18.0.windows.1

Re: [PATCH v3 1/4] transport: drop refnames from for_each_alternate_ref

2018-09-28 Thread Taylor Blau

On Fri, Sep 28, 2018 at 12:58:58AM -0400, Jeff King wrote:
> > From: Jeff King 
>
> Pretty sure that isn't right. :)

Indeed that isn't right :-). I try my best to review my patches
diligently before submitting them, but here's an interesting side-story
if you're interested:

I use a script 'git mail' which is essentially doing:

  git format-patch --stdout >mbox && mutt -f mbox

So, by the time that I've reviewed the diff via:

  $ git format-patch --stdout | less

I assume that the patches are ready to send (since, after all, running
'git format-patch' more than once shouldn't change anything.) So, I open
mutt with 'git mail', write my cover letter, and send each of the
patches to the list.

It was during that last phase that I ignored the From: Jeff King
, which I agree with you is certainly incorrect :-).

I was going to ask Junio to fix this up when queuing, but it seems (from
a quick skim of the rest of your review), that we will reach v4, so I'll
see if I can't teach 'git mail' to do the right thing for me.

> The patch itself is flawless, of course. ;)

Obviously ;-).

Thanks,
Taylor

Re: [PATCH v3 3/5] fsmonitor: update GIT_TEST_FSMONITOR support

2018-09-28 Thread Ben Peart





On 9/28/2018 6:01 AM, SZEDER Gábor wrote:

On Tue, Sep 18, 2018 at 11:29:35PM +, Ben Peart wrote:

diff --git a/t/README b/t/README
index 56a417439c..47165f7eab 100644
--- a/t/README
+++ b/t/README
@@ -319,6 +319,10 @@ GIT_TEST_OE_DELTA_SIZE= exercises the uncommon 
pack-objects code
  path where deltas larger than this limit require extra memory
  allocation for bookkeeping.
  
+GIT_TEST_FSMONITOR=$PWD/t7519/fsmonitor-all exercises the fsmonitor

+code path for utilizing a file system monitor to speed up detecting
+new or changed files.


Here you tell us to set GIT_TEST_FSMONITOR to an absolute path, and we
are good to go.


diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index 756beb0d8e..d77012ea6d 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -8,7 +8,7 @@ test_description='git status with file system watcher'
  # To run the entire git test suite using fsmonitor:
  #
  # copy t/t7519/fsmonitor-all to a location in your path and then set
-# GIT_FSMONITOR_TEST=fsmonitor-all and run your tests.
+# GIT_TEST_FSMONITOR=fsmonitor-all and run your tests.


But this old comment is different, suggesting copying that script to
our $PATH.

I prefer your instructions above, because it's only a single step,
and, more importantly, it won't pollute my $PATH.  I think this
comment should be updated to make the advices in both places
consistent.  Or perhaps even removed, now that all GIT_TEST variables
are documented in the same place?



I prefer the suggestion to simply remove this text from the test script 
now that there is documentation for it in the t/README file.

Re: [PATCH 1/1] roll wt_status_state into wt_status and populate in the collect phase

2018-09-28 Thread Taylor Blau

On Thu, Sep 27, 2018 at 09:49:36PM -0700, Stephen P. Smith wrote:
> When updating the collect and print functions, it was found that
> status variables were initialized in the collect phase and some
> variables were later freed in the print functions.

Nit: I think that in the past Eric Sunshine has recommended that I use
active voice in patches, but "it was found" is passive.

I tried to find the message that I was thinking of, but couldn't, so
perhaps I'm inventing it myself ;-).

I'm CC-ing Eric to check my judgement.

> Move the status state structure variables into the status state
> structure and populate them in the collect functions.
>
> Create a new funciton to free the buffers that were being freed in the
> print function.  Call this new function in commit.c where both the
> collect and print functions were being called.
>
> Based on a patch suggestion by Junio C Hamano. [1]
>
> [1] https://public-inbox.org/git/xmqqr2i5ueg4@gitster-ct.c.googlers.com/
>
> Signed-off-by: Stephen P. Smith 
> ---
>  builtin/commit.c |   3 ++
>  wt-status.c  | 135 +--
>  wt-status.h  |  38 ++---
>  3 files changed, 83 insertions(+), 93 deletions(-)
>
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 51ecebbec1..e168321e49 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -506,6 +506,7 @@ static int run_status(FILE *fp, const char *index_file, 
> const char *prefix, int
>
>   wt_status_collect(s);
>   wt_status_print(s);
> + wt_status_collect_free_buffers(s);
>
>   return s->committable;
>  }
> @@ -1388,6 +1389,8 @@ int cmd_status(int argc, const char **argv, const char 
> *prefix)
>   s.prefix = prefix;
>
>   wt_status_print();
> + wt_status_collect_free_buffers();
> +
>   return 0;
>  }
>
> diff --git a/wt-status.c b/wt-status.c
> index c7f76d4758..9977f0cdf2 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -744,21 +744,26 @@ static int has_unmerged(struct wt_status *s)
>
>  void wt_status_collect(struct wt_status *s)
>  {
> - struct wt_status_state state;
>   wt_status_collect_changes_worktree(s);
> -

Nit: unnecessary diff, but I certainly don't think that this is worth a
re-roll on its own.

>   if (s->is_initial)
>   wt_status_collect_changes_initial(s);
>   else
>   wt_status_collect_changes_index(s);
>   wt_status_collect_untracked(s);
>
> - memset(, 0, sizeof(state));
> - wt_status_get_state(, s->branch && !strcmp(s->branch, "HEAD"));
> - if (state.merge_in_progress && !has_unmerged(s))
> + wt_status_get_state(>state, s->branch && !strcmp(s->branch, "HEAD"));
> + if (s->state.merge_in_progress && !has_unmerged(s))
>   s->committable = 1;

Should this line be de-dented to match the above?

>  }
>
> +void wt_status_collect_free_buffers(struct wt_status *s)
> +{
> + free(s->state.branch);
> + free(s->state.onto);
> + free(s->state.detached_from);
> +}
> +
> +

Nit: too much whitespace between 'wt_status_collect_free_buffers()' and
'wt_longstatus_print_unmerged()' below. I see that there are two
newlines above, but I think that there should just be one.

>  static void wt_longstatus_print_unmerged(struct wt_status *s)
>  {
>   int shown_header = 0;
> @@ -1087,8 +1092,7 @@ static void wt_longstatus_print_tracking(struct 
> wt_status *s)
>  }

The rest of this patch looks sensible to me, but I didn't follow the
original discussion in [1], so take my review with a grain of salt :-).

Thanks,
Taylor

Re: [PATCH v6 4/7] config: add new index.threads config setting

2018-09-28 Thread Ben Peart





On 9/27/2018 8:26 PM, SZEDER Gábor wrote:

On Wed, Sep 26, 2018 at 03:54:39PM -0400, Ben Peart wrote:

Add support for a new index.threads config setting which will be used to
control the threading code in do_read_index().  A value of 0 will tell the
index code to automatically determine the correct number of threads to use.
A value of 1 will make the code single threaded.  A value greater than 1
will set the maximum number of threads to use.

For testing purposes, this setting can be overwritten by setting the
GIT_TEST_INDEX_THREADS= environment variable to a value greater than 0.

Signed-off-by: Ben Peart 
---



diff --git a/t/README b/t/README
index aa33ac4f26..0fcecf4500 100644
--- a/t/README
+++ b/t/README
@@ -332,6 +332,11 @@ This is used to allow tests 1, 4-9 in t1700-split-index.sh 
to succeed
  as they currently hard code SHA values for the index which are no longer
  valid due to the addition of the EOIE extension.
  
+GIT_TEST_INDEX_THREADS= enables exercising the multi-threaded loading

+of the index for the whole test suite by bypassing the default number of
+cache entries and thread minimums. Settting this to 1 will make the


s/ttt/tt/


+index loading single threaded.
+
  Naming Tests
  
  
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh

index 1f168378c8..ab205954cf 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -8,6 +8,7 @@ test_description='split index mode tests'
  sane_unset GIT_TEST_SPLIT_INDEX
  sane_unset GIT_FSMONITOR_TEST
  GIT_TEST_DISABLE_EOIE=true; export GIT_TEST_DISABLE_EOIE
+GIT_TEST_INDEX_THREADS=1; export GIT_TEST_INDEX_THREADS


Why does multithreading have to be disabled in this test?



If multi-threading is enabled, it will write out the IEOT extension 
which changes the SHA and causes the test to fail.  I will update the 
logic in this case to not write out the IEOT extension as it isn't needed.



  test_expect_success 'enable split index' '
git config splitIndex.maxPercentChange 100 &&
--
2.18.0.windows.1

Re: --skip-worktree and operations like checkout / stash /etc.

2018-09-28 Thread Raman Gupta

In the meantime I've created this simple script, but what a hack...

#!/bin/bash
git ls-files -v | grep "^S " | cut -c3- | readarray ignored
for x in "${ignored[@]}"; do
  git update-index --no-skip-worktree -- "$x"
done
git checkout -m $@
for x in "${ignored[@]}"; do
  git update-index --skip-worktree -- "$x"
done

Regards,
Raman
On Thu, Sep 27, 2018 at 6:29 PM Raman Gupta  wrote:
>
> The comand `update-index --skip-worktree` seems to be an ideal way to
> tell git to locally ignore some modified files. However, this seems
> not to play well with very common commands like `checkout` and
> `stash`.
>
> $ git checkout other-branch
> error: Your local changes to the following files would be overwritten
> by checkout:
> path/to/ignored/file
> Please commit your changes or stash them before you switch branches.
> Aborting
>
> Ok, well lets try stashing:
>
> $ git stash save
> No local changes to save
>
> Ok, lets try a checkout with a merge:
>
> $ git checkout -m other-branch
> error: Entry 'path/to/ignored/file' not uptodate. Cannot merge.
>
> Ok, lets force this sucker:
>
> $ git checkout -f other-branch
> error: Entry 'path/to/ignored/file' not uptodate. Cannot merge.
>
> Ok, at this point I'm wondering, do I really need to
> --no-skip-worktree all the ignored files, do my `checkout -m`, and
> then ignore them again? Umm, no, that ain't gonna work.
>
> I'd love for git to just check if my worktree-skipped changes will
> merge cleanly into the target branch, and if they do so, go ahead and
> do that merge (with perhaps a notification printed to the console) and
> keep the skip worktree status. If the merge ends up with a conflict,
> then feel free to no-worktree-skip it and show me merge conflicts.
>
> Regards,
> Raman

Re: [PATCH v5 7/8] t0410: test fetching from many promisor remotes

2018-09-28 Thread Christian Couder

On Fri, Sep 28, 2018 at 12:35 PM SZEDER Gábor  wrote:
>
> On Tue, Sep 25, 2018 at 01:53:40PM +0200, Christian Couder wrote:

> > + IDX=$(cat promisorlist | sed "s/promisor$/idx/") &&
>
> You could drop the unnecessary 'cat', 'sed' is capable to open a file
> on its own.
>
> > + git verify-pack --verbose "$IDX" | grep "$HASH2"
>
> Don't run a git command, especially one with "verify" in its name,
> upstream of a pipe, because the pipe hides the git command's exit
> code.

Yeah, I copied both of the above lines from the test just above the
one I added. So I will probably add a patch to fix those kinds of
issues in t0410 at the beginning of the series, and then of course
copy the fixed tests in the tests I add.

Thanks,
Christian.

Re: [PATCH v5 7/8] t0410: test fetching from many promisor remotes

2018-09-28 Thread SZEDER Gábor

On Tue, Sep 25, 2018 at 01:53:40PM +0200, Christian Couder wrote:
> From: Christian Couder 
> 
> Signed-off-by: Christian Couder 
> Signed-off-by: Junio C Hamano 
> ---
>  t/t0410-partial-clone.sh | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index 8b32be6417..3fbd8d919e 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -170,6 +170,28 @@ test_expect_success 'fetching of missing objects' '
>   git verify-pack --verbose "$IDX" | grep "$HASH"
>  '
>  
> +test_expect_success 'fetching of missing objects from another odb remote' '
> + git clone "file://$(pwd)/server" server2 &&
> + test_commit -C server2 bar &&
> + git -C server2 repack -a -d --write-bitmap-index &&
> + HASH2=$(git -C server2 rev-parse bar) &&
> +
> + git -C repo remote add server2 "file://$(pwd)/server2" &&
> + git -C repo config odb.magic2.promisorRemote server2 &&
> + git -C repo cat-file -p "$HASH2" &&
> +
> + git -C repo fetch server2 &&
> + rm -rf repo/.git/objects/* &&
> + git -C repo cat-file -p "$HASH2" &&
> +
> + # Ensure that the .promisor file is written, and check that its
> + # associated packfile contains the object
> + ls repo/.git/objects/pack/pack-*.promisor >promisorlist &&
> + test_line_count = 1 promisorlist &&
> + IDX=$(cat promisorlist | sed "s/promisor$/idx/") &&

You could drop the unnecessary 'cat', 'sed' is capable to open a file
on its own.

> + git verify-pack --verbose "$IDX" | grep "$HASH2"

Don't run a git command, especially one with "verify" in its name,
upstream of a pipe, because the pipe hides the git command's exit
code.

> +'

Re: [PATCH v2 0/5] Fix the racy split index problem

2018-09-28 Thread SZEDER Gábor

On Fri, Sep 28, 2018 at 08:57:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
> > Thanks. I had ~400 runs of the tests I ran before and they were all
> > OK. Now trying also with t1701 (which I hadn't noticed was a new
> > test...).
> 
> Ran that overnight with the same conditions as before. 2683 OK runs and
> 0 failures (and counting). So it seems like the combination of the two
> fixed the split index bugs.

Yeah, I thought they would.  If you look at the first loop of
prepare_to_write_split_index() classifying which cache entries should
be included in the new split index, you'll see only two code paths
that could leave out an entry from the split index, i.e. where an
entry could be left with a non-zero 'ce->index' and without its
CE_UPDATE_IN_BASE flag set.  Now, with the fix in patch 5/5 both of
those code paths have the is_race_timestamp() check.

Re: [PATCH v3 3/5] fsmonitor: update GIT_TEST_FSMONITOR support

2018-09-28 Thread SZEDER Gábor

On Tue, Sep 18, 2018 at 11:29:35PM +, Ben Peart wrote:
> diff --git a/t/README b/t/README
> index 56a417439c..47165f7eab 100644
> --- a/t/README
> +++ b/t/README
> @@ -319,6 +319,10 @@ GIT_TEST_OE_DELTA_SIZE= exercises the uncommon 
> pack-objects code
>  path where deltas larger than this limit require extra memory
>  allocation for bookkeeping.
>  
> +GIT_TEST_FSMONITOR=$PWD/t7519/fsmonitor-all exercises the fsmonitor
> +code path for utilizing a file system monitor to speed up detecting
> +new or changed files.

Here you tell us to set GIT_TEST_FSMONITOR to an absolute path, and we
are good to go.

> diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
> index 756beb0d8e..d77012ea6d 100755
> --- a/t/t7519-status-fsmonitor.sh
> +++ b/t/t7519-status-fsmonitor.sh
> @@ -8,7 +8,7 @@ test_description='git status with file system watcher'
>  # To run the entire git test suite using fsmonitor:
>  #
>  # copy t/t7519/fsmonitor-all to a location in your path and then set
> -# GIT_FSMONITOR_TEST=fsmonitor-all and run your tests.
> +# GIT_TEST_FSMONITOR=fsmonitor-all and run your tests.

But this old comment is different, suggesting copying that script to
our $PATH.

I prefer your instructions above, because it's only a single step,
and, more importantly, it won't pollute my $PATH.  I think this
comment should be updated to make the advices in both places
consistent.  Or perhaps even removed, now that all GIT_TEST variables
are documented in the same place?

Re: [PATCH] help: allow redirecting to help for aliased command

2018-09-28 Thread Rasmus Villemoes

On 2018-09-26 20:49, Jeff King wrote:
> On Wed, Sep 26, 2018 at 08:16:36AM -0700, Junio C Hamano wrote:
> 
>>
>> If we expect users to use "git cp --help" a lot more often than "git
>> help cp" (or the other way around), one way to give a nicer experience
>> may be to unconditionally make "git cp --help" to directly show the
>> manpage of cherry-pick, while keeping "git help cp" to never do
>> that.  Then those who want to remember what "co" is aliased to can
>> ask "git help co".
> 
> I like that direction much better. I also wondered if we could leverage
> the "-h" versus "--help" distinction. The problem with printing the
> alias definition along with "--help" is that the latter will start a
> pager that obliterates what we wrote before (and hence all of this delay
> trickery).
> 
> But for "-h" we generally expect the command to output a usage message.
> 
> So what if the rules were:
> 
>   - "git help cp" shows "cp is an alias for cherry-pick" (as it does
> now)

Sounds good.

>   - "git cp -h" shows "cp is an alias for cherry-pick", followed by
> actually running "cherry-pick -h", which will show the usage
> message. For a single-word command that does very little, since the
> usage message starts with "cherry-pick". But if your alias is
> actually "cp = cherry-pick -n", then it _is_ telling you extra
> information.

Funny, I never noticed this difference, and that '-h' for an alias would
actually give more information than '--help'. I sort-of knew that -h
would give the synopsis, so I guess I've just gotten used to always use
--help, and just noticed that for aliases that doesn't provide much help.

Adding the 'is an alias for' info to -h sounds quite sensible.

And this could even work with "!" aliases: we define
> it, and then it is up to the alias to handle "-h" sensibly.

I'd be nervous about doing this, though, especially if we introduce this
without a new opt-in config option (which seems to be the direction the
discussion is taking). There are lots of commands that don't respond
with a help message to -h, or that only recognize -h as the first word,
or... There are really too many ways this could cause headaches.

But, now that I test it, it seems that we already let the alias handle
-h (and any other following words, with --help as the first word
special-cased). So what you're suggesting is (correct me if I'm wrong)
to _also_ intercept -h as the first word, and then print the alias info,
in addition to spawning the alias with the entire argv as usual. The
alias info would probably need to go to stderr in this case.

>   - "git cp --help" opens the manpage for cherry-pick. We don't bother
> with the alias definition, as it's available through other means
> (and thus we skip the obliteration/timing thing totally).

It sounds like you suggest doing this unconditionally, and without any
opt-in via config option or a short wait? That would certainly work for
me. It is, in fact, how I expect 'git cp --help' to work, until I get
reminded that it does not... Also, as Junio noted, is consistent with
--help generally providing more information than -h - except that one
loses the 'is an alias for' part for --help.

> This really only works for non-! aliases. Those would continue to
> show the alias definition.

Yes.

Thanks,
Rasmus

Re: [PATCH] help: allow redirecting to help for aliased command

2018-09-28 Thread Rasmus Villemoes

On 2018-09-26 20:12, Taylor Blau wrote:
> 
> In the case where you are scripting (and want to know what 'git co'
> means for programmatic usage), I think that there are two options. One,
> which you note above, is the 'git -c help.followAlias=false ...'
> approach, which I don't think is so bad for callers, given the tradeoff.
> 
> Another way to go is 'git config alias.co', which should provide the
> same answer. I think that either would be fine.

The latter seems much more robust, since that will also tell you
precisely whether co is an alias at all, and you don't have to parse
-h/--help output (stripping out the 'is aliased to...' stuff, which
might be complicated by i18n etc. etc.). So I don't think we should
worry too much about scripted use of -h/--help.

Rasmus

Re: [PATCH] help: allow redirecting to help for aliased command

2018-09-28 Thread Rasmus Villemoes

On 2018-09-26 17:19, Duy Nguyen wrote:
> On Wed, Sep 26, 2018 at 4:42 PM Taylor Blau  wrote:
>>> +
>>> + /*
>>> +  * We use split_cmdline() to get the first word of the
>>> +  * alias, to ensure that we use the same rules as when
>>> +  * the alias is actually used. split_cmdline()
>>> +  * modifies alias in-place.
>>> +  */
>>> + count = split_cmdline(alias, );
>>> + if (count < 0)
>>> + die("Bad alias.%s string: %s", cmd,
>>> + split_cmdline_strerror(count));
>>
>> Please wrap this in _() so that translators can translate it.
> 
> Yes! And another nit. die(), error(), warning()... usually start the
> message with a lowercase letter because we already start the sentence
> with a prefix, like
> 
> fatal: bad alias.blah blah
> 

I'll keep these points in mind, but this was pure copy-paste from git.c.

Rasmus

Re: [PATCH] help: allow redirecting to help for aliased command

2018-09-28 Thread Rasmus Villemoes

On 2018-09-26 17:16, Junio C Hamano wrote:
> Rasmus Villemoes  writes:
> 
>> +/*
>> + * We use split_cmdline() to get the first word of the
>> + * alias, to ensure that we use the same rules as when
>> + * the alias is actually used. split_cmdline()
>> + * modifies alias in-place.
>> + */
>> +count = split_cmdline(alias, );
>> +if (count < 0)
>> +die("Bad alias.%s string: %s", cmd,
>> +split_cmdline_strerror(count));
>> +
>> +if (follow_alias > 0) {
>> +fprintf_ln(stderr,
>> +   _("Continuing to help for %s in %0.1f 
>> seconds."),
>> +   alias, follow_alias/10.0);
>> +sleep_millisec(follow_alias * 100);
>> +}
>> +return alias;
> 
> If you have "[alias] cp = cherry-pick -n", split_cmdline discards
> "-n" and the follow-alias prompt does not even tell you that it did
> so,

That's not really true, as I deliberately did the split_cmdline after
printing the "is an alias for", but before "continuing to help for", so
this would precisely tell you

  cp is an alias for 'cherry-pick -n'
  continuing to help for 'cherry-pick' in 1.5 seconds

> and you get "git help cherry-pick".  This code somehow expects
> you to know to jump to the section that describes the "--no-commit"
> option.  I do not think that is a reasonable expectation.

No, in that case I would not expect git cp --help to jump to that
section anymore than I would expect "git cherry-pick -n --help" to
magically do that (and that would be impossible in general, if more
options are bundled in the alias).

> When you have "[alias] cp = cherry-pick -n", "git cp --help" should
> not do "git help cherry-pick".  Only a single word that exactly
> matches a git command should get this treatment.

I considered that, and could certainly live with that. But it seems the
discussion took a different turn in another part of the thread, so I'll
continue there.

Rasmus

Re: [PATCH v2 0/5] Fix the racy split index problem

2018-09-28 Thread Ævar Arnfjörð Bjarmason



On Thu, Sep 27 2018, Ævar Arnfjörð Bjarmason wrote:

> On Thu, Sep 27 2018, SZEDER Gábor wrote:
>
>> On Thu, Sep 27, 2018 at 03:53:24PM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> On Thu, Sep 27 2018, SZEDER Gábor wrote:
>>>
>>> > This is the second attempt to fix the racy split index problem, which
>>> > causes occasional failures in several random test scripts when run
>>> > with 'GIT_TEST_SPLIT_INDEX=yes'.  The important details are in patches
>>> > 1 and 5 (corresponding to v1's 3 and 5).
>>>
>>> Thanks. I'm running the same sorts of tests I noted in
>>> https://public-inbox.org/git/87va7ireuu@evledraar.gmail.com/ on
>>> this. The fix Jeff had that you noted in
>>> https://public-inbox.org/git/20180906151439.GA8016@localhost/ is now in
>>> "master".
>>>
>>> I take it your
>>> https://github.com/szeder/git/commits/racy-split-index-fix is the same
>>> as this submission?
>>
>> Yes.
>>
>>> Anyway, I'm testing that cherry-picked on top of the
>>> latest master.
>>>
>>> Unfortunate that we couldn't get the isolated test you made in
>>> https://public-inbox.org/git/20180907034942.GA10370@localhost/
>>
>> Nah, that's not an isolated test case, that's only a somewhat
>> narrowed-down, but rather reliable reproduction recipe while I still
>> had no idea what was going on :)
>>
>> The _real_ isolated test is the last test in t1701, that's what it
>> eventually boiled down to.
>
> Thanks. I had ~400 runs of the tests I ran before and they were all
> OK. Now trying also with t1701 (which I hadn't noticed was a new
> test...).

Ran that overnight with the same conditions as before. 2683 OK runs and
0 failures (and counting). So it seems like the combination of the two
fixed the split index bugs.

>>> but I
>>> don't see how it could be added without some very liberal
>>> getenv("GIT_TEST_blahblah"), so it's probably best to not add it,
>>> particularly with the C rewrite of git-stash in-flight.
>>>
>>> I'll report back when I have enough test data to say how these patches
>>> affect the intermittent test failures under GIT_TEST_SPLIT_INDEX=yes.

79 matches

Mail list logo