Re: [PATCHv5 7/7] builtin/describe.c: describe a blob

2017-11-21 Thread Junio C Hamano
Stefan Beller  writes:

> ...
> fixed.
> ...
> fixed the text...
> ...
> I am not fully convinced all descriptions are in recent history, but I
> tend to agree that most are, so probably the trade off is a wash.

So what do we want with this topic?  I think the "teach 'git log' to
highlight commits whose changes involve the given blob" is a more or
less an orthogonal thing, and I suspect that it is something users
may (although I personally do not) find valuable to have a related
but different feature in "git describe".




Re: [PATCH v3 0/4] Hash abstraction

2017-11-21 Thread Junio C Hamano
Junio C Hamano  writes:

> "brian m. carlson"  writes:
>
>> This is a series proposing a basic abstraction for hash functions.
>>
>> The full description of the series can be found here:
>> https://public-inbox.org/git/20171028181239.59458-1-sand...@crustytoothpaste.net/
>>
>> At the bottom of this message is the output of git tbdiff between v2 and
>> v3.
>>
>> Changes from v2:
>> * Remove inline.
>> * Add dummy functions that call die for unknown hash implementation.
>> * Move structure definitions to hash.h and include git-compat-util.h in
>>   hash.h.
>> * Rename current_hash to the_hash_algo.
>> * Use repo_set_hash_algo everywhere we set the hash algorithm for a
>>   struct repository.
>
> Change for all of the above in this series looked sensible to me.
> Thank, will queue.

So... is everybody happy enough with this version that nobody minds
more codebase to be adjusted on the infrastructure this series lays
out?  I think this is ready for 'next', but just in case I missed
some discussions...



Re: [PATCH 1/2] Git/Packet.pm: rename packet_required_key_val_read()

2017-11-21 Thread Christian Couder
On Wed, Nov 22, 2017 at 7:28 AM, Junio C Hamano  wrote:
> Jonathan Nieder  writes:
>
 This comment doesn't tell me how to use the function.  How do I detect
 whether it successfully read a line?  What do the return values
 represent?  What happens if the line it read doesn't match the key?
>>>
>>> Would this work for both of you?
>>>
>>> # Read a text packet, expecting that it is in the form "key=value" for
>>> # the given $key.  An EOF does not trigger any error and is reported
>>> # back to the caller (like packet_txt_read() does).  Die if the "key"
>>> # part of "key=value" does not match the given $key, or the value part
>>> # is empty.
>>
>> Yes, thank you.
>
> Heh.  I actually was expecting a different response: "that describes
> what the reader can easily read out of the implementation and is
> useless", though.

I was going to resend without the comment after Jonathan's first
email, but I am ok with either your improved comment or without any
comment. Thanks.


Re: [PATCH 1/2] Git/Packet.pm: rename packet_required_key_val_read()

2017-11-21 Thread Jonathan Nieder
Junio C Hamano wrote:
> Jonathan Nieder  writes:
>> Junio C Hamano wrote:
>>> Jonathan Nieder  writes:

 This comment doesn't tell me how to use the function.  How do I detect
 whether it successfully read a line?  What do the return values
 represent?  What happens if the line it read doesn't match the key?
>>>
>>> Would this work for both of you?
>>>
>>> # Read a text packet, expecting that it is in the form "key=value" for
>>> # the given $key.  An EOF does not trigger any error and is reported
>>> # back to the caller (like packet_txt_read() does).  Die if the "key"
>>> # part of "key=value" does not match the given $key, or the value part
>>> # is empty.
>>
>> Yes, thank you.
>
> Heh.  I actually was expecting a different response: "that describes
> what the reader can easily read out of the implementation and is
> useless", though.

The main context that I'm missing and that this function comment
doesn't answer is what this function is for.  When would I want to
read a line and exit if it doesn't match "key" but not exit if I hit
EOF?  It seems very strange.

The function comment does successfully capture the strangeness,
though, and that already helps.  When looking at the implementation, I
had a bit of a double-take and wondered what I was missing.  This doc
comment says "you weren't missing anything --- that is actually the
contract that this function intends to fulfill".

Thanks,
Jonathan


Re: [PATCH 1/2] Git/Packet.pm: rename packet_required_key_val_read()

2017-11-21 Thread Junio C Hamano
Jonathan Nieder  writes:

>>> This comment doesn't tell me how to use the function.  How do I detect
>>> whether it successfully read a line?  What do the return values
>>> represent?  What happens if the line it read doesn't match the key?
>>
>> Would this work for both of you?
>>
>> # Read a text packet, expecting that it is in the form "key=value" for
>> # the given $key.  An EOF does not trigger any error and is reported
>> # back to the caller (like packet_txt_read() does).  Die if the "key"
>> # part of "key=value" does not match the given $key, or the value part
>> # is empty.
>
> Yes, thank you.

Heh.  I actually was expecting a different response: "that describes
what the reader can easily read out of the implementation and is
useless", though.


Re: [PATCH] defer expensive load_ref_decorations until needed

2017-11-21 Thread Junio C Hamano
Junio C Hamano  writes:

> Other than that, I like what this patch attempts to do.  A nicely
> identified low-hanging fruit ;-).

Having said that, this will have a bad interaction with another
topic in flight: <20171121213341.13939-1-rafa.al...@gmail.com>

Perhaps this should wait until the other topic lands and stabilizes.
We'd need to rethink if the approach taken by this patch, i.e. to
still pass the info to load() but holding onto it until the time
lazy_load() actually uses it, is a sensible way forward, or we would
want to change the calling convention to help making it easier to
implement the lazy loading.

Thanks.





Re: [PATCH v3 5/5] Testing: provide tests requiring them with ellipses after SHA-1 values

2017-11-21 Thread Junio C Hamano
"Philip Oakley"  writes:

> From: "Junio C Hamano" 
>> Ann T Ropea  writes:
>>
>>> *1* We are being overly generous in t4013-diff-various.sh because we do
>>> not want to destroy/take apart the here-document.  Given that all this a
>>> temporary measure, we should get away with it.
>
> So, the need to reformat the test for the future post-deprecation
> period is being deferred to the time that the PRINT_SHA1_ELLIPSIS env
> variable, and all ellipis, is removed - is that the case? Maybe it
> just needs saying plainly.

And if we say it that way, it is clear that with this series, we are
shipping a new feature with a test that does not protect the output
format we claim to be the improved and preferred one.  That sounds
quite bad.

Having said that, I have already queued this to 'pu' and I do not
terribly mind to merge it down to 'next', leaving the test updates
to cover the new output format as well as the backward compatible
one at the same time for a later follow-up patch.  

I'd however hate it if I have to carry the topic in the current
shape in 'next' forever, waiting for such an update to come, that
may never materialize, and be forced to do it myself without being
explicitly asked by (and thanked for) anybody, especially because
this is not exactly my itch X-<.

> Or is the env variable being retained as a fallback 'forever'? I'm
> half guessing that it may tend toward the latter as it's an easier
> backward compatibility decision.

We do not know until this change is released to the wild, at which
time we will hear noises about the lack of expected ellipses their
(poorly written) scripts rely on and tell them to set the workaround
environment variable.  We may not hear from such people at all, in
which case we may be able to remove it within a year or so, but it
is too early to tell.


Re: [RFC PATCH v2 0/2] bisect: add a single command for editing logs

2017-11-21 Thread Junio C Hamano
Adam Dinwoodie  writes:

> - It's possible to start a bisect session with a command like `git
>   bisect @ @~10`.  This will lead to the bisect log including the `@`
>   and `@~10` literally, and the interpretation of those values changes
>   depending on the current HEAD.  As a result, if you do a `git bisect
>   edit` after starting a bisect like that, but don't actually edit the
>   file, you'll nonetheless be in a different state.

This is a tangent, but for writing to the general public, please do
spell out HEAD, not the line noise synonym "@" that confuses readers.

>   I can see a few ways of coping with that:
>
>   1. Change the existing `git bisect start` behaviour to run arguments
>  through `git rev-parse` before recording them.  It appears `git
>  bisect good` et al. already do that, but it is a change in
>  behaviour that I guess could impact badly on other people using
>  `git bisect log`-based workflows.

The issue is not just HEAD but also for anything fruid, i.e. the
name of a branch, a search result ":/pattern", etc., and if we want
to allow restarting a previously failed bisect session from a
midpoint, we should be recording things in absolute terms as early
as possible.  I'd think it was an oversight the "log" thing did not
do so.

>   2. Do a full `git bisect reset` before replaying the log, so the
>  revisions will be parsed in the same way as they were originally.
>  I'd be slightly sad about that, as it seems an unnecessary
>  inefficiency, but it may well be the simplest approach.

It is not just inefficient, but would require there is no a local
change; I thought that the current system allows you to have a local
modification to a path that is not involved in the bisect session
and losing that property would be sad.

> - There aren't yet any tests or documentation changes; I wanted to get
>   commentary on the initial code changes before I spent time on those
>   parts.

There are some chicken-and-egg around this area.  For some changes,
without a doc update and test addition, it is harder to judge if a
reviewer can agree with the proposed change, as there is only a high
level description "we allow editing" and the lowest level changes to
the actual code, without anything in between that describes the
guiding principle and design decision that lead to the patch.

I'll need to see if the changes in the patch is clear/trivial enough
to see where you are trying to go to see if it is the case for this
patch, though, so read the above paragraph as a general guideline.




Re: [PATCH v5 00/10] Partial clone part 2: fsck and promisors

2017-11-21 Thread Junio C Hamano
Jeff Hostetler  writes:

> From: Jeff Hostetler 
>
> This is V5 of part 2 of partial clone.  This assumes V5 of part 1
> is already present.  V5 includes minor cleanup over V4 and better
> separates the --exclude-promisor-objects and --missing arguments.
>
> Part 2 is concerned with fsck, gc, initial support for dynamic
> object fetching, and tracking promisor objects.  Jonathan Tan
> originally developed this code.  I have moved it on top of
> part 1 and updated it slightly.

Thanks, will replace/queue all three series.  I am getting a feeling
that the first one is already ready for 'next', while the other two
may want to see a bit more comments?


Re: [PATCH v5 0/6] Partial clone part 1: object filtering

2017-11-21 Thread Junio C Hamano
Jonathan Tan  writes:

>> Jeff Hostetler (6):
>>   dir: allow exclusions from blob in addition to file
>>   oidmap: add oidmap iterator methods
>>   oidset: add iterator methods to oidset
>>   list-objects: filter objects in traverse_commit_list
>>   rev-list: add list-objects filtering support
>>   pack-objects: add list-objects filtering
>
> I checked the diff against v4 and this looks good.
>
> I'm still unsure if pre-parsing the --filter argument into a struct
> list_objects_filter_options is the best approach API-wise in the case
> that we plan to send it to the server, but it does have the benefit of
> failing (and informing the user) early if the filter is in the wrong
> format, so I'm fine with this patch set as-is.

Thanks.  I share the same suspicion but as long as we keep the raw
version in addition to the parsed-out value, we could pass it around
without having to reconstruct it (and risking an incorrect
reconstruction), so it should be OK.

Will queue with your reviewed-by: unless you object ;-)

Thanks, both.



Re: [PATCH 1/3] Documentation: allow overriding timestamps of generated asciidoc

2017-11-21 Thread Jonathan Nieder
Anders Kaseorg wrote:
> On Tue, 21 Nov 2017, Jonathan Nieder wrote:

>> http://asciidoc.org/CHANGELOG.html is stale but asciidoc still seems
>> to be getting changes at https://github.com/asciidoc/asciidoc.  I
>> wonder how difficult it would be to add any required SOURCE_DATE_EPOCH
>> support there.
>
> In fact I already did (https://github.com/asciidoc/asciidoc/pull/106),
> which is why I’d been holding off on trying to upstream this Git patch.
> The trouble was, the AsciiDoc developers had not been cutting new releases
> “because nobody knows how”
> (https://github.com/asciidoc/asciidoc/issues/103#issuecomment-322077321).
> However, it looks like AsciiDoc 8.6.10 was recently tagged and Debian got
> a 8.6.10-1 package yesterday, so I guess that trouble has been quietly
> resolved.

Ah, lovely.  I'll add a build-time dependency on that version to the
Debian package.

Junio, please disregard this patch (patch 1/3).

Thanks,
Jonathan


Re: [PATCH 1/2] Git/Packet.pm: rename packet_required_key_val_read()

2017-11-21 Thread Jonathan Nieder
Junio C Hamano wrote:
> Jonathan Nieder  writes:
>> Christian Couder wrote:

>>> +# Read a text line and check that it is in the form "key=value"
>>> +sub packet_key_val_read {
>>
>> This comment doesn't tell me how to use the function.  How do I detect
>> whether it successfully read a line?  What do the return values
>> represent?  What happens if the line it read doesn't match the key?
>
> Would this work for both of you?
>
> # Read a text packet, expecting that it is in the form "key=value" for
> # the given $key.  An EOF does not trigger any error and is reported
> # back to the caller (like packet_txt_read() does).  Die if the "key"
> # part of "key=value" does not match the given $key, or the value part
> # is empty.

Yes, thank you.

Jonathan


Re: [PATCH 1/3] Documentation: allow overriding timestamps of generated asciidoc

2017-11-21 Thread Junio C Hamano
Anders Kaseorg  writes:

> That should make this Git patch unnecessary.  (You’re of course still 
> welcome to take it if you think build reproducibility with old AsciiDoc 
> versions is worthwhile.)

Thanks.  

I've queued these three only so that I won't lose track, but will
not hastily merge them down (yet) until I hear from people.



Re: [PATCH] defer expensive load_ref_decorations until needed

2017-11-21 Thread Junio C Hamano
Phil Hord  writes:

> With many thousands of references, a simple `git rev-parse HEAD` may take
> more than a second to return because it first loads all the refs into
> memory even though it will never use them.
>
> Defer loading any references until we actually need them.
>
> Signed-off-by: Phil Hord 
> ---
>  log-tree.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/log-tree.c b/log-tree.c
> index 3b904f037..c1509f8b9 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -84,8 +84,10 @@ void add_name_decoration(enum decoration_type type, const 
> char *name, struct obj
>   res->next = add_decoration(_decoration, obj, res);
>  }
>  
> +static void maybe_load_ref_decorations();

I'll tweak that "()" and the other one we see below to "(void)"
while queuing.

I am not sure if "maybe_" is a good name here, though.  If anything,
you are making the semantics of "load_ref_decorations()" to "maybe"
(but I do not suggest renaming that one).

How about calling it to load_ref_decorations_lazily() or something?

I also wonder if decoration_loaded should now become function-scope
static in this new helper, but that can be left outside of the
topic.

Other than that, I like what this patch attempts to do.  A nicely
identified low-hanging fruit ;-).

Thanks.

>  const struct name_decoration *get_name_decoration(const struct object *obj)
>  {
> + maybe_load_ref_decorations();
>   return lookup_decoration(_decoration, obj);
>  }
>  
> @@ -150,10 +152,13 @@ static int add_graft_decoration(const struct 
> commit_graft *graft, void *cb_data)
>  
>  void load_ref_decorations(int flags)
>  {
> - if (!decoration_loaded) {
> + decoration_flags = flags;
> +}
>  
> +static void maybe_load_ref_decorations()
> +{
> + if (!decoration_loaded) {
>   decoration_loaded = 1;
> - decoration_flags = flags;
>   for_each_ref(add_ref_decoration, NULL);
>   head_ref(add_ref_decoration, NULL);
>   for_each_commit_graft(add_graft_decoration, NULL);


Re: [PATCH] stash: Learn to parse -m/--message like commit does

2017-11-21 Thread Junio C Hamano
Phil Hord  writes:

> `git stash push -m foo` uses "foo" as the message for the stash. But
> `git stash push -m"foo"` does not parse successfully.  Similarly
> `git stash push --message="My stash message"` also fails.  Nothing
> in the documentation suggests this syntax should work, but it does
> work for `git commit`, and my fingers have learned this pattern long
> ago.
>
> Teach `git stash` to parse -mFoo and --message=Foo the same as
> `git commit` would do.
>
> Signed-off-by: Phil Hord 
> ---
>  git-stash.sh | 18 ++
>  1 file changed, 18 insertions(+)

Makes sense.  Thanks.

I wonder if you want to add a trivial test or two for this, if "git
stash [save|push|nothing] -m foo" is already tested.  It appears
that t3903 already has a test that does 'push -m "test message"' and
sees if that appears in the output of "list", which looks like the
ideal place to do so.



Re: doc: prefer 'stash push' over 'stash save'

2017-11-21 Thread Junio C Hamano
Jonathan Nieder  writes:

> Phil Hord wrote:
>
>> Although `git stash save` was deprecated recently, some parts of the
>> documentation still refer to it instead of `push`.
>>
>> Signed-off-by: Phil Hord 
>> ---
>>  Documentation/git-stash.txt | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> Reviewed-by: Jonathan Nieder 
> Thanks.

Heh, this does not even apply to 8be661007 that it claims to apply
on top of, which is contained in fd2ebf14 ("stash: mark "git stash
save" deprecated in the man page", 2017-10-22).

I've wiggled it in, so there is no need to resend, but next time
please be careful when sending the patch and also when sending a
reviewed-by.


Re: [RFC PATCH] builtin/worktree: enhance worktree removal

2017-11-21 Thread Eric Sunshine
On Tue, Nov 21, 2017 at 10:55 PM, Junio C Hamano  wrote:
> OK, so the proposed log message was a bit confusing for those who
> are *not* the person who wrote it (who knew why existing behaviour
> was inadequate and did not describe how "worktree remove" would fail
> under such a scenario to illustrate it, incorrectly assuming that
> everybody who reads the proposed log message already *knows* how it
> would fail).

Correct. The log message is confusing, enough so that my knee-jerk
reaction was that the patch was trying to re-invent 'git worktree
prune'. That seemed odd, so I spent extra time trying to figure out
what it really meant, which led to my rather lengthy response asking
if I was understanding the situation correctly.

> "git worktree remove" removes both the named worktree
> directory and the administrative information for it after
> checking that there is no local modifications that would be
> lost (which is a handy safety measure).  It however refuses
> to work if the worktree directory is _already_ removed.

The "refusal" is by accident, so perhaps phrase it like this:

However, due to an oversight, it aborts with an error if the
worktree directory is _already_ removed.

or something.

> The user could use "git worktree prune" after seeing the
> error and realizing the situation, but at that point, there
> is nothing gained by leaving only the administrative data
> behind.  Teach "git worktree remove" to go ahead and remove
> the trace of the worktree in such a case.
>
> or soemthing like that?

Yes, I quite like it; it conveys the information necessary to
understand the issue.

Kaartic: Regarding the actual patch, rather than silencing
validate_worktree() (which seems an unfortunate thing to do), isn't it
possible simply to do a quick test to see if the worktree directory
exists before calling validate_worktree()? If it doesn't exist, then
just skip down to the part of the code which does the 'prune'
operation.


Re: [PATCH 1/3] Documentation: allow overriding timestamps of generated asciidoc

2017-11-21 Thread Anders Kaseorg
On Tue, 21 Nov 2017, Jonathan Nieder wrote:
> http://asciidoc.org/CHANGELOG.html is stale but asciidoc still seems
> to be getting changes at https://github.com/asciidoc/asciidoc.  I
> wonder how difficult it would be to add any required SOURCE_DATE_EPOCH
> support there.

In fact I already did (https://github.com/asciidoc/asciidoc/pull/106), 
which is why I’d been holding off on trying to upstream this Git patch.  
The trouble was, the AsciiDoc developers had not been cutting new releases 
“because nobody knows how” 
(https://github.com/asciidoc/asciidoc/issues/103#issuecomment-322077321). 
However, it looks like AsciiDoc 8.6.10 was recently tagged and Debian got 
a 8.6.10-1 package yesterday, so I guess that trouble has been quietly 
resolved.

That should make this Git patch unnecessary.  (You’re of course still 
welcome to take it if you think build reproducibility with old AsciiDoc 
versions is worthwhile.)

Anders


Re: [PATCH v2] log: add option to choose which refs to decorate

2017-11-21 Thread Junio C Hamano
Rafael Ascensão  writes:

> When `log --decorate` is used, git will decorate commits with all
> available refs. While in most cases this the desired effect, under some

Missing verb.  s/this the/this may give the/; perhaps.

> conditions it can lead to excessively verbose output.

Other than that, I didn't find anything questionable in the
implementation, tests or doc updates.  Nicely done.

Will queue.

Thanks.


Re: [RFC PATCH] builtin/worktree: enhance worktree removal

2017-11-21 Thread Junio C Hamano
Eric Sunshine  writes:

> On Tue, Nov 21, 2017 at 10:14 PM, Eric Sunshine  
> wrote:
>> The erroring out in this case looks like simple oversight. Most
>> likely, this particular case did not occur to Duy. The code does
>> intentionally check the directory to see if it is dirty so that it can
>> warn the user (in which case the user can re-run with --force or take
>> other corrective action), but erroring out if the directory is merely
>
> "...erroring out if the directory _is missing_ is merely..."
>
>> an indirect (and unintended) result of trying to check for dirtiness.
>>
>> So, Kaatic's patch is intended to address that oversight (though I
>> haven't examined the implementation closely; I was just trying to
>> understand the reason for the patch).

OK, so the proposed log message was a bit confusing for those who
are *not* the person who wrote it (who knew why existing behaviour
was inadequate and did not describe how "worktree remove" would fail
under such a scenario to illustrate it, incorrectly assuming that
everybody who reads the proposed log message already *knows* how it
would fail).

"git worktree remove" removes both the named worktree
directory and the administrative information for it after
checking that there is no local modifications that would be
lost (which is a handy safety measure).  It however refuses
to work if the worktree directory is _already_ removed.

The user could use "git worktree prune" after seeing the
error and realizing the situation, but at that point, there
is nothing gained by leaving only the administrative data
behind.  Teach "git worktree remove" to go ahead and remove
the trace of the worktree in such a case.

or soemthing like that?


Re: [PATCH 2/2] Git/Packet.pm: use 'if' instead of 'unless'

2017-11-21 Thread Junio C Hamano
Jonathan Nieder  writes:

> [...]
>> --- a/perl/Git/Packet.pm
>> +++ b/perl/Git/Packet.pm
>> @@ -68,16 +68,16 @@ sub packet_bin_read {
>>  
>>  sub remove_final_lf_or_die {
>>  my $buf = shift;
>> -unless ( $buf =~ s/\n$// ) {
>
> For readability, I find this whitespace within parens more distracting.

I personally find them distracting, too.  This file seems full of
them so it is OK to be consistent with the existing practice in this
step, leaving the eventual clean-up (if we agree that it is a good
idea) to a later step that does it for the entire file contents.

Thanks.


Re: [PATCH 1/2] Git/Packet.pm: rename packet_required_key_val_read()

2017-11-21 Thread Junio C Hamano
Christian Couder  writes:

> The function calls itself "required", but it does not die when it
> sees an unexpected EOF.
> Let's rename it to "packet_key_val_read()".
>
> Signed-off-by: Christian Couder 
> ---
>
> These 2 patches are a late follow up from:
>
> https://public-inbox.org/git/cap8ufd2vk4jv7jebx3axd-dhfcsgsjvfft+pumdt1j8gd_o...@mail.gmail.com/

Thanks for tying the loose end.



Re: [PATCH 1/2] Git/Packet.pm: rename packet_required_key_val_read()

2017-11-21 Thread Junio C Hamano
Jonathan Nieder  writes:

> nit: please wrap lines to a consistent width, to make the message
> easier to read.  In the above, it looks like the line break is
> intentional --- is it meant to be two paragraphs (i.e. is it missing
> another newline)?

I'd think so; will add a missing LF while queuing..

> optional, just noticed while I'm nitpicking: the description 'rename
> packet_required_key_val_read' doesn't tell why the function is being
> renamed.  Maybe something like
>
>   Git::Packet: clarify that packet_required_key_val_read allows EOF
>
> would do the trick.

Sounds good. 

>> +# Read a text line and check that it is in the form "key=value"
>> +sub packet_key_val_read {
>
> This comment doesn't tell me how to use the function.  How do I detect
> whether it successfully read a line?  What do the return values
> represent?  What happens if the line it read doesn't match the key?

Would this work for both of you?

# Read a text packet, expecting that it is in the form "key=value" for
# the given $key.  An EOF does not trigger any error and is reported
# back to the caller (like packet_txt_read() does).  Die if the "key"
# part of "key=value" does not match the given $key, or the value part
# is empty.


Re: [PATCH] prune: add "--progress" to man page and usage msg

2017-11-21 Thread Junio C Hamano
"Robert P. J. Day"  writes:

> Add mention of git prune's "--progress" option to the SYNOPSIS and
> DESCRIPTION sections of the man page, and to the usage message of "git
> prune" itself.
>
> While we're here, move the explanation of "--" toward the end of the
> DESCRIPTION section, where it belongs.

Thanks, both changes make sense.  Will queue.

>
> Signed-off-by: Robert P. J. Day 
>
> ---
>
> diff --git a/Documentation/git-prune.txt b/Documentation/git-prune.txt
> index 7a493c80f..a37c0af93 100644
> --- a/Documentation/git-prune.txt
> +++ b/Documentation/git-prune.txt
> @@ -9,7 +9,7 @@ git-prune - Prune all unreachable objects from the object 
> database
>  SYNOPSIS
>  
>  [verse]
> -'git prune' [-n] [-v] [--expire ] [--] [...]
> +'git prune' [-n] [-v] [--progress] [--expire ] [--] [...]
>
>  DESCRIPTION
>  ---
> @@ -42,12 +42,15 @@ OPTIONS
>  --verbose::
>   Report all removed objects.
>
> -\--::
> - Do not interpret any more arguments as options.
> +--progress::
> + Show progress.
>
>  --expire ::
>   Only expire loose objects older than .
>
> +\--::
> + Do not interpret any more arguments as options.
> +
>  ...::
>   In addition to objects
>   reachable from any of our references, keep objects
> diff --git a/builtin/prune.c b/builtin/prune.c
> index cddabf26a..d2fdae680 100644
> --- a/builtin/prune.c
> +++ b/builtin/prune.c
> @@ -8,7 +8,7 @@
>  #include "progress.h"
>
>  static const char * const prune_usage[] = {
> - N_("git prune [-n] [-v] [--expire ] [--] [...]"),
> + N_("git prune [-n] [-v] [--progress] [--expire ] [--] 
> [...]"),
>   NULL
>  };
>  static int show_only;


Re: [PATCH v2] doc: Add missing "-n" (dry-run) option to reflog man page

2017-11-21 Thread Junio C Hamano
"Robert P. J. Day"  writes:

> While the "git reflog" man page supports both "--dry-run" and "-n" for
> a dry run, the man page mentions only the former, not the latter.
>
> Signed-off-by: Robert P. J. Day 
>
> ---

I have a suspicion that this was deliberately omitted in order to
keep the lines in the description short and concise.  Just adding
5-columns may appear not to hurt too much, but these things tend to
accumulate, so...

Queued, so that I won't lose sight of it, but won't merge unless
somebody else strongly feels about it.

Thanks.

>
>   sorry, i accidentally chopped off the leading lines of the patch in
> the earlier post.
>
> diff --git a/Documentation/git-reflog.txt b/Documentation/git-reflog.txt
> index 44c736f1a..472a6808c 100644
> --- a/Documentation/git-reflog.txt
> +++ b/Documentation/git-reflog.txt
> @@ -20,9 +20,9 @@ depending on the subcommand:
>  'git reflog' ['show'] [log-options] []
>  'git reflog expire' [--expire=] [--expire-unreachable=]
>   [--rewrite] [--updateref] [--stale-fix]
> - [--dry-run] [--verbose] [--all | ...]
> + [--dry-run | -n] [--verbose] [--all | ...]
>  'git reflog delete' [--rewrite] [--updateref]
> - [--dry-run] [--verbose] ref@\{specifier\}...
> + [--dry-run | -n] [--verbose] ref@\{specifier\}...
>  'git reflog exists' 
>
>  Reference logs, or "reflogs", record when the tips of branches and


Re: [RFC PATCH] builtin/worktree: enhance worktree removal

2017-11-21 Thread Eric Sunshine
On Tue, Nov 21, 2017 at 10:14 PM, Eric Sunshine  wrote:
> The erroring out in this case looks like simple oversight. Most
> likely, this particular case did not occur to Duy. The code does
> intentionally check the directory to see if it is dirty so that it can
> warn the user (in which case the user can re-run with --force or take
> other corrective action), but erroring out if the directory is merely

"...erroring out if the directory _is missing_ is merely..."

> an indirect (and unintended) result of trying to check for dirtiness.
>
> So, Kaatic's patch is intended to address that oversight (though I
> haven't examined the implementation closely; I was just trying to
> understand the reason for the patch).


Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1

2017-11-21 Thread Jeff King
On Wed, Nov 22, 2017 at 10:49:25AM +0900, Junio C Hamano wrote:

> WRT existing codepaths that pass 0{40} and refuses to notice a
> potential repository corruption (from getting a NULL for a non null
> object name), I think we would need a sweep of the codebase and fix
> them in the longer term.  As long as that will happen someday, either
> approach between "we know 'no loose object? let's redo the packs' is
> the part that matters performance-wise, so let's do a short-cut only
> for that" and "we know that callers that comes with 0{40} want to get
> NULL back" should be OK, I would think.

I agree. Let's go with the "v2 5/5" I posted then.

I'll try to work up a patch for the fetch.c case I found tomorrow, but I
suspect there are many more. But that's largely orthogonal to the
series.

-Peff


Re: [RFC PATCH] builtin/worktree: enhance worktree removal

2017-11-21 Thread Eric Sunshine
On Tue, Nov 21, 2017 at 9:12 PM, Junio C Hamano  wrote:
> Eric Sunshine  writes:
>> Let me see if I understand. Sometimes you know that you've deleted the
>> worktree directory, in which case 'git worktree prune' is the obvious
>> choice. However, there may be cases when you've forgotten that you
>> deleted the worktree directory (or it got deleted some other way), yet
>> it still shows up in "git worktree list" output with no indication
>> that it has been deleted (though, ideally, it should tell you so[1]).
>> In this case, you see a worktree that you know you no longer want, so
>> you invoke "git worktree remove" but that errors out because the
>> actual directory is already gone. This patch make the operation
>> succeed, despite the missing directory. Is that correct?
>
> Hmph, so the user could be using "git worktree prune" after seeing
> the error?  Was there a reason behind "git worktree remove" to be
> extra careful to make sure both existed before going forward, or was
> this a simple oversight?

The erroring out in this case looks like simple oversight. Most
likely, this particular case did not occur to Duy. The code does
intentionally check the directory to see if it is dirty so that it can
warn the user (in which case the user can re-run with --force or take
other corrective action), but erroring out if the directory is merely
an indirect (and unintended) result of trying to check for dirtiness.

So, Kaatic's patch is intended to address that oversight (though I
haven't examined the implementation closely; I was just trying to
understand the reason for the patch).


Re: [PATCH 1/3] Documentation: allow overriding timestamps of generated asciidoc

2017-11-21 Thread Junio C Hamano
Jonathan Nieder  writes:

>> touch -d @SECONDS isn't POSIX compliant, and non-Linux systems don't
>> provide it.  POSIX only allows certain fixed format, and I assume that
>> non-Linux parties (maybe OpenBSD) will want to have reproducible builds
>> as well.
>
> Interesting.  My knee-jerk preference is still to go with this patch
> as-is for now, since the non-portability only triggers when
> SOURCE_DATE_EPOCH is set.

As long as this patch is kept Debian-only, that is a sensible stance
to take.  I am not sure (note: this is different from "I do not think")
if it is also OK for the wider public, though.

I wondered if this affects the dirtyness of the build, regardless of
how file timestamps are mucked with.  It turns out that we do not
use "describe --dirty" in the GIT-VERSION-GEN script, so perhaps
it would be OK.




Re: [PATCH v3 00/33] Add directory rename detection to git

2017-11-21 Thread Junio C Hamano
Elijah Newren  writes:

> Interesting; tbdiff looks cool.  Junio hasn't queued this series yet,
> but it's easy enough to reconstruct the old one.  It does weigh in
> pretty heavy, and I'm slighly worried about gmail mangling all the
> lines, but I'm going to give it a shot anyway.  If it's too mangled,
> I'll try to repost using git-send-email.  It does weigh in at over
> 1600 lines, so it's not small.

It seems that you have installed tbdiff correctly.  The below seems
to match what I saw when I queued this round, relative to the
previous one.

What I often do when I see a new round of patches is:

$ git checkout en/rename-directory
$ git checkout master... ;# detach at the base of the old
$ git am -s mbox ;# take the new 
$ git tbdiff ..@{-1} @{-1}..

to compare the old and new.  Often (but not with this topic) earlier
parts of the topic are identical between the old and the new, so I
may rebase the new to preserve the commit timestamp of the old one
when it happens after the above sequence of commands.  For example,
if I see these in tbdiff

1: 7893bf1720 = 1: f17207893b commit #1
2: c291293b2e = 2: 93b2ec2912 commit #2
3: a7d3c870a3 ! 3: 87b5e236a1 commit #3
@@ ... @@@

then we know up to commit #2 are the same as before, so I'd do

$ git rebase --onto c29129eb2e 93b2ec2912

by using the two commit object names on the last "=" line in the
output.  Then, running the same tbdiff again:

$ git tbdiff ..@{-1} @{-1}..

would now show the output starting from "commit #3".


Re: [RFC PATCH] xdiff/xpatience: support anchoring a line

2017-11-21 Thread Junio C Hamano
Jonathan Tan  writes:

> Teach the patience diff to support prohibiting a user-specified line
> from appearing as a deletion or addition in the end result.
>
> Signed-off-by: Jonathan Tan 
> ---
> I'm sending this out to see if a change similar to this would be
> welcome. It is useful to me as a reviewer (to check my own code, e.g.
> when checking [1]). Probably more design needs to go into this,
> including the best way to specify the "anchor" line, and the correct
> behavior when the anchor is either not found or appears more than once.
>
> Any thoughts?

This is a natural extension of the idea the patience algorithm is
built upon.  If this were a cumulative command line option that can
be given to specify multiple lines and can be used across the diff
family, it would make a welcome addition, I would think.


Re: [PATCH] doc: remove explanation of "--" from man pages

2017-11-21 Thread Junio C Hamano
Kevin Daudt  writes:

> Although I agree that common options don't need to be explained
> everytime again, this change might make '--' even more obscure. To be
> honest, I didn't even know about gitcli(7), let alone most new users.
>
> In the #git irc channel we often have to explain what '--' means and
> why it's sometimes necessary.
>
> I don't however know a better solution to it more clear.

I do not agree with the starting thought of this patch in the first
place.  With the same logic, "git help" showing the most commonly
used subcommands, as "git help -a" has all the information, is
redundant and unwanted.  So is the synopsis section and "git $cmd
-h" that shows only commonly used options but not necessarily all of
them.

There may be some git-$foo manual page that do not describe how '--'
would be useful for the specific $foo subcommand that would become
more helpful to new readers if they did, and I think updating them
would be a better approach if we wanted to have consistency across
manual pages.



Re: [PATCH] rebase: rebasing can also be done when HEAD is detached

2017-11-21 Thread Junio C Hamano
Kaartic Sivaraam  writes:

> In a repository when attempting to rebase when the HEAD is detached
> and it is already up to date with upstream (so there's nothing to do),
> the following message is shown
>
> Current branch HEAD is up to date.
>
> which is clearly wrong as HEAD is not a branch.
>
> Handle the special case of HEAD correctly to give a more precise
> error message.
>
> Signed-off-by: Kaartic Sivaraam 
> ---
>  In this patch, I basically assumed that there would be no
>  branch named "HEAD".

Perhaps time to learn "git symbolic-ref HEAD" and use it instead of
depending on the name?


Re: [RFC PATCH] builtin/worktree: enhance worktree removal

2017-11-21 Thread Junio C Hamano
Eric Sunshine  writes:

> Let me see if I understand. Sometimes you know that you've deleted the
> worktree directory, in which case 'git worktree prune' is the obvious
> choice. However, there may be cases when you've forgotten that you
> deleted the worktree directory (or it got deleted some other way), yet
> it still shows up in "git worktree list" output with no indication
> that it has been deleted (though, ideally, it should tell you so[1]).
> In this case, you see a worktree that you know you no longer want, so
> you invoke "git worktree remove" but that errors out because the
> actual directory is already gone. This patch make the operation
> succeed, despite the missing directory. Is that correct?

Hmph, so the user could be using "git worktree prune" after seeing
the error?  Was there a reason behind "git worktree remove" to be
extra careful to make sure both existed before going forward, or was
this a simple oversight?


Re: [PATCH v2 4/4] builtin/branch: strip refs/heads/ using skip_prefix

2017-11-21 Thread Junio C Hamano
Eric Sunshine  writes:

> On Tue, Nov 21, 2017 at 2:12 PM, Kaartic Sivaraam
>  wrote:
>> On Wednesday 22 November 2017 12:08 AM, Eric Sunshine wrote:
>>> The original code unconditionally uses "+ 11", which says that the
>>> prefix is _always_ present. This commit message muddies the waters [...]
>>
>> That muddiness of that statement is a consequence of my recent encounter[1]
>> in which the assumption (that the prefix(refs/heads/ always exists) of that
>> code failed. I had a little suspicion, when I wrote that commit message,
>> that there might be other cases in which assumption might fail. The issue
>> has been resolved only in 3/4 of jc/branch-name-sanity but that was only
>> after I wrote the commit message initially.  So, it does make sense to
>> remove that muddiness now. Thanks for noting that.
>>
>> [1]: Note the 'warning: ' message in the following mail. That ugly '|�?' was
>> a consequence of the assumption that the 'prefix' always existed!
>> https://public-inbox.org/git/1509209933.2256.4.ca...@gmail.com/
>
> Thanks for the explanation and history reference.

I have a suspicion that it wasn't the case.  The ugly uninitialized
byte sequence came from an error codepath of a function that is asked
to fill a strbuf with "refs/heads/$something" returning early when it
found an error in the input, without realizing that the caller still
looks at the strbuf even when it got an error.  That caller-callee pair
was updated.

It is just a bug and +11 is always correct; passing the data that
does not begin with "refs/heads/" there, including the case where an
uninitialized buffer is passed, is simply a bug.

In other words, skip_prefix() is a good change, but if we are to use
it, we should also check the result and error out if we do not see
"refs/heads/" there--you already bothered to spend extra cycles for
that.


Re: core.safecrlf warning is confusing[improvement suggestion?]

2017-11-21 Thread Junio C Hamano
Torsten Bögershausen  writes:

>> I want to have LF line endings in the repository and CRLF endings in
>> the working copy. (Because I use windows-exclusive tools to develop.)
>
> Side note: If you ever want to push your repository somewhere,
> it would be good practice to have a .gitattributes file:
> ...

Now we got your attention ;-)

What would be the BCP we would give if somebody has just a tarball
without .git that has LF endings?

$ git init a-project
$ cd a-project
$ tar xf ../a-project.tar
$ git add .
$ git commit -m 'Initial import'

would achieve one half of the original wish (i.e. "I want to end up
with repository data in LF eol"); disabling the "safe crlf" before
running that "git add ." step may also not be a bad idea, because it
reduces the number of things that can get in the way by one.

But the above also leaves the "working tree" files in LF eol
(i.e. it goes against "I want to work with CRLF in my working
tree").  What would be our recommendation?

One big-hammer way I can think of is

$ git rm -f .
$ git reset --hard

and that actually may be a good enough solution, given that you'd be
doing this just once at the beginning of "your" project that starts
from an inherited code drop.


Re: [PATCH v3 0/8] Coping with unrecognized ssh wrapper scripts in GIT_SSH

2017-11-21 Thread Junio C Hamano
Jonathan Nieder  writes:

> Jonathan Nieder (8):
>   ssh test: make copy_ssh_wrapper_as clean up after itself
>   connect: move no_fork fallback to git_tcp_connect
>   connect: split git:// setup into a separate function
>   connect: split ssh command line options into separate function
>   connect: split ssh option computation to its own function
>   ssh: 'auto' variant to select between 'ssh' and 'simple'
>   ssh: 'simple' variant does not support -4/-6
>   ssh: 'simple' variant does not support --port

Thanks.  All looked sensible.  With this, we can unblock both topics
;-)


Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1

2017-11-21 Thread Junio C Hamano
Jeff King  writes:

> Here's a re-roll of patch 5 that behaves this way (the patch should be
> unsurprising, but I've updated the commit message to match).
>
> I did notice one other thing: the function looks up replace objects, so
> we have both the original and the replaced sha1 available. My earlier
> patch used the original sha1, but this one uses the replaced value.
> I'm not sure if that's sane or not. It lets the fast-path kick in if you
> point a replace ref at 0{40}. And it lets you point refs/replace/0{40}
> to something else. I doubt that's useful, but it could perhaps be for
> debugging, etc.
>
> In most cases, of course, it wouldn't matter (since pointing to or from
> the null sha1 is vaguely crazy in the first place).

I tend to agree that those who go crazy/fancy with replace mechanism
can keep both halves when it breaks.

WRT existing codepaths that pass 0{40} and refuses to notice a
potential repository corruption (from getting a NULL for a non null
object name), I think we would need a sweep of the codebase and fix
them in the longer term.  As long as that will happen someday, either
approach between "we know 'no loose object? let's redo the packs' is
the part that matters performance-wise, so let's do a short-cut only
for that" and "we know that callers that comes with 0{40} want to get
NULL back" should be OK, I would think.





Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1

2017-11-21 Thread Junio C Hamano
Jeff King  writes:

> I'm not sure what the right behavior is, but I'm pretty sure that's not
> it. Probably one of:
>
>   - skip updating the ref when we see the breakage
>
>   - ditto, but terminate the whole operation, since we might be deleting
> other refs and in a broken repo we're probably best to make as few
> changes as possible
>
>   - behave as if it was a non-ff, which would allow "--force" to
> overwrite the broken ref. Maybe convenient for fixing things, but
> possibly surprising (and it's not that hard to just delete the
> broken refs manually before proceeding).

Perhaps the last one would be the ideal endgame, but the second one
may be a good stopping point in the shorter term.



Re: t3512 & t3513 'unexpected passes'

2017-11-21 Thread Junio C Hamano
Adam Dinwoodie  writes:

> I've seen the same unexpected passes, and had just completed the same
> bisect run myself, although I fixed the build failure by cherry-picking
> 82921316a ("SQUASH???", 2017-11-18) onto commits that wouldn't build,
> given that commit seems to exist entirely to fix that build breakage.

Your assumption is correct wrt "SQUASH???".  When I am in a hurry
and cannot spend extra cycles to pinpoint which one of the commits
in a multi-patch series breaks a build, I just fix the breakage on
top of the tip commit, and leave it to the original author of the
series to apply them down to the original breakage in individual
commits when the topic is rerolled.

It seems that the compilation breakage 82921316a fixes comes from
its immediate parent, so I'll squash them together, just in case a
rerolled topic does not materialize.

Thanks.



Re: Draft of Git Rev News edition 33

2017-11-21 Thread Jonathan Nieder
Hi,

Christian Couder wrote:
> On Tue, Nov 21, 2017 at 2:10 AM, Jonathan Nieder  wrote:

>> That said, I believe that the gitattributes(5) manpage does an okay
>> job of covering this and that that thread came to a clear conclusion:
>>
>>   
>> https://public-inbox.org/git/20171026203046.fu3z5ngnw7hck...@aiede.mtv.corp.google.com/
>>
>> I commented at [1] that I found the conclusion of the rev news entry
>> misleading and confusing but it doesn't appear that there is anything
>> I can do about that.
>
> Well, you could have provided a pull request or otherwise suggested
> what you think would be a better conclusion for the article and why.
>
> If you just say that the above email is the conclusion, when it seems
> to me that another email from someone else is also a conclusion with a
> quite different outcome, it does not help much come to an agreement
> about what should be reported as the conclusion of the thread.

This is something I suspect journalists have to deal with all the
time: when one of the subjects of an article feels misrepresented
(which happens inevitably when writing to a deadline), that comes with
a feeling of powerlessness that can lead to grumpiness and harsh
words.

In the end you ended up improving the text enough that I don't mind
any more.  Sorry for the bumpy road along the way.

Thanks,
Jonathan


Re: [PATCH v5 0/6] Partial clone part 1: object filtering

2017-11-21 Thread Jonathan Tan
On Tue, 21 Nov 2017 20:58:46 +
Jeff Hostetler  wrote:

> From: Jeff Hostetler 
> 
> Here is V5 of the list-object filtering, rev-list, and pack-objects.
> 
> This version addresses comments on the V4 series.  I removed the
> questionable character encoding scheme.  And I removed or clarified
> use of the term "partial clone" to refer to a future feature.
> 
> Jeff Hostetler (6):
>   dir: allow exclusions from blob in addition to file
>   oidmap: add oidmap iterator methods
>   oidset: add iterator methods to oidset
>   list-objects: filter objects in traverse_commit_list
>   rev-list: add list-objects filtering support
>   pack-objects: add list-objects filtering

I checked the diff against v4 and this looks good.

I'm still unsure if pre-parsing the --filter argument into a struct
list_objects_filter_options is the best approach API-wise in the case
that we plan to send it to the server, but it does have the benefit of
failing (and informing the user) early if the filter is in the wrong
format, so I'm fine with this patch set as-is.


Re: [PATCH] git-send-email: fix get_maintainer.pl regression

2017-11-21 Thread Junio C Hamano
Thomas Adam  writes:

> Trying to come up with a reinvention of regexps for email addresses is asking
> for trouble, not to mention a crappy rod for your own back.  Don't do that.
> This is why people use Mail::Address.
>
> https://metacpan.org/pod/distribution/MailTools/lib/Mail/Address.pod

Now we are coming back to cc907506 ("send-email: don't use
Mail::Address, even if available", 2017-08-23).  It argues

* Having this optional Mail::Address makes it tempting to anwser "please
  install Mail::Address" to users instead of fixing our own code. We've
  reached the stage where bugs in our parser should be fixed, not worked
  around.

but if it costs us maintaining our substitute that much, it seems to
me that depending on Mail::Address is not just tempting but may be a
sensible way forward.

Was there any reason why Mail::Address was _inadequate_?  I know we
had trouble with random garbage that are *not* addresses people put
on the in-body CC: trailer in the past, but I do not recall if they
are something Mail::Address would give worse result and we need our
workaround (hence our own substitute), or Mail::Address would handle
them just fine.


Re: Draft of Git Rev News edition 33

2017-11-21 Thread Junio C Hamano
Christian Couder  writes:

> On Tue, Nov 21, 2017 at 2:12 PM, Junio C Hamano  wrote:
>> I just re-read the sub-thread Jonathan pointed at, and to me it does
>> look like the original request was adequately addressed and talk has
>> concluded at around that article.
>>
>> I somehow was hoping that Jonathan's citing the above URL is a
>> suggestion enough for the editors, especially given that the
>> announcement for the draft invites a reply to this thread.
>
> DId you also read https://github.com/git/git.github.io/issues/262?

No, but I just did.  I didn't know the purpose of the Rev News
changed over time from helping those who do not have time to follow
the list by highlighting useful discussions and giving them concise
summary to just quoting pieces from messages in interesting
discussions and have the readers draw their own conclusions (or lack
thereof).



Re: [PATCH v3 3/3] worktree: make add dwim

2017-11-21 Thread Junio C Hamano
Thomas Gummerer  writes:

> I didn't consider that, I think you are right, and the flag should
> apply in that case as well.  I think at that point we may as well pass
> this flag through to the 'git branch' call, and let users set up
> tracking if they want to, the same way it works in 'git branch'.

OK, so tracking is set up by default in the current implementation
of "git worktree" (even without your proposed update), but we will
stop doing so, and instead take an explicit "--track" option (or
"--no-track" to countermand an earlier "--track" on the command line
and/or a default configured with branch.autosetupmerge) just like
"git branch" does?

I think that it is very sensible thing to make sure that "branch",
"checkout -b" and "worktree", i.e. the three ways to create a branch
to work on (the latter two being short-hands), behave consistently.

Thanks.


Re: [PATCH 1/3] Documentation: allow overriding timestamps of generated asciidoc

2017-11-21 Thread Jonathan Nieder
Hi,

brian m. carlson wrote:
> On Tue, Nov 21, 2017 at 03:34:32PM -0800, Jonathan Nieder wrote:

>> --- a/Documentation/Makefile
>> +++ b/Documentation/Makefile
>> @@ -410,6 +410,7 @@ $(patsubst %.txt,%.texi,$(MAN_TXT)): %.texi : %.xml
>>  howto-index.txt: howto-index.sh $(wildcard howto/*.txt)
>>  $(QUIET_GEN)$(RM) $@+ $@ && \
>>  '$(SHELL_PATH_SQ)' ./howto-index.sh $(sort $(wildcard howto/*.txt)) 
>> >$@+ && \
>> +$(if $(SOURCE_DATE_EPOCH),touch -d '@$(SOURCE_DATE_EPOCH)' $@+ &&) \
>
> touch -d @SECONDS isn't POSIX compliant, and non-Linux systems don't
> provide it.  POSIX only allows certain fixed format, and I assume that
> non-Linux parties (maybe OpenBSD) will want to have reproducible builds
> as well.

Interesting.  My knee-jerk preference is still to go with this patch
as-is for now, since the non-portability only triggers when
SOURCE_DATE_EPOCH is set.

> It's unfortunate for shell users that this variable is in seconds from
> the epoch, since there's no portable way to format such a time in shell.
> (POSIX doesn't allow date(1) to format anything but the current time.)
>
> My proposed solution was to use Perl to do so, and simply require that
> if you wanted a reproducible build, then you had to have Perl.  That
> would, of course, require a separate variable in the Makefile holding
> the formatted date.
>
> Maybe something like the following in the Makefile:
>
> ifndef NO_PERL
> SOURCE_DATE_TIMESTAMP=$(shell perl -MPOSIX -e 'print strftime("%FT%TZ", 
> gmtime($ENV{SOURCE_DATE_EPOCH}));')
> endif
>
> and then:
>
> + $(if $(SOURCE_DATE_TIMESTAMP),touch -d '$(SOURCE_DATE_TIMESTAMP)' $@+ 
> &&) \

Neat.  I can play with this a little.

http://asciidoc.org/CHANGELOG.html is stale but asciidoc still seems
to be getting changes at https://github.com/asciidoc/asciidoc.  I
wonder how difficult it would be to add any required SOURCE_DATE_EPOCH
support there.

Longer term, I wonder what it would take to move to a markup language
that is more widely known, like commonmark.

Thanks,
Jonathan


Re: Documentation of post-receive hook

2017-11-21 Thread Junio C Hamano
Christoph Michelbach  writes:

> On November 20, 2017 2:17:58 AM GMT+01:00, 
>>How about this rewrite?  Would it consider all the points raised and
>>make it easier to understand?
>>
>>This hook is invoked by 'git-receive-pack' when it reacts to
>>'git push' and updates reference(s) in its repository.
>
> I think it's much more intelligible but a hint as to when this
> happens wouldn't hurt. E.g.: "This does not happen if the user
> receives the message 'Already up-to-date'." That is if this is
> correct, of course.

Your suggesting to mention that particular message hints at me that
you feel that the users may not necessarily understand that push did
not result in any update of references on the other side when they
see it.  If the message was clear enough to them, "when it reacts to
push and updates" ought to be clear enough description, too.

And if that indeed is the case (and I would not be surprised if it
is, but I suspect that most users are clueful enough), it is not the
documentation, but the "Already up-to-date" message, that needs to
be clarified, no?

Besides, we'd rather not cast the end-user facing message in stone
in the documentation like that (especially when the message has
known room for improvement and will change).


Re: [PATCH v3 00/33] Add directory rename detection to git

2017-11-21 Thread Elijah Newren
On Tue, Nov 21, 2017 at 4:42 PM, Stefan Beller  wrote:
> On Tue, Nov 21, 2017 at 12:00 AM, Elijah Newren  wrote:
>> This patchset introduces directory rename detection to merge-recursive; I'm
>> resubmitting just a few hours after my PATCHv2 because I didn't know about
>> the DEVELOPER=1 flag previously, and my code had a number of
>> warnings/errors.  I would have just submitted fixup/squash patches, but
>> when I checked, there sadly they cause merge conflicts when rebasing
>>
>> See https://public-inbox.org/git/20171110190550.27059-1-new...@gmail.com/
>> for the first series, design considerations, etc, and
>> https://public-inbox.org/git/20171120220209.15111-1-new...@gmail.com/ for
>> v2.
>
> Thanks, I'll take a look!
>
> Protip: To make it easy for reviewers add an interdiff[1] between the 
> different
> versions of the patch series, this can be done via tbdiff[2] for example,
> or in case you still have the old branch around or Junio has it queued 
> already,
> you can do a diff against that branch.

Thanks!

Interesting; tbdiff looks cool.  Junio hasn't queued this series yet,
but it's easy enough to reconstruct the old one.  It does weigh in
pretty heavy, and I'm slighly worried about gmail mangling all the
lines, but I'm going to give it a shot anyway.  If it's too mangled,
I'll try to repost using git-send-email.  It does weigh in at over
1600 lines, so it's not small.


$ git tbdiff --no-color v2.15.0..detect-directory-renames-v1
origin/master..detect-directory-renames
 1: 054c9c1a76 !  1: a0abcb1378 Tighten and correct a few testcases
for merging and cherry-picking
@@ -2,18 +2,21 @@

 Tighten and correct a few testcases for merging and cherry-picking

-t3501 had a testcase originally added to ensure cherry-pick wouldn't
-segfault when working with a dirty file involved in a rename.  While
-the segfault was fixed, there was another problem this test
demonstrated:
-namely, that git would overwrite a dirty file involved in a rename.
-Further, the test encoded a "successful merge" and overwriting of this
-file as correct behavior.  Modify the test so that it would still catch
-the segfault, but to require the correct behavior.
+t3501 had a testcase originally added in 05f2dfb965 (cherry-pick:
+demonstrate a segmentation fault, 2016-11-26) to ensure cherry-pick
+wouldn't segfault when working with a dirty file involved in a rename.
+While the segfault was fixed, there was another problem this test
+demonstrated: namely, that git would overwrite a dirty file
involved in a
+rename.  Further, the test encoded a "successful merge" and
overwriting of
+this file as correct behavior.  Modify the test so that it would still
+catch the segfault, but to require the correct behavior.  Mark it as
+test_expect_failure for now too, since this second bug is not
yet fixed.

-t7607 had a test specific to looking for a merge overwriting
a dirty file
-involved in a rename, but it too actually encoded what I would term
-incorrect behavior: it expected the merge to succeed.  Fix
that, and add
-a few more checks to make sure that the merge really does produce the
+t7607 had a test added in 30fd3a5425 (merge overwrites
unstaged changes in
+renamed file, 2012-04-15) specific to looking for a merge overwriting a
+dirty file involved in a rename, but it too actually encoded
what I would
+term incorrect behavior: it expected the merge to succeed.
Fix that, and
+add a few more checks to make sure that the merge really does
produce the
 expected results.

 Signed-off-by: Elijah Newren 
 2: 80e8d435ad !  2: e8745c4f1b merge-recursive: fix logic ordering issue
@@ -1,6 +1,6 @@
 Author: Elijah Newren 

-merge-recursive: Fix logic ordering issue
+merge-recursive: fix logic ordering issue

 merge_trees() did a variety of work, including:
   * Calling get_unmerged() to get unmerged entries
 3: 2714320c37 !  3: 0ae4156994 merge-recursive: add explanation for
src_entry and dst_entry
@@ -1,6 +1,6 @@
 Author: Elijah Newren 

-merge-recursive: Add explanation for src_entry and dst_entry
+merge-recursive: add explanation for src_entry and dst_entry

 If I have to walk through the debugger and inspect the values found in
 here in order to figure out their meaning, despite having known these
@@ -17,17 +17,14 @@
  struct rename {
  struct diff_filepair *pair;
 +/*
-+ * Because I keep forgetting every few years what src_entry and
-+ * dst_entry are and have to walk through a debugger and puzzle
-+ * through it to remind myself...
++ * Purpose of src_entry and dst_entry:
 + *
 + * 

Re: [PATCH] recursive submodules: detach HEAD from new state

2017-11-21 Thread Junio C Hamano
Jonathan Nieder  writes:

> The thread I am replying to appears to be where the patch comes from
> but I have some memories of more recent discussion that I'm not
> finding.
>
> More context:
> https://public-inbox.org/git/xmqqshd8i3ze@gitster.mtv.corp.google.com/
> says
>
>  * sb/submodule-recursive-checkout-detach-head (2017-07-28) 2 commits
>   - Documentation/checkout: clarify submodule HEADs to be detached
>   - recursive submodules: detach HEAD from new state
>
>   "git checkout --recursive" may overwrite and rewind the history of
>   the branch that happens to be checked out in submodule
>   repositories, which might not be desirable.  Detach the HEAD but
>   still allow the recursive checkout to succeed in such a case.
>
>   Expecting a reroll.

Sorry, I should have done my usual "cf. " to help me
recalling the discussion that lead to the marking there.

We kicked it back to 'pu' after the discussion that had
, and the "send out a
plan" sort-of happened with <20171109001007.11894-1-sbel...@google.com>
which seemed to have converged to a conclusion in the subthread with

where a preferred way would be to detach and opportunistically reattach
to keep the illusion of submodule being on a branch as much as possible
(correct me if I am misunderstanding the consensus).  

I had a suspicion that such a random re-attachment may lead to an
unpredictable behaviour from end-user's point of view that is
confusing, but there wasn't a concrete patch to do so, so that was
why I was waiting for a reroll so that people can take a look at it
and see how well it fares.

The responses in this thread seems to indicate that now we are
punting on that re-attachment plan and want to give this "always
detach" to the end users, which I think is a lot more predictable
thing to do.  After such a recursive checkout from the top-level,
any work done in the submodule from that state and is referenced
from the top-level (recorded presumably with a recursive commit) is
not referenced by anything other than the reflog of HEAD in the
submodule repository, and I suspect many users are not used to and
are comfortable with working on a detached HEAD for extended period
of time, so this new behaviour might deserve a stronger warning to
help them, but I am OK with the stance "We'll learn as we go---let's
merge it as-is and see what happens".

Thanks for prodding.  One less topic that is stale but has to be
carried around in 'pu' is always a good thing ;-)





Re: [PATCH 1/3] Documentation: allow overriding timestamps of generated asciidoc

2017-11-21 Thread brian m. carlson
On Tue, Nov 21, 2017 at 03:34:32PM -0800, Jonathan Nieder wrote:
> From: Anders Kaseorg 
> Date: Wed, 30 Nov 2016 22:21:15 -0500
> 
> Allow overriding the timestamp in generated documentation by setting
> SOURCE_DATE_EPOCH to the number of seconds since 1970-01-01 00:00:00
> UTC to use.
> 
> This makes the generated documentation reproducible from the source
> code as long as that variable is set, without losing the last-modified
> dates in the default build.

Thanks for this.  I had planned on either submitting this patch myself,
or working on a similar one, but I ran into the issue I'll mention
below and hadn't finished looking at it.

My research on this determined that Asciidoctor 1.5.5 and newer handle
this properly, because they honor SOURCE_BUILD_EPOCH.  It's only with
AsciiDoc that this is an issue.

> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index 2ab65561af..dfec29f36f 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -410,6 +410,7 @@ $(patsubst %.txt,%.texi,$(MAN_TXT)): %.texi : %.xml
>  howto-index.txt: howto-index.sh $(wildcard howto/*.txt)
>   $(QUIET_GEN)$(RM) $@+ $@ && \
>   '$(SHELL_PATH_SQ)' ./howto-index.sh $(sort $(wildcard howto/*.txt)) 
> >$@+ && \
> + $(if $(SOURCE_DATE_EPOCH),touch -d '@$(SOURCE_DATE_EPOCH)' $@+ &&) \

touch -d @SECONDS isn't POSIX compliant, and non-Linux systems don't
provide it.  POSIX only allows certain fixed format, and I assume that
non-Linux parties (maybe OpenBSD) will want to have reproducible builds
as well.

It's unfortunate for shell users that this variable is in seconds from
the epoch, since there's no portable way to format such a time in shell.
(POSIX doesn't allow date(1) to format anything but the current time.)

My proposed solution was to use Perl to do so, and simply require that
if you wanted a reproducible build, then you had to have Perl.  That
would, of course, require a separate variable in the Makefile holding
the formatted date.

Maybe something like the following in the Makefile:

ifndef NO_PERL
SOURCE_DATE_TIMESTAMP=$(shell perl -MPOSIX -e 'print strftime("%FT%TZ", 
gmtime($ENV{SOURCE_DATE_EPOCH}));')
endif

and then:

+   $(if $(SOURCE_DATE_TIMESTAMP),touch -d '$(SOURCE_DATE_TIMESTAMP)' $@+ 
&&) \
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v3 00/33] Add directory rename detection to git

2017-11-21 Thread Stefan Beller
On Tue, Nov 21, 2017 at 12:00 AM, Elijah Newren  wrote:
> This patchset introduces directory rename detection to merge-recursive; I'm
> resubmitting just a few hours after my PATCHv2 because I didn't know about
> the DEVELOPER=1 flag previously, and my code had a number of
> warnings/errors.  I would have just submitted fixup/squash patches, but
> when I checked, there sadly they cause merge conflicts when rebasing
>
> See https://public-inbox.org/git/20171110190550.27059-1-new...@gmail.com/
> for the first series, design considerations, etc, and
> https://public-inbox.org/git/20171120220209.15111-1-new...@gmail.com/ for
> v2.

Thanks, I'll take a look!

Protip: To make it easy for reviewers add an interdiff[1] between the different
versions of the patch series, this can be done via tbdiff[2] for example,
or in case you still have the old branch around or Junio has it queued already,
you can do a diff against that branch.

I just looked through recent cover letters and it doesn't seem to be common
(any more?), or just not suitable for the series currently in flight.

[1] for example
https://public-inbox.org/git/20171116020039.17810-1-sbel...@google.com/
[2] https://github.com/trast/tbdiff

Stefan


Re: [PATCH 1/5] p5550: factor our nonsense-pack creation

2017-11-21 Thread Stefan Beller
On Tue, Nov 21, 2017 at 7:58 AM, Jeff King  wrote:
> On Mon, Nov 20, 2017 at 06:55:51PM -0500, Eric Sunshine wrote:
>
>> On Mon, Nov 20, 2017 at 3:26 PM, Jeff King  wrote:
>> > p5550: factor our nonsense-pack creation
>>
>> s/our/out/, I guess.
>
> Heh, yes. I even fixed it once, but I have the funny habit of noticing
> such typos while reading the "todo" list of "rebase -i" and fixing them
> there. Which of course has no impact whatsoever on the commit. :-/
>

That happened to me a couple of times as well before.
This sounds like a UX bug on first thought.

On second thought the text after the abbreviated hash can be
user dependent IIRC, by setting some format option how to populate the
rebase instruction sheet using log, so it would not be easy to take
fixes from that line into the commit for a fixup.


Re: [PATCH] list-objects-filter-options: fix up some sparse warnings

2017-11-21 Thread Stefan Beller
> [It would probably be easier if I used git to output this for me, rather
> than typing it into my email client!]

git config alias.gcs "show --date=short -s --pretty='format:%h ("%s", %ad)'"

will make you the `git gcs` alias that I use to describe commits.


Re: [PATCH v3 0/8] Coping with unrecognized ssh wrapper scripts in GIT_SSH

2017-11-21 Thread Stefan Beller
On Mon, Nov 20, 2017 at 2:32 PM, Brandon Williams  wrote:
> On 11/20, Jonathan Nieder wrote:
>> Previously: [1].
>>
>> This version should be essentially identical to v2.  Changes:
>> - patch 1 is new and should fix the test failure on Windows
>> - patch 2 is new, discussed at [2]
>> - patch 5 split off from patch 6 as suggested at [3]
>> - patch 6 commit message got two new notes to address the worries
>>   from [3]
>>
>> Thanks for the helpful reviews, and sorry to take so long to get this
>> out.  Thoughts of all kinds welcome, as always.
>
> Just finished looking through the series.  Looks good overall!
>
> Thanks again for getting this out!

Same here,

Thanks,
Stefan


Re: Documentation of post-receive hook

2017-11-21 Thread Christoph Michelbach


On November 20, 2017 2:17:58 AM GMT+01:00, 
>How about this rewrite?  Would it consider all the points raised and
>make it easier to understand?
>
>This hook is invoked by 'git-receive-pack' when it reacts to
>'git push' and updates reference(s) in its repository.

I think it's much more intelligible but a hint as to when this happens wouldn't 
hurt. E.g.: "This does not happen if the user receives the message 'Already 
up-to-date'." That is if this is correct, of course.


[PATCH] defer expensive load_ref_decorations until needed

2017-11-21 Thread Phil Hord
With many thousands of references, a simple `git rev-parse HEAD` may take
more than a second to return because it first loads all the refs into
memory even though it will never use them.

Defer loading any references until we actually need them.

Signed-off-by: Phil Hord 
---
 log-tree.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index 3b904f037..c1509f8b9 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -84,8 +84,10 @@ void add_name_decoration(enum decoration_type type, const 
char *name, struct obj
res->next = add_decoration(_decoration, obj, res);
 }
 
+static void maybe_load_ref_decorations();
 const struct name_decoration *get_name_decoration(const struct object *obj)
 {
+   maybe_load_ref_decorations();
return lookup_decoration(_decoration, obj);
 }
 
@@ -150,10 +152,13 @@ static int add_graft_decoration(const struct commit_graft 
*graft, void *cb_data)
 
 void load_ref_decorations(int flags)
 {
-   if (!decoration_loaded) {
+   decoration_flags = flags;
+}
 
+static void maybe_load_ref_decorations()
+{
+   if (!decoration_loaded) {
decoration_loaded = 1;
-   decoration_flags = flags;
for_each_ref(add_ref_decoration, NULL);
head_ref(add_ref_decoration, NULL);
for_each_commit_graft(add_graft_decoration, NULL);
-- 
2.15.0.471.g17a719cfe.dirty



Re: [PATCH 1/8 v2] ssh test: make copy_ssh_wrapper_as clean up after itself

2017-11-21 Thread Stefan Beller
On Mon, Nov 20, 2017 at 5:49 PM, Jonathan Nieder  wrote:

>  @@ -317,7 +317,7 @@ test_expect_success 'set up ssh wrapper' '
>
>   copy_ssh_wrapper_as () {
> cp "$TRASH_DIRECTORY/ssh$X" "${1%$X}$X" &&
>  -  test_when_finished "rm -f ${1%$X}$X" &&
>  +  test_when_finished "rm $(git rev-parse --sq-quote "${1%$X}$X")" &&

I wondered why the line above doesn't need the same treatment, but there
the argument is quoted, in this line we cannot use quotation as we are already
using it for bundle up the argument for test_when_finished.

The patch looks good.
Stefan


[PATCH 3/3] generate-cmdlist: avoid non-deterministic output

2017-11-21 Thread Jonathan Nieder
Date: Fri, 1 Jul 2016 17:32:00 -0700

Non-determinism makes it harder for build tools to discover when a
target needs to be rebuilt.

generate-cmdlist.sh stores the full path in a comment:

 /* Automatically generated by /build/git-agojiD/git-2.15.0/generate-cmdlist.sh 
*/

Use the file name alone instead.

Signed-off-by: Jonathan Nieder 
---
That's the end of the series.  Thanks for reading.

 generate-cmdlist.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/generate-cmdlist.sh b/generate-cmdlist.sh
index ab0d1b0c06..eeea4b67ea 100755
--- a/generate-cmdlist.sh
+++ b/generate-cmdlist.sh
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-echo "/* Automatically generated by $0 */
+echo "/* Automatically generated by generate-cmdlist.sh */
 struct cmdname_help {
char name[16];
char help[80];
-- 
2.15.0.448.gf294e3d99a



[PATCH 2/3] git-gui: sort entries in optimized tclIndex

2017-11-21 Thread Jonathan Nieder
From: Anders Kaseorg 
Date: Wed, 16 Nov 2016 16:37:17 -0500

auto_mkindex expands wildcards in directory order, which depends on
the underlying filesystem.  To improve build reproducibility, sort the
list of *.tcl files in the Makefile.

The unoptimized loading case was previously fixed in gitgui-0.21.0~14
(git-gui: sort entries in tclIndex, 2015-01-26).

Signed-off-by: Anders Kaseorg 
Signed-off-by: Jonathan Nieder 
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 918a8de369..f10caedaa7 100644
--- a/Makefile
+++ b/Makefile
@@ -254,7 +254,7 @@ $(ALL_MSGFILES): %.msg : %.po
 lib/tclIndex: $(ALL_LIBFILES) GIT-GUI-VARS
$(QUIET_INDEX)if echo \
  $(foreach p,$(PRELOAD_FILES),source $p\;) \
- auto_mkindex lib '*.tcl' \
+ auto_mkindex lib $(patsubst lib/%,%,$(sort $(ALL_LIBFILES))) \
| $(TCL_PATH) $(QUIET_2DEVNULL); then : ok; \
else \
 echo >&2 "* $(TCL_PATH) failed; using unoptimized loading"; \
-- 
2.15.0.448.gf294e3d99a



Re: [PATCH 6/6] grep: show non-empty lines before functions with -W

2017-11-21 Thread Stefan Beller
On Mon, Nov 20, 2017 at 2:07 PM, René Scharfe  wrote:
> Am 20.11.2017 um 21:39 schrieb Stefan Beller:
>> On Sat, Nov 18, 2017 at 10:08 AM, René Scharfe  wrote:
>>> Non-empty lines before a function definition are most likely comments
>>> for that function and thus relevant.  Include them in function context.
>>>
>>> Such a non-empty line might also belong to the preceding function if
>>> there is no separating blank line.  Stop extending the context upwards
>>> also at the next function line to make sure only one extra function body
>>> is shown at most.
>>
>> Can we add another heuristic, that checks for common function body ends, e.g.
>> if the preceding line contains '}' but is not commented (the line doesn't
>> contain '*/' '//', '#'), we have a strong hint that it is a function, not an
>> additional comment.
>
> C comments containing "}" as part of the text would only be shown
> partially, e.g:
>
> /*
>  * Not shown because of the curly closing brace in ${PATH}.
>  * Shown.
>  */
>
> Two examples in git's repo are in refs.h and sha1-lookup.c.
>
> Before diving deeper: Is it worth it?  Does the heuristic in this series
> produce excessive context often?  Enough to be annoying?

We'll find out... I was just spurting out my thought of the day.
Sorry for the noise.


[PATCH 1/3] Documentation: allow overriding timestamps of generated asciidoc

2017-11-21 Thread Jonathan Nieder
From: Anders Kaseorg 
Date: Wed, 30 Nov 2016 22:21:15 -0500

Allow overriding the timestamp in generated documentation by setting
SOURCE_DATE_EPOCH to the number of seconds since 1970-01-01 00:00:00
UTC to use.

This makes the generated documentation reproducible from the source
code as long as that variable is set, without losing the last-modified
dates in the default build.

With this change, the package passes Debian's build reproducibility
test (https://wiki.debian.org/ReproducibleBuilds/TimestampsProposal).

The goal is to make it easier to verify that binary packages of open
source projects were built from the source they were claimed to have
been built from.  https://reproducible-builds.org/ has more details.

Signed-off-by: Anders Kaseorg 
Signed-off-by: Jonathan Nieder 
---
Perhaps this should e.g. be taking the latest timestamp of all its
inputs.  That would be straightforward to do, but what's here is what
we've been running with for the past year, so I'd rather stick to it,
at least as a starting point.

Another tweak I'd be interested in is allowing asciidoc to take the
timestamp as a parameter instead of inferring it from mtimes.
Asciidoc accepts an "--attribute footer-style=none" parameter, but I'm
not aware of an "--attribute footer-date=" parameter to keep the
footer but override its date.

 Documentation/Makefile   | 7 +--
 Documentation/technical/api-index.sh | 5 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 2ab65561af..dfec29f36f 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -410,6 +410,7 @@ $(patsubst %.txt,%.texi,$(MAN_TXT)): %.texi : %.xml
 howto-index.txt: howto-index.sh $(wildcard howto/*.txt)
$(QUIET_GEN)$(RM) $@+ $@ && \
'$(SHELL_PATH_SQ)' ./howto-index.sh $(sort $(wildcard howto/*.txt)) 
>$@+ && \
+   $(if $(SOURCE_DATE_EPOCH),touch -d '@$(SOURCE_DATE_EPOCH)' $@+ &&) \
mv $@+ $@
 
 $(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt
@@ -420,8 +421,10 @@ WEBDOC_DEST = /pub/software/scm/git/docs
 howto/%.html: ASCIIDOC_EXTRA += -a git-relative-html-prefix=../
 $(patsubst %.txt,%.html,$(wildcard howto/*.txt)): %.html : %.txt
$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
-   sed -e '1,/^$$/d' $< | \
-   $(TXT_TO_HTML) - >$@+ && \
+   sed -e '1,/^$$/d' $< > $<+ && \
+   $(if $(SOURCE_DATE_EPOCH),touch -d '@$(SOURCE_DATE_EPOCH)' $<+ &&) \
+   $(TXT_TO_HTML) -o $@+ $<+ && \
+   rm $<+ && \
mv $@+ $@
 
 install-webdoc : html
diff --git a/Documentation/technical/api-index.sh 
b/Documentation/technical/api-index.sh
index 9c3f4131b8..07b3909627 100755
--- a/Documentation/technical/api-index.sh
+++ b/Documentation/technical/api-index.sh
@@ -20,6 +20,11 @@
sed -n -e '/^\/\/ table of contents end/,$p' "$skel"
 ) >api-index.txt+
 
+if test "${SOURCE_DATE_EPOCH:+set}"
+then
+   touch -d "@$SOURCE_DATE_EPOCH" api-index.txt+
+fi
+
 if test -f api-index.txt && cmp api-index.txt api-index.txt+ >/dev/null
 then
rm -f api-index.txt+
-- 
2.15.0.448.gf294e3d99a



[PATCH 0/3] Improving build reproducibility

2017-11-21 Thread Jonathan Nieder
Hi,

The reproducible builds  project has
been working on making it possible to verify that binary packages of
open source projects were built from the source they were claimed to
have been built from.

To that end, Debian has been carrying patches 1-2 for a while.  Patch
3 is a Google-internal patch with a related but distinct goal of
making builds go faster.

I think these should be ready to apply.  Thoughts of all kinds welcome.

Sincerely,
Anders Kaseorg (2):
  Documentation: allow overriding timestamps of generated asciidoc
  git-gui: sort entries in optimized tclIndex

Jonathan Nieder (1):
  generate-cmdlist: avoid non-deterministic output

 Documentation/Makefile   | 7 +--
 Documentation/technical/api-index.sh | 5 +
 generate-cmdlist.sh  | 2 +-
 git-gui/Makefile | 2 +-
 4 files changed, 12 insertions(+), 4 deletions(-)


Re: [RFC PATCH] xdiff/xpatience: support anchoring a line

2017-11-21 Thread Stefan Beller
On Tue, Nov 21, 2017 at 2:17 PM, Jonathan Tan  wrote:
> Teach the patience diff to support prohibiting a user-specified line
> from appearing as a deletion or addition in the end result.
>
> Signed-off-by: Jonathan Tan 
> ---
> I'm sending this out to see if a change similar to this would be
> welcome. It is useful to me as a reviewer (to check my own code, e.g.
> when checking [1]). Probably more design needs to go into this,
> including the best way to specify the "anchor" line, and the correct
> behavior when the anchor is either not found or appears more than once.
>
> Any thoughts?

The background from this whole idea is that the Myers diff algorithm
may produce the shortest diff, which is good for computer consumption
or transport, but not ideal for human review. To accommodate human
review we need to couple the diff with higher level concepts, such as
move detection or ignoring blanks.

I would imagine that this anchor can be set by a user to a function
header or other significant line, such that the commit is more in line with
the diff itself ("move code out of function A into its helper" would not
want to have the "function A" line jump around, but the code should
be removed from the function, hence you'd anchor the function).

The solution you provide is a good thing to experiment with, but
longer term, I would want to have huge record of configs in which
humans selected the best diff, such that we can use that data
to reason about better automatic diff generation.
The diff heuristic was based on a lot of human generated data,
that was generated by Michael at the time. I wonder if we want to
permanently store the anchor so the data collection will happen
automatically over time.

I had a similar idea, which would affix a given coordinate of the map[1]
to be on the path. I imagine it similar to e.g. Google Maps in which you
can select intermittent way points on your route.

When having this rather abstract coordinate, which can be
given as line number in pre and post image, then we would not
need to think about questions whether a given line is found or
appears multiple times; however users like concise input the best,
so the idea of an "anchor line" might be the best representation for
the user, which is internally translated into way points.

[1] think of http://simplygenius.net/ArticleFiles/DiffTutorial/diagonals.png

Stefan

> [1]
> https://public-inbox.org/git/20171121221256.154741-1-jonathanta...@google.com/
> ---
>  t/t4033-diff-patience.sh | 13 +
>  xdiff/xpatience.c| 29 +++--
>  2 files changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/t/t4033-diff-patience.sh b/t/t4033-diff-patience.sh
> index 113304dc5..2147fd688 100755
> --- a/t/t4033-diff-patience.sh
> +++ b/t/t4033-diff-patience.sh
> @@ -13,6 +13,19 @@ test_expect_success '--ignore-space-at-eol with a single 
> appended character' '
> grep "^+.*X" diff
>  '
>
> +test_expect_success 'anchor' '
> +   printf "a\nb\nc\n" >pre &&
> +   printf "c\na\nb\n" >post &&
> +
> +   # without anchor, c is moved
> +   test_expect_code 1 git diff --no-index --patience pre post >diff &&
> +   grep "^+c" diff &&
> +
> +   # with anchor, a is moved
> +   DIFF_ANCHOR=c test_expect_code 1 git diff --no-index --patience pre 
> post >diff &&
> +   grep "^+a" diff

or rather: "c is not moved, we don't care how the diff actually looks like",
so maybe
  ! grep "+c" diff


> diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
> index a44e77632..195a60e57 100644
> --- a/xdiff/xpatience.c
> +++ b/xdiff/xpatience.c
> @@ -62,6 +62,8 @@ struct hashmap {
>  * initially, "next" reflects only the order in file1.
>  */
> struct entry *next, *previous;
> +
> +   unsigned anchor : 1;

While this is RFC, I should not expect comments, though it would
be nice to have them in the final series. ;-)

> } *entries, *first, *last;
> /* were common records found? */
> unsigned long has_matches;
> @@ -70,6 +72,14 @@ struct hashmap {
> xpparam_t const *xpp;
>  };
>
> +static int is_anchor(const char *line)
> +{
> +   char *anchor = getenv("DIFF_ANCHOR");
> +   if (!anchor)
> +   return 0;
> +   return !strncmp(line, anchor, strlen(anchor));
> +}
> +
>  /* The argument "pass" is 1 for the first file, 2 for the second. */
>  static void insert_record(int line, struct hashmap *map, int pass)
>  {
> @@ -110,6 +120,7 @@ static void insert_record(int line, struct hashmap *map, 
> int pass)
> return;
> map->entries[index].line1 = line;
> map->entries[index].hash = record->ha;
> +   map->entries[index].anchor = is_anchor(map->env->xdf1.recs[line - 
> 1]->ptr);
> if (!map->first)
> map->first = map->entries + index;
> if (map->last) {
> @@ -192,14 +203,28 @@ static 

[PATCH] doc: prefer 'stash push' instead of 'stash save'

2017-11-21 Thread Phil Hord
Although `git stash save` was deprecated recently, some parts of the
documentation still refer to it.

Signed-off-by: Phil Hord 
---
 Documentation/git-stash.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index 8be661007..056dfb866 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -175,14 +175,14 @@ create::
return its object name, without storing it anywhere in the ref
namespace.
This is intended to be useful for scripts.  It is probably not
-   the command you want to use; see "save" above.
+   the command you want to use; see "push" above.
 
 store::
 
Store a given stash created via 'git stash create' (which is a
dangling merge commit) in the stash ref, updating the stash
reflog.  This is intended to be useful for scripts.  It is
-   probably not the command you want to use; see "save" above.
+   probably not the command you want to use; see "push" above.
 
 DISCUSSION
 --
-- 
2.15.0.471.g17a719cfe.dirty



[PATCH] stash: Learn to parse -m/--message like commit does

2017-11-21 Thread Phil Hord
`git stash push -m foo` uses "foo" as the message for the stash. But
`git stash push -m"foo"` does not parse successfully.  Similarly
`git stash push --message="My stash message"` also fails.  Nothing
in the documentation suggests this syntax should work, but it does
work for `git commit`, and my fingers have learned this pattern long
ago.

Teach `git stash` to parse -mFoo and --message=Foo the same as
`git commit` would do.

Signed-off-by: Phil Hord 
---
 git-stash.sh | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/git-stash.sh b/git-stash.sh
index 4b7495144..1114005ce 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -76,6 +76,12 @@ create_stash () {
shift
stash_msg=${1?"BUG: create_stash () -m requires an 
argument"}
;;
+   -m*)
+   stash_msg=${1#-m}
+   ;;
+   --message=*)
+   stash_msg=${1#--message=}
+   ;;
-u|--include-untracked)
shift
untracked=${1?"BUG: create_stash () -u requires an 
argument"}
@@ -193,6 +199,12 @@ store_stash () {
shift
stash_msg="$1"
;;
+   -m*)
+   stash_msg=${1#-m}
+   ;;
+   --message=*)
+   stash_msg=${1#--message=}
+   ;;
-q|--quiet)
quiet=t
;;
@@ -251,6 +263,12 @@ push_stash () {
test -z ${1+x} && usage
stash_msg=$1
;;
+   -m*)
+   stash_msg=${1#-m}
+   ;;
+   --message=*)
+   stash_msg=${1#--message=}
+   ;;
--help)
show_help
;;
-- 
2.15.0.471.g17a719cfe.dirty



Re: stash: learn to parse -m/--message like commit does

2017-11-21 Thread Jonathan Nieder
Hi,

Phil Hord wrote:

> `git stash push -m foo` uses "foo" as the message for the stash. But
> `git stash push -m"foo"` does not parse successfully.  Similarly
> `git stash push --message="My stash message"` also fails.  Nothing
> in the documentation suggests this syntax should work,

"git help cli" says it should work.  Thanks for working on it.

>but it does
> work for `git commit`, and my fingers have learned this pattern long
> ago.


Re: doc: prefer 'stash push' over 'stash save'

2017-11-21 Thread Jonathan Nieder
Phil Hord wrote:

> Although `git stash save` was deprecated recently, some parts of the
> documentation still refer to it instead of `push`.
>
> Signed-off-by: Phil Hord 
> ---
>  Documentation/git-stash.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Jonathan Nieder 
Thanks.


Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1

2017-11-21 Thread Jeff King
On Tue, Nov 21, 2017 at 02:20:19PM +0900, Junio C Hamano wrote:

> After queuing this series to an earlier part of 'pu' and resolving a
> conflict with jh/fsck-promisors topic, I ended up with a code that
> rejects 0{40} a lot earlier, before we try to see if a pack entry
> for 0{40} exists, even though the patch that is queued on this topic
> is more conservative (i.e. the above one).
> 
> Perhaps we would want to use the alternate version that declares the
> 0{40} is a sentinel that signals that there is no such object in
> this topic---that would give us a consistent semantics without
> having to adjust jh/fsck-promisors when it becomes ready to be
> merged.

I think we could adjust the patches without too much effort. That said,
I am very on-the-fence about whether to just declare the null sha1 as
"not a real object" and automatically return from the lookup without
doing any work. But if you agree that it's not likely to hurt anything,
it's IMHO a lot easier to reason about.

Here's a re-roll of patch 5 that behaves this way (the patch should be
unsurprising, but I've updated the commit message to match).

I did notice one other thing: the function looks up replace objects, so
we have both the original and the replaced sha1 available. My earlier
patch used the original sha1, but this one uses the replaced value.
I'm not sure if that's sane or not. It lets the fast-path kick in if you
point a replace ref at 0{40}. And it lets you point refs/replace/0{40}
to something else. I doubt that's useful, but it could perhaps be for
debugging, etc.

In most cases, of course, it wouldn't matter (since pointing to or from
the null sha1 is vaguely crazy in the first place).

-- >8 --
Subject: [PATCH v2 5/5] sha1_file: fast-path null sha1 as a missing object

In theory nobody should ever ask the low-level object code
for a null sha1. It's used as a sentinel for "no such
object" in lots of places, so leaking through to this level
is a sign that the higher-level code is not being careful
about its error-checking.  In practice, though, quite a few
code paths seem to rely on the null sha1 lookup failing as a
way to quietly propagate non-existence (e.g., by feeding it
to lookup_commit_reference_gently(), which then returns
NULL).

When this happens, we do two inefficient things:

  1. We actually search for the null sha1 in packs and in
 the loose object directory.

  2. When we fail to find it, we re-scan the pack directory
 in case a simultaneous repack happened to move it from
 loose to packed. This can be very expensive if you have
 a large number of packs.

Only the second one actually causes noticeable performance
problems, so we could treat them independently. But for the
sake of simplicity (both of code and of reasoning about it),
it makes sense to just declare that the null sha1 cannot be
a real on-disk object, and looking it up will always return
"no such object".

There's no real loss of functionality to do so Its use as a
sentinel value means that anybody who is unlucky enough to
hit the 2^-160th chance of generating an object with that
sha1 is already going to find the object largely unusable.

In an ideal world, we'd simply fix all of the callers to
notice the null sha1 and avoid passing it to us. But a
simple experiment to catch this with a BUG() shows that
there are a large number of code paths that do so.

So in the meantime, let's fix the performance problem by
taking a fast exit from the object lookup when we see a null
sha1. p5551 shows off the improvement (when a fetched ref is
new, the "old" sha1 is 0{40}, which ends up being passed for
fast-forward checks, the status table abbreviations, etc):

  TestHEAD^ HEAD
  
  5551.4: fetch   5.51(5.03+0.48)   0.17(0.10+0.06) -96.9%

Signed-off-by: Jeff King 
---
 sha1_file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sha1_file.c b/sha1_file.c
index 8a7c6b7eba..ae4b71f445 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1152,6 +1152,9 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
lookup_replace_object(sha1) :
sha1;
 
+   if (is_null_sha1(real))
+   return -1;
+
if (!oi)
oi = _oi;
 
-- 
2.15.0.578.g35b419775f



Re: stash: learn to parse -m/--message like commit does

2017-11-21 Thread Phil Hord
Hm..  Sorry about the formatting here.  It's been a while.  I'll try again.

On Tue, Nov 21, 2017 at 3:07 PM, Phil Hord  wrote:
> `git stash push -m foo` uses "foo" as the message for the stash. But
> `git stash push -m"foo"` does not parse successfully.  Similarly
> `git stash push --message="My stash message"` also fails.  Nothing
> in the documentation suggests this syntax should work, but it does
> work for `git commit`, and my fingers have learned this pattern long
> ago.
>
> Teach `git stash` to parse -mFoo and --message=Foo the same as
> `git commit` would do.
>
> Signed-off-by: Phil Hord 
> ---
>  git-stash.sh | 18 ++
>  1 file changed, 18 insertions(+)
>
> diff --git a/git-stash.sh b/git-stash.sh
> index 4b7495144..1114005ce 100755
> --- a/git-stash.sh
> +++ b/git-stash.sh
> @@ -76,6 +76,12 @@ create_stash () {
>   shift
>   stash_msg=${1?"BUG: create_stash () -m requires an argument"}
>   ;;
> + -m*)
> + stash_msg=${1#-m}
> + ;;
> + --message=*)
> + stash_msg=${1#--message=}
> + ;;
>   -u|--include-untracked)
>   shift
>   untracked=${1?"BUG: create_stash () -u requires an argument"}
> @@ -193,6 +199,12 @@ store_stash () {
>   shift
>   stash_msg="$1"
>   ;;
> + -m*)
> + stash_msg=${1#-m}
> + ;;
> + --message=*)
> + stash_msg=${1#--message=}
> + ;;
>   -q|--quiet)
>   quiet=t
>   ;;
> @@ -251,6 +263,12 @@ push_stash () {
>   test -z ${1+x} && usage
>   stash_msg=$1
>   ;;
> + -m*)
> + stash_msg=${1#-m}
> + ;;
> + --message=*)
> + stash_msg=${1#--message=}
> + ;;
>   --help)
>   show_help
>   ;;
> --
> 2.15.0.471.g17a719cfe.dirty


stash: learn to parse -m/--message like commit does

2017-11-21 Thread Phil Hord
`git stash push -m foo` uses "foo" as the message for the stash. But
`git stash push -m"foo"` does not parse successfully.  Similarly
`git stash push --message="My stash message"` also fails.  Nothing
in the documentation suggests this syntax should work, but it does
work for `git commit`, and my fingers have learned this pattern long
ago.

Teach `git stash` to parse -mFoo and --message=Foo the same as
`git commit` would do.

Signed-off-by: Phil Hord 
---
 git-stash.sh | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/git-stash.sh b/git-stash.sh
index 4b7495144..1114005ce 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -76,6 +76,12 @@ create_stash () {
  shift
  stash_msg=${1?"BUG: create_stash () -m requires an argument"}
  ;;
+ -m*)
+ stash_msg=${1#-m}
+ ;;
+ --message=*)
+ stash_msg=${1#--message=}
+ ;;
  -u|--include-untracked)
  shift
  untracked=${1?"BUG: create_stash () -u requires an argument"}
@@ -193,6 +199,12 @@ store_stash () {
  shift
  stash_msg="$1"
  ;;
+ -m*)
+ stash_msg=${1#-m}
+ ;;
+ --message=*)
+ stash_msg=${1#--message=}
+ ;;
  -q|--quiet)
  quiet=t
  ;;
@@ -251,6 +263,12 @@ push_stash () {
  test -z ${1+x} && usage
  stash_msg=$1
  ;;
+ -m*)
+ stash_msg=${1#-m}
+ ;;
+ --message=*)
+ stash_msg=${1#--message=}
+ ;;
  --help)
  show_help
  ;;
-- 
2.15.0.471.g17a719cfe.dirty


doc: prefer 'stash push' over 'stash save'

2017-11-21 Thread Phil Hord
Although `git stash save` was deprecated recently, some parts of the
documentation still refer to it instead of `push`.

Signed-off-by: Phil Hord 
---
 Documentation/git-stash.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index 8be661007..056dfb866 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -175,14 +175,14 @@ create::
  return its object name, without storing it anywhere in the ref
  namespace.
  This is intended to be useful for scripts.  It is probably not
- the command you want to use; see "save" above.
+ the command you want to use; see "push" above.

 store::

  Store a given stash created via 'git stash create' (which is a
  dangling merge commit) in the stash ref, updating the stash
  reflog.  This is intended to be useful for scripts.  It is
- probably not the command you want to use; see "save" above.
+ probably not the command you want to use; see "push" above.

 DISCUSSION
 --
-- 
2.15.0.471.g17a719cfe.dirty


Re: [PATCH] recursive submodules: detach HEAD from new state

2017-11-21 Thread Stefan Beller
On Tue, Nov 21, 2017 at 2:47 PM, Jonathan Nieder  wrote:
> Hi,
>
> Stefan Beller wrote:
>> On Tue, Nov 21, 2017 at 2:34 PM, Jonathan Nieder  wrote:
>
>>> Stefan, do you know what thread I should look at to find the current
>>> state of this patch?  I've had it applied locally for a long time.
>>
>> It was "Undecided" for some time, then Junio kicked it back to pu, expecting 
>> a
>> reroll[1]. The "send out a plan" that was referenced is found in [2]
>> describing 6 plans for the future of submodules. The approach in [3]
>> which is different on the implementation level, but very similar on
>> the UX level sounds best currently.  I'll coordinate with JTan to
>> come up with patches for that.
>>
>> [1] 
>> https://public-inbox.org/git/CAGZ79kYUZv0g+3OEMrbT26A7mSLJzeS-yf5Knr-CnARHqVB=a...@mail.gmail.com/
>> [2] https://public-inbox.org/git/20171109001007.11894-1-sbel...@google.com/
>> [3] 
>> https://public-inbox.org/git/20171108172945.33c42a0e91b4ac494217b...@google.com/
>
> Thanks.  That thread appears to be about a long-term plan; what is the
> short-term plan?
>
> E.g. is there any additional documentation that should be added to the
> patch that detaches?

The second patch in that series has a tiny bit of information, see
"Documentation/checkout: clarify submodule HEADs to be detached".

I would think that this is sufficient for the short term to get into a
safe state.

> Or should it go in as-is?

That is what I would prefer and then we'll build on top of this once we
figured out the direction of the long term solution.

Thanks,
Stefan


Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1

2017-11-21 Thread Jeff King
On Tue, Nov 21, 2017 at 11:37:28AM +0900, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > In an ideal world, we'd simply fix all of the callers to
> > notice the null sha1 and avoid passing it to us. But a
> > simple experiment to catch this with a BUG() shows that
> > there are a large number of code paths.
> 
> Well, we can view this (or the alternative you sent later that does
> the same a bit earlier in the function) as "fixing the caller" but
> has already refactord the common logic to a helper function that all
> of these callers call into ;-).

Yes, I'm definitely tempted by that view. :)

What worries me, though, is that callers who lazily propagate the null
sha1 to the lookup functions cannot reasonably tell the difference
between "this object was corrupt or missing" and "we passed the null
sha1, and a missing object is expected".

For example, look at how fetch.c:update_local_ref() looks up objects.
It feeds the old and new sha1 to lookup_commit_reference_gently(), and
if either is NULL, it skips the fast-forward check. That makes sense if
we expect the null sha1 to get translated into a NULL commit. But it
also triggers for a broken ref, and we'd overwrite it (when the right
thing is probably refusing to update).

Here's a runnable example:

-- >8 --
git init parent
git -C parent commit --allow-empty -m base

git clone parent child
git -C parent commit --allow-empty -m more

cd child
rm -f .git/objects/??/*
git fetch
-- 8< --

That final fetch spews a bunch of errors about broken refs, and then
overwrites the value of origin/master, even though it's broken (in this
case it actually is a fast-forward, but the child repo doesn't even know
that).

I'm not sure what the right behavior is, but I'm pretty sure that's not
it. Probably one of:

  - skip updating the ref when we see the breakage

  - ditto, but terminate the whole operation, since we might be deleting
other refs and in a broken repo we're probably best to make as few
changes as possible

  - behave as if it was a non-ff, which would allow "--force" to
overwrite the broken ref. Maybe convenient for fixing things, but
possibly surprising (and it's not that hard to just delete the
broken refs manually before proceeding).

-Peff


Re: [PATCH] recursive submodules: detach HEAD from new state

2017-11-21 Thread Jonathan Nieder
Hi,

Stefan Beller wrote:
> On Tue, Nov 21, 2017 at 2:34 PM, Jonathan Nieder  wrote:

>> Stefan, do you know what thread I should look at to find the current
>> state of this patch?  I've had it applied locally for a long time.
>
> It was "Undecided" for some time, then Junio kicked it back to pu, expecting a
> reroll[1]. The "send out a plan" that was referenced is found in [2]
> describing 6 plans for the future of submodules. The approach in [3]
> which is different on the implementation level, but very similar on
> the UX level sounds best currently.  I'll coordinate with JTan to
> come up with patches for that.
>
> [1] 
> https://public-inbox.org/git/CAGZ79kYUZv0g+3OEMrbT26A7mSLJzeS-yf5Knr-CnARHqVB=a...@mail.gmail.com/
> [2] https://public-inbox.org/git/20171109001007.11894-1-sbel...@google.com/
> [3] 
> https://public-inbox.org/git/20171108172945.33c42a0e91b4ac494217b...@google.com/

Thanks.  That thread appears to be about a long-term plan; what is the
short-term plan?

E.g. is there any additional documentation that should be added to the
patch that detaches?

Or should it go in as-is?

Thanks,
Jonathan


Re: [PATCH] recursive submodules: detach HEAD from new state

2017-11-21 Thread Stefan Beller
On Tue, Nov 21, 2017 at 2:34 PM, Jonathan Nieder  wrote:
> Stefan Beller wrote:
>>> Junio C Hamano  writes:
>
 Also, while I do agree with you that the problem exists, it is
 unclear why this patch is a solution and not a hack that sweeps a
 problem under the rug.

 It is unclear why this "silently detach HEAD without telling the
 user" is a better solution than erroring out, for example [*1*].
> [...]
>> So I took a step back and wrote about different proposals where
>> we want to go long term. See below. This will help us
>> figuring out how to approach this bug correctly.
>
> Stefan, do you know what thread I should look at to find the current
> state of this patch?  I've had it applied locally for a long time.

It was "Undecided" for some time, then Junio kicked it back to pu, expecting a
reroll[1]. The "send out a plan" that was referenced is found in [2]
describing 6
plans for the future of submodules. The approach in [3] which is
different on the
implementation level, but very similar on the UX level sounds best currently.
I'll coordinate with JTan to come up with patches for that.

[1] 
https://public-inbox.org/git/CAGZ79kYUZv0g+3OEMrbT26A7mSLJzeS-yf5Knr-CnARHqVB=a...@mail.gmail.com/
[2] https://public-inbox.org/git/20171109001007.11894-1-sbel...@google.com/
[3] 
https://public-inbox.org/git/20171108172945.33c42a0e91b4ac494217b...@google.com/

> The thread I am replying to appears to be where the patch comes from
> but I have some memories of more recent discussion that I'm not
> finding.
>
> More context:
> https://public-inbox.org/git/xmqqshd8i3ze@gitster.mtv.corp.google.com/
> says
>
>  * sb/submodule-recursive-checkout-detach-head (2017-07-28) 2 commits
>   - Documentation/checkout: clarify submodule HEADs to be detached
>   - recursive submodules: detach HEAD from new state
>
>   "git checkout --recursive" may overwrite and rewind the history of
>   the branch that happens to be checked out in submodule
>   repositories, which might not be desirable.  Detach the HEAD but
>   still allow the recursive checkout to succeed in such a case.
>
>   Expecting a reroll.
>
> Thanks,
> Jonathan


Re: [PATCH] recursive submodules: detach HEAD from new state

2017-11-21 Thread Jonathan Nieder
Stefan Beller wrote:
>> Junio C Hamano  writes:

>>> Also, while I do agree with you that the problem exists, it is
>>> unclear why this patch is a solution and not a hack that sweeps a
>>> problem under the rug.
>>>
>>> It is unclear why this "silently detach HEAD without telling the
>>> user" is a better solution than erroring out, for example [*1*].
[...]
> So I took a step back and wrote about different proposals where
> we want to go long term. See below. This will help us
> figuring out how to approach this bug correctly.

Stefan, do you know what thread I should look at to find the current
state of this patch?  I've had it applied locally for a long time.

The thread I am replying to appears to be where the patch comes from
but I have some memories of more recent discussion that I'm not
finding.

More context:
https://public-inbox.org/git/xmqqshd8i3ze@gitster.mtv.corp.google.com/
says

 * sb/submodule-recursive-checkout-detach-head (2017-07-28) 2 commits
  - Documentation/checkout: clarify submodule HEADs to be detached
  - recursive submodules: detach HEAD from new state

  "git checkout --recursive" may overwrite and rewind the history of
  the branch that happens to be checked out in submodule
  repositories, which might not be desirable.  Detach the HEAD but
  still allow the recursive checkout to succeed in such a case.

  Expecting a reroll.

Thanks,
Jonathan


[RFC PATCH] xdiff/xpatience: support anchoring a line

2017-11-21 Thread Jonathan Tan
Teach the patience diff to support prohibiting a user-specified line
from appearing as a deletion or addition in the end result.

Signed-off-by: Jonathan Tan 
---
I'm sending this out to see if a change similar to this would be
welcome. It is useful to me as a reviewer (to check my own code, e.g.
when checking [1]). Probably more design needs to go into this,
including the best way to specify the "anchor" line, and the correct
behavior when the anchor is either not found or appears more than once.

Any thoughts?

[1]
https://public-inbox.org/git/20171121221256.154741-1-jonathanta...@google.com/
---
 t/t4033-diff-patience.sh | 13 +
 xdiff/xpatience.c| 29 +++--
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/t/t4033-diff-patience.sh b/t/t4033-diff-patience.sh
index 113304dc5..2147fd688 100755
--- a/t/t4033-diff-patience.sh
+++ b/t/t4033-diff-patience.sh
@@ -13,6 +13,19 @@ test_expect_success '--ignore-space-at-eol with a single 
appended character' '
grep "^+.*X" diff
 '
 
+test_expect_success 'anchor' '
+   printf "a\nb\nc\n" >pre &&
+   printf "c\na\nb\n" >post &&
+
+   # without anchor, c is moved
+   test_expect_code 1 git diff --no-index --patience pre post >diff &&
+   grep "^+c" diff &&
+
+   # with anchor, a is moved
+   DIFF_ANCHOR=c test_expect_code 1 git diff --no-index --patience pre 
post >diff &&
+   grep "^+a" diff
+'
+
 test_diff_frobnitz "patience"
 
 test_diff_unique "patience"
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index a44e77632..195a60e57 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -62,6 +62,8 @@ struct hashmap {
 * initially, "next" reflects only the order in file1.
 */
struct entry *next, *previous;
+
+   unsigned anchor : 1;
} *entries, *first, *last;
/* were common records found? */
unsigned long has_matches;
@@ -70,6 +72,14 @@ struct hashmap {
xpparam_t const *xpp;
 };
 
+static int is_anchor(const char *line)
+{
+   char *anchor = getenv("DIFF_ANCHOR");
+   if (!anchor)
+   return 0;
+   return !strncmp(line, anchor, strlen(anchor));
+}
+
 /* The argument "pass" is 1 for the first file, 2 for the second. */
 static void insert_record(int line, struct hashmap *map, int pass)
 {
@@ -110,6 +120,7 @@ static void insert_record(int line, struct hashmap *map, 
int pass)
return;
map->entries[index].line1 = line;
map->entries[index].hash = record->ha;
+   map->entries[index].anchor = is_anchor(map->env->xdf1.recs[line - 
1]->ptr);
if (!map->first)
map->first = map->entries + index;
if (map->last) {
@@ -192,14 +203,28 @@ static struct entry *find_longest_common_sequence(struct 
hashmap *map)
int longest = 0, i;
struct entry *entry;
 
+   /*
+* If not -1, this entry in sequence must never be overridden. (Also,
+* do not override entries in sequence before this entry, since it is
+* useless.)
+*/
+   int anchor_i = -1;
+
for (entry = map->first; entry; entry = entry->next) {
if (!entry->line2 || entry->line2 == NON_UNIQUE)
continue;
i = binary_search(sequence, longest, entry);
entry->previous = i < 0 ? NULL : sequence[i];
-   sequence[++i] = entry;
-   if (i == longest)
+   ++i;
+   if (i <= anchor_i)
+   continue;
+   sequence[i] = entry;
+   if (anchor_i == -1 && entry->anchor) {
+   anchor_i = i;
+   longest = anchor_i + 1;
+   } else if (i == longest) {
longest++;
+   }
}
 
/* No common unique lines were found */
-- 
2.15.0.448.gf294e3d99a-goog



[PATCH] Tests: clean up submodule recursive helpers

2017-11-21 Thread Jonathan Tan
This continues the work in commit d3b5a49 ("Tests: clean up and document
submodule helpers", 2017-11-08).

Factor out the commonalities from
test_submodule_switch_recursing_with_args() and
test_submodule_forced_switch_recursing_with_args() in
lib-submodule-update.sh, and document their usage. Some tests differ
slightly in their test assertions; I have used the superset of those
assertions in that case.

Signed-off-by: Jonathan Tan 
---
I checked my work by using a custom diff patch to "anchor" one line into
never appearing as a minus/plus in the diff, so that I could more
clearly see the deletions and additions, then opening 3 terminals: one
showing the common part, one showing the non-forced part, and one
showing the forced part. I'll send that diff patch to the mailing list
shortly [1].

[1] With my diff patch, you can run:
DIFF_ANCHOR=test_submodule_switch_recursing_with_args git show
--patience --color-moved
---
 t/lib-submodule-update.sh | 343 +-
 1 file changed, 125 insertions(+), 218 deletions(-)

diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index bb94c2320..f39ec3095 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -554,6 +554,10 @@ test_submodule_switch_common() {
 #  - if succeeds, once "git submodule update" is invoked, the contents of
 #submodule directories are updated
 #
+# If the command under test is known to not work with submodules in certain
+# conditions, set the appropriate KNOWN_FAILURE_* variable used in the tests
+# below to 1.
+#
 # Use as follows:
 #
 # my_func () {
@@ -622,19 +626,11 @@ test_submodule_forced_switch () {
 # - Removing a submodule with a git directory absorbs the submodules
 #   git directory first into the superproject.
 
-test_submodule_switch_recursing_with_args () {
-   cmd_args="$1"
-   command="git $cmd_args --recurse-submodules"
-   RESULTDS=success
-   if test "$KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS" = 1
-   then
-   RESULTDS=failure
-   fi
-   RESULTOI=success
-   if test "$KNOWN_FAILURE_SUBMODULE_OVERWRITE_IGNORED_UNTRACKED" = 1
-   then
-   RESULTOI=failure
-   fi
+# Internal function; use test_submodule_switch_recursing_with_args() or
+# test_submodule_forced_switch_recursing_with_args() instead.
+test_submodule_recursing_with_args_common() {
+   command="$1"
+
# Appearing submodule #
# Switching to a commit letting a submodule appear checks it out ...
test_expect_success "$command: added submodule is checked out" '
@@ -648,7 +644,7 @@ test_submodule_switch_recursing_with_args () {
test_submodule_content sub1 origin/add_sub1
)
'
-   # ... ignoring an empty existing directory ...
+   # ... ignoring an empty existing directory.
test_expect_success "$command: added submodule is checked out in empty 
dir" '
prolog &&
reset_work_tree_to_interested no_submodule &&
@@ -661,34 +657,6 @@ test_submodule_switch_recursing_with_args () {
test_submodule_content sub1 origin/add_sub1
)
'
-   # ... unless there is an untracked file in its place.
-   test_expect_success "$command: added submodule doesn't remove untracked 
file with same name" '
-   prolog &&
-   reset_work_tree_to_interested no_submodule &&
-   (
-   cd submodule_update &&
-   git branch -t add_sub1 origin/add_sub1 &&
-   : >sub1 &&
-   test_must_fail $command add_sub1 &&
-   test_superproject_content origin/no_submodule &&
-   test_must_be_empty sub1
-   )
-   '
-   # ... but an ignored file is fine.
-   test_expect_$RESULTOI "$command: added submodule removes an untracked 
ignored file" '
-   test_when_finished "rm submodule_update/.git/info/exclude" &&
-   prolog &&
-   reset_work_tree_to_interested no_submodule &&
-   (
-   cd submodule_update &&
-   git branch -t add_sub1 origin/add_sub1 &&
-   : >sub1 &&
-   echo sub1 >.git/info/exclude
-   $command add_sub1 &&
-   test_superproject_content origin/add_sub1 &&
-   test_submodule_content sub1 origin/add_sub1
-   )
-   '
# Replacing a tracked file with a submodule produces a checked out 
submodule
test_expect_success "$command: replace tracked file with submodule 
checks out submodule" '
prolog &&
@@ -742,33 +710,6 @@ test_submodule_switch_recursing_with_args () {
test_git_directory_exists sub1
  

Re: [PATCH v2] gitcli: tweak "man gitcli" for clarity

2017-11-21 Thread Robert P. J. Day
On Tue, 21 Nov 2017, Kevin Daudt wrote:

> On Tue, Nov 21, 2017 at 04:47:42PM -0500, Robert P. J. Day wrote:
> > On Tue, 21 Nov 2017, Kevin Daudt wrote:
> >
> > > On Tue, Nov 21, 2017 at 04:27:59PM -0500, Robert P. J. Day wrote:
> > > > No major changes, just some rewording and showing some variations of
> > > > general Git commands.
> > > >
> > > > Signed-off-by: Robert P. J. Day 
> > > >
> > > > ---
> > > >
> > > > diff --git a/Documentation/gitcli.txt b/Documentation/gitcli.txt
> > > > index 9f13266a6..d690d1ff0 100644
> > > > --- a/Documentation/gitcli.txt
> > > > +++ b/Documentation/gitcli.txt
> > > > @@ -13,7 +13,7 @@ gitcli
> > > >  DESCRIPTION
> > > >  ---
> > > >
> > > > -This manual describes the convention used throughout Git CLI.
> > > > +This manual describes the conventions used throughout Git CLI.
> > > >
> > > >  Many commands take revisions (most often "commits", but sometimes
> > > >  "tree-ish", depending on the context and command) and paths as their
> > > > @@ -32,32 +32,35 @@ arguments.  Here are the rules:
> > > > between the HEAD commit and the work tree as a whole".  You can say
> > > > `git diff HEAD --` to ask for the latter.
> > > >
> > > > - * Without disambiguating `--`, Git makes a reasonable guess, but 
> > > > errors
> > > > -   out and asking you to disambiguate when ambiguous.  E.g. if you 
> > > > have a
> > > > + * Without a disambiguating `--`, Git makes a reasonable guess, but can
> > > > +   error out, asking you to disambiguate when ambiguous.  E.g. if you 
> > > > have a
> > >
> > > 'Can' error out implies that it sometimes would not error out when
> > > there is ambiguity. Are there situation where git does not error out
> > > in that case?
> >
> >   i would say (based on my limited knowledge) that if the heuristic
> > kicks in and works fine, then things will work. i think it's fair to
> > say that git "can" error out if the heuristic fails.
> >
> > rday
>
> In most cases that I'm aware of, you have to be explicit. If for
> example you want to refer to a file that's not in the working tree,
> you have to use '--'.  Even with heuristics, it would still have to
> error out when it's ambiguous what the user meant.
>
> So the way you worded it implies that there are situations where git
> knows there are multiple things the user could have meant, but it
> would not error out in that case.

  all right, i will ponder this ... open to suggestions. i would have
to examine the heuristic itself, wondering if it can make the wrong
decision on occasion.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: [PATCH v3 3/3] worktree: make add dwim

2017-11-21 Thread Thomas Gummerer
On 11/19, Eric Sunshine wrote:
> On Sun, Nov 19, 2017 at 2:04 PM, Eric Sunshine  
> wrote:
> > On Sat, Nov 18, 2017 at 5:47 PM, Thomas Gummerer  
> > wrote:
> >> +To disable the behaviour of trying to match the basename of  to
> >> +a remote, and always create a new branch from HEAD, the `--no-track`
> >
> > Does --[no-]track deserve to be documented in the OPTIONS section like
> > the other options are?
> 
> One other question: Since this is re-using the well-known option name
> --no-track, should it also get applied to the "git worktree add -b foo
> dir origin/foo" case, as well, which you pointed out (in the patch 2/3
> thread) already DWIMs tracking automatically? (I can easily see
> someone reporting it as a bug if "git worktree add --no-track -b foo
> dir origin/foo" does not suppress tracking.)

I didn't consider that, I think you are right, and the flag should
apply in that case as well.  I think at that point we may as well pass
this flag through to the 'git branch' call, and let users set up
tracking if they want to, the same way it works in 'git branch'.  Thanks!


Re: bash script to pull in branch B the changes from parent branch A

2017-11-21 Thread Paul Smith
On Wed, 2017-11-22 at 00:19 +0800, Laetitia Pfaender wrote:
> cd repo-in-branchB
> git branch --set-upstream-to=origin/branchB
> git pull
> 
> git branch --set-upstream-to=origin/branchA
> git pull
> git branch --set-
> upstream-to=origin/branchB
> 
> It does exactly what I want but, as I have
> many children branches B to update, I would like to prompt my
> username and password only once and then makes the script use them in
> all following git requests.

It would be nice if you explained in words exactly what it is you want
to do.

This seems like a lot more work than necessary.  A "git pull" consists
of two steps: first a "git fetch" which is the part that actually goes
out to the remote and pulls all the new content, and then a merge
operation to the remote's version of the current branch.

The "git fetch" is all that needs credentials, and it pulls the entire
contents of the repo including all branches, so you only need to do it
once.

Is there some reason why you can't do the following:

  cd repo
  git fetch (requires you to enter username/password)
  git merge origin/branchB
  git merge origin/branchA

and just continue to merge for each different branch (without re-
running git fetch)?

>  I came to the conclusion that I needed to update my script as
> follow:
> echo “pull from branchB"
> git pull https://$username:$password@g
> ithub.com/myrepo.git heads/branchB
> echo “pull from parent branchA"
> git
> pull https://$username:$passw...@github.com/myrepo.git heads/branchA
>  I came to the conclusion that I needed to update my script as
> follow:
> echo “pull from branchB"
> git pull https://$username:$password@g
> ithub.com/myrepo.git heads/branchB
> echo “pull from parent branchA"
> git
> pull https://$username:$passw...@github.com/myrepo.git heads/branchA

Don't know how well this works as I don't use HTTPS remotes very much. 
But note, this will make your username AND password for your GitHub
account visible to anyone one the system who happens to run "ps" while
your pull command is running.


Re: [PATCH v2] gitcli: tweak "man gitcli" for clarity

2017-11-21 Thread Kevin Daudt
On Tue, Nov 21, 2017 at 04:47:42PM -0500, Robert P. J. Day wrote:
> On Tue, 21 Nov 2017, Kevin Daudt wrote:
> 
> > On Tue, Nov 21, 2017 at 04:27:59PM -0500, Robert P. J. Day wrote:
> > > No major changes, just some rewording and showing some variations of
> > > general Git commands.
> > >
> > > Signed-off-by: Robert P. J. Day 
> > >
> > > ---
> > >
> > > diff --git a/Documentation/gitcli.txt b/Documentation/gitcli.txt
> > > index 9f13266a6..d690d1ff0 100644
> > > --- a/Documentation/gitcli.txt
> > > +++ b/Documentation/gitcli.txt
> > > @@ -13,7 +13,7 @@ gitcli
> > >  DESCRIPTION
> > >  ---
> > >
> > > -This manual describes the convention used throughout Git CLI.
> > > +This manual describes the conventions used throughout Git CLI.
> > >
> > >  Many commands take revisions (most often "commits", but sometimes
> > >  "tree-ish", depending on the context and command) and paths as their
> > > @@ -32,32 +32,35 @@ arguments.  Here are the rules:
> > > between the HEAD commit and the work tree as a whole".  You can say
> > > `git diff HEAD --` to ask for the latter.
> > >
> > > - * Without disambiguating `--`, Git makes a reasonable guess, but errors
> > > -   out and asking you to disambiguate when ambiguous.  E.g. if you have a
> > > + * Without a disambiguating `--`, Git makes a reasonable guess, but can
> > > +   error out, asking you to disambiguate when ambiguous.  E.g. if you 
> > > have a
> >
> > 'Can' error out implies that it sometimes would not error out when
> > there is ambiguity. Are there situation where git does not error out
> > in that case?
> 
>   i would say (based on my limited knowledge) that if the heuristic
> kicks in and works fine, then things will work. i think it's fair to
> say that git "can" error out if the heuristic fails.
> 
> rday

In most cases that I'm aware of, you have to be explicit. If for example
you want to refer to a file that's not in the working tree, you have to
use '--'.  Even with heuristics, it would still have to error out when
it's ambiguous what the user meant.

So the way you worded it implies that there are situations where git
knows there are multiple things the user could have meant, but it would
not error out in that case.

Kevin



Boat Owners List

2017-11-21 Thread Marvin Curtis


Hi,

Greeting of the day!

Would you be interested in acquiring an email list of "Boat Owners" from USA?

Our Databases:-1.RV Owners List  2.Sail and Power boat Owners List
   3.Travelers List  4.Fishing Enthusiasts List
   5.Cruise Travelers List   6.Motorcycle Owners List
   7.Camping Enthusiasts List8.Spa and Resort Visitors List
   9.Car Owners List 10.Outdoor Enthusiasts List and 
many more..,

We provide Data fields on each record contains: Name (First and Last), Address, 
City, State, Zip, County, Opt-in Email Address, Boat use, Boat length, Boat 
Propulsion, Boat Fuel, Boat Hull Material, Boat Make, Boat Hull Shape, Boat 
Size, Boat Year, Boat Transaction Date, Boat Transaction Type, Boat Validation 
Date and Registration date.

All the contacts are opt-in verified, complete permission based and can be used 
for unlimited multi-channel marketing.

Please let me know your thoughts towards procuring the Boat Owners List.

Waiting for your valuable and sincere reply.


Best Regards,
Marvin Curtis
Research Analyst



Re: [PATCH v2] gitcli: tweak "man gitcli" for clarity

2017-11-21 Thread Robert P. J. Day
On Tue, 21 Nov 2017, Kevin Daudt wrote:

> On Tue, Nov 21, 2017 at 04:27:59PM -0500, Robert P. J. Day wrote:
> > No major changes, just some rewording and showing some variations of
> > general Git commands.
> >
> > Signed-off-by: Robert P. J. Day 
> >
> > ---
> >
> > diff --git a/Documentation/gitcli.txt b/Documentation/gitcli.txt
> > index 9f13266a6..d690d1ff0 100644
> > --- a/Documentation/gitcli.txt
> > +++ b/Documentation/gitcli.txt
> > @@ -13,7 +13,7 @@ gitcli
> >  DESCRIPTION
> >  ---
> >
> > -This manual describes the convention used throughout Git CLI.
> > +This manual describes the conventions used throughout Git CLI.
> >
> >  Many commands take revisions (most often "commits", but sometimes
> >  "tree-ish", depending on the context and command) and paths as their
> > @@ -32,32 +32,35 @@ arguments.  Here are the rules:
> > between the HEAD commit and the work tree as a whole".  You can say
> > `git diff HEAD --` to ask for the latter.
> >
> > - * Without disambiguating `--`, Git makes a reasonable guess, but errors
> > -   out and asking you to disambiguate when ambiguous.  E.g. if you have a
> > + * Without a disambiguating `--`, Git makes a reasonable guess, but can
> > +   error out, asking you to disambiguate when ambiguous.  E.g. if you have 
> > a
>
> 'Can' error out implies that it sometimes would not error out when
> there is ambiguity. Are there situation where git does not error out
> in that case?

  i would say (based on my limited knowledge) that if the heuristic
kicks in and works fine, then things will work. i think it's fair to
say that git "can" error out if the heuristic fails.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: [PATCH v2] gitcli: tweak "man gitcli" for clarity

2017-11-21 Thread Kevin Daudt
On Tue, Nov 21, 2017 at 04:27:59PM -0500, Robert P. J. Day wrote:
> No major changes, just some rewording and showing some variations of
> general Git commands.
> 
> Signed-off-by: Robert P. J. Day 
> 
> ---
> 
> diff --git a/Documentation/gitcli.txt b/Documentation/gitcli.txt
> index 9f13266a6..d690d1ff0 100644
> --- a/Documentation/gitcli.txt
> +++ b/Documentation/gitcli.txt
> @@ -13,7 +13,7 @@ gitcli
>  DESCRIPTION
>  ---
> 
> -This manual describes the convention used throughout Git CLI.
> +This manual describes the conventions used throughout Git CLI.
> 
>  Many commands take revisions (most often "commits", but sometimes
>  "tree-ish", depending on the context and command) and paths as their
> @@ -32,32 +32,35 @@ arguments.  Here are the rules:
> between the HEAD commit and the work tree as a whole".  You can say
> `git diff HEAD --` to ask for the latter.
> 
> - * Without disambiguating `--`, Git makes a reasonable guess, but errors
> -   out and asking you to disambiguate when ambiguous.  E.g. if you have a
> + * Without a disambiguating `--`, Git makes a reasonable guess, but can
> +   error out, asking you to disambiguate when ambiguous.  E.g. if you have a

'Can' error out implies that it sometimes would not error out when there
is ambiguity. Are there situation where git does not error out in that
case?

> file called HEAD in your work tree, `git diff HEAD` is ambiguous, and
> you have to say either `git diff HEAD --` or `git diff -- HEAD` to
> disambiguate.
>  +
>  When writing a script that is expected to handle random user-input, it is
>  a good practice to make it explicit which arguments are which by placing
> -disambiguating `--` at appropriate places.
> +a disambiguating `--` at appropriate places.
> 
>   * Many commands allow wildcards in paths, but you need to protect
> -   them from getting globbed by the shell.  These two mean different
> -   things:
> +   them from getting globbed by the shell.  The following commands have
> +   two different meanings:
>  +
>  
>  $ git checkout -- *.c
> +
>  $ git checkout -- \*.c
> +$ git checkout -- "*.c"
> +$ git checkout -- '*.c'
>  
>  +
> -The former lets your shell expand the fileglob, and you are asking
> -the dot-C files in your working tree to be overwritten with the version
> -in the index.  The latter passes the `*.c` to Git, and you are asking
> -the paths in the index that match the pattern to be checked out to your
> -working tree.  After running `git add hello.c; rm hello.c`, you will _not_
> -see `hello.c` in your working tree with the former, but with the latter
> -you will.
> +The first command lets your shell expand the fileglob, and you are asking
> +the dot-C files in your working tree to be overwritten with the version in
> +the index.  The latter three variations pass the `*.c` to Git, and you are
> +asking the paths in the index that match the pattern to be checked out to
> +your working tree.  After running `git add hello.c; rm hello.c`, you will
> +_not_ see `hello.c` in your working tree with the first command, but with
> +the latter three variations, you will.
> 
>   * Just as the filesystem '.' (period) refers to the current directory,
> using a '.' as a repository name in Git (a dot-repository) is a relative
> 
> -- 
> 
> 
> Robert P. J. Day Ottawa, Ontario, CANADA
> http://crashcourse.ca
> 
> Twitter:   http://twitter.com/rpjday
> LinkedIn:   http://ca.linkedin.com/in/rpjday
> 


Re: [RFC PATCH] builtin/worktree: enhance worktree removal

2017-11-21 Thread Eric Sunshine
On Tue, Nov 21, 2017 at 10:09 AM, Kaartic Sivaraam
 wrote:
> The new feature to 'remove' worktree was handy to remove specific
> worktrees. It didn't cover one particular case of removal. Specifically,
> if there is an "entry" (a directory in /.git/worktrees)
> for a worktree but the worktree repository itself does not exist then
> it means that the "entry" is stale and it could just be removed.
>
> So, in case there's a "worktree entry" but not "worktree direectory"
> then just remove the 'stale' entry.

Let me see if I understand. Sometimes you know that you've deleted the
worktree directory, in which case 'git worktree prune' is the obvious
choice. However, there may be cases when you've forgotten that you
deleted the worktree directory (or it got deleted some other way), yet
it still shows up in "git worktree list" output with no indication
that it has been deleted (though, ideally, it should tell you so[1]).
In this case, you see a worktree that you know you no longer want, so
you invoke "git worktree remove" but that errors out because the
actual directory is already gone. This patch make the operation
succeed, despite the missing directory. Is that correct?

[1]: Excerpt from:
https://public-inbox.org/git/capig+cttrv2c7jlu1dr4+n8xo+7yq+deiwlda835wbgd6fh...@mail.gmail.com/

Other information which would be nice to display for each worktree
[by the 'list' command] (possibly controlled by a --verbose flag):

   * the checked out branch or detached head
   * whether it is locked
- the lock reason (if available)
- and whether the worktree is currently accessible
* whether it can be pruned
- and the prune reason if so


[PATCH v2] log: add option to choose which refs to decorate

2017-11-21 Thread Rafael Ascensão
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this the desired effect, under some
conditions it can lead to excessively verbose output.

Introduce two command line options, `--decorate-refs=` and
`--decorate-refs-exclude=` to allow the user to select which
refs are used in decoration.

When "--decorate-refs=" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=" is given, are never used in
decoration.

These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.

 (1) if there is no positive pattern given, pretend as if an
 inclusive default positive pattern was given;

 (2) for each candidate, reject it if it matches no positive
 pattern, or if it matches any one of the negative patterns.

The rules for what is considered a match are slightly different from the
rules used elsewhere.

Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref.  On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.

The commands introduced in this patch consider a match if:

  (a) the pattern contains globs chars,
and regular pattern matching returns a match.

  (b) the pattern does not contain glob chars,
 and ref '' exists, or if ref exists under '/'

This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.

Helped-by: Kevin Daudt 
Helped-by: Junio C Hamano 
Signed-off-by: Rafael Ascensão 
---

Notable changes since v1:

  * Do not change refs.c:for_each_glob_ref. Those changes were meant
  to address inconsistencies between commands, but they should be done
  in a separate topic.

  * Use new matching behaviour suggested by Junio for '--decorate-refs*'
  and change documentation/comments to reflect that.

  * Change help strings to clarify the commands expects a pattern
  instead of 'ref'.

  * Fix small inconsistencies on tests, and issues pointed by the
  feedback on the previous version.


 Documentation/git-log.txt |   7 
 builtin/log.c |  10 -
 log-tree.c|  24 ---
 log-tree.h|   6 ++-
 pretty.c  |   4 +-
 refs.c|  65 +
 refs.h|  24 +++
 revision.c|   2 +-
 t/t4202-log.sh| 101 ++
 9 files changed, 232 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
index 32246fdb0..5437f8b0f 100644
--- a/Documentation/git-log.txt
+++ b/Documentation/git-log.txt
@@ -38,6 +38,13 @@ OPTIONS
are shown as if 'short' were given, otherwise no ref names are
shown. The default option is 'short'.
 
+--decorate-refs=::
+--decorate-refs-exclude=::
+   If no `--decorate-refs` is given, pretend as if all refs were
+   included.  For each candidate, do not use it for decoration if it
+   matches any patterns given to `--decorate-refs-exclude` or if it
+   doesn't match any of the patterns given to `--decorate-refs`.
+
 --source::
Print out the ref name given on the command line by which each
commit was reached.
diff --git a/builtin/log.c b/builtin/log.c
index 6c1fa896a..14fdf3916 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -142,11 +142,19 @@ static void cmd_log_init_finish(int argc, const char 
**argv, const char *prefix,
struct userformat_want w;
int quiet = 0, source = 0, mailmap = 0;
static struct line_opt_callback_data line_cb = {NULL, NULL, 
STRING_LIST_INIT_DUP};
+   static struct string_list decorate_refs_exclude = 
STRING_LIST_INIT_NODUP;
+   static struct string_list decorate_refs_include = 
STRING_LIST_INIT_NODUP;
+   struct decoration_filter decoration_filter = {_refs_include,
+ _refs_exclude};
 
const struct option builtin_log_options[] = {
OPT__QUIET(, N_("suppress diff output")),
OPT_BOOL(0, "source", , N_("show source")),
OPT_BOOL(0, "use-mailmap", , N_("Use mail map file")),
+   OPT_STRING_LIST(0, "decorate-refs", _refs_include,
+   N_("pattern"), N_("only decorate refs that 
match ")),
+   OPT_STRING_LIST(0, "decorate-refs-exclude", 
_refs_exclude,
+   N_("pattern"), N_("do not decorate refs that 
match ")),
{ OPTION_CALLBACK, 0, "decorate", NULL, NULL, 

[PATCH v2] gitcli: tweak "man gitcli" for clarity

2017-11-21 Thread Robert P. J. Day
No major changes, just some rewording and showing some variations of
general Git commands.

Signed-off-by: Robert P. J. Day 

---

diff --git a/Documentation/gitcli.txt b/Documentation/gitcli.txt
index 9f13266a6..d690d1ff0 100644
--- a/Documentation/gitcli.txt
+++ b/Documentation/gitcli.txt
@@ -13,7 +13,7 @@ gitcli
 DESCRIPTION
 ---

-This manual describes the convention used throughout Git CLI.
+This manual describes the conventions used throughout Git CLI.

 Many commands take revisions (most often "commits", but sometimes
 "tree-ish", depending on the context and command) and paths as their
@@ -32,32 +32,35 @@ arguments.  Here are the rules:
between the HEAD commit and the work tree as a whole".  You can say
`git diff HEAD --` to ask for the latter.

- * Without disambiguating `--`, Git makes a reasonable guess, but errors
-   out and asking you to disambiguate when ambiguous.  E.g. if you have a
+ * Without a disambiguating `--`, Git makes a reasonable guess, but can
+   error out, asking you to disambiguate when ambiguous.  E.g. if you have a
file called HEAD in your work tree, `git diff HEAD` is ambiguous, and
you have to say either `git diff HEAD --` or `git diff -- HEAD` to
disambiguate.
 +
 When writing a script that is expected to handle random user-input, it is
 a good practice to make it explicit which arguments are which by placing
-disambiguating `--` at appropriate places.
+a disambiguating `--` at appropriate places.

  * Many commands allow wildcards in paths, but you need to protect
-   them from getting globbed by the shell.  These two mean different
-   things:
+   them from getting globbed by the shell.  The following commands have
+   two different meanings:
 +
 
 $ git checkout -- *.c
+
 $ git checkout -- \*.c
+$ git checkout -- "*.c"
+$ git checkout -- '*.c'
 
 +
-The former lets your shell expand the fileglob, and you are asking
-the dot-C files in your working tree to be overwritten with the version
-in the index.  The latter passes the `*.c` to Git, and you are asking
-the paths in the index that match the pattern to be checked out to your
-working tree.  After running `git add hello.c; rm hello.c`, you will _not_
-see `hello.c` in your working tree with the former, but with the latter
-you will.
+The first command lets your shell expand the fileglob, and you are asking
+the dot-C files in your working tree to be overwritten with the version in
+the index.  The latter three variations pass the `*.c` to Git, and you are
+asking the paths in the index that match the pattern to be checked out to
+your working tree.  After running `git add hello.c; rm hello.c`, you will
+_not_ see `hello.c` in your working tree with the first command, but with
+the latter three variations, you will.

  * Just as the filesystem '.' (period) refers to the current directory,
using a '.' as a repository name in Git (a dot-repository) is a relative

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: [PATCH] gitcli: tweak "man gitcli" for clarity

2017-11-21 Thread Robert P. J. Day
On Tue, 21 Nov 2017, Eric Sunshine wrote:

> On Tue, Nov 21, 2017 at 3:53 PM, Robert P. J. Day  
> wrote:
> > No major changes, just some rewording and showing some variations of
> > general Git commands.
> >
> > Signed-off-by: Robert P. J. Day 
> > ---
> > diff --git a/Documentation/gitcli.txt b/Documentation/gitcli.txt
> > @@ -13,7 +13,7 @@ gitcli
> >  DESCRIPTION
> >  ---
> >
> > -This manual describes the convention used throughout Git CLI.
> > +This manual describes the common conventions used throughout Git CLI.
>
> The Department of Redundancy department...

  oh, fine, reduced redundancy coming shortly ...

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: [PATCH] gitcli: tweak "man gitcli" for clarity

2017-11-21 Thread Eric Sunshine
On Tue, Nov 21, 2017 at 3:53 PM, Robert P. J. Day  wrote:
> No major changes, just some rewording and showing some variations of
> general Git commands.
>
> Signed-off-by: Robert P. J. Day 
> ---
> diff --git a/Documentation/gitcli.txt b/Documentation/gitcli.txt
> @@ -13,7 +13,7 @@ gitcli
>  DESCRIPTION
>  ---
>
> -This manual describes the convention used throughout Git CLI.
> +This manual describes the common conventions used throughout Git CLI.

The Department of Redundancy department...


Re: [PATCH] doc: remove explanation of "--" from man pages

2017-11-21 Thread Kevin Daudt
On Tue, Nov 21, 2017 at 04:12:02PM -0500, Robert P. J. Day wrote:
> "man gitcli" already explains the purpose of the "--" syntax, so there
> is no value to a small subset of Git commands explaining that in their
> man pages.
> 
> Signed-off-by: Robert P. J. Day 
> 
> ---
> 
>   i tried this once before, and i'm going to try to push it through
> again ... it's pointless and inconsistent for less than a dozen man
> pages to explicitly explain the purpose of "--" unless all of the man
> pages do. as long as the "--" appears in the command SYNOPSIS, that
> should be more than adequate.

Although I agree that common options don't need to be explained
everytime again, this change might make '--' even more obscure. To be
honest, I didn't even know about gitcli(7), let alone most new users.

In the #git irc channel we often have to explain what '--' means and
why it's sometimes necessary.

I don't however know a better solution to it more clear.

Kevin


Re: pedantry: is there a standard for what should be in the SYNOPSIS?

2017-11-21 Thread Robert P. J. Day
On Tue, 21 Nov 2017, Jonathan Nieder wrote:

> Hi,
>
> Robert P. J. Day wrote:
>
> > following up on an earlier question of mine, is there a standard
> > for what options should be listed in either the SYNOPSIS or the
> > DESCRIPTION sections of a man page? i ask since i'm seeing some
> > definite inconsistency.
>
> No standard.  Seems worth starting a discussion on what you'd like
> the standard to be.

  so i'm in charge? awesome. you won't regret it. :-)

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



[PATCH v5 03/14] fetch: refactor calculation of remote list

2017-11-21 Thread Jeff Hostetler
From: Jonathan Tan 

Separate out the calculation of remotes to be fetched from and the
actual fetching. This will allow us to include an additional step before
the actual fetching in a subsequent commit.

Signed-off-by: Jonathan Tan 
---
 builtin/fetch.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 225c734..1b1f039 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1322,7 +1322,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
 {
int i;
struct string_list list = STRING_LIST_INIT_DUP;
-   struct remote *remote;
+   struct remote *remote = NULL;
int result = 0;
struct argv_array argv_gc_auto = ARGV_ARRAY_INIT;
 
@@ -1367,17 +1367,14 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
else if (argc > 1)
die(_("fetch --all does not make sense with refspecs"));
(void) for_each_remote(get_one_remote_for_fetch, );
-   result = fetch_multiple();
} else if (argc == 0) {
/* No arguments -- use default remote */
remote = remote_get(NULL);
-   result = fetch_one(remote, argc, argv);
} else if (multiple) {
/* All arguments are assumed to be remotes or groups */
for (i = 0; i < argc; i++)
if (!add_remote_or_group(argv[i], ))
die(_("No such remote or remote group: %s"), 
argv[i]);
-   result = fetch_multiple();
} else {
/* Single remote or group */
(void) add_remote_or_group(argv[0], );
@@ -1385,14 +1382,19 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
/* More than one remote */
if (argc > 1)
die(_("Fetching a group and specifying refspecs 
does not make sense"));
-   result = fetch_multiple();
} else {
/* Zero or one remotes */
remote = remote_get(argv[0]);
-   result = fetch_one(remote, argc-1, argv+1);
+   argc--;
+   argv++;
}
}
 
+   if (remote)
+   result = fetch_one(remote, argc, argv);
+   else
+   result = fetch_multiple();
+
if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
struct argv_array options = ARGV_ARRAY_INIT;
 
-- 
2.9.3



[PATCH v5 14/14] fetch-pack: restore save_commit_buffer after use

2017-11-21 Thread Jeff Hostetler
From: Jonathan Tan 

In fetch-pack, the global variable save_commit_buffer is set to 0, but
not restored to its original value after use.

In particular, if show_log() (in log-tree.c) is invoked after
fetch_pack() in the same process, show_log() will return before printing
out the commit message (because the invocation to
get_cached_commit_buffer() returns NULL, because the commit buffer was
not saved). I discovered this when attempting to run "git log -S" in a
partial clone, triggering the case where revision walking lazily loads
missing objects.

Therefore, restore save_commit_buffer to its original value after use.

An alternative to solve the problem I had is to replace
get_cached_commit_buffer() with get_commit_buffer(). That invocation was
introduced in commit a97934d ("use get_cached_commit_buffer where
appropriate", 2014-06-13) to replace "commit->buffer" introduced in
commit 3131b71 ("Add "--show-all" revision walker flag for debugging",
2008-02-13). In the latter commit, the commit author seems to be
deciding between not showing an unparsed commit at all and showing an
unparsed commit without the message (which is what the commit does), and
did not mention parsing the unparsed commit, so I prefer to preserve the
existing behavior.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 fetch-pack.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/fetch-pack.c b/fetch-pack.c
index 2b04251..33a36d1 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -717,6 +717,7 @@ static int everything_local(struct fetch_pack_args *args,
 {
struct ref *ref;
int retval;
+   int old_save_commit_buffer = save_commit_buffer;
timestamp_t cutoff = 0;
 
save_commit_buffer = 0;
@@ -784,6 +785,9 @@ static int everything_local(struct fetch_pack_args *args,
print_verbose(args, _("already have %s (%s)"), 
oid_to_hex(remote),
  ref->name);
}
+
+   save_commit_buffer = old_save_commit_buffer;
+
return retval;
 }
 
-- 
2.9.3



[PATCH v5 02/14] clone, fetch-pack, index-pack, transport: partial clone

2017-11-21 Thread Jeff Hostetler
From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 builtin/clone.c  |  9 +
 builtin/fetch-pack.c |  4 
 fetch-pack.c | 13 +
 fetch-pack.h |  2 ++
 transport-helper.c   |  5 +
 transport.c  |  4 
 transport.h  |  5 +
 7 files changed, 42 insertions(+)

diff --git a/builtin/clone.c b/builtin/clone.c
index dbddd98..0a8ac76 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -26,6 +26,7 @@
 #include "run-command.h"
 #include "connected.h"
 #include "packfile.h"
+#include "list-objects-filter-options.h"
 
 /*
  * Overall FIXMEs:
@@ -60,6 +61,7 @@ static struct string_list option_optional_reference = 
STRING_LIST_INIT_NODUP;
 static int option_dissociate;
 static int max_jobs = -1;
 static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
+static struct list_objects_filter_options filter_options;
 
 static int recurse_submodules_cb(const struct option *opt,
 const char *arg, int unset)
@@ -135,6 +137,7 @@ static struct option builtin_clone_options[] = {
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
TRANSPORT_FAMILY_IPV6),
+   OPT_PARSE_LIST_OBJECTS_FILTER(_options),
OPT_END()
 };
 
@@ -1073,6 +1076,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
warning(_("--shallow-since is ignored in local clones; 
use file:// instead."));
if (option_not.nr)
warning(_("--shallow-exclude is ignored in local 
clones; use file:// instead."));
+   if (filter_options.choice)
+   warning(_("--filter is ignored in local clones; use 
file:// instead."));
if (!access(mkpath("%s/shallow", path), F_OK)) {
if (option_local > 0)
warning(_("source repository is shallow, 
ignoring --local"));
@@ -1104,6 +1109,10 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_set_option(transport, TRANS_OPT_UPLOADPACK,
 option_upload_pack);
 
+   if (filter_options.choice)
+   transport_set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
+filter_options.filter_spec);
+
if (transport->smart_options && !deepen)
transport->smart_options->check_self_contained_and_connected = 
1;
 
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 9a7ebf6..d0fdaa8 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -153,6 +153,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
args.no_haves = 1;
continue;
}
+   if (skip_prefix(arg, ("--" CL_ARG__FILTER "="), )) {
+   parse_list_objects_filter(_options, arg);
+   continue;
+   }
usage(fetch_pack_usage);
}
if (deepen_not.nr)
diff --git a/fetch-pack.c b/fetch-pack.c
index 4640b4e..2b04251 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -29,6 +29,7 @@ static int deepen_not_ok;
 static int fetch_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
 static int agent_supported;
+static int server_supports_filtering;
 static struct lock_file shallow_lock;
 static const char *alternate_shallow_file;
 
@@ -379,6 +380,8 @@ static int find_common(struct fetch_pack_args *args,
if (deepen_not_ok)  strbuf_addstr(, " 
deepen-not");
if (agent_supported)strbuf_addf(, " agent=%s",

git_user_agent_sanitized());
+   if (args->filter_options.choice)
+   strbuf_addstr(, " filter");
packet_buf_write(_buf, "want %s%s\n", remote_hex, 
c.buf);
strbuf_release();
} else
@@ -407,6 +410,9 @@ static int find_common(struct fetch_pack_args *args,
packet_buf_write(_buf, "deepen-not %s", s->string);
}
}
+   if (server_supports_filtering && args->filter_options.choice)
+   packet_buf_write(_buf, "filter %s",
+args->filter_options.filter_spec);
packet_buf_flush(_buf);
state_len = req_buf.len;
 
@@ -967,6 +973,13 @@ static struct ref *do_fetch_pack(struct fetch_pack_args 
*args,
else
prefer_ofs_delta = 0;
 
+   if (server_supports("filter")) {
+   server_supports_filtering = 1;
+   print_verbose(args, _("Server supports filter"));
+   } else if (args->filter_options.choice) {
+   warning("filtering not recognized by server, 

[PATCH v5 13/14] unpack-trees: batch fetching of missing blobs

2017-11-21 Thread Jeff Hostetler
From: Jonathan Tan 

When running checkout, first prefetch all blobs that are to be updated
but are missing. This means that only one pack is downloaded during such
operations, instead of one per missing blob.

This operates only on the blob level - if a repository has a missing
tree, they are still fetched one at a time.

This does not use the delayed checkout mechanism introduced in commit
2841e8f ("convert: add "status=delayed" to filter process protocol",
2017-06-30) due to significant conceptual differences - in particular,
for partial clones, we already know what needs to be fetched based on
the contents of the local repo alone, whereas for status=delayed, it is
the filter process that tells us what needs to be checked in the end.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 fetch-object.c   | 26 ++
 fetch-object.h   |  5 +
 t/t5601-clone.sh | 52 
 unpack-trees.c   | 22 ++
 4 files changed, 101 insertions(+), 4 deletions(-)

diff --git a/fetch-object.c b/fetch-object.c
index fc086fc..21b4dfa 100644
--- a/fetch-object.c
+++ b/fetch-object.c
@@ -5,11 +5,10 @@
 #include "transport.h"
 #include "fetch-object.h"
 
-void fetch_object(const char *remote_name, const unsigned char *sha1)
+static void fetch_refs(const char *remote_name, struct ref *ref)
 {
struct remote *remote;
struct transport *transport;
-   struct ref *ref;
int original_fetch_if_missing = fetch_if_missing;
 
fetch_if_missing = 0;
@@ -18,10 +17,29 @@ void fetch_object(const char *remote_name, const unsigned 
char *sha1)
die(_("Remote with no URL"));
transport = transport_get(remote, remote->url[0]);
 
-   ref = alloc_ref(sha1_to_hex(sha1));
-   hashcpy(ref->old_oid.hash, sha1);
transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
transport_set_option(transport, TRANS_OPT_NO_HAVES, "1");
transport_fetch_refs(transport, ref);
fetch_if_missing = original_fetch_if_missing;
 }
+
+void fetch_object(const char *remote_name, const unsigned char *sha1)
+{
+   struct ref *ref = alloc_ref(sha1_to_hex(sha1));
+   hashcpy(ref->old_oid.hash, sha1);
+   fetch_refs(remote_name, ref);
+}
+
+void fetch_objects(const char *remote_name, const struct oid_array *to_fetch)
+{
+   struct ref *ref = NULL;
+   int i;
+
+   for (i = 0; i < to_fetch->nr; i++) {
+   struct ref *new_ref = alloc_ref(oid_to_hex(_fetch->oid[i]));
+   oidcpy(_ref->old_oid, _fetch->oid[i]);
+   new_ref->next = ref;
+   ref = new_ref;
+   }
+   fetch_refs(remote_name, ref);
+}
diff --git a/fetch-object.h b/fetch-object.h
index f371300..4b269d0 100644
--- a/fetch-object.h
+++ b/fetch-object.h
@@ -1,6 +1,11 @@
 #ifndef FETCH_OBJECT_H
 #define FETCH_OBJECT_H
 
+#include "sha1-array.h"
+
 extern void fetch_object(const char *remote_name, const unsigned char *sha1);
 
+extern void fetch_objects(const char *remote_name,
+ const struct oid_array *to_fetch);
+
 #endif
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 6d37c6d..13610b7 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -611,6 +611,58 @@ test_expect_success 'partial clone: warn if server does 
not support object filte
test_i18ngrep "filtering not recognized by server" err
 '
 
+test_expect_success 'batch missing blob request during checkout' '
+   rm -rf server client &&
+
+   test_create_repo server &&
+   echo a >server/a &&
+   echo b >server/b &&
+   git -C server add a b &&
+
+   git -C server commit -m x &&
+   echo aa >server/a &&
+   echo bb >server/b &&
+   git -C server add a b &&
+   git -C server commit -m x &&
+
+   test_config -C server uploadpack.allowfilter 1 &&
+   test_config -C server uploadpack.allowanysha1inwant 1 &&
+
+   git clone --filter=blob:limit=0 "file://$(pwd)/server" client &&
+
+   # Ensure that there is only one negotiation by checking that there is
+   # only "done" line sent. ("done" marks the end of negotiation.)
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client checkout HEAD^ &&
+   grep "git> done" trace >done_lines &&
+   test_line_count = 1 done_lines
+'
+
+test_expect_success 'batch missing blob request does not inadvertently try to 
fetch gitlinks' '
+   rm -rf server client &&
+
+   test_create_repo repo_for_submodule &&
+   test_commit -C repo_for_submodule x &&
+
+   test_create_repo server &&
+   echo a >server/a &&
+   echo b >server/b &&
+   git -C server add a b &&
+   git -C server commit -m x &&
+
+   echo aa >server/a &&
+   echo bb >server/b &&
+   # Also add a gitlink pointing to an arbitrary repository
+   git -C server submodule add 

[PATCH v5 11/14] t5601: test for partial clone

2017-11-21 Thread Jeff Hostetler
From: Jonathan Tan 

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/clone.c  | 15 ---
 t/t5601-clone.sh | 49 +
 2 files changed, 61 insertions(+), 3 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 0a8ac76..f519bd4 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -889,6 +889,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
struct refspec *refspec;
const char *fetch_pattern;
 
+   fetch_if_missing = 0;
+
packet_trace_identity("clone");
argc = parse_options(argc, argv, prefix, builtin_clone_options,
 builtin_clone_usage, 0);
@@ -1109,11 +,13 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_set_option(transport, TRANS_OPT_UPLOADPACK,
 option_upload_pack);
 
-   if (filter_options.choice)
+   if (filter_options.choice) {
transport_set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
 filter_options.filter_spec);
+   transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+   }
 
-   if (transport->smart_options && !deepen)
+   if (transport->smart_options && !deepen && !filter_options.choice)
transport->smart_options->check_self_contained_and_connected = 
1;
 
refs = transport_get_remote_refs(transport);
@@ -1173,13 +1177,17 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
write_refspec_config(src_ref_prefix, our_head_points_at,
remote_head_points_at, _top);
 
+   if (filter_options.choice)
+   partial_clone_register("origin", _options);
+
if (is_local)
clone_local(path, git_dir);
else if (refs && complete_refs_before_fetch)
transport_fetch_refs(transport, mapped_refs);
 
update_remote_refs(refs, mapped_refs, remote_head_points_at,
-  branch_top.buf, reflog_msg.buf, transport, 
!is_local);
+  branch_top.buf, reflog_msg.buf, transport,
+  !is_local && !filter_options.choice);
 
update_head(our_head_points_at, remote_head, reflog_msg.buf);
 
@@ -1200,6 +1208,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
}
 
junk_mode = JUNK_LEAVE_REPO;
+   fetch_if_missing = 1;
err = checkout(submodule_progress);
 
strbuf_release(_msg);
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 9c56f77..6d37c6d 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -571,4 +571,53 @@ test_expect_success 'GIT_TRACE_PACKFILE produces a usable 
pack' '
git -C replay.git index-pack -v --stdin  err &&
+
+   test_i18ngrep "filtering not recognized by server" err
+'
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'partial clone using HTTP' '
+   partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+'
+
+stop_httpd
+
 test_done
-- 
2.9.3



[PATCH v5 12/14] t5500: more tests for partial clone and fetch

2017-11-21 Thread Jeff Hostetler
From: Jonathan Tan 

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 60 +++
 1 file changed, 56 insertions(+), 4 deletions(-)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 23702b5..c95bb7b 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -782,7 +782,7 @@ test_expect_success 'filtering by size has no effect if 
support for it is not ad
test_i18ngrep "filtering not recognized by server" err
 '
 
-fetch_blob_max_bytes () {
+setup_blob_max_bytes () {
  SERVER="$1"
  URL="$2"
 
@@ -794,7 +794,11 @@ fetch_blob_max_bytes () {
git clone "$URL" client &&
test_config -C client extensions.partialclone origin &&
 
-   test_commit -C "$SERVER" two &&
+   test_commit -C "$SERVER" two
+}
+
+do_blob_max_bytes() {
+   SERVER="$1" &&
 
git -C client fetch --filter=blob:limit=0 origin HEAD:somewhere &&
 
@@ -805,14 +809,62 @@ fetch_blob_max_bytes () {
 }
 
 test_expect_success 'fetch with filtering' '
-fetch_blob_max_bytes server server
+   setup_blob_max_bytes server server &&
+   do_blob_max_bytes server
+'
+
+test_expect_success 'fetch respects configured filtering' '
+   setup_blob_max_bytes server server &&
+
+   test_config -C client core.partialclonefilter blob:limit=0 &&
+
+   git -C client fetch origin HEAD:somewhere &&
+
+   # Ensure that commit is fetched, but blob is not
+   test_config -C client extensions.partialclone "arbitrary string" &&
+   git -C client cat-file -e $(git -C server rev-parse two) &&
+   test_must_fail git -C client cat-file -e $(git hash-object server/two.t)
+'
+
+test_expect_success 'pull respects configured filtering' '
+   setup_blob_max_bytes server server &&
+
+   # Hide two.t from tip so that client does not load it upon the
+   # automatic checkout that pull performs
+   git -C server rm two.t &&
+   test_commit -C server three &&
+
+   test_config -C server uploadpack.allowanysha1inwant 1 &&
+   test_config -C client core.partialclonefilter blob:limit=0 &&
+
+   git -C client pull origin &&
+
+   # Ensure that commit is fetched, but blob is not
+   test_config -C client extensions.partialclone "arbitrary string" &&
+   git -C client cat-file -e $(git -C server rev-parse two) &&
+   test_must_fail git -C client cat-file -e $(git hash-object server/two.t)
+'
+
+test_expect_success 'clone configures filtering' '
+   rm -rf server client &&
+   test_create_repo server &&
+   test_commit -C server one &&
+   test_commit -C server two &&
+   test_config -C server uploadpack.allowanysha1inwant 1 &&
+
+   git clone --filter=blob:limit=12345 server client &&
+
+   # Ensure that we can, for example, checkout HEAD^
+   rm -rf client/.git/objects/* &&
+   git -C client checkout HEAD^
 '
 
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
 test_expect_success 'fetch with filtering and HTTP' '
-fetch_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+   setup_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server" &&
+   do_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server"
 '
 
 stop_httpd
-- 
2.9.3



[PATCH v5 10/14] t5500: add fetch-pack tests for partial clone

2017-11-21 Thread Jeff Hostetler
From: Jonathan Tan 

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 36 
 1 file changed, 36 insertions(+)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index c57916b..23702b5 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -782,4 +782,40 @@ test_expect_success 'filtering by size has no effect if 
support for it is not ad
test_i18ngrep "filtering not recognized by server" err
 '
 
+fetch_blob_max_bytes () {
+ SERVER="$1"
+ URL="$2"
+
+   rm -rf "$SERVER" client &&
+   test_create_repo "$SERVER" &&
+   test_commit -C "$SERVER" one &&
+   test_config -C "$SERVER" uploadpack.allowfilter 1 &&
+
+   git clone "$URL" client &&
+   test_config -C client extensions.partialclone origin &&
+
+   test_commit -C "$SERVER" two &&
+
+   git -C client fetch --filter=blob:limit=0 origin HEAD:somewhere &&
+
+   # Ensure that commit is fetched, but blob is not
+   test_config -C client extensions.partialclone "arbitrary string" &&
+   git -C client cat-file -e $(git -C "$SERVER" rev-parse two) &&
+   test_must_fail git -C client cat-file -e $(git hash-object 
"$SERVER/two.t")
+}
+
+test_expect_success 'fetch with filtering' '
+fetch_blob_max_bytes server server
+'
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'fetch with filtering and HTTP' '
+fetch_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+'
+
+stop_httpd
+
+
 test_done
-- 
2.9.3



[PATCH v5 06/14] pack-objects: test support for blob filtering

2017-11-21 Thread Jeff Hostetler
From: Jonathan Tan 

As part of an effort to improve Git support for very large repositories
in which clients typically have only a subset of all version-controlled
blobs, test pack-objects support for --filter=blob:limit=, packing only
blobs not exceeding that size unless the blob corresponds to a file
whose name starts with ".git". upload-pack will eventually be taught to
use this new parameter if needed to exclude certain blobs during a fetch
or clone, potentially drastically reducing network consumption when
serving these very large repositories.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5300-pack-object.sh  | 26 ++
 t/test-lib-functions.sh | 12 
 2 files changed, 38 insertions(+)

diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 9c68b99..8e3db12 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -457,6 +457,32 @@ test_expect_success !PTHREADS,C_LOCALE_OUTPUT 
'pack-objects --threads=N or pack.
grep -F "no threads support, ignoring pack.threads" err
 '
 
+lcut () {
+   perl -e '$/ = undef; $_ = <>; s/^.{'$1'}//s; print $_'
+}
+
+test_expect_success 'filtering by size works with multiple excluded' '
+   rm -rf server &&
+   git init server &&
+   printf a > server/a &&
+   printf b > server/b &&
+   printf c-very-long-file > server/c &&
+   printf d-very-long-file > server/d &&
+   git -C server add a b c d &&
+   git -C server commit -m x &&
+
+   git -C server rev-parse HEAD >objects &&
+   git -C server pack-objects --revs --stdout --filter=blob:limit=10 
my.pack &&
+
+   # Ensure that only the small blobs are in the packfile
+   git index-pack my.pack &&
+   git verify-pack -v my.idx >objectlist &&
+   grep $(git hash-object server/a) objectlist &&
+   grep $(git hash-object server/b) objectlist &&
+   ! grep $(git hash-object server/c) objectlist &&
+   ! grep $(git hash-object server/d) objectlist
+'
+
 #
 # WARNING!
 #
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 1701fe2..07b79c7 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1020,3 +1020,15 @@ nongit () {
"$@"
)
 }
+
+# Converts big-endian pairs of hexadecimal digits into bytes. For example,
+# "printf 61620d0a | hex_pack" results in "ab\r\n".
+hex_pack () {
+   perl -e '$/ = undef; $input = <>; print pack("H*", $input)'
+}
+
+# Converts bytes into big-endian pairs of hexadecimal digits. For example,
+# "printf 'ab\r\n' | hex_unpack" results in "61620d0a".
+hex_unpack () {
+   perl -e '$/ = undef; $input = <>; print unpack("H2" x length($input), 
$input)'
+}
-- 
2.9.3



  1   2   3   >