date:20160905

Re: 2.10.0: multiple versionsort.prereleasesuffix buggy?

2016-09-05 Thread Jeff King

On Tue, Sep 06, 2016 at 03:07:59AM +0200, SZEDER Gábor wrote:

> > So that seems wrong. Even weirder, if I set _only_ "-beta", I get:
> > 
> >   $ git tag -l --sort=version:refname | grep -v ^2.6.0
> >   2.6.0-beta-2
> >   2.6.0-beta-3
> >   2.6.0-beta-4
> >   2.6.0
> >   2.6.0-RC1
> >   2.6.0-RC2
> >   2.6.0-beta-1
> > 
> > Umm...what? beta-1 is sorted away from its companions? That's weird.
> > 
> > I wondered if the presence of "-" after the suffix ("beta-1" rather than
> > "beta1") would matter. It looks like that shouldn't matter, though; it's
> > purely doing a prefix match on "do these names differ at a prerelease
> > suffix".
> > 
> > But something certainly seems wrong.
> 
> Some of the weirdness is caused by the '-' at the _beginning_ of the
> suffixes, because versioncmp() gets confused by suffixes starting with
> the same character(s).

Oh, right, that makes sense. So it's effectively not finding _any_
suffix between X-RC1 and X-beta-1, because we only start looking after
"X-", and none of them match.

I am still confused why "2.6.0-beta-1" doesn't get sorted with its
peers. I'd guess that the comparison function doesn't actually provide a
strict ordering, so the results depend on the actual sort algorithm, and
which pairs it ends up comparing.

-Peff

Re: [PATCHv3] diff.c: emit moved lines with a different color

2016-09-05 Thread Stefan Beller

On Mon, Sep 5, 2016 at 6:17 PM, Jacob Keller  wrote:
> On Sun, Sep 4, 2016 at 4:42 PM, Stefan Beller  wrote:
>> When we color the diff, we'll mark moved lines with a different color.
>>
>
> Excellent idea. This is a very neat way to show extra information
> without cluttering the diff output.
>
>> This is achieved by doing a two passes over the diff. The first pass
>> will inspect each line of the diff and store the removed lines and the
>> added lines in its own hash map.
>> The second pass will check for each added line if that is found in the
>> set of removed lines. If so it will color the added line differently as
>> with the new `moved-new` color mode. For each removed line we check the
>> set of added lines and if found emit that with the new color `moved-old`.
>>
>
> Makes sense.
>
>> When detecting the moved lines, we cannot just rely on a line being equal,
>> but we need to take the context into account to detect when the moves were
>> reordered as we do not want to color moved but per-mutated lines.
>> To do that we use the hash of the preceding line.
>
> Also makes sense.
>
>>
>> This patch was motivated by e.g. reviewing 3b0c4200 ("apply: move
>> libified code from builtin/apply.c to apply.{c,h}", 2016-08-08)
>>
>
> Yes, this would be quite helpful.
>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index 0bcb679..f4f51c2 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -980,8 +980,9 @@ color.diff.::
>> of `context` (context text - `plain` is a historical synonym),
>> `meta` (metainformation), `frag`
>> (hunk header), 'func' (function in hunk header), `old` (removed 
>> lines),
>> -   `new` (added lines), `commit` (commit headers), or `whitespace`
>> -   (highlighting whitespace errors).
>> +   `new` (added lines), `commit` (commit headers), `whitespace`
>> +   (highlighting whitespace errors), `moved-old` (removed lines that
>> +   reappear), `moved-new` (added lines that were removed elsewhere).
>>
>
> I liked Junio's "Moved from" and "moved to" but I think moved old and
> moved new are ok as well.

as we do not want to see dashes ('moved-old'), I think I'l go with
"movedfrom" and "movedto".

>
>> @@ -287,6 +304,25 @@ int git_diff_basic_config(const char *var, const char 
>> *value, void *cb)
>> return git_default_config(var, value, cb);
>>  }
>>
>> +static int moved_entry_cmp(const struct moved_entry *a,
>> +  const struct moved_entry *b,
>> +  const void *unused)
>> +{
>> +   return strcmp(a->line, b->line) &&
>> +  a->hash_prev_line == b->hash_prev_line;
>
> So we're only comparing them if they match and have a matching
> previous line? That seems pretty reasonable to reduce the cost of
> computing exact copied sequences.
>
>> +   if (ecbdata->opt->color_moved) {
>> +   int h = memhash(line, len);
>> +   hash_prev_removed = h;
>> +   hash_prev_added = h;
>> +   }
>>  }
>
> If I understand, this is to ensure that we don't keep re-hashing each
> line right?

No, this is to ensure we have the context sensitivity of one prior line.

In the collection phase we look at each line of the patch and make a hash of it.
Then we store the hash temporarily (think of a state machine that goes line by
line and always keeps the hash of the last line)

What we store in the hashmaps is the hash(current line) ^
hash(previous applicable line).
With previous applicable line I mean any line starting with " " or "+"
when the current
line starts with "+" and " " or "-" when the current line starts with "-".

When going through the second pass and actually emitting colored lines
we only find matches in the hash map if the current line AND the previous line
match as we lookup by hash code, i.e. if we have a moved line, but the
previous line
changed we do not find it in the hashmap and we don't color it, such
that the reviewer
can spot a permutation.

So in the second phase we also need to have access to previous line, so maybe
we could also go with just taking the line with us instead of 2 hash codes.
But that implementation detail seems like a trade off to me, where I'd lean
on keeping the hashes around as lines may be very long in bad cases, whereas
the hashcode is short and it is a cheap hash.

(I am referring to http://i.imgur.com/MnaSZ1D.png where in the malicious
case all lines were moved to there as well, but permutated)

Re: [WIP PATCH v2] diff.c: emit moved lines with a different color

2016-09-05 Thread Stefan Beller

On Mon, Sep 5, 2016 at 6:09 PM, Jacob Keller  wrote:
> On Mon, Sep 5, 2016 at 11:57 AM, Junio C Hamano  wrote:
>> Stefan Beller  writes:
>>
>>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>>> index 0bcb679..f4f51c2 100644
>>> --- a/Documentation/config.txt
>>> +++ b/Documentation/config.txt
>>> @@ -980,8 +980,9 @@ color.diff.::
>>>   of `context` (context text - `plain` is a historical synonym),
>>>   `meta` (metainformation), `frag`
>>>   (hunk header), 'func' (function in hunk header), `old` (removed 
>>> lines),
>>> - `new` (added lines), `commit` (commit headers), or `whitespace`
>>> - (highlighting whitespace errors).
>>> + `new` (added lines), `commit` (commit headers), `whitespace`
>>> + (highlighting whitespace errors), `moved-old` (removed lines that
>>> + reappear), `moved-new` (added lines that were removed elsewhere).
>>
>> Could we have a config to disable this rather costly new feature,
>> too?
>
> That seems entirely reasonable, though we *do* have a configuration
> for disabling color altogether.. is there any numbers on how much more
> this costs to compute?

This new coloring is linear to the size of the patch, i.e. O(number of
added/removed lines) in memory and for computational efforts I'd
think it is O(n log n) as inserting into the hashmap is an amortized
log n.

>
>>> +static struct hashmap *duplicates_added;
>>> +static struct hashmap *duplicates_removed;
>>> +static int hash_previous_line_added;
>>> +static int hash_previous_line_removed;
>>
>> I think these should be added as new fields to diff_options
>> structure.
>
> Agreed, those seem like good choices for diff_options.

yup.

Re: [WIP PATCH v2] diff.c: emit moved lines with a different color

2016-09-05 Thread Stefan Beller

On Mon, Sep 5, 2016 at 11:57 AM, Junio C Hamano  wrote:
>> + `new` (added lines), `commit` (commit headers), `whitespace`
>> + (highlighting whitespace errors), `moved-old` (removed lines that
>> + reappear), `moved-new` (added lines that were removed elsewhere).
>
> Could we have a config to disable this rather costly new feature,
> too?

And by config option you mean both a command line parameter
`--color=yes-but-no-move-detection` as well as a diff.color config option?

As it is currently `--color=` with when={always, never,auto},
I don't think we want to add it as another parameter there, so maybe

Re: [PATCHv3] diff.c: emit moved lines with a different color

2016-09-05 Thread Jacob Keller

On Sun, Sep 4, 2016 at 4:42 PM, Stefan Beller  wrote:
> When we color the diff, we'll mark moved lines with a different color.
>

Excellent idea. This is a very neat way to show extra information
without cluttering the diff output.

> This is achieved by doing a two passes over the diff. The first pass
> will inspect each line of the diff and store the removed lines and the
> added lines in its own hash map.
> The second pass will check for each added line if that is found in the
> set of removed lines. If so it will color the added line differently as
> with the new `moved-new` color mode. For each removed line we check the
> set of added lines and if found emit that with the new color `moved-old`.
>

Makes sense.

> When detecting the moved lines, we cannot just rely on a line being equal,
> but we need to take the context into account to detect when the moves were
> reordered as we do not want to color moved but per-mutated lines.
> To do that we use the hash of the preceding line.

Also makes sense.

>
> This patch was motivated by e.g. reviewing 3b0c4200 ("apply: move
> libified code from builtin/apply.c to apply.{c,h}", 2016-08-08)
>

Yes, this would be quite helpful.

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 0bcb679..f4f51c2 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -980,8 +980,9 @@ color.diff.::
> of `context` (context text - `plain` is a historical synonym),
> `meta` (metainformation), `frag`
> (hunk header), 'func' (function in hunk header), `old` (removed 
> lines),
> -   `new` (added lines), `commit` (commit headers), or `whitespace`
> -   (highlighting whitespace errors).
> +   `new` (added lines), `commit` (commit headers), `whitespace`
> +   (highlighting whitespace errors), `moved-old` (removed lines that
> +   reappear), `moved-new` (added lines that were removed elsewhere).
>

I liked Junio's "Moved from" and "moved to" but I think moved old and
moved new are ok as well.

> @@ -287,6 +304,25 @@ int git_diff_basic_config(const char *var, const char 
> *value, void *cb)
> return git_default_config(var, value, cb);
>  }
>
> +static int moved_entry_cmp(const struct moved_entry *a,
> +  const struct moved_entry *b,
> +  const void *unused)
> +{
> +   return strcmp(a->line, b->line) &&
> +  a->hash_prev_line == b->hash_prev_line;

So we're only comparing them if they match and have a matching
previous line? That seems pretty reasonable to reduce the cost of
computing exact copied sequences.

> +   if (ecbdata->opt->color_moved) {
> +   int h = memhash(line, len);
> +   hash_prev_removed = h;
> +   hash_prev_added = h;
> +   }
>  }

If I understand, this is to ensure that we don't keep re-hashing each
line right?

Thanks,
Jake

Re: [WIP PATCH v2] diff.c: emit moved lines with a different color

2016-09-05 Thread Jacob Keller

On Mon, Sep 5, 2016 at 11:57 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index 0bcb679..f4f51c2 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -980,8 +980,9 @@ color.diff.::
>>   of `context` (context text - `plain` is a historical synonym),
>>   `meta` (metainformation), `frag`
>>   (hunk header), 'func' (function in hunk header), `old` (removed lines),
>> - `new` (added lines), `commit` (commit headers), or `whitespace`
>> - (highlighting whitespace errors).
>> + `new` (added lines), `commit` (commit headers), `whitespace`
>> + (highlighting whitespace errors), `moved-old` (removed lines that
>> + reappear), `moved-new` (added lines that were removed elsewhere).
>
> Could we have a config to disable this rather costly new feature,
> too?

That seems entirely reasonable, though we *do* have a configuration
for disabling color altogether.. is there any numbers on how much more
this costs to compute?

>> +static struct hashmap *duplicates_added;
>> +static struct hashmap *duplicates_removed;
>> +static int hash_previous_line_added;
>> +static int hash_previous_line_removed;
>
> I think these should be added as new fields to diff_options
> structure.

Agreed, those seem like good choices for diff_options.

Thanks,
Jake

Re: 2.10.0: multiple versionsort.prereleasesuffix buggy?

2016-09-05 Thread SZEDER Gábor

> On Tue, Sep 06, 2016 at 01:42:28AM +0300, Leho Kraav (Conversion Ready) wrote:
> 
> > Here's the testing tree https://github.com/woothemes/woocommerce
> > 
> > .git/config has:
> > 
> > [versionsort]
> > 
> > 
> > prereleasesuffix = -beta
> > prereleasesuffix = -RC
> > 
> > $ git tag -l --sort=version:refname
> > [...]
> > 2.6.0-RC1
> > 2.6.0-RC2
> > 2.6.0-beta-1
> > 2.6.0-beta-2
> > 2.6.0-beta-3
> > 2.6.0-beta-4
> 
> So that seems wrong. Even weirder, if I set _only_ "-beta", I get:
> 
>   $ git tag -l --sort=version:refname | grep -v ^2.6.0
>   2.6.0-beta-2
>   2.6.0-beta-3
>   2.6.0-beta-4
>   2.6.0
>   2.6.0-RC1
>   2.6.0-RC2
>   2.6.0-beta-1
> 
> Umm...what? beta-1 is sorted away from its companions? That's weird.
> 
> I wondered if the presence of "-" after the suffix ("beta-1" rather than
> "beta1") would matter. It looks like that shouldn't matter, though; it's
> purely doing a prefix match on "do these names differ at a prerelease
> suffix".
> 
> But something certainly seems wrong.

Some of the weirdness is caused by the '-' at the _beginning_ of the
suffixes, because versioncmp() gets confused by suffixes starting with
the same character(s).

versioncmp() consumes two tagnames up to the first different character
and then calls swap_prereleases() to try to match prerelease suffixes
starting at those characters.  This works fine when comparing a
release with a prerelease, e.g. "2.6.0" and "2.6.0-RC1", because
swap_prereleases() gets "" and "-RC1" and the latter does match one of
the configured suffixes.  However, when comparing two prereleases,
e.g. "2.6.0-beta1" and "2.6.0-RC1", then the '-' is consumed from both
tagnames because the first differing characters are 'b' and 'R', thus
swap_prereleases() gets "beta1" and "RC1", which, of course, don't
match any of the configured suffixes without the leading '-'.

It's way past my bedtime, so for the time being I can only come up
with a hacky configuration workaround that seems to deliver the
expected results:

[versionsort]
prereleasesuffix = beta
prereleasesuffix = -beta
prereleasesuffix = RC
prereleasesuffix = -RC

Best,
Gábor

Re: [PATCH] stash: allow ref of a stash by index

2016-09-05 Thread Jeff King

On Mon, Sep 05, 2016 at 11:46:34PM +0200, Øystein Walle wrote:

> The bash-specific code is a no-go, so here's a way to do it in a way
> that I think is in line with Git's code style for shell scripts. I took
> the liberty of removing the '|| exit 1' since the rev is verified later
> on anyway, as can be seen in the last piece of context. That way the
> argument munging can be done at a later stage where we don't have to
> loop over multiple ones. The first rev-parse's purpose is just to apply
> --sq.

I wondered how that would impact the error message when there is no such
stash. It looks like rev-parse will still return the bogus name on
stdout, so we do not run afoul of the "No stash found" code path. We do
get:

  $ git.compile stash show foobar
  foobar is not a valid reference

instead of:

  $ git stash show foobar
  fatal: ambiguous argument 'foobar': unknown revision or path not in
  the working tree.
  Use '--' to separate paths from revisions, like this:
  'git  [...] -- [...]'

But I do not see that as a big downside (I might even call it an
improvement).

-Peff

Great Offer

2016-09-05 Thread Mrs Julie Leach

You are a recipient to Mrs Julie Leach Donation of $2 million USD. Contact
(julieleach...@hotmail.com) for claims.

Re: 2.10.0: multiple versionsort.prereleasesuffix buggy?

2016-09-05 Thread Jeff King

On Tue, Sep 06, 2016 at 01:42:28AM +0300, Leho Kraav (Conversion Ready) wrote:

> Here's the testing tree https://github.com/woothemes/woocommerce
> 
> .git/config has:
> 
> [versionsort]
> 
> 
> prereleasesuffix = -beta
> prereleasesuffix = -RC
> 
> $ git tag -l --sort=version:refname
> [...]
> 2.6.0-RC1
> 2.6.0-RC2
> 2.6.0-beta-1
> 2.6.0-beta-2
> 2.6.0-beta-3
> 2.6.0-beta-4

So that seems wrong. Even weirder, if I set _only_ "-beta", I get:

  $ git tag -l --sort=version:refname | grep -v ^2.6.0
  2.6.0-beta-2
  2.6.0-beta-3
  2.6.0-beta-4
  2.6.0
  2.6.0-RC1
  2.6.0-RC2
  2.6.0-beta-1

Umm...what? beta-1 is sorted away from its companions? That's weird.

I wondered if the presence of "-" after the suffix ("beta-1" rather than
"beta1") would matter. It looks like that shouldn't matter, though; it's
purely doing a prefix match on "do these names differ at a prerelease
suffix".

But something certainly seems wrong.

-Peff

2.10.0: multiple versionsort.prereleasesuffix buggy?

2016-09-05 Thread Leho Kraav (Conversion Ready)


Hi all


Here's the testing tree https://github.com/woothemes/woocommerce

.git/config has:

[versionsort] 



prereleasesuffix = -beta
prereleasesuffix = -RC

$ git tag -l --sort=version:refname
...
2.5.0-RC1
2.5.0-RC2
2.5.0-RC3
2.5.0-beta-1
2.5.0-beta-2
2.5.0-beta-3
2.5.0
2.5.1
2.5.2
2.5.3
2.5.4
2.5.5
2.6.0-RC1
2.6.0-RC2
2.6.0-beta-1
2.6.0-beta-2
2.6.0-beta-3
2.6.0-beta-4
2.6.0
2.6.1
2.6.2
2.6.3
2.6.4

Per documentation, I'm supposed to see something like
...
2.5.0-beta-1
2.5.0-beta-2
2.5.0-beta-3
2.5.0-RC1
2.5.0-RC2
2.5.0-RC3
2.5.0
...


No matter what I do in `.git/config`, RC goes up front. What's going on?

(Yes, this project's tag capitalization is messed up.)

Re: [PATCH v1 1/2] sha1_file: open window into packfiles with CLOEXEC

2016-09-05 Thread Eric Wong

larsxschnei...@gmail.com wrote:
> All processes that the Git main process spawns inherit the open file
> descriptors of the main process. These leaked file descriptors can
> cause problems.

> -int git_open_noatime(const char *name)
> +int git_open_noatime_cloexec(const char *name)
>  {
> - static int sha1_file_open_flag = O_NOATIME;
> + static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
>  
>   for (;;) {
>   int fd;

If there's real problems being caused by lack of cloexec
today, I think the F_SETFD fallback I proposed in
https://public-inbox.org/git/20160818173555.GA29253@starla/
will be necessary.

I question the need for the "_cloexec" suffixing in the
function name since the old function is going away entirely.

I prefer all FD-creating functions set cloexec by default
for FD > 2 to avoid inadvertantly leaking FDs.  So we
ought to use pipe2, accept4, socket(..., SOCK_CLOEXEC), etc...
and fallback to the racy+slower F_SETFD when not available.

Fwiw, Perl has been setting cloexec on FDs above $^F
(2, $SYSTEM_FD_MAX) for decades, and Ruby started
doing it a few years ago, too.

Re: [PATCH 5/5] pack-objects: walk tag chains for --include-tag

2016-09-05 Thread Jeff King

On Mon, Sep 05, 2016 at 05:52:26PM -0400, Jeff King wrote:

> When pack-objects is given --include-tag, it peels each tag
> ref down to a non-tag object, and if that non-tag object is
> going to be packed, we include the tag, too. But what
> happens if we have a chain of tags (e.g., tag "A" points to
> tag "B", which points to commit "C")?
> 
> We'll peel down to "C" and realize that we want to include
> tag "A", but we do not ever consider tag "B", leading to a
> broken pack (assuming "B" was not otherwise selected).
> Instead, we have to walk the whole chain, adding any tags we
> find to the pack.

So technically one might argue that this pack isn't "broken", in that it
_is_ a valid pack. It's just that it doesn't contain what the user was
asking for.

As explained further in the commit message, "fetch" is robust to this,
because it does a real connectivity check and follow-on fetch before
writing anything it thinks it got via include-tag. So perhaps one could
argue that pack-objects is correct; include-tag is best-effort, and it
is the client's job to make sure it has everything it needs. And that
would mean the bug is in git-clone, which should be doing the
connectivity check and follow-on fetch.

I dunno. This seems like the most elegant place to fix it, though it
does mean that pack-objects will go to some slight extra work when
auto-packing a tag (it has to parse the tag to find out whether it's a
chain). I'm doubt it matters much in practice.

-Peff

Re: [PATCH] stash: allow ref of a stash by index

2016-09-05 Thread Øystein Walle

Pasting text into Gmail's web interface is not conducive to sending tabs
through the intertubes. But you get the idea..

Øsse

Re: [PATCH 0/5] handle tag recursion in pack-objects --include-tag

2016-09-05 Thread Jeff King

On Mon, Sep 05, 2016 at 05:51:41PM -0400, Jeff King wrote:

> If you have a tag of a tag of a commit, and the middle tag isn't
> otherwise reachable, then "pack-objects --include-tag" will not include
> the middle tag, and certain invocations of "clone" will fail. See the
> final patch below for the gory details.
> 
> The first five patches are just test cleanup and modernization in
> preparation (though note that the 3rd one actually fixes a minor bug in
> the test script).
> 
>   [1/5]: t5305: move cleanup into test block
>   [2/5]: t5305: drop "dry-run" of unpack-objects
>   [3/5]: t5305: use "git -C"
>   [4/5]: t5305: simplify packname handling
>   [5/5]: pack-objects: walk tag chains for --include-tag

BTW, this is not a new regression; the problem dates back to the origin
of the "include-tag" feature. So there is no urgency to the fix, as it
has been with us for almost a decade. I built it on master, but it
should apply cleanly to maint.

-Peff

[PATCH 2/5] t5305: drop "dry-run" of unpack-objects

2016-09-05 Thread Jeff King

For each test we do a dry-run of unpack-objects, followed by
a real run, followed by confirming that it contained the
objects we expected. The dry-run is telling us nothing, as
any errors it encounters would be found in the real run.

Signed-off-by: Jeff King 
---
 t/t5305-include-tag.sh | 2 --
 1 file changed, 2 deletions(-)

diff --git a/t/t5305-include-tag.sh b/t/t5305-include-tag.sh
index 6018404..787fc83 100755
--- a/t/t5305-include-tag.sh
+++ b/t/t5305-include-tag.sh
@@ -37,7 +37,6 @@ test_expect_success 'unpack objects' '
GIT_DIR=clone.git &&
export GIT_DIR &&
git init &&
-   git unpack-objects -n

[PATCH 3/5] t5305: use "git -C"

2016-09-05 Thread Jeff King

This test unpacks objects into a separate repository, and
accesses it by setting GIT_DIR in a subshell. We can do the
same thing these days by using "git init " and "git
-C". In most cases this is shorter, though when there are
multiple commands, we may end up repeating the "-C".

However, this repetition can actually be a good thing. This
patch also fixes a bug introduced by 512477b (tests: use
"env" to run commands with temporary env-var settings,
2014-03-18). That commit essentially converted:

   (GIT_DIR=...; export GIT_DIR
cmd1 &&
cmd2)

into:

   (GIT_DIR=... cmd1 &&
cmd2)

which obviously loses the GIT_DIR setting for cmd2 (we never
noticed the bug because it simply runs "cmd2" in the parent
repo, which means we were simply failing to test anything
interesting). By using "git -C" rather than a subshell, it
becomes quite obvious where each command is supposed to be
running.

Signed-off-by: Jeff King 
---
 t/t5305-include-tag.sh | 28 +++-
 1 file changed, 7 insertions(+), 21 deletions(-)

diff --git a/t/t5305-include-tag.sh b/t/t5305-include-tag.sh
index 787fc83..089a3e9 100755
--- a/t/t5305-include-tag.sh
+++ b/t/t5305-include-tag.sh
@@ -33,20 +33,14 @@ test_expect_success 'pack without --include-tag' '
 
 test_expect_success 'unpack objects' '
rm -rf clone.git &&
-   (
-   GIT_DIR=clone.git &&
-   export GIT_DIR &&
-   git init &&
-   git unpack-objects list.expect &&
-   (
-   test_must_fail env GIT_DIR=clone.git git cat-file -e $tag &&
-   git rev-list --objects $commit
-   ) >list.actual &&
+   test_must_fail git -C clone.git cat-file -e $tag &&
+   git -C clone.git rev-list --objects $commit >list.actual &&
test_cmp list.expect list.actual
 '
 
@@ -59,21 +53,13 @@ test_expect_success 'pack with --include-tag' '
 
 test_expect_success 'unpack objects' '
rm -rf clone.git &&
-   (
-   GIT_DIR=clone.git &&
-   export GIT_DIR &&
-   git init &&
-   git unpack-objects list.expect &&
-   (
-   GIT_DIR=clone.git &&
-   export GIT_DIR &&
-   git rev-list --objects $tag
-   ) >list.actual &&
+   git -C clone.git rev-list --objects $tag >list.actual &&
test_cmp list.expect list.actual
 '
 
-- 
2.10.0.rc2.154.gb4a4b8b

[PATCH 4/5] t5305: simplify packname handling

2016-09-05 Thread Jeff King

We generate a series of packfiles test-1-$pack,
test-2-$pack, with different properties and then examine
them. However we always store the packname generated by
pack-objects in the variable packname_1. This probably was
meant to be packname_2 in the second test, but it turns out
that it doesn't matter: once we are done with the first
pack, we can just keep using the same $packname variable.

So let's drop the confusing "_1" parameter. At the same
time, let's give test-1 and test-2 more descriptive names,
which can help keep them straight (note that we _could_
likewise overwrite the packfiles in each test, but by using
separate filenames, we are sure that test 2 does not
accidentally use the packfile from test 1).

Signed-off-by: Jeff King 
---
 t/t5305-include-tag.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t5305-include-tag.sh b/t/t5305-include-tag.sh
index 089a3e9..e3c6c62 100755
--- a/t/t5305-include-tag.sh
+++ b/t/t5305-include-tag.sh
@@ -26,15 +26,15 @@ test_expect_success setup '
 '
 
 test_expect_success 'pack without --include-tag' '
-   packname_1=$(git pack-objects \
+   packname=$(git pack-objects \
--window=0 \
-   test-1

[PATCH 1/5] t5305: move cleanup into test block

2016-09-05 Thread Jeff King

We usually try to avoid doing any significant actions
outside of test blocks. Although "rm -rf" is unlikely to
either fail or to generate output, moving these to the
point of use makes it more clear that they are part of the
overall setup of "clone.git".

Signed-off-by: Jeff King 
---
 t/t5305-include-tag.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t5305-include-tag.sh b/t/t5305-include-tag.sh
index f314ad5..6018404 100755
--- a/t/t5305-include-tag.sh
+++ b/t/t5305-include-tag.sh
@@ -25,7 +25,6 @@ test_expect_success setup '
} >obj-list
 '
 
-rm -rf clone.git
 test_expect_success 'pack without --include-tag' '
packname_1=$(git pack-objects \
--window=0 \
@@ -33,6 +32,7 @@ test_expect_success 'pack without --include-tag' '
 '
 
 test_expect_success 'unpack objects' '
+   rm -rf clone.git &&
(
GIT_DIR=clone.git &&
export GIT_DIR &&
@@ -51,7 +51,6 @@ test_expect_success 'check unpacked result (have commit, no 
tag)' '
test_cmp list.expect list.actual
 '
 
-rm -rf clone.git
 test_expect_success 'pack with --include-tag' '
packname_1=$(git pack-objects \
--window=0 \
@@ -60,6 +59,7 @@ test_expect_success 'pack with --include-tag' '
 '
 
 test_expect_success 'unpack objects' '
+   rm -rf clone.git &&
(
GIT_DIR=clone.git &&
export GIT_DIR &&
-- 
2.10.0.rc2.154.gb4a4b8b

[PATCH 5/5] pack-objects: walk tag chains for --include-tag

2016-09-05 Thread Jeff King

When pack-objects is given --include-tag, it peels each tag
ref down to a non-tag object, and if that non-tag object is
going to be packed, we include the tag, too. But what
happens if we have a chain of tags (e.g., tag "A" points to
tag "B", which points to commit "C")?

We'll peel down to "C" and realize that we want to include
tag "A", but we do not ever consider tag "B", leading to a
broken pack (assuming "B" was not otherwise selected).
Instead, we have to walk the whole chain, adding any tags we
find to the pack.

Interestingly, it doesn't seem possible to trigger this
problem with "git fetch", but you can with "git clone
--single-branch". The reason is that we generate the correct
pack when the client explicitly asks for "A" (because we do
a real reachability analysis there), and "fetch" is more
willing to do so. There are basically two cases:

  1. If "C" is already a ref tip, then the client can deduce
 that it needs "A" itself (via find_non_local_tags), and
 will ask for it explicitly rather than relying on the
 include-tag capability. Everything works.

  2. If "C" is not already a ref tip, then we hope for
 include-tag to send us the correct tag. But it doesn't;
 it generates a broken pack. However, the next step is
 to do a follow-up run of find_non_local_tags(),
 followed by fetch_refs() to backfill any tags we
 learned about.

 In the normal case, fetch_refs() calls quickfetch(),
 which does a connectivity check and sees we have no
 new objects to fetch. We just write the refs.

 But for the broken-pack case, the connectivity check
 fails, and quickfetch will follow-up with the remote,
 asking explicitly for each of the ref tips. This picks
 up the missing object in a new pack.

For a regular "git clone", we are similarly OK, because we
explicitly request all of the tag refs, and get a correct
pack. But with "--single-branch", we kick in tag
auto-following via "include-tag", but do _not_ do a
follow-up backfill. We just take whatever the server sent us
via include-tag and write out tag refs for any tag objects
we were sent. So prior to c6807a4 (clone: open a shortcut
for connectivity check, 2013-05-26), we actually claimed the
clone was a success, but the result was silently
corrupted!  Since c6807a4, index-pack's connectivity
check catches this case, and we correctly complain.

The included test directly checks that pack-objects does not
generate a broken pack, but also confirms that "clone
--single-branch" does not hit the bug.

Note that tag chains introduce another interesting question:
if we are packing the tag "B" but not the commit "C", should
"A" be included?

Both before and after this patch, we do not include "A",
because the initial peel_ref() check only knows about the
bottom-most level, "C". To realize that "B" is involved at
all, we would have to switch to an incremental peel, in
which we examine each tagged object, asking if it is being
packed (and including the outer tag if so).

But that runs contrary to the optimizations in peel_ref(),
which avoid accessing the objects at all, in favor of using
the value we pull from packed-refs. It's OK to walk the
whole chain once we know we're going to include the tag (we
have to access it anyway, so the effort is proportional to
the pack we're generating). But for the initial selection,
we have to look at every ref. If we're only packing a few
objects, we'd still have to parse every single referenced
tag object just to confirm that it isn't part of a tag
chain.

This could be addressed if packed-refs stored the complete
tag chain for each peeled ref (in most cases, this would be
the same cost as now, as each "chain" is only a single
link). But given the size of that project, it's out of scope
for this fix (and probably nobody cares enough anyway, as
it's such an obscure situation). This commit limits itself
to just avoiding the creation of a broken pack.

Signed-off-by: Jeff King 
---
 builtin/pack-objects.c | 31 +-
 t/t5305-include-tag.sh | 52 ++
 2 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4a63398..0954375 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2123,6 +2123,35 @@ static void ll_find_deltas(struct object_entry **list, 
unsigned list_size,
 #define ll_find_deltas(l, s, w, d, p)  find_deltas(l, , w, d, p)
 #endif
 
+static void add_tag_chain(const struct object_id *oid)
+{
+   struct tag *tag;
+
+   /*
+* We catch duplicates already in add_object_entry(), but we'd
+* prefer to do this extra check to avoid having to parse the
+* tag at all if we already know that it's being packed (e.g., if
+* it was included via bitmaps, we would not have parsed it
+* previously).
+*/
+   if (packlist_find(_pack, oid->hash, NULL))
+

[PATCH 0/5] handle tag recursion in pack-objects --include-tag

2016-09-05 Thread Jeff King

If you have a tag of a tag of a commit, and the middle tag isn't
otherwise reachable, then "pack-objects --include-tag" will not include
the middle tag, and certain invocations of "clone" will fail. See the
final patch below for the gory details.

The first five patches are just test cleanup and modernization in
preparation (though note that the 3rd one actually fixes a minor bug in
the test script).

  [1/5]: t5305: move cleanup into test block
  [2/5]: t5305: drop "dry-run" of unpack-objects
  [3/5]: t5305: use "git -C"
  [4/5]: t5305: simplify packname handling
  [5/5]: pack-objects: walk tag chains for --include-tag

-Peff

Re: [PATCH] stash: allow ref of a stash by index

2016-09-05 Thread Øystein Walle

Hi, guys

I like this idea. It makes stash easier and quicker to use, and it
"hides" the fact that it uses the reflog for keeping track of the made
stashes. *Not* to say I discourage interested people from peeking under
the hood. I just think it's nice to sometimes think of the stash as a
separate concept instead of being built on top of strange merge
commits constructed in temporary indexes :)

The bash-specific code is a no-go, so here's a way to do it in a way
that I think is in line with Git's code style for shell scripts. I took
the liberty of removing the '|| exit 1' since the rev is verified later
on anyway, as can be seen in the last piece of context. That way the
argument munging can be done at a later stage where we don't have to
loop over multiple ones. The first rev-parse's purpose is just to apply
--sq.

(Besides, the only way to do it at the top like in the original patch
was   for arg in bleh "$@"   where bleh is a marker to indicate that the
positional arguments should be cleared in a separate case branch so that
they can be rebuilt with multiple invocations of set later. Not pretty,
in my opinion.)

Regards,
Øsse

diff --git a/git-stash.sh b/git-stash.sh
index 826af18..b026288 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -384,7 +384,7 @@ parse_flags_and_rev()
 i_tree=
 u_tree=

-REV=$(git rev-parse --no-flags --symbolic --sq "$@") || exit 1
+REV=$(git rev-parse --no-flags --symbolic --sq "$@" 2>/dev/null)

 FLAGS=
 for opt
@@ -422,6 +422,15 @@ parse_flags_and_rev()
 ;;
 esac

+case "$1" in
+*[!0-9]*)
+:
+;;
+*)
+set -- "${ref_stash}@{$1}"
+;;
+esac
+
 REV=$(git rev-parse --symbolic --verify --quiet "$1") || {
 reference="$1"
 die "$(eval_gettext "\$reference is not a valid reference")"

[RFCv3] Proposed questions for "Git User's Survey 2016", take three

2016-09-05 Thread Jakub Narębski

Hello,

As I wrote previously, I am thinking about returning to doing the
Git User's Survey this year.

  Message-ID: <577e6f32.7020...@gmail.com>

  https://marc.info/?l=git=146790381602547=2
  https://public-inbox.org/git/577E6F32.7020007%40gmail.com/

In this email I'd like to propose the revised list of questions (and
answers) for Git User's Survey 2016, following comments to the previous
revisions, starting with

  Message-ID: <91a2ffbe-a73b-fbb9-96da-9aea4d439...@gmail.com>

  https://public-inbox.org/git/91a2ffbe-a73b-fbb9-96da-9aea4d439...@gmail.com/


Below is my third, and probably final proposal.  You can see the survey
in work at (there might be some slight differences from what's below)

  https://survs.com/survey/kn2bpu4p9d

NOTE: This is a *test channel*, all responses will be deleted.


The survey is currently planned to be open (after a bit of delay)
from  10 September 2016  to  20 October 2016.

I'd like to ask also about announcing the survey as widely as possible.
Please reply if you think of some channel, or of you could announce
it yourself.  Thanks in advance.


If you want to do this survey "in house", you can ask for a separate
channel (a separate survey URL).  After the survey ends, I'll send 
anonymous responses from that channel in Excel and CSV formats.

There also will be anonymous, JavaScript- and cookie-less channel;
at the cost that you cannot go back and change or add to your response.

There are 50 questions in this survey.


 About you

01. What country do you live in? (Country of residence, in English)
(free-form single line)

02. How old are you (in years)?
(single number)

03. What is your gender?
(single choice)

  *  Man
  *  Woman
  *  Other
  *  Prefer to not disclose

04. How would you describe your occupation / role as Git user?
(multiple choice with other, limit to 3 selections)

  + Developer
  + Programmer
  + Engineer
  + Analyst
  + Manager / Leader
  + Maintainer / Reviewer / Sub-maintainer
  + DevOps
  + Administrator
  + Designer
  + Artist / Writer
  + Tester / QA
  + Expert
  + Student
  + Researcher
  + Teacher
  + Other, please specify

05. Have you ever contributed to Git project (code, documentation, i18n, etc.)?
(single choice)

  * Yes
  * No
  * Maybe, don't remember
  * Plan to, soon

06. Have you ever reviewed contribution to Git project?
(single choice)

  * Yes
  * No
  * Maybe, don't remember
  * Plan to, soon
  * Only spelling corrections, or correctness of translation

07. What's stopping you from contributing to Git?
What was hardest / most difficult when contribution to Git?
(free-text essay question)


 Getting started with Git

08. Have you found Git easy to learn?
(single choice)

  * Very easy
  * Easy
  * Reasonably easy (average)
  * Hard
  * Very hard

09. Have you found Git easy to use?
(single choice)

  * Very easy
  * Easy
  * Reasonably easy (average)
  * Hard
  * Very hard

10. Rate your own proficiency with Git
(single choice, rating)

  1. novice
  2. casual, needs advice
  3. everyday use
  4. can offer advice
  5. know it very well

Description:

You can think of it as 1-5 numerical grade of your proficiency in Git.



11. Which Git version(s) are you using?
   (multiple choice, with other)

 + I don't remember and cannot check the version
 + pre 2.0
 + 2.0.x to 2.4.x
 + 2.5.x
 + 2.6.x
 + 2.7.x
 + 2.8.x
 + 2.9.x
 + 2.10.x or newer
 + 2.10.x-rcN version (release candidate)
 + minor (maintenance) release 1.x.y.z
 + 'master' branch of official git repository
 + 'next' branch of official git repository
 + version from msysGit / Git for Windows fork repository
 + alternate Git implementation (JGit and similar)

 + other, please specify


 Getting and updating Git

12. On which operating system(s) do you use Git?
(multiple choice, with other)

 + GNU/Linux
 + MS Windows
 + macOS
 + *BSD (FreeBSD, OpenBSD, NetBSD, etc.)
 + Android
 + iOS
 + other Unix

 + other operating system, please specify

13. How do/did you obtain Git (install and/or upgrade)?
(multiple choices, with other)

  + binary package
  + source package or script (automatic compiling)
  + source tarball/archive (extract, make, make install)
  + pull from repository, and compiled
  + bundled with GUI or other tool
  + preinstalled / sysadmin job / I don't know

  + other, please specify (if none of the above apply)

14. How often do you upgrade Git?
(multiple choice)

  + use 'master' or 'next' version, and/or pre-release
  + as soon as new version is released
  + when there is new binary package / distribution package
  + when updating distribution / system
  + around every month, or more often
  + around every 6 months or more often
  + update from time to time, cannot say how often
  + I use what is installed on system


 How do you use Git

15. What kind of projects etc. do you use Git for?

[PATCH v1 2/2] read-cache: make sure file handles are not inherited by child processes

2016-09-05 Thread larsxschneider

From: Lars Schneider 

This fix prepares a series with the goal to avoid launching a new
clean/smudge filter process for each file that is filtered. A new
long running filter process is introduced that is used to filter all
files in a single Git invocation.

Consider the case of a file that requires filtering and is present in
branch A but not in branch B. If A is the current HEAD and we checkout B
then the following happens:

1. ce_compare_data() opens the file
2.   index_fd() detects that the file requires to run a clean filter and
 calls index_stream_convert_blob()
4. index_stream_convert_blob() calls convert_to_git_filter_fd()
5.   convert_to_git_filter_fd() calls apply_filter() which creates a
 new long running filter process (in case it is the first file
 of this kind to be filtered)
6.   The new filter process inherits all file handles. This is the
 default on Linux/OSX and is explicitly defined in the
 `CreateProcessW` call in `mingw.c` on Windows.
7. ce_compare_data() closes the file
8. Git unlinks the file as it is not present in B

The unlink operation does not work on Windows because the filter process
has still an open handle to the file. On Linux/OSX the unlink operation
succeeds but the file descriptors still leak into the child process.

Fix this problem by opening files in read-cache with the CLOEXEC flag to
ensure that the file descriptor does not remain open in a newly spawned
process similar to 05d1ed61.

Signed-off-by: Lars Schneider 
---
 read-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 491e52d..02f74d3 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -156,7 +156,7 @@ void fill_stat_cache_info(struct cache_entry *ce, struct 
stat *st)
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 {
int match = -1;
-   int fd = open(ce->name, O_RDONLY);
+   int fd = open(ce->name, O_RDONLY | O_CLOEXEC);
 
if (fd >= 0) {
unsigned char sha1[20];
-- 
2.10.0

[PATCH v1 1/2] sha1_file: open window into packfiles with CLOEXEC

2016-09-05 Thread larsxschneider

From: Lars Schneider 

All processes that the Git main process spawns inherit the open file
descriptors of the main process. These leaked file descriptors can
cause problems.

Use the CLOEXEC flag similar to 05d1ed61 to fix the leaked file
descriptors.

Signed-off-by: Lars Schneider 
---
 builtin/pack-objects.c |  2 +-
 cache.h|  2 +-
 pack-bitmap.c  |  2 +-
 sha1_file.c| 14 +++---
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4a63398..a2b1fb6 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -718,7 +718,7 @@ static off_t write_reused_pack(struct sha1file *f)
if (!is_pack_valid(reuse_packfile))
die("packfile is invalid: %s", reuse_packfile->pack_name);
 
-   fd = git_open_noatime(reuse_packfile->pack_name);
+   fd = git_open_noatime_cloexec(reuse_packfile->pack_name);
if (fd < 0)
die_errno("unable to open packfile for reuse: %s",
  reuse_packfile->pack_name);
diff --git a/cache.h b/cache.h
index b780a91..ae79747 100644
--- a/cache.h
+++ b/cache.h
@@ -1089,7 +1089,7 @@ extern int write_sha1_file(const void *buf, unsigned long 
len, const char *type,
 extern int hash_sha1_file_literally(const void *buf, unsigned long len, const 
char *type, unsigned char *sha1, unsigned flags);
 extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned 
char *);
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
-extern int git_open_noatime(const char *name);
+extern int git_open_noatime_cloexec(const char *name);
 extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, 
unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
diff --git a/pack-bitmap.c b/pack-bitmap.c
index b949e51..1b39e5d 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -266,7 +266,7 @@ static int open_pack_bitmap_1(struct packed_git *packfile)
return -1;
 
idx_name = pack_bitmap_filename(packfile);
-   fd = git_open_noatime(idx_name);
+   fd = git_open_noatime_cloexec(idx_name);
free(idx_name);
 
if (fd < 0)
diff --git a/sha1_file.c b/sha1_file.c
index 3045aea..c1701dc 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -356,7 +356,7 @@ void read_info_alternates(const char * relative_base, int 
depth)
int fd;
 
path = xstrfmt("%s/info/alternates", relative_base);
-   fd = git_open_noatime(path);
+   fd = git_open_noatime_cloexec(path);
free(path);
if (fd < 0)
return;
@@ -550,7 +550,7 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
struct pack_idx_header *hdr;
size_t idx_size;
uint32_t version, nr, i, *index;
-   int fd = git_open_noatime(path);
+   int fd = git_open_noatime_cloexec(path);
struct stat st;
 
if (fd < 0)
@@ -956,7 +956,7 @@ static int open_packed_git_1(struct packed_git *p)
while (pack_max_fds <= pack_open_fds && close_one_pack())
; /* nothing */
 
-   p->pack_fd = git_open_noatime(p->pack_name);
+   p->pack_fd = git_open_noatime_cloexec(p->pack_name);
if (p->pack_fd < 0 || fstat(p->pack_fd, ))
return -1;
pack_open_fds++;
@@ -1459,9 +1459,9 @@ int check_sha1_signature(const unsigned char *sha1, void 
*map,
return hashcmp(sha1, real_sha1) ? -1 : 0;
 }
 
-int git_open_noatime(const char *name)
+int git_open_noatime_cloexec(const char *name)
 {
-   static int sha1_file_open_flag = O_NOATIME;
+   static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
 
for (;;) {
int fd;
@@ -1505,7 +1505,7 @@ static int open_sha1_file(const unsigned char *sha1)
struct alternate_object_database *alt;
int most_interesting_errno;
 
-   fd = git_open_noatime(sha1_file_name(sha1));
+   fd = git_open_noatime_cloexec(sha1_file_name(sha1));
if (fd >= 0)
return fd;
most_interesting_errno = errno;
@@ -1513,7 +1513,7 @@ static int open_sha1_file(const unsigned char *sha1)
prepare_alt_odb();
for (alt = alt_odb_list; alt; alt = alt->next) {
fill_sha1_path(alt->name, sha1);
-   fd = git_open_noatime(alt->base);
+   fd = git_open_noatime_cloexec(alt->base);
if (fd >= 0)
return fd;
if (most_interesting_errno == ENOENT)
-- 
2.10.0

[PATCH v1 0/2] Use CLOEXEC to avoid fd leaks

2016-09-05 Thread larsxschneider

From: Lars Schneider 

Use the CLOEXEC flag similar to 05d1ed61 to fix leaked file descriptors.

Patch 1/2 was contemplated by Junio here:
http://public-inbox.org/git/xmqq37lnbbpk@gitster.mtv.corp.google.com/

I followed Torsten's advice and renamed `git_open_noatime` to
`git_open_noatime_cloexec`:
http://public-inbox.org/git/20160830145429.GA11221@tb-raspi/

Patch 2/2 was part of my "Git filter protocol" series:
http://public-inbox.org/git/20160825110752.31581-14-larsxschnei...@gmail.com/
(I kept the patch as-is but changed the commit messages slightly)

Thanks,
Lars

Lars Schneider (2):
  sha1_file: open window into packfiles with CLOEXEC
  read-cache: make sure file handles are not inherited by child
processes

 builtin/pack-objects.c |  2 +-
 cache.h|  2 +-
 pack-bitmap.c  |  2 +-
 read-cache.c   |  2 +-
 sha1_file.c| 14 +++---
 5 files changed, 11 insertions(+), 11 deletions(-)

--
2.10.0

git commit -p with file arguments

2016-09-05 Thread Christian Neukirchen

Hi,

I noticed the following suprising behavior:

% git --version
git version 2.10.0

% git add bar
% git status -s 
A  bar
 M foo

% git commit -p foo
[stage a hunk]
...
# Explicit paths specified without -i or -o; assuming --only paths...
# On branch master
# Changes to be committed:
#   new file:   bar
#   modified:   foo
#

So why does it want to commit bar too, when I explicitly wanted to
commit foo only?

This is not how "git commit files..." works, and the man page says

3.by listing files as arguments to the commit command, in which
   case the commit will ignore changes staged in the index, and
   instead record the current content of the listed files (which must
   already be known to Git);

I'd expect "git commit -p files..." to work like
"git add -p files... && git commit files...".

Thanks,
-- 
Christian Neukirchen    http://chneukirchen.org

[PATCH v2 06/20] builtin/cat-file: convert some static functions to struct object_id

2016-09-05 Thread brian m. carlson

Convert all of the static functions that are not callbacks to use struct
object_id.

Signed-off-by: brian m. carlson 
---
 builtin/cat-file.c | 50 +-
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 16b0b8c9..8b773787 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -23,7 +23,7 @@ struct batch_options {
 static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
int unknown_type)
 {
-   unsigned char sha1[20];
+   struct object_id oid;
enum object_type type;
char *buf;
unsigned long size;
@@ -35,14 +35,14 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
if (unknown_type)
flags |= LOOKUP_UNKNOWN_OBJECT;
 
-   if (get_sha1_with_context(obj_name, 0, sha1, _context))
+   if (get_sha1_with_context(obj_name, 0, oid.hash, _context))
die("Not a valid object name %s", obj_name);
 
buf = NULL;
switch (opt) {
case 't':
oi.typename = 
-   if (sha1_object_info_extended(sha1, , flags) < 0)
+   if (sha1_object_info_extended(oid.hash, , flags) < 0)
die("git cat-file: could not get object info");
if (sb.len) {
printf("%s\n", sb.buf);
@@ -53,24 +53,24 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
 
case 's':
oi.sizep = 
-   if (sha1_object_info_extended(sha1, , flags) < 0)
+   if (sha1_object_info_extended(oid.hash, , flags) < 0)
die("git cat-file: could not get object info");
printf("%lu\n", size);
return 0;
 
case 'e':
-   return !has_sha1_file(sha1);
+   return !has_object_file();
 
case 'c':
if (!obj_context.path[0])
die("git cat-file --textconv %s:  must be 
",
obj_name);
 
-   if (textconv_object(obj_context.path, obj_context.mode, sha1, 
1, , ))
+   if (textconv_object(obj_context.path, obj_context.mode, 
oid.hash, 1, , ))
break;
 
case 'p':
-   type = sha1_object_info(sha1, NULL);
+   type = sha1_object_info(oid.hash, NULL);
if (type < 0)
die("Not a valid object name %s", obj_name);
 
@@ -83,8 +83,8 @@ static int cat_one_file(int opt, const char *exp_type, const 
char *obj_name,
}
 
if (type == OBJ_BLOB)
-   return stream_blob_to_fd(1, sha1, NULL, 0);
-   buf = read_sha1_file(sha1, , );
+   return stream_blob_to_fd(1, oid.hash, NULL, 0);
+   buf = read_sha1_file(oid.hash, , );
if (!buf)
die("Cannot read object %s", obj_name);
 
@@ -93,19 +93,19 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
 
case 0:
if (type_from_string(exp_type) == OBJ_BLOB) {
-   unsigned char blob_sha1[20];
-   if (sha1_object_info(sha1, NULL) == OBJ_TAG) {
-   char *buffer = read_sha1_file(sha1, , 
);
+   struct object_id blob_oid;
+   if (sha1_object_info(oid.hash, NULL) == OBJ_TAG) {
+   char *buffer = read_sha1_file(oid.hash, , 
);
const char *target;
if (!skip_prefix(buffer, "object ", ) ||
-   get_sha1_hex(target, blob_sha1))
-   die("%s not a valid tag", 
sha1_to_hex(sha1));
+   get_oid_hex(target, _oid))
+   die("%s not a valid tag", 
oid_to_hex());
free(buffer);
} else
-   hashcpy(blob_sha1, sha1);
+   oidcpy(_oid, );
 
-   if (sha1_object_info(blob_sha1, NULL) == OBJ_BLOB)
-   return stream_blob_to_fd(1, blob_sha1, NULL, 0);
+   if (sha1_object_info(blob_oid.hash, NULL) == OBJ_BLOB)
+   return stream_blob_to_fd(1, blob_oid.hash, 
NULL, 0);
/*
 * we attempted to dereference a tag to a blob
 * and failed; there may be new dereference
@@ -113,7 +113,7 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
 * fall-back to the usual case.
 */
}
-   buf =

[PATCH v2 01/20] cache: convert struct cache_entry to use struct object_id

2016-09-05 Thread brian m. carlson

Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:

@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash

Signed-off-by: brian m. carlson 
---
 builtin/apply.c  | 10 +-
 builtin/blame.c  |  2 +-
 builtin/checkout.c   |  6 +++---
 builtin/fsck.c   |  2 +-
 builtin/grep.c   |  3 ++-
 builtin/ls-files.c   |  2 +-
 builtin/merge-index.c|  2 +-
 builtin/read-tree.c  |  2 +-
 builtin/rm.c |  2 +-
 builtin/submodule--helper.c  |  5 +++--
 builtin/update-index.c   | 10 +-
 cache-tree.c |  4 ++--
 cache.h  |  2 +-
 diff-lib.c   | 31 ++-
 diff.c   |  2 +-
 dir.c|  7 ---
 entry.c  |  9 +
 merge-recursive.c|  2 +-
 read-cache.c | 24 
 rerere.c |  3 ++-
 resolve-undo.c   |  2 +-
 revision.c   |  2 +-
 sha1_name.c  |  2 +-
 t/helper/test-dump-split-index.c |  2 +-
 tree.c   |  2 +-
 unpack-trees.c   |  8 
 26 files changed, 79 insertions(+), 69 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index 1a488f9e..ba0e75bf 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -3220,7 +3220,7 @@ static int read_file_or_gitlink(const struct cache_entry 
*ce, struct strbuf *buf
 {
if (!ce)
return 0;
-   return read_blob_object(buf, ce->sha1, ce->ce_mode);
+   return read_blob_object(buf, ce->oid.hash, ce->ce_mode);
 }
 
 static struct patch *in_fn_table(struct apply_state *state, const char *name)
@@ -3959,7 +3959,7 @@ static int get_current_sha1(const char *path, unsigned 
char *sha1)
pos = cache_name_pos(path, strlen(path));
if (pos < 0)
return -1;
-   hashcpy(sha1, active_cache[pos]->sha1);
+   hashcpy(sha1, active_cache[pos]->oid.hash);
return 0;
 }
 
@@ -4211,7 +4211,7 @@ static void add_index_file(struct apply_state *state,
const char *s;
 
if (!skip_prefix(buf, "Subproject commit ", ) ||
-   get_sha1_hex(s, ce->sha1))
+   get_sha1_hex(s, ce->oid.hash))
die(_("corrupt patch for submodule %s"), path);
} else {
if (!state->cached) {
@@ -4220,7 +4220,7 @@ static void add_index_file(struct apply_state *state,
  path);
fill_stat_cache_info(ce, );
}
-   if (write_sha1_file(buf, size, blob_type, ce->sha1) < 0)
+   if (write_sha1_file(buf, size, blob_type, ce->oid.hash) < 0)
die(_("unable to create backing store for newly created 
file %s"), path);
}
if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD) < 0)
@@ -4335,7 +4335,7 @@ static void add_conflicted_stages_file(struct apply_state 
*state,
ce->ce_mode = create_ce_mode(mode);
ce->ce_flags = create_ce_flags(stage);
ce->ce_namelen = namelen;
-   hashcpy(ce->sha1, patch->threeway_stage[stage - 1].hash);
+   oidcpy(>oid, >threeway_stage[stage - 1]);
if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD) < 0)
die(_("unable to add cache entry for %s"), 
patch->new_name);
}
diff --git a/builtin/blame.c b/builtin/blame.c
index a5bbf91e..5eb3725d 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2410,7 +2410,7 @@ static struct commit *fake_working_tree_commit(struct 
diff_options *opt,
}
size = cache_entry_size(len);
ce = xcalloc(1, size);
-   hashcpy(ce->sha1, origin->blob_sha1);
+   hashcpy(ce->oid.hash, origin->blob_sha1);
memcpy(ce->name, path, len);
ce->ce_flags = create_ce_flags(0);
ce->ce_namelen = len;
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 8672d072..a9523ffa 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -76,7 +76,7 @@ static int update_some(const unsigned char *sha1, struct 
strbuf *base,
 
len = base->len + strlen(pathname);
ce = xcalloc(1, cache_entry_size(len));
-   hashcpy(ce->sha1, sha1);
+   hashcpy(ce->oid.hash, sha1);
memcpy(ce->name, base->buf, base->len);
memcpy(ce->name + base->len, pathname, len - base->len);
ce->ce_flags = create_ce_flags(0) | CE_UPDATE;
@@ -92,7 +92,7 @@ static int update_some(const unsigned char *sha1, struct 
strbuf *base,
if (pos

[PATCH v2 07/20] builtin: convert textconv_object to use struct object_id

2016-09-05 Thread brian m. carlson

Since all of its callers have been updated, make textconv_object take a
struct object_id.

Signed-off-by: brian m. carlson 
---
 builtin.h  |  2 +-
 builtin/blame.c| 12 ++--
 builtin/cat-file.c |  2 +-
 builtin/log.c  |  2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/builtin.h b/builtin.h
index 6b95006a..b9122bc5 100644
--- a/builtin.h
+++ b/builtin.h
@@ -25,7 +25,7 @@ struct fmt_merge_msg_opts {
 extern int fmt_merge_msg(struct strbuf *in, struct strbuf *out,
 struct fmt_merge_msg_opts *);
 
-extern int textconv_object(const char *path, unsigned mode, const unsigned 
char *sha1, int sha1_valid, char **buf, unsigned long *buf_size);
+extern int textconv_object(const char *path, unsigned mode, const struct 
object_id *oid, int oid_valid, char **buf, unsigned long *buf_size);
 
 extern int is_builtin(const char *s);
 
diff --git a/builtin/blame.c b/builtin/blame.c
index 9c09d464..0e10e302 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -154,8 +154,8 @@ static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b,
  */
 int textconv_object(const char *path,
unsigned mode,
-   const unsigned char *sha1,
-   int sha1_valid,
+   const struct object_id *oid,
+   int oid_valid,
char **buf,
unsigned long *buf_size)
 {
@@ -163,7 +163,7 @@ int textconv_object(const char *path,
struct userdiff_driver *textconv;
 
df = alloc_filespec(path);
-   fill_filespec(df, sha1, sha1_valid, mode);
+   fill_filespec(df, oid->hash, oid_valid, mode);
textconv = get_textconv(df);
if (!textconv) {
free_filespec(df);
@@ -188,7 +188,7 @@ static void fill_origin_blob(struct diff_options *opt,
 
num_read_blob++;
if (DIFF_OPT_TST(opt, ALLOW_TEXTCONV) &&
-   textconv_object(o->path, o->mode, o->blob_oid.hash, 1, 
>ptr, _size))
+   textconv_object(o->path, o->mode, >blob_oid, 1, 
>ptr, _size))
;
else
file->ptr = read_sha1_file(o->blob_oid.hash, ,
@@ -2366,7 +2366,7 @@ static struct commit *fake_working_tree_commit(struct 
diff_options *opt,
switch (st.st_mode & S_IFMT) {
case S_IFREG:
if (DIFF_OPT_TST(opt, ALLOW_TEXTCONV) &&
-   textconv_object(read_from, mode, null_sha1, 0, 
_ptr, _len))
+   textconv_object(read_from, mode, _oid, 0, 
_ptr, _len))
strbuf_attach(, buf_ptr, buf_len, buf_len + 
1);
else if (strbuf_read_file(, read_from, st.st_size) 
!= st.st_size)
die_errno("cannot open or read '%s'", 
read_from);
@@ -2793,7 +2793,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
die("no such path %s in %s", path, final_commit_name);
 
if (DIFF_OPT_TST(>diffopt, ALLOW_TEXTCONV) &&
-   textconv_object(path, o->mode, o->blob_oid.hash, 1, (char 
**) _buf,
+   textconv_object(path, o->mode, >blob_oid, 1, (char **) 
_buf,
_buf_size))
;
else
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 8b773787..7b2e0537 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -66,7 +66,7 @@ static int cat_one_file(int opt, const char *exp_type, const 
char *obj_name,
die("git cat-file --textconv %s:  must be 
",
obj_name);
 
-   if (textconv_object(obj_context.path, obj_context.mode, 
oid.hash, 1, , ))
+   if (textconv_object(obj_context.path, obj_context.mode, , 
1, , ))
break;
 
case 'p':
diff --git a/builtin/log.c b/builtin/log.c
index 226212c9..48b9db51 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -479,7 +479,7 @@ static int show_blob_object(const struct object_id *oid, 
struct rev_info *rev, c
if (get_sha1_with_context(obj_name, 0, oidc.hash, _context))
die(_("Not a valid object name %s"), obj_name);
if (!obj_context.path[0] ||
-   !textconv_object(obj_context.path, obj_context.mode, oidc.hash, 1, 
, ))
+   !textconv_object(obj_context.path, obj_context.mode, , 1, 
, ))
return stream_blob_to_fd(1, oid->hash, NULL, 0);
 
if (!buf)

[PATCH v2 02/20] builtin/apply: convert static functions to struct object_id

2016-09-05 Thread brian m. carlson

There were several static functions using unsigned char arrays for SHA-1
values.  Convert them to use struct object_id.

Signed-off-by: brian m. carlson 
---
 builtin/apply.c | 96 -
 1 file changed, 48 insertions(+), 48 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index ba0e75bf..76b16121 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -3101,16 +3101,16 @@ static int apply_binary(struct apply_state *state,
struct patch *patch)
 {
const char *name = patch->old_name ? patch->old_name : patch->new_name;
-   unsigned char sha1[20];
+   struct object_id oid;
 
/*
 * For safety, we require patch index line to contain
 * full 40-byte textual SHA1 for old and new, at least for now.
 */
-   if (strlen(patch->old_sha1_prefix) != 40 ||
-   strlen(patch->new_sha1_prefix) != 40 ||
-   get_sha1_hex(patch->old_sha1_prefix, sha1) ||
-   get_sha1_hex(patch->new_sha1_prefix, sha1))
+   if (strlen(patch->old_sha1_prefix) != GIT_SHA1_HEXSZ ||
+   strlen(patch->new_sha1_prefix) != GIT_SHA1_HEXSZ ||
+   get_oid_hex(patch->old_sha1_prefix, ) ||
+   get_oid_hex(patch->new_sha1_prefix, ))
return error("cannot apply binary patch to '%s' "
 "without full index line", name);
 
@@ -3119,12 +3119,12 @@ static int apply_binary(struct apply_state *state,
 * See if the old one matches what the patch
 * applies to.
 */
-   hash_sha1_file(img->buf, img->len, blob_type, sha1);
-   if (strcmp(sha1_to_hex(sha1), patch->old_sha1_prefix))
+   hash_sha1_file(img->buf, img->len, blob_type, oid.hash);
+   if (strcmp(oid_to_hex(), patch->old_sha1_prefix))
return error("the patch applies to '%s' (%s), "
 "which does not match the "
 "current contents.",
-name, sha1_to_hex(sha1));
+name, oid_to_hex());
}
else {
/* Otherwise, the old one must be empty. */
@@ -3133,19 +3133,19 @@ static int apply_binary(struct apply_state *state,
 "'%s' but it is not empty", name);
}
 
-   get_sha1_hex(patch->new_sha1_prefix, sha1);
-   if (is_null_sha1(sha1)) {
+   get_oid_hex(patch->new_sha1_prefix, );
+   if (is_null_oid()) {
clear_image(img);
return 0; /* deletion patch */
}
 
-   if (has_sha1_file(sha1)) {
+   if (has_sha1_file(oid.hash)) {
/* We already have the postimage */
enum object_type type;
unsigned long size;
char *result;
 
-   result = read_sha1_file(sha1, , );
+   result = read_sha1_file(oid.hash, , );
if (!result)
return error("the necessary postimage %s for "
 "'%s' cannot be read",
@@ -3164,10 +3164,10 @@ static int apply_binary(struct apply_state *state,
 name);
 
/* verify that the result matches */
-   hash_sha1_file(img->buf, img->len, blob_type, sha1);
-   if (strcmp(sha1_to_hex(sha1), patch->new_sha1_prefix))
+   hash_sha1_file(img->buf, img->len, blob_type, oid.hash);
+   if (strcmp(oid_to_hex(), patch->new_sha1_prefix))
return error(_("binary patch to '%s' creates incorrect 
result (expecting %s, got %s)"),
-   name, patch->new_sha1_prefix, 
sha1_to_hex(sha1));
+   name, patch->new_sha1_prefix, oid_to_hex());
}
 
return 0;
@@ -3197,17 +3197,17 @@ static int apply_fragments(struct apply_state *state, 
struct image *img, struct
return 0;
 }
 
-static int read_blob_object(struct strbuf *buf, const unsigned char *sha1, 
unsigned mode)
+static int read_blob_object(struct strbuf *buf, const struct object_id *oid, 
unsigned mode)
 {
if (S_ISGITLINK(mode)) {
strbuf_grow(buf, 100);
-   strbuf_addf(buf, "Subproject commit %s\n", sha1_to_hex(sha1));
+   strbuf_addf(buf, "Subproject commit %s\n", oid_to_hex(oid));
} else {
enum object_type type;
unsigned long sz;
char *result;
 
-   result = read_sha1_file(sha1, , );
+   result = read_sha1_file(oid->hash, , );
if (!result)
return -1;
/* XXX read_sha1_file NUL-terminates */
@@ -3220,7 +3220,7 @@ static int read_file_or_gitlink(const struct cache_entry 
*ce, struct strbuf *buf
 {
if

[PATCH v2 10/20] notes-merge: convert struct notes_merge_pair to struct object_id

2016-09-05 Thread brian m. carlson

Convert each of this structure's members from an unsigned char array to
a struct object_id.

Signed-off-by: brian m. carlson 
---
 notes-merge.c | 127 ++
 1 file changed, 65 insertions(+), 62 deletions(-)

diff --git a/notes-merge.c b/notes-merge.c
index 97fc42f6..cb36b43c 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -12,7 +12,7 @@
 #include "notes-utils.h"
 
 struct notes_merge_pair {
-   unsigned char obj[20], base[20], local[20], remote[20];
+   struct object_id obj, base, local, remote;
 };
 
 void init_notes_merge_options(struct notes_merge_options *o)
@@ -75,7 +75,7 @@ static struct notes_merge_pair *find_notes_merge_pair_pos(
int i = last_index < len ? last_index : len - 1;
int prev_cmp = 0, cmp = -1;
while (i >= 0 && i < len) {
-   cmp = hashcmp(obj, list[i].obj);
+   cmp = hashcmp(obj, list[i].obj.hash);
if (!cmp) /* obj belongs @ i */
break;
else if (cmp < 0 && prev_cmp <= 0) /* obj belongs < i */
@@ -108,9 +108,10 @@ static struct notes_merge_pair *find_notes_merge_pair_pos(
return list + i;
 }
 
-static unsigned char uninitialized[20] =
+static struct object_id uninitialized = {
"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff" \
-   "\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff";
+   "\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"
+};
 
 static struct notes_merge_pair *diff_tree_remote(struct notes_merge_options *o,
 const unsigned char *base,
@@ -149,25 +150,25 @@ static struct notes_merge_pair *diff_tree_remote(struct 
notes_merge_options *o,
mp = find_notes_merge_pair_pos(changes, len, obj, 1, );
if (occupied) {
/* We've found an addition/deletion pair */
-   assert(!hashcmp(mp->obj, obj));
+   assert(!hashcmp(mp->obj.hash, obj));
if (is_null_oid(>one->oid)) { /* addition */
-   assert(is_null_sha1(mp->remote));
-   hashcpy(mp->remote, p->two->oid.hash);
+   assert(is_null_oid(>remote));
+   oidcpy(>remote, >two->oid);
} else if (is_null_oid(>two->oid)) { /* deletion */
-   assert(is_null_sha1(mp->base));
-   hashcpy(mp->base, p->one->oid.hash);
+   assert(is_null_oid(>base));
+   oidcpy(>base, >one->oid);
} else
assert(!"Invalid existing change recorded");
} else {
-   hashcpy(mp->obj, obj);
-   hashcpy(mp->base, p->one->oid.hash);
-   hashcpy(mp->local, uninitialized);
-   hashcpy(mp->remote, p->two->oid.hash);
+   hashcpy(mp->obj.hash, obj);
+   oidcpy(>base, >one->oid);
+   oidcpy(>local, );
+   oidcpy(>remote, >two->oid);
len++;
}
trace_printf("\t\tStored remote change for %s: %.7s -> %.7s\n",
-  sha1_to_hex(mp->obj), sha1_to_hex(mp->base),
-  sha1_to_hex(mp->remote));
+  oid_to_hex(>obj), oid_to_hex(>base),
+  oid_to_hex(>remote));
}
diff_flush();
clear_pathspec();
@@ -216,7 +217,7 @@ static void diff_tree_local(struct notes_merge_options *o,
continue;
}
 
-   assert(!hashcmp(mp->obj, obj));
+   assert(!hashcmp(mp->obj.hash, obj));
if (is_null_oid(>two->oid)) { /* deletion */
/*
 * Either this is a true deletion (1), or it is part
@@ -227,8 +228,8 @@ static void diff_tree_local(struct notes_merge_options *o,
 * (3) mp->local is uninitialized; set it to null_sha1
 * (will be overwritten by following addition)
 */
-   if (!hashcmp(mp->local, uninitialized))
-   hashclr(mp->local);
+   if (!oidcmp(>local, ))
+   oidclr(>local);
} else if (is_null_oid(>one->oid)) { /* addition */
/*
 * Either this is a true addition (1), or it is part
@@ -238,22 +239,22 @@ static void diff_tree_local(struct notes_merge_options *o,
 * (2) mp->local is uninitialized; set to p->two->sha1
 * (3) mp->local is null_sha1; set to p->two->sha1
 */
-

[PATCH v2 03/20] builtin/blame: convert struct origin to use struct object_id

2016-09-05 Thread brian m. carlson

Convert struct origin to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib,
plus the actual change to the struct:

@@
struct origin E1;
@@
- E1.blob_sha1
+ E1.blob_oid.hash

@@
struct origin *E1;
@@
- E1->blob_sha1
+ E1->blob_oid.hash

Signed-off-by: brian m. carlson 
---
 builtin/blame.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 5eb3725d..9c09d464 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -120,7 +120,7 @@ struct origin {
 */
struct blame_entry *suspects;
mmfile_t file;
-   unsigned char blob_sha1[20];
+   struct object_id blob_oid;
unsigned mode;
/* guilty gets set when shipping any suspects to the final
 * blame list instead of other commits
@@ -188,15 +188,16 @@ static void fill_origin_blob(struct diff_options *opt,
 
num_read_blob++;
if (DIFF_OPT_TST(opt, ALLOW_TEXTCONV) &&
-   textconv_object(o->path, o->mode, o->blob_sha1, 1, 
>ptr, _size))
+   textconv_object(o->path, o->mode, o->blob_oid.hash, 1, 
>ptr, _size))
;
else
-   file->ptr = read_sha1_file(o->blob_sha1, , 
_size);
+   file->ptr = read_sha1_file(o->blob_oid.hash, ,
+  _size);
file->size = file_size;
 
if (!file->ptr)
die("Cannot read blob %s for path %s",
-   sha1_to_hex(o->blob_sha1),
+   oid_to_hex(>blob_oid),
o->path);
o->file = *file;
}
@@ -508,17 +509,17 @@ static struct origin *get_origin(struct scoreboard *sb,
  */
 static int fill_blob_sha1_and_mode(struct origin *origin)
 {
-   if (!is_null_sha1(origin->blob_sha1))
+   if (!is_null_oid(>blob_oid))
return 0;
if (get_tree_entry(origin->commit->object.oid.hash,
   origin->path,
-  origin->blob_sha1, >mode))
+  origin->blob_oid.hash, >mode))
goto error_out;
-   if (sha1_object_info(origin->blob_sha1, NULL) != OBJ_BLOB)
+   if (sha1_object_info(origin->blob_oid.hash, NULL) != OBJ_BLOB)
goto error_out;
return 0;
  error_out:
-   hashclr(origin->blob_sha1);
+   oidclr(>blob_oid);
origin->mode = S_IFINVALID;
return -1;
 }
@@ -572,7 +573,7 @@ static struct origin *find_origin(struct scoreboard *sb,
if (!diff_queued_diff.nr) {
/* The path is the same as parent */
porigin = get_origin(sb, parent, origin->path);
-   hashcpy(porigin->blob_sha1, origin->blob_sha1);
+   oidcpy(>blob_oid, >blob_oid);
porigin->mode = origin->mode;
} else {
/*
@@ -598,7 +599,7 @@ static struct origin *find_origin(struct scoreboard *sb,
p->status);
case 'M':
porigin = get_origin(sb, parent, origin->path);
-   hashcpy(porigin->blob_sha1, p->one->oid.hash);
+   oidcpy(>blob_oid, >one->oid);
porigin->mode = p->one->mode;
break;
case 'A':
@@ -644,7 +645,7 @@ static struct origin *find_rename(struct scoreboard *sb,
if ((p->status == 'R' || p->status == 'C') &&
!strcmp(p->two->path, origin->path)) {
porigin = get_origin(sb, parent, p->one->path);
-   hashcpy(porigin->blob_sha1, p->one->oid.hash);
+   oidcpy(>blob_oid, >one->oid);
porigin->mode = p->one->mode;
break;
}
@@ -1308,7 +1309,7 @@ static void find_copy_in_parent(struct scoreboard *sb,
continue;
 
norigin = get_origin(sb, parent, p->one->path);
-   hashcpy(norigin->blob_sha1, p->one->oid.hash);
+   oidcpy(>blob_oid, >one->oid);
norigin->mode = p->one->mode;
fill_origin_blob(>revs->diffopt, norigin, _p);
if (!file_p.ptr)
@@ -1458,15 +1459,14 @@ static void pass_blame(struct scoreboard *sb, struct 
origin *origin, int opt)
porigin = find(sb, p, origin);
if (!porigin)
continue;
-   if (!hashcmp(porigin->blob_sha1, origin->blob_sha1)) {
+   if (!oidcmp(>blob_oid, >blob_oid)) {
pass_whole_blame(sb, origin, porigin);

[PATCH v2 00/20] object_id part 5

2016-09-05 Thread brian m. carlson


This is the fifth in a series of series to convert from unsigned char [20] to
struct object_id.

This series converts many of the files in the builtin directory to use struct
object_id.  This gets us almost to the point where we can convert get_tree_entry
to use struct object_id, but not quite.  That function is used indirectly by
get_sha1, meaning that get_oid would have to completely replace it in order for
get_tree_entry to be converted.

However, this series tackles one of two major sources of object ID values: the
command line (the other, of course, being the refs code).  Converting several of
the builtin commands to use struct object_id as much as possible makes it easier
to convert other functions down the line.

Changes from v1:
* More clearly state that the Coccinelle-using patches also have a manual struct
  update.
* Convert the uninitialized constant in notes-merge.c.
* Convert one additional use of update_ref into update_ref_oid.

brian m. carlson (20):
  cache: convert struct cache_entry to use struct object_id
  builtin/apply: convert static functions to struct object_id
  builtin/blame: convert struct origin to use struct object_id
  builtin/log: convert some static functions to use struct object_id
  builtin/cat-file: convert struct expand_data to use struct object_id
  builtin/cat-file: convert some static functions to struct object_id
  builtin: convert textconv_object to use struct object_id
  streaming: make stream_blob_to_fd take struct object_id
  builtin/checkout: convert some static functions to struct object_id
  notes-merge: convert struct notes_merge_pair to struct object_id
  Convert read_mmblob to take struct object_id.
  builtin/blame: convert file to use struct object_id
  builtin/rm: convert to use struct object_id
  notes: convert init_notes to use struct object_id
  builtin/update-index: convert file to struct object_id
  sha1_name: convert get_sha1_mb to struct object_id
  refs: add an update_ref_oid function.
  builtin/am: convert to struct object_id
  builtin/commit-tree: convert to struct object_id
  builtin/reset: convert to use struct object_id

 builtin.h|   2 +-
 builtin/am.c | 140 +++
 builtin/apply.c  |  94 +-
 builtin/blame.c  |  76 ++---
 builtin/cat-file.c   |  70 ++--
 builtin/checkout.c   |  70 ++--
 builtin/commit-tree.c|  20 +++---
 builtin/fsck.c   |   4 +-
 builtin/grep.c   |   3 +-
 builtin/log.c|  44 ++--
 builtin/ls-files.c   |   2 +-
 builtin/merge-index.c|   2 +-
 builtin/read-tree.c  |   2 +-
 builtin/reset.c  |  52 +++
 builtin/rm.c |  18 ++---
 builtin/submodule--helper.c  |   5 +-
 builtin/update-index.c   |  67 ++-
 cache-tree.c |   4 +-
 cache.h  |   4 +-
 diff-lib.c   |  31 +
 diff.c   |   2 +-
 dir.c|   7 +-
 entry.c  |   9 +--
 merge-recursive.c|   8 +--
 notes-merge.c| 127 ++-
 notes.c  |  12 ++--
 read-cache.c |  24 +++
 refs.c   |   8 +++
 refs.h   |   3 +
 rerere.c |   3 +-
 resolve-undo.c   |   2 +-
 revision.c   |   2 +-
 sha1_name.c  |  20 +++---
 streaming.c  |   4 +-
 streaming.h  |   2 +-
 t/helper/test-dump-split-index.c |   2 +-
 tree.c   |   2 +-
 unpack-trees.c   |   8 +--
 xdiff-interface.c|   8 +--
 xdiff-interface.h|   3 +-
 40 files changed, 497 insertions(+), 469 deletions(-)

[PATCH v2 09/20] builtin/checkout: convert some static functions to struct object_id

2016-09-05 Thread brian m. carlson

Convert all the static functions that are not callbacks to struct
object_id.

Signed-off-by: brian m. carlson 
---
 builtin/checkout.c | 66 +++---
 1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index a9523ffa..ec85af56 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -175,9 +175,9 @@ static int checkout_merged(int pos, struct checkout *state)
const char *path = ce->name;
mmfile_t ancestor, ours, theirs;
int status;
-   unsigned char sha1[20];
+   struct object_id oid;
mmbuffer_t result_buf;
-   unsigned char threeway[3][20];
+   struct object_id threeway[3];
unsigned mode = 0;
 
memset(threeway, 0, sizeof(threeway));
@@ -186,18 +186,18 @@ static int checkout_merged(int pos, struct checkout 
*state)
stage = ce_stage(ce);
if (!stage || strcmp(path, ce->name))
break;
-   hashcpy(threeway[stage - 1], ce->oid.hash);
+   oidcpy([stage - 1], >oid);
if (stage == 2)
mode = create_ce_mode(ce->ce_mode);
pos++;
ce = active_cache[pos];
}
-   if (is_null_sha1(threeway[1]) || is_null_sha1(threeway[2]))
+   if (is_null_oid([1]) || is_null_oid([2]))
return error(_("path '%s' does not have necessary versions"), 
path);
 
-   read_mmblob(, threeway[0]);
-   read_mmblob(, threeway[1]);
-   read_mmblob(, threeway[2]);
+   read_mmblob(, threeway[0].hash);
+   read_mmblob(, threeway[1].hash);
+   read_mmblob(, threeway[2].hash);
 
/*
 * NEEDSWORK: re-create conflicts from merges with
@@ -226,9 +226,9 @@ static int checkout_merged(int pos, struct checkout *state)
 * object database even when it may contain conflicts).
 */
if (write_sha1_file(result_buf.ptr, result_buf.size,
-   blob_type, sha1))
+   blob_type, oid.hash))
die(_("Unable to add merge result for '%s'"), path);
-   ce = make_cache_entry(mode, sha1, path, 2, 0);
+   ce = make_cache_entry(mode, oid.hash, path, 2, 0);
if (!ce)
die(_("make_cache_entry failed for path '%s'"), path);
status = checkout_entry(ce, state, NULL);
@@ -241,7 +241,7 @@ static int checkout_paths(const struct checkout_opts *opts,
int pos;
struct checkout state;
static char *ps_matched;
-   unsigned char rev[20];
+   struct object_id rev;
struct commit *head;
int errs = 0;
struct lock_file *lock_file;
@@ -374,8 +374,8 @@ static int checkout_paths(const struct checkout_opts *opts,
if (write_locked_index(_index, lock_file, COMMIT_LOCK))
die(_("unable to write new index file"));
 
-   read_ref_full("HEAD", 0, rev, NULL);
-   head = lookup_commit_reference_gently(rev, 1);
+   read_ref_full("HEAD", 0, rev.hash, NULL);
+   head = lookup_commit_reference_gently(rev.hash, 1);
 
errs |= post_checkout_hook(head, head, 0);
return errs;
@@ -808,11 +808,11 @@ static int switch_branches(const struct checkout_opts 
*opts,
int ret = 0;
struct branch_info old;
void *path_to_free;
-   unsigned char rev[20];
+   struct object_id rev;
int flag, writeout_error = 0;
memset(, 0, sizeof(old));
-   old.path = path_to_free = resolve_refdup("HEAD", 0, rev, );
-   old.commit = lookup_commit_reference_gently(rev, 1);
+   old.path = path_to_free = resolve_refdup("HEAD", 0, rev.hash, );
+   old.commit = lookup_commit_reference_gently(rev.hash, 1);
if (!(flag & REF_ISSYMREF))
old.path = NULL;
 
@@ -860,7 +860,7 @@ static int git_checkout_config(const char *var, const char 
*value, void *cb)
 struct tracking_name_data {
/* const */ char *src_ref;
char *dst_ref;
-   unsigned char *dst_sha1;
+   struct object_id *dst_oid;
int unique;
 };
 
@@ -871,7 +871,7 @@ static int check_tracking_name(struct remote *remote, void 
*cb_data)
memset(, 0, sizeof(struct refspec));
query.src = cb->src_ref;
if (remote_find_tracking(remote, ) ||
-   get_sha1(query.dst, cb->dst_sha1)) {
+   get_oid(query.dst, cb->dst_oid)) {
free(query.dst);
return 0;
}
@@ -884,13 +884,13 @@ static int check_tracking_name(struct remote *remote, 
void *cb_data)
return 0;
 }
 
-static const char *unique_tracking_name(const char *name, unsigned char *sha1)
+static const char *unique_tracking_name(const char *name, struct object_id 
*oid)
 {
struct tracking_name_data cb_data = { NULL, NULL, NULL, 1 };
char src_ref[PATH_MAX];
snprintf(src_ref, PATH_MAX, "refs/heads/%s", name);

[PATCH v2 08/20] streaming: make stream_blob_to_fd take struct object_id

2016-09-05 Thread brian m. carlson

Since all of its callers have been updated, modify stream_blob_to_fd to
take a struct object_id.

Signed-off-by: brian m. carlson 
---
 builtin/cat-file.c | 6 +++---
 builtin/fsck.c | 2 +-
 builtin/log.c  | 4 ++--
 entry.c| 2 +-
 streaming.c| 4 ++--
 streaming.h| 2 +-
 6 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 7b2e0537..49b8fa8e 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -83,7 +83,7 @@ static int cat_one_file(int opt, const char *exp_type, const 
char *obj_name,
}
 
if (type == OBJ_BLOB)
-   return stream_blob_to_fd(1, oid.hash, NULL, 0);
+   return stream_blob_to_fd(1, , NULL, 0);
buf = read_sha1_file(oid.hash, , );
if (!buf)
die("Cannot read object %s", obj_name);
@@ -105,7 +105,7 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
oidcpy(_oid, );
 
if (sha1_object_info(blob_oid.hash, NULL) == OBJ_BLOB)
-   return stream_blob_to_fd(1, blob_oid.hash, 
NULL, 0);
+   return stream_blob_to_fd(1, _oid, NULL, 0);
/*
 * we attempted to dereference a tag to a blob
 * and failed; there may be new dereference
@@ -240,7 +240,7 @@ static void print_object_or_die(struct batch_options *opt, 
struct expand_data *d
if (data->type == OBJ_BLOB) {
if (opt->buffer_output)
fflush(stdout);
-   if (stream_blob_to_fd(1, oid->hash, NULL, 0) < 0)
+   if (stream_blob_to_fd(1, oid, NULL, 0) < 0)
die("unable to stream %s to stdout", oid_to_hex(oid));
}
else {
diff --git a/builtin/fsck.c b/builtin/fsck.c
index f604adff..055dfdcf 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -268,7 +268,7 @@ static void check_unreachable_object(struct object *obj)
if (!(f = fopen(filename, "w")))
die_errno("Could not open '%s'", filename);
if (obj->type == OBJ_BLOB) {
-   if (stream_blob_to_fd(fileno(f), obj->oid.hash, 
NULL, 1))
+   if (stream_blob_to_fd(fileno(f), >oid, 
NULL, 1))
die_errno("Could not write '%s'", 
filename);
} else
fprintf(f, "%s\n", describe_object(obj));
diff --git a/builtin/log.c b/builtin/log.c
index 48b9db51..0b427b67 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -474,13 +474,13 @@ static int show_blob_object(const struct object_id *oid, 
struct rev_info *rev, c
fflush(rev->diffopt.file);
if (!DIFF_OPT_TOUCHED(>diffopt, ALLOW_TEXTCONV) ||
!DIFF_OPT_TST(>diffopt, ALLOW_TEXTCONV))
-   return stream_blob_to_fd(1, oid->hash, NULL, 0);
+   return stream_blob_to_fd(1, oid, NULL, 0);
 
if (get_sha1_with_context(obj_name, 0, oidc.hash, _context))
die(_("Not a valid object name %s"), obj_name);
if (!obj_context.path[0] ||
!textconv_object(obj_context.path, obj_context.mode, , 1, 
, ))
-   return stream_blob_to_fd(1, oid->hash, NULL, 0);
+   return stream_blob_to_fd(1, oid, NULL, 0);
 
if (!buf)
die(_("git show %s: bad file"), obj_name);
diff --git a/entry.c b/entry.c
index ce80d292..c6eea240 100644
--- a/entry.c
+++ b/entry.c
@@ -127,7 +127,7 @@ static int streaming_write_entry(const struct cache_entry 
*ce, char *path,
if (fd < 0)
return -1;
 
-   result |= stream_blob_to_fd(fd, ce->oid.hash, filter, 1);
+   result |= stream_blob_to_fd(fd, >oid, filter, 1);
*fstat_done = fstat_output(fd, state, statbuf);
result |= close(fd);
 
diff --git a/streaming.c b/streaming.c
index 811fcc24..3c48f049 100644
--- a/streaming.c
+++ b/streaming.c
@@ -497,7 +497,7 @@ static open_method_decl(incore)
  * Users of streaming interface
  /
 
-int stream_blob_to_fd(int fd, unsigned const char *sha1, struct stream_filter 
*filter,
+int stream_blob_to_fd(int fd, const struct object_id *oid, struct 
stream_filter *filter,
  int can_seek)
 {
struct git_istream *st;
@@ -506,7 +506,7 @@ int stream_blob_to_fd(int fd, unsigned const char *sha1, 
struct stream_filter *f
ssize_t kept = 0;
int result = -1;
 
-   st = open_istream(sha1, , , filter);
+   st = open_istream(oid->hash, , , filter);
if (!st) {
if (filter)
free_stream_filter(filter);
diff --git a/streaming.h b/streaming.h

[PATCH v2 04/20] builtin/log: convert some static functions to use struct object_id

2016-09-05 Thread brian m. carlson

Signed-off-by: brian m. carlson 
---
 builtin/log.c | 44 ++--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 92dc34dc..226212c9 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -464,9 +464,9 @@ static void show_tagger(char *buf, int len, struct rev_info 
*rev)
strbuf_release();
 }
 
-static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, 
const char *obj_name)
+static int show_blob_object(const struct object_id *oid, struct rev_info *rev, 
const char *obj_name)
 {
-   unsigned char sha1c[20];
+   struct object_id oidc;
struct object_context obj_context;
char *buf;
unsigned long size;
@@ -474,13 +474,13 @@ static int show_blob_object(const unsigned char *sha1, 
struct rev_info *rev, con
fflush(rev->diffopt.file);
if (!DIFF_OPT_TOUCHED(>diffopt, ALLOW_TEXTCONV) ||
!DIFF_OPT_TST(>diffopt, ALLOW_TEXTCONV))
-   return stream_blob_to_fd(1, sha1, NULL, 0);
+   return stream_blob_to_fd(1, oid->hash, NULL, 0);
 
-   if (get_sha1_with_context(obj_name, 0, sha1c, _context))
+   if (get_sha1_with_context(obj_name, 0, oidc.hash, _context))
die(_("Not a valid object name %s"), obj_name);
if (!obj_context.path[0] ||
-   !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, 
, ))
-   return stream_blob_to_fd(1, sha1, NULL, 0);
+   !textconv_object(obj_context.path, obj_context.mode, oidc.hash, 1, 
, ))
+   return stream_blob_to_fd(1, oid->hash, NULL, 0);
 
if (!buf)
die(_("git show %s: bad file"), obj_name);
@@ -489,15 +489,15 @@ static int show_blob_object(const unsigned char *sha1, 
struct rev_info *rev, con
return 0;
 }
 
-static int show_tag_object(const unsigned char *sha1, struct rev_info *rev)
+static int show_tag_object(const struct object_id *oid, struct rev_info *rev)
 {
unsigned long size;
enum object_type type;
-   char *buf = read_sha1_file(sha1, , );
+   char *buf = read_sha1_file(oid->hash, , );
int offset = 0;
 
if (!buf)
-   return error(_("Could not read object %s"), sha1_to_hex(sha1));
+   return error(_("Could not read object %s"), oid_to_hex(oid));
 
assert(type == OBJ_TAG);
while (offset < size && buf[offset] != '\n') {
@@ -574,7 +574,7 @@ int cmd_show(int argc, const char **argv, const char 
*prefix)
const char *name = objects[i].name;
switch (o->type) {
case OBJ_BLOB:
-   ret = show_blob_object(o->oid.hash, , name);
+   ret = show_blob_object(>oid, , name);
break;
case OBJ_TAG: {
struct tag *t = (struct tag *)o;
@@ -585,7 +585,7 @@ int cmd_show(int argc, const char **argv, const char 
*prefix)
diff_get_color_opt(, 
DIFF_COMMIT),
t->tag,
diff_get_color_opt(, 
DIFF_RESET));
-   ret = show_tag_object(o->oid.hash, );
+   ret = show_tag_object(>oid, );
rev.shown_one = 1;
if (ret)
break;
@@ -1248,11 +1248,11 @@ static struct commit *get_base_commit(const char 
*base_commit,
if (upstream) {
struct commit_list *base_list;
struct commit *commit;
-   unsigned char sha1[20];
+   struct object_id oid;
 
-   if (get_sha1(upstream, sha1))
+   if (get_oid(upstream, ))
die(_("Failed to resolve '%s' as a valid 
ref."), upstream);
-   commit = lookup_commit_or_die(sha1, "upstream base");
+   commit = lookup_commit_or_die(oid.hash, "upstream 
base");
base_list = get_merge_bases_many(commit, total, list);
/* There should be one and only one merge base. */
if (!base_list || base_list->next)
@@ -1339,15 +1339,15 @@ static void prepare_bases(struct base_tree_info *bases,
 * and stuff them in bases structure.
 */
while ((commit = get_revision()) != NULL) {
-   unsigned char sha1[20];
+   struct object_id oid;
struct object_id *patch_id;
if (commit->util)
continue;
-   if (commit_patch_id(commit, , sha1, 0))
+   if (commit_patch_id(commit, , oid.hash, 0))
die(_("cannot get patch id"));
ALLOC_GROW(bases->patch_id, bases->nr_patch_id + 1, 
bases->alloc_patch_id);

[PATCH v2 19/20] builtin/commit-tree: convert to struct object_id

2016-09-05 Thread brian m. carlson

Signed-off-by: brian m. carlson 
---
 builtin/commit-tree.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/builtin/commit-tree.c b/builtin/commit-tree.c
index 8a674bc9..60501726 100644
--- a/builtin/commit-tree.c
+++ b/builtin/commit-tree.c
@@ -40,8 +40,8 @@ int cmd_commit_tree(int argc, const char **argv, const char 
*prefix)
 {
int i, got_tree = 0;
struct commit_list *parents = NULL;
-   unsigned char tree_sha1[20];
-   unsigned char commit_sha1[20];
+   struct object_id tree_oid;
+   struct object_id commit_oid;
struct strbuf buffer = STRBUF_INIT;
 
git_config(commit_tree_config, NULL);
@@ -52,13 +52,13 @@ int cmd_commit_tree(int argc, const char **argv, const char 
*prefix)
for (i = 1; i < argc; i++) {
const char *arg = argv[i];
if (!strcmp(arg, "-p")) {
-   unsigned char sha1[20];
+   struct object_id oid;
if (argc <= ++i)
usage(commit_tree_usage);
-   if (get_sha1_commit(argv[i], sha1))
+   if (get_sha1_commit(argv[i], oid.hash))
die("Not a valid object name %s", argv[i]);
-   assert_sha1_type(sha1, OBJ_COMMIT);
-   new_parent(lookup_commit(sha1), );
+   assert_sha1_type(oid.hash, OBJ_COMMIT);
+   new_parent(lookup_commit(oid.hash), );
continue;
}
 
@@ -105,7 +105,7 @@ int cmd_commit_tree(int argc, const char **argv, const char 
*prefix)
continue;
}
 
-   if (get_sha1_tree(arg, tree_sha1))
+   if (get_sha1_tree(arg, tree_oid.hash))
die("Not a valid object name %s", arg);
if (got_tree)
die("Cannot give more than one trees");
@@ -117,13 +117,13 @@ int cmd_commit_tree(int argc, const char **argv, const 
char *prefix)
die_errno("git commit-tree: failed to read");
}
 
-   if (commit_tree(buffer.buf, buffer.len, tree_sha1, parents,
-   commit_sha1, NULL, sign_commit)) {
+   if (commit_tree(buffer.buf, buffer.len, tree_oid.hash, parents,
+   commit_oid.hash, NULL, sign_commit)) {
strbuf_release();
return 1;
}
 
-   printf("%s\n", sha1_to_hex(commit_sha1));
+   printf("%s\n", oid_to_hex(_oid));
strbuf_release();
return 0;
 }

[PATCH v2 05/20] builtin/cat-file: convert struct expand_data to use struct object_id

2016-09-05 Thread brian m. carlson

Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib,
plus the actual change to the struct:

@@
struct expand_data E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct expand_data *E1;
@@
- E1->sha1
+ E1->oid.hash

@@
struct expand_data E1;
@@
- E1.delta_base_sha1
+ E1.delta_base_oid.hash

@@
struct expand_data *E1;
@@
- E1->delta_base_sha1
+ E1->delta_base_oid.hash

Signed-off-by: brian m. carlson 
---
 builtin/cat-file.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 2dfe6265..16b0b8c9 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -128,12 +128,12 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
 }
 
 struct expand_data {
-   unsigned char sha1[20];
+   struct object_id oid;
enum object_type type;
unsigned long size;
off_t disk_size;
const char *rest;
-   unsigned char delta_base_sha1[20];
+   struct object_id delta_base_oid;
 
/*
 * If mark_query is true, we do not expand anything, but rather
@@ -176,7 +176,7 @@ static void expand_atom(struct strbuf *sb, const char 
*atom, int len,
 
if (is_atom("objectname", atom, len)) {
if (!data->mark_query)
-   strbuf_addstr(sb, sha1_to_hex(data->sha1));
+   strbuf_addstr(sb, oid_to_hex(>oid));
} else if (is_atom("objecttype", atom, len)) {
if (data->mark_query)
data->info.typep = >type;
@@ -199,9 +199,10 @@ static void expand_atom(struct strbuf *sb, const char 
*atom, int len,
strbuf_addstr(sb, data->rest);
} else if (is_atom("deltabase", atom, len)) {
if (data->mark_query)
-   data->info.delta_base_sha1 = data->delta_base_sha1;
+   data->info.delta_base_sha1 = data->delta_base_oid.hash;
else
-   strbuf_addstr(sb, sha1_to_hex(data->delta_base_sha1));
+   strbuf_addstr(sb,
+ oid_to_hex(>delta_base_oid));
} else
die("unknown format element: %.*s", len, atom);
 }
@@ -232,7 +233,7 @@ static void batch_write(struct batch_options *opt, const 
void *data, int len)
 
 static void print_object_or_die(struct batch_options *opt, struct expand_data 
*data)
 {
-   const unsigned char *sha1 = data->sha1;
+   const unsigned char *sha1 = data->oid.hash;
 
assert(data->info.typep);
 
@@ -266,8 +267,9 @@ static void batch_object_write(const char *obj_name, struct 
batch_options *opt,
struct strbuf buf = STRBUF_INIT;
 
if (!data->skip_object_info &&
-   sha1_object_info_extended(data->sha1, >info, 
LOOKUP_REPLACE_OBJECT) < 0) {
-   printf("%s missing\n", obj_name ? obj_name : 
sha1_to_hex(data->sha1));
+   sha1_object_info_extended(data->oid.hash, >info, 
LOOKUP_REPLACE_OBJECT) < 0) {
+   printf("%s missing\n",
+  obj_name ? obj_name : oid_to_hex(>oid));
fflush(stdout);
return;
}
@@ -290,7 +292,7 @@ static void batch_one_object(const char *obj_name, struct 
batch_options *opt,
int flags = opt->follow_symlinks ? GET_SHA1_FOLLOW_SYMLINKS : 0;
enum follow_symlinks_result result;
 
-   result = get_sha1_with_context(obj_name, flags, data->sha1, );
+   result = get_sha1_with_context(obj_name, flags, data->oid.hash, );
if (result != FOUND) {
switch (result) {
case MISSING_OBJECT:
@@ -336,7 +338,7 @@ struct object_cb_data {
 static void batch_object_cb(const unsigned char sha1[20], void *vdata)
 {
struct object_cb_data *data = vdata;
-   hashcpy(data->expand->sha1, sha1);
+   hashcpy(data->expand->oid.hash, sha1);
batch_object_write(NULL, data->opt, data->expand);
 }

[PATCH v2 20/20] builtin/reset: convert to use struct object_id

2016-09-05 Thread brian m. carlson

Signed-off-by: brian m. carlson 
---
 builtin/reset.c | 52 ++--
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/builtin/reset.c b/builtin/reset.c
index 9020ec66..5aa86079 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -39,7 +39,7 @@ static inline int is_merge(void)
return !access(git_path_merge_head(), F_OK);
 }
 
-static int reset_index(const unsigned char *sha1, int reset_type, int quiet)
+static int reset_index(const struct object_id *oid, int reset_type, int quiet)
 {
int nr = 1;
struct tree_desc desc[2];
@@ -69,22 +69,22 @@ static int reset_index(const unsigned char *sha1, int 
reset_type, int quiet)
read_cache_unmerged();
 
if (reset_type == KEEP) {
-   unsigned char head_sha1[20];
-   if (get_sha1("HEAD", head_sha1))
+   struct object_id head_oid;
+   if (get_oid("HEAD", _oid))
return error(_("You do not have a valid HEAD."));
-   if (!fill_tree_descriptor(desc, head_sha1))
+   if (!fill_tree_descriptor(desc, head_oid.hash))
return error(_("Failed to find tree of HEAD."));
nr++;
opts.fn = twoway_merge;
}
 
-   if (!fill_tree_descriptor(desc + nr - 1, sha1))
-   return error(_("Failed to find tree of %s."), 
sha1_to_hex(sha1));
+   if (!fill_tree_descriptor(desc + nr - 1, oid->hash))
+   return error(_("Failed to find tree of %s."), oid_to_hex(oid));
if (unpack_trees(nr, desc, ))
return -1;
 
if (reset_type == MIXED || reset_type == HARD) {
-   tree = parse_tree_indirect(sha1);
+   tree = parse_tree_indirect(oid->hash);
prime_cache_tree(_index, tree);
}
 
@@ -143,7 +143,7 @@ static void update_index_from_diff(struct diff_queue_struct 
*q,
 }
 
 static int read_from_tree(const struct pathspec *pathspec,
- unsigned char *tree_sha1,
+ struct object_id *tree_oid,
  int intent_to_add)
 {
struct diff_options opt;
@@ -154,7 +154,7 @@ static int read_from_tree(const struct pathspec *pathspec,
opt.format_callback = update_index_from_diff;
opt.format_callback_data = _to_add;
 
-   if (do_diff_cache(tree_sha1, ))
+   if (do_diff_cache(tree_oid->hash, ))
return 1;
diffcore_std();
diff_flush();
@@ -191,7 +191,7 @@ static void parse_args(struct pathspec *pathspec,
   const char **rev_ret)
 {
const char *rev = "HEAD";
-   unsigned char unused[20];
+   struct object_id unused;
/*
 * Possible arguments are:
 *
@@ -216,8 +216,8 @@ static void parse_args(struct pathspec *pathspec,
 * has to be unambiguous. If there is a single argument, it
 * can not be a tree
 */
-   else if ((!argv[1] && !get_sha1_committish(argv[0], unused)) ||
-(argv[1] && !get_sha1_treeish(argv[0], unused))) {
+   else if ((!argv[1] && !get_sha1_committish(argv[0], 
unused.hash)) ||
+(argv[1] && !get_sha1_treeish(argv[0], unused.hash))) {
/*
 * Ok, argv[0] looks like a commit/tree; it should not
 * be a filename.
@@ -241,24 +241,24 @@ static void parse_args(struct pathspec *pathspec,
   prefix, argv);
 }
 
-static int reset_refs(const char *rev, const unsigned char *sha1)
+static int reset_refs(const char *rev, const struct object_id *oid)
 {
int update_ref_status;
struct strbuf msg = STRBUF_INIT;
-   unsigned char *orig = NULL, sha1_orig[20],
-   *old_orig = NULL, sha1_old_orig[20];
+   struct object_id *orig = NULL, oid_orig,
+   *old_orig = NULL, oid_old_orig;
 
-   if (!get_sha1("ORIG_HEAD", sha1_old_orig))
-   old_orig = sha1_old_orig;
-   if (!get_sha1("HEAD", sha1_orig)) {
-   orig = sha1_orig;
+   if (!get_oid("ORIG_HEAD", _old_orig))
+   old_orig = _old_orig;
+   if (!get_oid("HEAD", _orig)) {
+   orig = _orig;
set_reflog_message(, "updating ORIG_HEAD", NULL);
-   update_ref(msg.buf, "ORIG_HEAD", orig, old_orig, 0,
+   update_ref_oid(msg.buf, "ORIG_HEAD", orig, old_orig, 0,
   UPDATE_REFS_MSG_ON_ERR);
} else if (old_orig)
-   delete_ref("ORIG_HEAD", old_orig, 0);
+   delete_ref("ORIG_HEAD", old_orig->hash, 0);
set_reflog_message(, "updating HEAD", rev);
-   update_ref_status = update_ref(msg.buf, "HEAD", sha1, orig, 0,
+   update_ref_status = update_ref_oid(msg.buf, "HEAD", oid, orig, 0,

[PATCH v2 13/20] builtin/rm: convert to use struct object_id

2016-09-05 Thread brian m. carlson

Signed-off-by: brian m. carlson 
---
 builtin/rm.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/builtin/rm.c b/builtin/rm.c
index 109969d5..3f3e24eb 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -107,7 +107,7 @@ static int check_submodules_use_gitfiles(void)
return errs;
 }
 
-static int check_local_mod(unsigned char *head, int index_only)
+static int check_local_mod(struct object_id *head, int index_only)
 {
/*
 * Items in list are already sorted in the cache order,
@@ -123,13 +123,13 @@ static int check_local_mod(unsigned char *head, int 
index_only)
struct string_list files_submodule = STRING_LIST_INIT_NODUP;
struct string_list files_local = STRING_LIST_INIT_NODUP;
 
-   no_head = is_null_sha1(head);
+   no_head = is_null_oid(head);
for (i = 0; i < list.nr; i++) {
struct stat st;
int pos;
const struct cache_entry *ce;
const char *name = list.entry[i].name;
-   unsigned char sha1[20];
+   struct object_id oid;
unsigned mode;
int local_changes = 0;
int staged_changes = 0;
@@ -197,9 +197,9 @@ static int check_local_mod(unsigned char *head, int 
index_only)
 * way as changed from the HEAD.
 */
if (no_head
-|| get_tree_entry(head, name, sha1, )
+|| get_tree_entry(head->hash, name, oid.hash, )
 || ce->ce_mode != create_ce_mode(mode)
-|| hashcmp(ce->oid.hash, sha1))
+|| oidcmp(>oid, ))
staged_changes = 1;
 
/*
@@ -351,10 +351,10 @@ int cmd_rm(int argc, const char **argv, const char 
*prefix)
 * report no changes unless forced.
 */
if (!force) {
-   unsigned char sha1[20];
-   if (get_sha1("HEAD", sha1))
-   hashclr(sha1);
-   if (check_local_mod(sha1, index_only))
+   struct object_id oid;
+   if (get_oid("HEAD", ))
+   oidclr();
+   if (check_local_mod(, index_only))
exit(1);
} else if (!index_only) {
if (check_submodules_use_gitfiles())

[PATCH v2 16/20] sha1_name: convert get_sha1_mb to struct object_id

2016-09-05 Thread brian m. carlson

All of the callers of this function use struct object_id, so rename it
to get_oid_mb and make it take struct object_id instead of
unsigned char *.

Signed-off-by: brian m. carlson 
---
 builtin/checkout.c |  2 +-
 cache.h|  2 +-
 sha1_name.c| 18 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 13169221..8013a1b8 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -973,7 +973,7 @@ static int parse_branchname_arg(int argc, const char **argv,
if (!strcmp(arg, "-"))
arg = "@{-1}";
 
-   if (get_sha1_mb(arg, rev->hash)) {
+   if (get_oid_mb(arg, rev)) {
/*
 * Either case (3) or (4), with  not being
 * a commit, or an attempt to use case (1) with an
diff --git a/cache.h b/cache.h
index a679484e..e40165d1 100644
--- a/cache.h
+++ b/cache.h
@@ -1204,7 +1204,7 @@ extern char *sha1_to_hex(const unsigned char *sha1);  
/* static buffer result! */
 extern char *oid_to_hex(const struct object_id *oid);  /* same static buffer 
as sha1_to_hex */
 
 extern int interpret_branch_name(const char *str, int len, struct strbuf *);
-extern int get_sha1_mb(const char *str, unsigned char *sha1);
+extern int get_oid_mb(const char *str, struct object_id *oid);
 
 extern int validate_headref(const char *ref);
 
diff --git a/sha1_name.c b/sha1_name.c
index e4404391..faf873cf 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -995,35 +995,35 @@ static int interpret_nth_prior_checkout(const char *name, 
int namelen,
return retval;
 }
 
-int get_sha1_mb(const char *name, unsigned char *sha1)
+int get_oid_mb(const char *name, struct object_id *oid)
 {
struct commit *one, *two;
struct commit_list *mbs;
-   unsigned char sha1_tmp[20];
+   struct object_id oid_tmp;
const char *dots;
int st;
 
dots = strstr(name, "...");
if (!dots)
-   return get_sha1(name, sha1);
+   return get_oid(name, oid);
if (dots == name)
-   st = get_sha1("HEAD", sha1_tmp);
+   st = get_oid("HEAD", _tmp);
else {
struct strbuf sb;
strbuf_init(, dots - name);
strbuf_add(, name, dots - name);
-   st = get_sha1_committish(sb.buf, sha1_tmp);
+   st = get_sha1_committish(sb.buf, oid_tmp.hash);
strbuf_release();
}
if (st)
return st;
-   one = lookup_commit_reference_gently(sha1_tmp, 0);
+   one = lookup_commit_reference_gently(oid_tmp.hash, 0);
if (!one)
return -1;
 
-   if (get_sha1_committish(dots[3] ? (dots + 3) : "HEAD", sha1_tmp))
+   if (get_sha1_committish(dots[3] ? (dots + 3) : "HEAD", oid_tmp.hash))
return -1;
-   two = lookup_commit_reference_gently(sha1_tmp, 0);
+   two = lookup_commit_reference_gently(oid_tmp.hash, 0);
if (!two)
return -1;
mbs = get_merge_bases(one, two);
@@ -1031,7 +1031,7 @@ int get_sha1_mb(const char *name, unsigned char *sha1)
st = -1;
else {
st = 0;
-   hashcpy(sha1, mbs->item->object.oid.hash);
+   oidcpy(oid, >item->object.oid);
}
free_commit_list(mbs);
return st;

[PATCH v2 14/20] notes: convert init_notes to use struct object_id

2016-09-05 Thread brian m. carlson

Signed-off-by: brian m. carlson 
---
 notes.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/notes.c b/notes.c
index df4660fe..2bab961a 100644
--- a/notes.c
+++ b/notes.c
@@ -993,7 +993,7 @@ const char *default_notes_ref(void)
 void init_notes(struct notes_tree *t, const char *notes_ref,
combine_notes_fn combine_notes, int flags)
 {
-   unsigned char sha1[20], object_sha1[20];
+   struct object_id oid, object_oid;
unsigned mode;
struct leaf_node root_tree;
 
@@ -1017,16 +1017,16 @@ void init_notes(struct notes_tree *t, const char 
*notes_ref,
t->dirty = 0;
 
if (flags & NOTES_INIT_EMPTY || !notes_ref ||
-   get_sha1_treeish(notes_ref, object_sha1))
+   get_sha1_treeish(notes_ref, object_oid.hash))
return;
-   if (flags & NOTES_INIT_WRITABLE && read_ref(notes_ref, object_sha1))
+   if (flags & NOTES_INIT_WRITABLE && read_ref(notes_ref, object_oid.hash))
die("Cannot use notes ref %s", notes_ref);
-   if (get_tree_entry(object_sha1, "", sha1, ))
+   if (get_tree_entry(object_oid.hash, "", oid.hash, ))
die("Failed to read notes tree referenced by %s (%s)",
-   notes_ref, sha1_to_hex(object_sha1));
+   notes_ref, oid_to_hex(_oid));
 
hashclr(root_tree.key_sha1);
-   hashcpy(root_tree.val_sha1, sha1);
+   hashcpy(root_tree.val_sha1, oid.hash);
load_subtree(t, _tree, t->root, 0);
 }

[PATCH v2 11/20] Convert read_mmblob to take struct object_id.

2016-09-05 Thread brian m. carlson

Since all of its callers have been updated, convert read_mmblob to take
a pointer to struct object_id.

Signed-off-by: brian m. carlson 
---
 builtin/apply.c| 6 +++---
 builtin/checkout.c | 6 +++---
 merge-recursive.c  | 6 +++---
 notes-merge.c  | 6 +++---
 xdiff-interface.c  | 8 
 xdiff-interface.h  | 3 ++-
 6 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index 76b16121..df2c95d3 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -3435,9 +3435,9 @@ static int three_way_merge(struct image *image,
mmbuffer_t result = { NULL };
int status;
 
-   read_mmblob(_file, base->hash);
-   read_mmblob(_file, ours->hash);
-   read_mmblob(_file, theirs->hash);
+   read_mmblob(_file, base);
+   read_mmblob(_file, ours);
+   read_mmblob(_file, theirs);
status = ll_merge(, path,
  _file, "base",
  _file, "ours",
diff --git a/builtin/checkout.c b/builtin/checkout.c
index ec85af56..13169221 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -195,9 +195,9 @@ static int checkout_merged(int pos, struct checkout *state)
if (is_null_oid([1]) || is_null_oid([2]))
return error(_("path '%s' does not have necessary versions"), 
path);
 
-   read_mmblob(, threeway[0].hash);
-   read_mmblob(, threeway[1].hash);
-   read_mmblob(, threeway[2].hash);
+   read_mmblob(, [0]);
+   read_mmblob(, [1]);
+   read_mmblob(, [2]);
 
/*
 * NEEDSWORK: re-create conflicts from merges with
diff --git a/merge-recursive.c b/merge-recursive.c
index e3db594d..3750d253 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -910,9 +910,9 @@ static int merge_3way(struct merge_options *o,
name2 = mkpathdup("%s", branch2);
}
 
-   read_mmblob(, one->oid.hash);
-   read_mmblob(, a->oid.hash);
-   read_mmblob(, b->oid.hash);
+   read_mmblob(, >oid);
+   read_mmblob(, >oid);
+   read_mmblob(, >oid);
 
merge_status = ll_merge(result_buf, a->path, , base_name,
, name1, , name2, _opts);
diff --git a/notes-merge.c b/notes-merge.c
index cb36b43c..b3536284 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -344,9 +344,9 @@ static int ll_merge_in_worktree(struct notes_merge_options 
*o,
mmfile_t base, local, remote;
int status;
 
-   read_mmblob(, p->base.hash);
-   read_mmblob(, p->local.hash);
-   read_mmblob(, p->remote.hash);
+   read_mmblob(, >base);
+   read_mmblob(, >local);
+   read_mmblob(, >remote);
 
status = ll_merge(_buf, oid_to_hex(>obj), , NULL,
  , o->local_ref, , o->remote_ref, NULL);
diff --git a/xdiff-interface.c b/xdiff-interface.c
index f34ea762..3bfc69ca 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -178,20 +178,20 @@ int read_mmfile(mmfile_t *ptr, const char *filename)
return 0;
 }
 
-void read_mmblob(mmfile_t *ptr, const unsigned char *sha1)
+void read_mmblob(mmfile_t *ptr, const struct object_id *oid)
 {
unsigned long size;
enum object_type type;
 
-   if (!hashcmp(sha1, null_sha1)) {
+   if (!oidcmp(oid, _oid)) {
ptr->ptr = xstrdup("");
ptr->size = 0;
return;
}
 
-   ptr->ptr = read_sha1_file(sha1, , );
+   ptr->ptr = read_sha1_file(oid->hash, , );
if (!ptr->ptr || type != OBJ_BLOB)
-   die("unable to read blob object %s", sha1_to_hex(sha1));
+   die("unable to read blob object %s", oid_to_hex(oid));
ptr->size = size;
 }
 
diff --git a/xdiff-interface.h b/xdiff-interface.h
index fbb5a1c3..6f6ba909 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -1,6 +1,7 @@
 #ifndef XDIFF_INTERFACE_H
 #define XDIFF_INTERFACE_H
 
+#include "cache.h"
 #include "xdiff/xdiff.h"
 
 /*
@@ -20,7 +21,7 @@ int parse_hunk_header(char *line, int len,
  int *ob, int *on,
  int *nb, int *nn);
 int read_mmfile(mmfile_t *ptr, const char *filename);
-void read_mmblob(mmfile_t *ptr, const unsigned char *sha1);
+void read_mmblob(mmfile_t *ptr, const struct object_id *oid);
 int buffer_is_binary(const char *ptr, unsigned long size);
 
 extern void xdiff_set_find_func(xdemitconf_t *xecfg, const char *line, int 
cflags);

[PATCH v2 18/20] builtin/am: convert to struct object_id

2016-09-05 Thread brian m. carlson

Convert uses of unsigned char [20] to struct object_id.  Rename the
generically-named "ptr" to "old_oid" and make it const.

Signed-off-by: brian m. carlson 
---
 builtin/am.c | 140 +--
 1 file changed, 70 insertions(+), 70 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 739b34dc..1b4829a1 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -108,7 +108,7 @@ struct am_state {
size_t msg_len;
 
/* when --rebasing, records the original commit the patch came from */
-   unsigned char orig_commit[GIT_SHA1_RAWSZ];
+   struct object_id orig_commit;
 
/* number of digits in patch filename */
int prec;
@@ -428,8 +428,8 @@ static void am_load(struct am_state *state)
read_commit_msg(state);
 
if (read_state_file(, state, "original-commit", 1) < 0)
-   hashclr(state->orig_commit);
-   else if (get_sha1_hex(sb.buf, state->orig_commit) < 0)
+   oidclr(>orig_commit);
+   else if (get_oid_hex(sb.buf, >orig_commit) < 0)
die(_("could not parse %s"), am_path(state, "original-commit"));
 
read_state_file(, state, "threeway", 1);
@@ -555,14 +555,14 @@ static int copy_notes_for_rebase(const struct am_state 
*state)
fp = xfopen(am_path(state, "rewritten"), "r");
 
while (!strbuf_getline_lf(, fp)) {
-   unsigned char from_obj[GIT_SHA1_RAWSZ], to_obj[GIT_SHA1_RAWSZ];
+   struct object_id from_obj, to_obj;
 
if (sb.len != GIT_SHA1_HEXSZ * 2 + 1) {
ret = error(invalid_line, sb.buf);
goto finish;
}
 
-   if (get_sha1_hex(sb.buf, from_obj)) {
+   if (get_oid_hex(sb.buf, _obj)) {
ret = error(invalid_line, sb.buf);
goto finish;
}
@@ -572,14 +572,14 @@ static int copy_notes_for_rebase(const struct am_state 
*state)
goto finish;
}
 
-   if (get_sha1_hex(sb.buf + GIT_SHA1_HEXSZ + 1, to_obj)) {
+   if (get_oid_hex(sb.buf + GIT_SHA1_HEXSZ + 1, _obj)) {
ret = error(invalid_line, sb.buf);
goto finish;
}
 
-   if (copy_note_for_rewrite(c, from_obj, to_obj))
+   if (copy_note_for_rewrite(c, from_obj.hash, to_obj.hash))
ret = error(_("Failed to copy notes from '%s' to '%s'"),
-   sha1_to_hex(from_obj), 
sha1_to_hex(to_obj));
+   oid_to_hex(_obj), 
oid_to_hex(_obj));
}
 
 finish:
@@ -985,7 +985,7 @@ static int split_mail(struct am_state *state, enum 
patch_format patch_format,
 static void am_setup(struct am_state *state, enum patch_format patch_format,
const char **paths, int keep_cr)
 {
-   unsigned char curr_head[GIT_SHA1_RAWSZ];
+   struct object_id curr_head;
const char *str;
struct strbuf sb = STRBUF_INIT;
 
@@ -1053,10 +1053,10 @@ static void am_setup(struct am_state *state, enum 
patch_format patch_format,
else
write_state_text(state, "applying", "");
 
-   if (!get_sha1("HEAD", curr_head)) {
-   write_state_text(state, "abort-safety", sha1_to_hex(curr_head));
+   if (!get_oid("HEAD", _head)) {
+   write_state_text(state, "abort-safety", oid_to_hex(_head));
if (!state->rebasing)
-   update_ref("am", "ORIG_HEAD", curr_head, NULL, 0,
+   update_ref_oid("am", "ORIG_HEAD", _head, NULL, 0,
UPDATE_REFS_DIE_ON_ERR);
} else {
write_state_text(state, "abort-safety", "");
@@ -1081,7 +1081,7 @@ static void am_setup(struct am_state *state, enum 
patch_format patch_format,
  */
 static void am_next(struct am_state *state)
 {
-   unsigned char head[GIT_SHA1_RAWSZ];
+   struct object_id head;
 
free(state->author_name);
state->author_name = NULL;
@@ -1099,11 +1099,11 @@ static void am_next(struct am_state *state)
unlink(am_path(state, "author-script"));
unlink(am_path(state, "final-commit"));
 
-   hashclr(state->orig_commit);
+   oidclr(>orig_commit);
unlink(am_path(state, "original-commit"));
 
-   if (!get_sha1("HEAD", head))
-   write_state_text(state, "abort-safety", sha1_to_hex(head));
+   if (!get_oid("HEAD", ))
+   write_state_text(state, "abort-safety", oid_to_hex());
else
write_state_text(state, "abort-safety", "");
 
@@ -1145,17 +1145,17 @@ static void refresh_and_write_cache(void)
  */
 static int index_has_changes(struct strbuf *sb)
 {
-   unsigned char head[GIT_SHA1_RAWSZ];
+   struct object_id head;
int i;
 
-   if

[PATCH v2 17/20] refs: add an update_ref_oid function.

2016-09-05 Thread brian m. carlson

Several places around the codebase want to pass update_ref data from
struct object_id, but update_ref may also be passed NULL pointers.
Instead of checking and dereferencing in every caller, create an
update_ref_oid which wraps update_ref and provides this functionality.

Signed-off-by: brian m. carlson 
---
 refs.c | 8 
 refs.h | 3 +++
 2 files changed, 11 insertions(+)

diff --git a/refs.c b/refs.c
index b4e7cac7..f567a78e 100644
--- a/refs.c
+++ b/refs.c
@@ -858,6 +858,14 @@ int ref_transaction_verify(struct ref_transaction 
*transaction,
  flags, NULL, err);
 }
 
+int update_ref_oid(const char *msg, const char *refname,
+  const struct object_id *new_oid, const struct object_id *old_oid,
+  unsigned int flags, enum action_on_err onerr)
+{
+   return update_ref(msg, refname, new_oid ? new_oid->hash : NULL,
+   old_oid ? old_oid->hash : NULL, flags, onerr);
+}
+
 int update_ref(const char *msg, const char *refname,
   const unsigned char *new_sha1, const unsigned char *old_sha1,
   unsigned int flags, enum action_on_err onerr)
diff --git a/refs.h b/refs.h
index 1b020437..7a77f3ef 100644
--- a/refs.h
+++ b/refs.h
@@ -477,6 +477,9 @@ void ref_transaction_free(struct ref_transaction 
*transaction);
 int update_ref(const char *msg, const char *refname,
   const unsigned char *new_sha1, const unsigned char *old_sha1,
   unsigned int flags, enum action_on_err onerr);
+int update_ref_oid(const char *msg, const char *refname,
+  const struct object_id *new_oid, const struct object_id *old_oid,
+  unsigned int flags, enum action_on_err onerr);
 
 int parse_hide_refs_config(const char *var, const char *value, const char *);

[PATCH v2 12/20] builtin/blame: convert file to use struct object_id

2016-09-05 Thread brian m. carlson

Convert this file to use struct object_id, and additionally convert some
uses of the constant 40 to GIT_SHA1_HEXSZ.

Signed-off-by: brian m. carlson 
---
 builtin/blame.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 0e10e302..ccaf8be5 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1941,7 +1941,7 @@ static void emit_other(struct scoreboard *sb, struct 
blame_entry *ent, int opt)
cp = nth_line(sb, ent->lno);
for (cnt = 0; cnt < ent->num_lines; cnt++) {
char ch;
-   int length = (opt & OUTPUT_LONG_OBJECT_NAME) ? 40 : abbrev;
+   int length = (opt & OUTPUT_LONG_OBJECT_NAME) ? GIT_SHA1_HEXSZ : 
abbrev;
 
if (suspect->commit->object.flags & UNINTERESTING) {
if (blank_boundary)
@@ -2232,12 +2232,12 @@ static void verify_working_tree_path(struct commit 
*work_tree, const char *path)
int pos;
 
for (parents = work_tree->parents; parents; parents = parents->next) {
-   const unsigned char *commit_sha1 = 
parents->item->object.oid.hash;
-   unsigned char blob_sha1[20];
+   const struct object_id *commit_oid = >item->object.oid;
+   struct object_id blob_oid;
unsigned mode;
 
-   if (!get_tree_entry(commit_sha1, path, blob_sha1, ) &&
-   sha1_object_info(blob_sha1, NULL) == OBJ_BLOB)
+   if (!get_tree_entry(commit_oid->hash, path, blob_oid.hash, 
) &&
+   sha1_object_info(blob_oid.hash, NULL) == OBJ_BLOB)
return;
}
 
@@ -2251,13 +2251,13 @@ static void verify_working_tree_path(struct commit 
*work_tree, const char *path)
die("no such path '%s' in HEAD", path);
 }
 
-static struct commit_list **append_parent(struct commit_list **tail, const 
unsigned char *sha1)
+static struct commit_list **append_parent(struct commit_list **tail, const 
struct object_id *oid)
 {
struct commit *parent;
 
-   parent = lookup_commit_reference(sha1);
+   parent = lookup_commit_reference(oid->hash);
if (!parent)
-   die("no such commit %s", sha1_to_hex(sha1));
+   die("no such commit %s", oid_to_hex(oid));
return _list_insert(parent, tail)->next;
 }
 
@@ -2274,10 +2274,10 @@ static void append_merge_parents(struct commit_list 
**tail)
}
 
while (!strbuf_getwholeline_fd(, merge_head, '\n')) {
-   unsigned char sha1[20];
-   if (line.len < 40 || get_sha1_hex(line.buf, sha1))
+   struct object_id oid;
+   if (line.len < GIT_SHA1_HEXSZ || get_oid_hex(line.buf, ))
die("unknown line in '%s': %s", git_path_merge_head(), 
line.buf);
-   tail = append_parent(tail, sha1);
+   tail = append_parent(tail, );
}
close(merge_head);
strbuf_release();
@@ -2306,7 +2306,7 @@ static struct commit *fake_working_tree_commit(struct 
diff_options *opt,
struct commit *commit;
struct origin *origin;
struct commit_list **parent_tail, *parent;
-   unsigned char head_sha1[20];
+   struct object_id head_oid;
struct strbuf buf = STRBUF_INIT;
const char *ident;
time_t now;
@@ -2322,10 +2322,10 @@ static struct commit *fake_working_tree_commit(struct 
diff_options *opt,
commit->date = now;
parent_tail = >parents;
 
-   if (!resolve_ref_unsafe("HEAD", RESOLVE_REF_READING, head_sha1, NULL))
+   if (!resolve_ref_unsafe("HEAD", RESOLVE_REF_READING, head_oid.hash, 
NULL))
die("no such ref: HEAD");
 
-   parent_tail = append_parent(parent_tail, head_sha1);
+   parent_tail = append_parent(parent_tail, _oid);
append_merge_parents(parent_tail);
verify_working_tree_path(commit, path);

[PATCH v2 15/20] builtin/update-index: convert file to struct object_id

2016-09-05 Thread brian m. carlson

Convert all functions to use struct object_id, and replace instances of
hardcoded 40, 41, and 42 with appropriate references to GIT_SHA1_HEXSZ.

Signed-off-by: brian m. carlson 
---
 builtin/update-index.c | 61 +-
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index a60a30a8..44fb2e4f 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -312,7 +312,7 @@ static int add_one_path(const struct cache_entry *old, 
const char *path, int len
  */
 static int process_directory(const char *path, int len, struct stat *st)
 {
-   unsigned char sha1[20];
+   struct object_id oid;
int pos = cache_name_pos(path, len);
 
/* Exact match: file or existing gitlink */
@@ -321,7 +321,7 @@ static int process_directory(const char *path, int len, 
struct stat *st)
if (S_ISGITLINK(ce->ce_mode)) {
 
/* Do nothing to the index if there is no HEAD! */
-   if (resolve_gitlink_ref(path, "HEAD", sha1) < 0)
+   if (resolve_gitlink_ref(path, "HEAD", oid.hash) < 0)
return 0;
 
return add_one_path(ce, path, len, st);
@@ -347,7 +347,7 @@ static int process_directory(const char *path, int len, 
struct stat *st)
}
 
/* No match - should we add it as a gitlink? */
-   if (!resolve_gitlink_ref(path, "HEAD", sha1))
+   if (!resolve_gitlink_ref(path, "HEAD", oid.hash))
return add_one_path(NULL, path, len, st);
 
/* Error out. */
@@ -390,7 +390,7 @@ static int process_path(const char *path)
return add_one_path(ce, path, len, );
 }
 
-static int add_cacheinfo(unsigned int mode, const unsigned char *sha1,
+static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
 const char *path, int stage)
 {
int size, len, option;
@@ -403,7 +403,7 @@ static int add_cacheinfo(unsigned int mode, const unsigned 
char *sha1,
size = cache_entry_size(len);
ce = xcalloc(1, size);
 
-   hashcpy(ce->oid.hash, sha1);
+   oidcpy(>oid, oid);
memcpy(ce->name, path, len);
ce->ce_flags = create_ce_flags(stage);
ce->ce_namelen = len;
@@ -487,7 +487,7 @@ static void read_index_info(int nul_term_line)
while (getline_fn(, stdin) != EOF) {
char *ptr, *tab;
char *path_name;
-   unsigned char sha1[20];
+   struct object_id oid;
unsigned int mode;
unsigned long ul;
int stage;
@@ -516,7 +516,7 @@ static void read_index_info(int nul_term_line)
mode = ul;
 
tab = strchr(ptr, '\t');
-   if (!tab || tab - ptr < 41)
+   if (!tab || tab - ptr < GIT_SHA1_HEXSZ + 1)
goto bad_line;
 
if (tab[-2] == ' ' && '0' <= tab[-1] && tab[-1] <= '3') {
@@ -529,7 +529,8 @@ static void read_index_info(int nul_term_line)
ptr = tab + 1; /* point at the head of path */
}
 
-   if (get_sha1_hex(tab - 40, sha1) || tab[-41] != ' ')
+   if (get_oid_hex(tab - GIT_SHA1_HEXSZ, ) ||
+   tab[-(GIT_SHA1_HEXSZ + 1)] != ' ')
goto bad_line;
 
path_name = ptr;
@@ -557,8 +558,8 @@ static void read_index_info(int nul_term_line)
 * ptr[-1] points at tab,
 * ptr[-41] is at the beginning of sha1
 */
-   ptr[-42] = ptr[-1] = 0;
-   if (add_cacheinfo(mode, sha1, path_name, stage))
+   ptr[-(GIT_SHA1_HEXSZ + 2)] = ptr[-1] = 0;
+   if (add_cacheinfo(mode, , path_name, stage))
die("git update-index: unable to update %s",
path_name);
}
@@ -576,19 +577,19 @@ static const char * const update_index_usage[] = {
NULL
 };
 
-static unsigned char head_sha1[20];
-static unsigned char merge_head_sha1[20];
+static struct object_id head_oid;
+static struct object_id merge_head_oid;
 
 static struct cache_entry *read_one_ent(const char *which,
-   unsigned char *ent, const char *path,
+   struct object_id *ent, const char *path,
int namelen, int stage)
 {
unsigned mode;
-   unsigned char sha1[20];
+   struct object_id oid;
int size;
struct cache_entry *ce;
 
-   if (get_tree_entry(ent, path, sha1, )) {
+   if (get_tree_entry(ent->hash, path, oid.hash, )) {
if (which)
error("%s: not in %s branch.", path, which);
return

Re: [PATCH v6 12/13] convert: add filter..process option

2016-09-05 Thread Lars Schneider


> On 30 Aug 2016, at 22:46, Jakub Narębski  wrote:
> 
>>> The filter can exit right after the "error-all". If the filter does
>>> not exit then Git will kill the filter. I'll add this to the docs.
>> 
>> OK.
> 
> Is it explicit kill, or implicit kill: close pipe and end process?

I close the pipe and call "finish_command".


>>> "abort" could be ambiguous because it could be read as "abort only
>>> this file". "abort-all" would work, though. Would you prefer to see
>>> "error" replaced by "abort" and "error-all" by "abort-all"?
>> 
>> No.
>> 
>> I was primarily reacting to "-all" part, so anything that ends with
>> "-all" is equally ugly from my point of view and not an improvement.
>> 
>> As I said, "error-all" as long as other reviewers are happy with is
>> OK by me, too.
> 
> I'm rather partial to "abort" instead of "error-all", or "quit"/"exit"
> (but I prefer "abort" or "bail-out"), as it better reflects what this
> response is about - ending filter process.

After some thinking I agree with "abort" instead of "error-all".


>>> A filter that dies during communication or does not adhere to the protocol
>>> is a faulty filter. Feeding the faulty filter after restart with the same 
>>> blob would likely cause the same error. 
>> 
>> Why does Git restart it and continue with the rest of the blobs
>> then?  Is there a reason to believe it may produce correct results
>> for other blobs if you run it again?
> 
> I guess the idea is that single filter can be run on different types
> of blobs, and it could fail on some types (some files) and not others.
> Like LFS-type filter failing on files with size larger than 2 GiB,
> or iconv-like filter to convert UTF-16 to UTF-8 failing on invalid
> byte sequences.

This mimics the v1 behavior and I will explain that in the documentation.


>>> B) If we communicate "shutdown" to the filter then we need to give the
>>>   filter some time to perform the exit before the filter is killed on
>>>   Git exit. I wasn't able to come up with a good answer how long Git 
>>>   should wait for the exit.
>> 
>> Would that be an issue?  A filter is buggy if told to shutdown,
>> ignores the request and hangs around forever.  I do not think we
>> even need to kill it.  It is not Git's problem.
> 
> I think the idea was for Git to request shutdown, and filter to tell
> when it finished (or just exiting, and Git getting SIGCHLD, I guess).
> It is hard to tell how much to wait, for example for filter process
> to send "sync" command, or finish unmounting, or something...

I like Junios approach because then we don't need to wait at all...


>> I personally would be reluctant to become a filter process writer if
>> the system it will be talking to can kill it without giving it a
>> chance to do a graceful exit, but perhaps that is just me.  I don't
>> know if closing the pipe going there where you are having Git to
>> kill the process on the other end is any more involved than what you
>> have without extra patches.
> 
> Isn't it the same problem with v1 filters being killed on Git process
> exit?  Unless v2 filter wants to do something _after_ writing output
> to Git, it should be safe... unless Git process is killed, but it
> is not different from filter being stray-killed.

Yes, it should be safe. However, I think it is probably nicer if the filter
process can shutdown gracefully instead of being killed.


> +Please note that you cannot use an existing `filter..clean`
> +or `filter..smudge` command with `filter..process`
> +because the former two use a different inter process communication
> +protocol than the latter one.
 
 Would it be a useful sample program we can ship in contrib/ if you
 created a "filter adapter" that reads these two configuration
 variables and act as a filter..process?
>>> 
>>> You mean a v2 filter that would use v1 filters under the hood?
>>> If we would drop v1, then this would be useful. Otherwise I don't
>>> see any need for such a thing.
>> 
>> I meant it as primarily an example people can learn from when they
>> want to write their own.
> 
> Errr... if it were any v1 filter, it would be useless (as bad or
> worse performance than ordinary v1 clean or smudge filter).  It
> might make sense if v1 filter and v2 wrapper were written in the
> same [dynamic] language, and wrapper could just load / require
> filter as a function / procedure, [compile it], and use it.
> For example smudge/clean filter in Perl (e.g. rot13), and v2 wrapper
> in Perl too.

I'll provide a simple Perl rot13 v2 filter in contrib.

Thanks,
Lars

Re: [PATCH] Avoid hard-coded perl path in sha-bang

2016-09-05 Thread brian m. carlson

On Mon, Sep 05, 2016 at 11:28:18PM +0800, iblis wrote:
> Signed-off-by: Iblis Lin 
> 
> ---
>  contrib/diff-highlight/diff-highlight | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/contrib/diff-highlight/diff-highlight 
> b/contrib/diff-highlight/diff-highlight
> index ffefc31..b57b0fd 100755
> --- a/contrib/diff-highlight/diff-highlight
> +++ b/contrib/diff-highlight/diff-highlight
> @@ -1,4 +1,4 @@
> -#!/usr/bin/perl
> +#!/usr/bin/env perl
>  use 5.008;
>  use warnings FATAL => 'all';

This isn't consistent with what the other contrib scripts do, and
/usr/bin/env perl isn't always the right perl for all situations.  What
you probably want is to do something like this, which is listed in the
Makefile.

# Individual rules to allow e.g.
# "make -C ../.. SCRIPT_PERL=contrib/foo/bar.perl build-perl-script"
# from subdirectories like contrib/*/
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: [PATCH 1/4] add: document the chmod option

2016-09-05 Thread Thomas Gummerer

Hi Johannes,

On 09/05, Johannes Schindelin wrote:
> Hi Thomas,
> 
> On Sun, 4 Sep 2016, Thomas Gummerer wrote:
> 
> > +--chmod=(+|-)::
> 
> There is an "x" missing ;-)

Ugh, thanks for catching.  I'll wait a few days for more comments and
address it in a re-roll.

> Ciao,
> Dscho

[PATCHv1 3/4] t5550-http-fetch-dumb.sh: use the GIT_TRACE_CURL environment var

2016-09-05 Thread Elia Pinto

Use the new GIT_TRACE_CURL environment variable instead
of the deprecated GIT_CURL_VERBOSE.

Signed-off-by: Elia Pinto 
---
Drop leftover debugging junk from previous patch

 t/t5550-http-fetch-dumb.sh | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 3484b6f..dc9b87d 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -263,15 +263,15 @@ check_language () {
>expect
;;
?*)
-   echo "Accept-Language: $1" >expect
+   echo "=> Send header: Accept-Language: $1" >expect
;;
esac &&
-   GIT_CURL_VERBOSE=1 \
+   GIT_TRACE_CURL=true \
LANGUAGE=$2 \
git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 &&
tr -d '\015' actual &&
+   sed -ne '/^=> Send header: Accept-Language:/ p' >actual &&
test_cmp expect actual
 }
 
@@ -295,8 +295,8 @@ ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, 
*;q=0.90" \
 '
 
 test_expect_success 'git client does not send an empty Accept-Language' '
-   GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 
2>stderr &&
-   ! grep "^Accept-Language:" stderr
+   GIT_TRACE_CURL=true LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 
2>stderr &&
+   ! grep "^=> Send header: Accept-Language:" stderr
 '
 
 stop_httpd
-- 
2.10.0.308.gf73994d

Re: [PATCH 0/3] Fix a segfault caused by regexec() being called on mmap()ed data

2016-09-05 Thread Junio C Hamano

Johannes Schindelin  writes:

> This patch series addresses a problem where `git diff` is called using
> `-G` or `-S --pickaxe-regex` on new-born files that are configured
> without user diff drivers, and that hence get mmap()ed into memory.

Good spotting.  This has been with us almost forever; I do not think
the original pickaxe had it, but I am sure it is broken after the
"--pickaxe-regex" enhancement.

I am somehow surprised that this is a problem on Windows, though.
Wouldn't we be at least running CRLF conversions, and causing diff
or grep machinery to work on a NUL-terminated buffer?  The convesion
code would have to look at mmap'ed memory but I do not think it
assumes NUL-termination.  Perhaps people on Windows do not usually
use straight-through and that is why this was discovered after many
years, or something?  In any case, that is a digression.

> Windows (the bug does not trigger a segmentation fault for me on Linux,
> strangely enough, but it is really a problem on Windows).

I think it is an issue on all platforms that lets us use mmap().
When the size of a file is multiple of pagesize, the byte past the
end of the file can very well fall on an unmapped address.

> So at least I have a workaround in place. Ideally, though, we would
> NUL-terminate the buffers only when needed, or somehow call regexec() on
> ptr/size parameters instead of passing a supposedly NUL-terminated
> string to it?

I see two reasonable approaches.

 * We may want to revisit the performance numbers to see if using
   mmap() to read from the working tree files still buys us much.
   If not, we should stop borrowing from the working tree using
   mmap(); instead just slurp in and NUL-terminate it.

 * We could use  variant of regexp engine as you proposed,
   which I think is a preferrable solution.  Do people know of a
   widely accepted implementation that we can throw into compat/ as
   fallback that is compatible with GPLv2?

Re: [PATCH 3/4] t5550-http-fetch-dumb.sh: use the GIT_TRACE_CURL environment var

2016-09-05 Thread Elia Pinto

2016-09-05 15:43 GMT+02:00 Eric Sunshine :
> On Mon, Sep 5, 2016 at 6:24 AM, Elia Pinto  wrote:
>> Use the new GIT_TRACE_CURL environment variable instead
>> of the deprecated GIT_CURL_VERBOSE.
>>
>> Signed-off-by: Elia Pinto 
>> ---
>> diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
>> @@ -263,15 +263,18 @@ check_language () {
>> >expect
>> ;;
>> ?*)
>> -   echo "Accept-Language: $1" >expect
>> +   echo "=> Send header: Accept-Language: $1" >expect
>> ;;
>> esac &&
>> -   GIT_CURL_VERBOSE=1 \
>> +   GIT_TRACE_CURL=true \
>> LANGUAGE=$2 \
>> git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 &&
>> tr -d '\015' > sort -u |
>> -   sed -ne '/^Accept-Language:/ p' >actual &&
>> +   sed -ne '/^=> Send header: Accept-Language:/ p' >actual &&
>> +   cp expect expect.$$ &&
>> +   cp actual actual.$$ &&
>> +   cp output output.$$ &&
>
> What are these three cp's about? They don't seem to be related to the
> stated changes. Are they leftover debugging gunk?
Yes, i am very sorry. My bad. I will repost. Thanks


>
>> test_cmp expect actual
>>  }
>>
>> @@ -295,8 +298,8 @@ ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, 
>> *;q=0.90" \
>>  '
>>
>>  test_expect_success 'git client does not send an empty Accept-Language' '
>> -   GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote 
>> "$HTTPD_URL/dumb/repo.git" 2>stderr &&
>> -   ! grep "^Accept-Language:" stderr
>> +   GIT_TRACE_CURL=true LANGUAGE= git ls-remote 
>> "$HTTPD_URL/dumb/repo.git" 2>stderr &&
>> +   ! grep "^=> Send header: Accept-Language:" stderr
>>  '
>>
>>  stop_httpd

[PATCH] t6026-merge-attr: wait for process to release trash directory

2016-09-05 Thread Johannes Sixt

The process spawned in the hook uses the test's trash directory as CWD.
As long as it is alive, the directory cannot be removed on Windows.
Although the test succeeds, the 'test_done' that follows produces an
error message and leaves the trash directory around. Insert a delay to
give the hook time to go away.

Signed-off-by: Johannes Sixt 
---
 t/t6026-merge-attr.sh | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/t6026-merge-attr.sh b/t/t6026-merge-attr.sh
index dd8f88d..2c5b138 100755
--- a/t/t6026-merge-attr.sh
+++ b/t/t6026-merge-attr.sh
@@ -191,7 +191,10 @@ test_expect_success 'custom merge does not lock index' '
"* merge=ours" "text merge=sleep-one-second" &&
test_config merge.ours.driver true &&
test_config merge.sleep-one-second.driver ./sleep-one-second.sh &&
-   git merge master
+   git merge master &&
+   # the trash directory cannot be removed on Windows until the hook
+   # does not occupy it for its CWD anymore; wait for it to quit:
+   sleep 1
 '
 
 test_done
-- 
2.10.0.85.gea34e30

[PATCH] t9903: fix broken && chain

2016-09-05 Thread Johannes Sixt


We might wonder why our && chain check does not catch this case:
The && chain check uses a strange exit code with the expectation that
the second or later part of a broken && chain would not exit with this
particular code.

This expectation does not work in this case because __git_ps1, being
the first command in the second part of the broken && chain, records
the current exit code, does its work, and finally returns to the caller
with the recorded exit code. This fools our && chain check.

Signed-off-by: Johannes Sixt 
---
 t/t9903-bash-prompt.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t9903-bash-prompt.sh b/t/t9903-bash-prompt.sh
index 0db4469..97c9b32 100755
--- a/t/t9903-bash-prompt.sh
+++ b/t/t9903-bash-prompt.sh
@@ -177,7 +177,7 @@ test_expect_success 'prompt - interactive rebase' '
git checkout b1 &&
test_when_finished "git checkout master" &&
git rebase -i HEAD^ &&
-   test_when_finished "git rebase --abort"
+   test_when_finished "git rebase --abort" &&
__git_ps1 >"$actual" &&
test_cmp expected "$actual"
 '
--
2.10.0.85.gea34e30

Re: [WIP PATCH v2] diff.c: emit moved lines with a different color

2016-09-05 Thread Junio C Hamano

Stefan Beller  writes:

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 0bcb679..f4f51c2 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -980,8 +980,9 @@ color.diff.::
>   of `context` (context text - `plain` is a historical synonym),
>   `meta` (metainformation), `frag`
>   (hunk header), 'func' (function in hunk header), `old` (removed lines),
> - `new` (added lines), `commit` (commit headers), or `whitespace`
> - (highlighting whitespace errors).
> + `new` (added lines), `commit` (commit headers), `whitespace`
> + (highlighting whitespace errors), `moved-old` (removed lines that
> + reappear), `moved-new` (added lines that were removed elsewhere).

Could we have a config to disable this rather costly new feature,
too?

Also the first and the third level configuration names (the 
is at the third level) used by the core-git do not use dashed-words
format.  Please adhere to the current convention.

> diff --git a/diff.c b/diff.c
> index 534c12e..d37cb4f 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -18,6 +18,7 @@
>  #include "ll-merge.h"
>  #include "string-list.h"
>  #include "argv-array.h"
> +#include "git-compat-util.h"
>  
>  #ifdef NO_FAST_WORKING_DIRECTORY
>  #define FAST_WORKING_DIRECTORY 0
> @@ -42,6 +43,11 @@ static int diff_dirstat_permille_default = 30;
>  static struct diff_options default_diff_options;
>  static long diff_algorithm;
>  
> +static struct hashmap *duplicates_added;
> +static struct hashmap *duplicates_removed;
> +static int hash_previous_line_added;
> +static int hash_previous_line_removed;

I think these should be added as new fields to diff_options
structure.

Re: [PATCH 1/6] git-gui: The term unified for remote in Japanese

2016-09-05 Thread Satoshi Yasushima

On Tue, 6 Sep 2016 2:16 +0900 Satoshi Yasushima wrote:
>But now, doesn't make empty string by apply msgmerge this commit.

Oops.

Now, I was a rebase work, msgmerge changed with empty string.
Hence I think this commit keeps with empty string.

Re: [PATCH 1/6] git-gui: The term unified for remote in Japanese

2016-09-05 Thread Satoshi Yasushima


On Sun, 4 Sep 2016 14:10:45 +0200 Jakub Narębski wrote:
1>>  msgid "Tracking branch %s is not a branch in the remote repository."
2>> -msgstr "トラッキング・ブランチ %s は遠隔リポジトリのブランチではありません。"
3>> +msgstr ""
4>
5>What for is the above part of change (empty string)?
6>
7>> +"トラッキング・ブランチ %s はリモートリポジトリのブランチではありません。"

I had been changed for "remote" from line 2 to line 7 above.
And previously apply msgmerge because I hope this change will not change again 
finaly apply msgmerge.
When July with MSYS2.

But now, doesn't make empty string by apply msgmerge this commit.
I will remake this patch.

Re: [PATCH 1/6] git-gui: The term unified for remote in Japanese

2016-09-05 Thread Satoshi Yasushima


On Sat, 03 Sep 2016 23:54:36 -0700 Junio C Hamano wrote:

I couldn't quite read/parse the title, but luckily I read Japanese ;-)

You saw different Japanese words used to translate the same original
word "remote" in different message strings, and you chose one of
them and use it everywhere.  And you did the same for "blame" in
your patch 2/6.

I would have described them like so:

  Subject: git-gui: consistently use the same word for "remote" in Japanese
  Subject: git-gui: consistently use the same word for "blame" in Japanese


Thank you for your title patch.
My English is so poor as with many of Japanese.

Resend patches in a little while.

Re: [PATCH v2 05/38] refs: create a base class "ref_store" for files_ref_store

2016-09-05 Thread David Turner

On Mon, 2016-09-05 at 05:53 +0200, Michael Haggerty wrote:
> On 09/04/2016 10:40 PM, David Turner wrote:
> > On Sun, 2016-09-04 at 18:08 +0200, Michael Haggerty wrote:
> > 
> >> +/* A linked list of ref_stores for submodules: */
> >> +static struct ref_store *submodule_ref_stores;
> > 
> > I don't like the per-submodule stores being in a linked list, which
> > requires a linear search.  Stefan has, I think, been doing a bunch of
> > work to scale git to support on the order of thousands of submodules,
> > which this potentially breaks.  What about a hashmap?
> 
> I agree it's not ideal, but this linked list is not new. Before this
> patch the same code was in `files-backend.c`, and before patch [03/38]
> essentially the same linked list was stored as `submodule_ref_caches`.
> So this is not a regression, and I'd rather not address it in this patch
> series.

Fair enough!

[PATCH 3/3] diff_grep: add assertions verifying that the buffers are NUL-terminated

2016-09-05 Thread Johannes Schindelin

Before calling regexec() on the file contents, we better be certain that
the strings fulfill the contract of C strings assumed by said function.

Signed-off-by: Johannes Schindelin 
---
 diffcore-pickaxe.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 55067ca..88820b6 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -49,6 +49,8 @@ static int diff_grep(mmfile_t *one, mmfile_t *two,
xpparam_t xpp;
xdemitconf_t xecfg;
 
+   assert(!one || one->ptr[one->size] == '\0');
+   assert(!two || two->ptr[two->size] == '\0');
if (!one)
return !regexec(regexp, two->ptr, 1, , 0);
if (!two)
-- 
2.10.0.windows.1.2.g732a511

[PATCH 1/3] Demonstrate a problem: our pickaxe code assumes NUL-terminated buffers

2016-09-05 Thread Johannes Schindelin

When our pickaxe code feeds file contents to regexec(), it implicitly
assumes that the file contents are read into implicitly NUL-terminated
buffers (i.e. that we overallocate by 1, appending a single '\0').

This is not so.

In particular when the file contents are simply mmap()ed, we can be
virtually certain that the buffer is preceding uninitialized bytes, or
invalid pages.

Note that the test we add here is known to be flakey: we simply cannot
know whether the byte following the mmap()ed ones is a NUL or not.

Typically, on Linux the test passes. On Windows, it fails virtually
every time due to an access violation (that's a segmentation fault for
you Unix-y people out there). And Windows would be correct: the
regexec() call wants to operate on a regular, NUL-terminated string,
there is no NUL in the mmap()ed memory range, and it is undefined
whether the next byte is even legal to access.

When run with --valgrind it demonstrates quite clearly the breakage, of
course.

So we simply mark it with `test_expect_success` for now.

This test case represents a Minimal, Complete and Verifiable Example of
a breakage reported by Chris Sidi.

Signed-off-by: Johannes Schindelin 
---
 t/t4059-diff-pickaxe.sh | 22 ++
 1 file changed, 22 insertions(+)
 create mode 100755 t/t4059-diff-pickaxe.sh

diff --git a/t/t4059-diff-pickaxe.sh b/t/t4059-diff-pickaxe.sh
new file mode 100755
index 000..f0bf50b
--- /dev/null
+++ b/t/t4059-diff-pickaxe.sh
@@ -0,0 +1,22 @@
+#!/bin/sh
+#
+# Copyright (c) 2016 Johannes Schindelin
+#
+
+test_description='Pickaxe options'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+   test_commit initial &&
+   printf "%04096d" 0 >4096-zeroes.txt &&
+   git add 4096-zeroes.txt &&
+   test_tick &&
+   git commit -m "A 4k file"
+'
+test_expect_success '-G matches' '
+   git diff --name-only -G "^0{4096}$" HEAD^ >out &&
+   test 4096-zeroes.txt = "$(cat out)"
+'
+
+test_done
-- 
2.10.0.windows.1.2.g732a511

[PATCH 0/3] Fix a segfault caused by regexec() being called on mmap()ed data

2016-09-05 Thread Johannes Schindelin

This patch series addresses a problem where `git diff` is called using
`-G` or `-S --pickaxe-regex` on new-born files that are configured
without user diff drivers, and that hence get mmap()ed into memory.

The problem with that: mmap()ed memory is *not* NUL-terminated, yet the
pickaxe code calls regexec() on it just the same.

This problem has been reported by my colleague Chris Sidi.

Please note that this patch series is a hot fix I applied to Git for
Windows (the bug does not trigger a segmentation fault for me on Linux,
strangely enough, but it is really a problem on Windows).

So at least I have a workaround in place. Ideally, though, we would
NUL-terminate the buffers only when needed, or somehow call regexec() on
ptr/size parameters instead of passing a supposedly NUL-terminated
string to it?


Johannes Schindelin (3):
  Demonstrate a problem: our pickaxe code assumes NUL-terminated buffers
  diff_populate_filespec: NUL-terminate buffers
  diff_grep: add assertions verifying that the buffers are
NUL-terminated

 diff.c  |  9 +
 diffcore-pickaxe.c  |  2 ++
 t/t4059-diff-pickaxe.sh | 22 ++
 3 files changed, 33 insertions(+)
 create mode 100755 t/t4059-diff-pickaxe.sh

Published-As: https://github.com/dscho/git/releases/tag/mmap-regexec-v1
Fetch-It-Via: git fetch https://github.com/dscho/git mmap-regexec-v1

-- 
2.10.0.windows.1.2.g732a511

base-commit: 6ebdac1bab966b720d776aa43ca188fe378b1f4b

[PATCH 2/3] diff_populate_filespec: NUL-terminate buffers

2016-09-05 Thread Johannes Schindelin

While the xdiff machinery is quite capable of working with strings given
as pointer and size, Git's add-on functionality simply assumes that we
are operating on NUL-terminated strings, e.g. y running regexec() on the
provided pointer, with no way to pass the size, too.

In general, this assumption is wrong.

It is true that many code paths populate the mmfile_t structure silently
appending a NUL, e.g. when running textconv on a temporary file and
reading the results back into an strbuf.

The assumption is most definitely wrong, however, when mmap()ing a file.

Practically, we seemed to be lucky that the bytes after mmap()ed memory
were 1) accessible and 2) somehow contained NUL bytes *somewhere*.

In a use case reported by Chris Sidi, it turned out that the mmap()ed
file had the precise size of a memory page, and on Windows the bytes
after memory-mapped pages are in general not valid.

This patch works around that issue, giving us time to discuss the best
course how to fix this problem more generally.

Signed-off-by: Johannes Schindelin 
---
 diff.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/diff.c b/diff.c
index 534c12e..32f7f46 100644
--- a/diff.c
+++ b/diff.c
@@ -2826,6 +2826,15 @@ int diff_populate_filespec(struct diff_filespec *s, 
unsigned int flags)
s->data = strbuf_detach(, );
s->size = size;
s->should_free = 1;
+   } else {
+   /* data must be NUL-terminated so e.g. for regexec() */
+   char *data = xmalloc(s->size + 1);
+   memcpy(data, s->data, s->size);
+   data[s->size] = '\0';
+   munmap(s->data, s->size);
+   s->should_munmap = 0;
+   s->data = data;
+   s->should_free = 1;
}
}
else {
-- 
2.10.0.windows.1.2.g732a511

[PATCH] Avoid hard-coded perl path in sha-bang

2016-09-05 Thread iblis


Signed-off-by: Iblis Lin 

---
 contrib/diff-highlight/diff-highlight | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/diff-highlight/diff-highlight 
b/contrib/diff-highlight/diff-highlight
index ffefc31..b57b0fd 100755
--- a/contrib/diff-highlight/diff-highlight
+++ b/contrib/diff-highlight/diff-highlight
@@ -1,4 +1,4 @@
-#!/usr/bin/perl
+#!/usr/bin/env perl
 
 use 5.008;

 use warnings FATAL => 'all';


--

Iblis Lin

Re: [PATCH 3/4] t5550-http-fetch-dumb.sh: use the GIT_TRACE_CURL environment var

2016-09-05 Thread Eric Sunshine

On Mon, Sep 5, 2016 at 6:24 AM, Elia Pinto  wrote:
> Use the new GIT_TRACE_CURL environment variable instead
> of the deprecated GIT_CURL_VERBOSE.
>
> Signed-off-by: Elia Pinto 
> ---
> diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
> @@ -263,15 +263,18 @@ check_language () {
> >expect
> ;;
> ?*)
> -   echo "Accept-Language: $1" >expect
> +   echo "=> Send header: Accept-Language: $1" >expect
> ;;
> esac &&
> -   GIT_CURL_VERBOSE=1 \
> +   GIT_TRACE_CURL=true \
> LANGUAGE=$2 \
> git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 &&
> tr -d '\015'  sort -u |
> -   sed -ne '/^Accept-Language:/ p' >actual &&
> +   sed -ne '/^=> Send header: Accept-Language:/ p' >actual &&
> +   cp expect expect.$$ &&
> +   cp actual actual.$$ &&
> +   cp output output.$$ &&

What are these three cp's about? They don't seem to be related to the
stated changes. Are they leftover debugging gunk?

> test_cmp expect actual
>  }
>
> @@ -295,8 +298,8 @@ ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, 
> *;q=0.90" \
>  '
>
>  test_expect_success 'git client does not send an empty Accept-Language' '
> -   GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 
> 2>stderr &&
> -   ! grep "^Accept-Language:" stderr
> +   GIT_TRACE_CURL=true LANGUAGE= git ls-remote 
> "$HTTPD_URL/dumb/repo.git" 2>stderr &&
> +   ! grep "^=> Send header: Accept-Language:" stderr
>  '
>
>  stop_httpd

[PATCH 4/4] t5551-http-fetch-smart.sh: use the GIT_TRACE_CURL environment var

2016-09-05 Thread Elia Pinto

Use the new GIT_TRACE_CURL environment variable instead
of the deprecated GIT_CURL_VERBOSE.

Signed-off-by: Elia Pinto 
---
 t/t5551-http-fetch-smart.sh | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 2f375eb..1ec5b27 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -43,12 +43,21 @@ cat >exp err &&
test_cmp file clone/file &&
tr '\''\015'\'' Q  Send header, /d
+   /^=> Send header:$/d
+   /^<= Recv header, /d
+   /^<= Recv header:$/d
+   s/=> Send header: //
+   s/= Recv header://
+   /^<= Recv data/d
+   /^=> Send data/d
/^$/d
/^< $/d
 
@@ -261,9 +270,9 @@ test_expect_success CMDLINE_LIMIT \
 '
 
 test_expect_success 'large fetch-pack requests can be split across POSTs' '
-   GIT_CURL_VERBOSE=1 git -c http.postbuffer=65536 \
+   GIT_TRACE_CURL=true git -c http.postbuffer=65536 \
clone --bare "$HTTPD_URL/smart/repo.git" split.git 2>err &&
-   grep "^> POST" err >posts &&
+   grep "^=> Send header: POST" err >posts &&
test_line_count = 2 posts
 '
 
-- 
2.10.0.308.gf73994d

[PATCH 2/4] test-lib.sh: preserve GIT_TRACE_CURL from the environment

2016-09-05 Thread Elia Pinto

Turning on this variable can be useful when debugging http
tests. It can break a few tests in t5541 if not set
to an absolute path but it is not a variable
that the user is likely to have enabled accidentally.

Signed-off-by: Elia Pinto 
---
 t/test-lib.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index d731d66..986eba1 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -89,6 +89,7 @@ unset VISUAL EMAIL LANGUAGE COLUMNS $("$PERL_PATH" -e '
UNZIP
PERF_
CURL_VERBOSE
+   TRACE_CURL
));
my @vars = grep(/^GIT_/ && !/^GIT_($ok)/o, @env);
print join("\n", @vars);
-- 
2.10.0.308.gf73994d

[PATCH 1/4] t5541-http-push-smart.sh: use the GIT_TRACE_CURL environment var

2016-09-05 Thread Elia Pinto

Use the new GIT_TRACE_CURL environment variable instead
of the deprecated GIT_CURL_VERBOSE.

Signed-off-by: Elia Pinto 
---
 t/t5541-http-push-smart.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5541-http-push-smart.sh b/t/t5541-http-push-smart.sh
index 4840c71..d38bf32 100755
--- a/t/t5541-http-push-smart.sh
+++ b/t/t5541-http-push-smart.sh
@@ -74,7 +74,7 @@ test_expect_success 'push to remote repository (standard)' '
test_tick &&
git commit -m path2 &&
HEAD=$(git rev-parse --verify HEAD) &&
-   GIT_CURL_VERBOSE=1 git push -v -v 2>err &&
+   GIT_TRACE_CURL=true git push -v -v 2>err &&
! grep "Expect: 100-continue" err &&
grep "POST git-receive-pack ([0-9]* bytes)" err &&
(cd "$HTTPD_DOCUMENT_ROOT_PATH"/test_repo.git &&
-- 
2.10.0.308.gf73994d

[PATCH 0/4] test suite: use the GIT_TRACE_CURL environment var

2016-09-05 Thread Elia Pinto

Use the new GIT_TRACE_CURL environment variable in the test suite 
instead of the deprecated GIT_CURL_VERBOSE.

Elia Pinto (4):
  t5541-http-push-smart.sh: use the GIT_TRACE_CURL environment var
  test-lib.sh: preserve GIT_TRACE_CURL from the environment
  t5550-http-fetch-dumb.sh: use the GIT_TRACE_CURL environment var
  t5551-http-fetch-smart.sh: use the GIT_TRACE_CURL environment var

 t/t5541-http-push-smart.sh  |  2 +-
 t/t5550-http-fetch-dumb.sh  | 13 -
 t/t5551-http-fetch-smart.sh | 15 ---
 t/test-lib.sh   |  1 +
 4 files changed, 22 insertions(+), 9 deletions(-)

-- 
2.10.0.308.gf73994d

[PATCH 3/4] t5550-http-fetch-dumb.sh: use the GIT_TRACE_CURL environment var

2016-09-05 Thread Elia Pinto

Use the new GIT_TRACE_CURL environment variable instead
of the deprecated GIT_CURL_VERBOSE.

Signed-off-by: Elia Pinto 
---
 t/t5550-http-fetch-dumb.sh | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 3484b6f..75694ad 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -263,15 +263,18 @@ check_language () {
>expect
;;
?*)
-   echo "Accept-Language: $1" >expect
+   echo "=> Send header: Accept-Language: $1" >expect
;;
esac &&
-   GIT_CURL_VERBOSE=1 \
+   GIT_TRACE_CURL=true \
LANGUAGE=$2 \
git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 &&
tr -d '\015' actual &&
+   sed -ne '/^=> Send header: Accept-Language:/ p' >actual &&
+   cp expect expect.$$ &&
+   cp actual actual.$$ &&
+   cp output output.$$ &&
test_cmp expect actual
 }
 
@@ -295,8 +298,8 @@ ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, 
*;q=0.90" \
 '
 
 test_expect_success 'git client does not send an empty Accept-Language' '
-   GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 
2>stderr &&
-   ! grep "^Accept-Language:" stderr
+   GIT_TRACE_CURL=true LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 
2>stderr &&
+   ! grep "^=> Send header: Accept-Language:" stderr
 '
 
 stop_httpd
-- 
2.10.0.308.gf73994d

[PATCH v3 8/8] blame: honor the diff heuristic options and config

2016-09-05 Thread Michael Haggerty

Teach "git blame" and "git annotate" the --compaction-heuristic and
--indent-heuristic options that are now supported by "git diff".

Also teach them to honor the `diff.compactionHeuristic` and
`diff.indentHeuristic` configuration options.

It would be conceivable to introduce separate configuration options for
"blame" and "annotate"; for example `blame.compactionHeuristic` and
`blame.indentHeuristic`. But it would be confusing to users if blame
output is inconsistent with diff output, so it makes more sense for them
to respect the same configuration.

Signed-off-by: Michael Haggerty 
---
 Documentation/diff-heuristic-options.txt |  7 +++
 Documentation/diff-options.txt   |  8 +---
 Documentation/git-annotate.txt   |  1 +
 Documentation/git-blame.txt  |  2 ++
 builtin/blame.c  | 12 
 diff.c   | 29 +
 diff.h   |  1 +
 t/t4059-diff-indent.sh   | 29 +
 8 files changed, 70 insertions(+), 19 deletions(-)
 create mode 100644 Documentation/diff-heuristic-options.txt

diff --git a/Documentation/diff-heuristic-options.txt 
b/Documentation/diff-heuristic-options.txt
new file mode 100644
index 000..36cb549
--- /dev/null
+++ b/Documentation/diff-heuristic-options.txt
@@ -0,0 +1,7 @@
+--indent-heuristic::
+--no-indent-heuristic::
+--compaction-heuristic::
+--no-compaction-heuristic::
+   These are to help debugging and tuning experimental heuristics
+   (which are off by default) that shift diff hunk boundaries to
+   make patches easier to read.
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index ffc2b60..d6d41e4 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -63,13 +63,7 @@ ifndef::git-format-patch[]
Synonym for `-p --raw`.
 endif::git-format-patch[]
 
---indent-heuristic::
---no-indent-heuristic::
---compaction-heuristic::
---no-compaction-heuristic::
-   These are to help debugging and tuning experimental heuristics
-   (which are off by default) that shift diff hunk boundaries to
-   make patches easier to read.
+include::diff-heuristic-options.txt[]
 
 --minimal::
Spend extra time to make sure the smallest possible
diff --git a/Documentation/git-annotate.txt b/Documentation/git-annotate.txt
index 05fd482..94be4b8 100644
--- a/Documentation/git-annotate.txt
+++ b/Documentation/git-annotate.txt
@@ -23,6 +23,7 @@ familiar command name for people coming from other SCM 
systems.
 OPTIONS
 ---
 include::blame-options.txt[]
+include::diff-heuristic-options.txt[]
 
 SEE ALSO
 
diff --git a/Documentation/git-blame.txt b/Documentation/git-blame.txt
index ba54175..9dccb33 100644
--- a/Documentation/git-blame.txt
+++ b/Documentation/git-blame.txt
@@ -89,6 +89,8 @@ include::blame-options.txt[]
abbreviated object name, use +1 digits. Note that 1 column
is used for a caret to mark the boundary commit.
 
+include::diff-heuristic-options.txt[]
+
 
 THE PORCELAIN FORMAT
 
diff --git a/builtin/blame.c b/builtin/blame.c
index 7ec7823..f7b67f8 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2220,6 +2220,8 @@ static int git_blame_config(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (git_diff_heuristic_config(var, value, cb) < 0)
+   return -1;
if (userdiff_config(var, value) < 0)
return -1;
 
@@ -2549,6 +2551,15 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
OPT_BIT('s', NULL, _option, N_("Suppress author name and 
timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
OPT_BIT('e', "show-email", _option, N_("Show author 
email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
OPT_BIT('w', NULL, _opts, N_("Ignore whitespace 
differences"), XDF_IGNORE_WHITESPACE),
+
+   /*
+* The following two options are parsed by parse_revision_opt()
+* and are only included here to get included in the "-h"
+* output:
+*/
+   { OPTION_LOWLEVEL_CALLBACK, 0, "indent-heuristic", NULL, NULL, 
N_("Use an experimental indent-based heuristic to improve diffs"), 
PARSE_OPT_NOARG, parse_opt_unknown_cb },
+   { OPTION_LOWLEVEL_CALLBACK, 0, "compaction-heuristic", NULL, 
NULL, N_("Use an experimental blank-line-based heuristic to improve diffs"), 
PARSE_OPT_NOARG, parse_opt_unknown_cb },
+
OPT_BIT(0, "minimal", _opts, N_("Spend extra cycles to find 
better match"), XDF_NEED_MINIMAL),
OPT_STRING('S', NULL, _file, N_("file"), N_("Use revisions 
from  instead of calling git-rev-list")),
OPT_STRING(0, "contents", _from, N_("file"), N_("Use 
's contents as the final image")),
@@

[PATCH v3 2/8] xdl_change_compact(): only use heuristic if group can't be matched

2016-09-05 Thread Michael Haggerty

If the changed group of lines can be matched to a group in the other
file, then that positioning should take precedence over the compaction
heuristic.

The old code tried the heuristic unconditionally, which cost redundant
effort and also was broken if the matching code had already shifted the
group higher than the blank line.

Signed-off-by: Michael Haggerty 
---
 xdiff/xdiffi.c | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 95b037e..61deed8 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -504,25 +504,25 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, 
long flags) {
}
} while (grpsiz != ix - ixs);
 
-   /*
-* Try to move back the possibly merged group of changes, to 
match
-* the recorded position in the other file.
-*/
-   while (ixref < ix) {
-   rchg[--ixs] = 1;
-   rchg[--ix] = 0;
-   while (rchgo[--ixo]);
-   }
-
-   /*
-* If a group can be moved back and forth, see if there is a
-* blank line in the moving space. If there is a blank line,
-* make sure the last blank line is the end of the group.
-*
-* As we already shifted the group forward as far as possible
-* in the earlier loop, we need to shift it back only if at all.
-*/
-   if ((flags & XDF_COMPACTION_HEURISTIC) && blank_lines) {
+   if (ixref < ix) {
+   /*
+* Try to move back the possibly merged group of 
changes, to match
+* the recorded position in the other file.
+*/
+   while (ixref < ix) {
+   rchg[--ixs] = 1;
+   rchg[--ix] = 0;
+   while (rchgo[--ixo]);
+   }
+   } else if ((flags & XDF_COMPACTION_HEURISTIC) && blank_lines) {
+   /*
+* The group can be slid up to make its last line a
+* blank line. Do so.
+*
+* As we already shifted the group forward as far as
+* possible in the earlier loop, we need to shift it
+* back only if at all.
+*/
while (ixs > 0 &&
   !is_blank_line(recs, ix - 1, flags) &&
   recs_match(recs, ixs - 1, ix - 1, flags)) {
-- 
2.9.3

[PATCH v3 7/8] parse-options: add parse_opt_unknown_cb()

2016-09-05 Thread Michael Haggerty

Add a new callback function, parse_opt_unknown_cb(), which returns -2 to
indicate that the corresponding option is unknown. This can be used to
add "-h" documentation for an option that will be handled externally to
parse_options().

Signed-off-by: Michael Haggerty 
---
See the next patch for the reason I implemented this change. Let me
know if there is an easier way to do this.

 parse-options-cb.c | 12 
 parse-options.h|  1 +
 2 files changed, 13 insertions(+)

diff --git a/parse-options-cb.c b/parse-options-cb.c
index 9667bc7..b5d9209 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -159,6 +159,18 @@ int parse_opt_noop_cb(const struct option *opt, const char 
*arg, int unset)
 }
 
 /**
+ * Report that the option is unknown, so that other code can handle
+ * it. This can be used as a callback together with
+ * OPTION_LOWLEVEL_CALLBACK to allow an option to be documented in the
+ * "-h" output even if it's not being handled directly by
+ * parse_options().
+ */
+int parse_opt_unknown_cb(const struct option *opt, const char *arg, int unset)
+{
+   return -2;
+}
+
+/**
  * Recreates the command-line option in the strbuf.
  */
 static int recreate_opt(struct strbuf *sb, const struct option *opt,
diff --git a/parse-options.h b/parse-options.h
index 78f8384..dcd8a09 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -228,6 +228,7 @@ extern int parse_opt_commits(const struct option *, const 
char *, int);
 extern int parse_opt_tertiary(const struct option *, const char *, int);
 extern int parse_opt_string_list(const struct option *, const char *, int);
 extern int parse_opt_noop_cb(const struct option *, const char *, int);
+extern int parse_opt_unknown_cb(const struct option *, const char *, int);
 extern int parse_opt_passthru(const struct option *, const char *, int);
 extern int parse_opt_passthru_argv(const struct option *, const char *, int);
 
-- 
2.9.3

[PATCH v3 3/8] is_blank_line(): take a single xrecord_t as argument

2016-09-05 Thread Michael Haggerty

There is no reason for it to take an array and index as argument, as it
only accesses a single element of the array.

Signed-off-by: Michael Haggerty 
---
 xdiff/xdiffi.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 61deed8..ed2df64 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -400,9 +400,9 @@ static xdchange_t *xdl_add_change(xdchange_t *xscr, long 
i1, long i2, long chg1,
 }
 
 
-static int is_blank_line(xrecord_t **recs, long ix, long flags)
+static int is_blank_line(xrecord_t *rec, long flags)
 {
-   return xdl_blankline(recs[ix]->ptr, recs[ix]->size, flags);
+   return xdl_blankline(rec->ptr, rec->size, flags);
 }
 
 static int recs_match(xrecord_t **recs, long ixs, long ix, long flags)
@@ -485,7 +485,7 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long 
flags) {
 * the group.
 */
while (ix < nrec && recs_match(recs, ixs, ix, flags)) {
-   blank_lines += is_blank_line(recs, ix, flags);
+   blank_lines += is_blank_line(recs[ix], flags);
 
rchg[ixs++] = 0;
rchg[ix++] = 1;
@@ -524,7 +524,7 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long 
flags) {
 * back only if at all.
 */
while (ixs > 0 &&
-  !is_blank_line(recs, ix - 1, flags) &&
+  !is_blank_line(recs[ix - 1], flags) &&
   recs_match(recs, ixs - 1, ix - 1, flags)) {
rchg[--ixs] = 1;
rchg[--ix] = 0;
-- 
2.9.3

[PATCH v3 5/8] xdl_change_compact(): introduce the concept of a change group

2016-09-05 Thread Michael Haggerty

The idea of xdl_change_compact() is fairly simple:

* Proceed through groups of changed lines in the file to be compacted,
  keeping track of the corresponding location in the "other" file.

* If possible, slide the group up and down to try to give the most
  aesthetically pleasing diff. Whenever it is slid, the current location
  in the other file needs to be adjusted.

But these simple concepts are obfuscated by a lot of index handling that
is written in terse, subtle, and varied patterns. I found it very hard
to convince myself that the function was correct.

So introduce a "struct group" that represents a group of changed lines
in a file. Add some functions that perform elementary operations on
groups:

* Initialize a group to the first group in a file
* Move to the next or previous group in a file
* Slide a group up or down

Even though the resulting code is longer, I think it is easier to
understand and review. Its performance is not changed
appreciably (though it would be if `group_next()` and `group_previous()`
were not inlined).

...and in fact, the rewriting helped me discover another bug in the
--compaction-heuristic code: The update of blank_lines was never done
for the highest possible position of the group. This means that it could
fail to slide the group to its highest possible position, even if that
position had a blank line as its last line. So for example, it yielded
the following diff:

$ git diff --no-index --compaction-heuristic a.txt b.txt
diff --git a/a.txt b/b.txt
index e53969f..0d60c5fe 100644
--- a/a.txt
+++ b/b.txt
@@ -1,3 +1,7 @@
 1
 A
+
+B
+
+A
 2

when in fact the following diff is better (according to the rules of
--compaction-heuristic):

$ git diff --no-index --compaction-heuristic a.txt b.txt
diff --git a/a.txt b/b.txt
index e53969f..0d60c5fe 100644
--- a/a.txt
+++ b/b.txt
@@ -1,3 +1,7 @@
 1
+A
+
+B
+
 A
 2

The new code gives the bottom answer.

Signed-off-by: Michael Haggerty 
---
 xdiff/xdiffi.c | 293 +++--
 1 file changed, 203 insertions(+), 90 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 8a5832a..44fded6 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -413,126 +413,239 @@ static int recs_match(xrecord_t *rec1, xrecord_t *rec2, 
long flags)
 flags));
 }
 
-int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
-   long ix, ixo, ixs, ixref, grpsiz, nrec = xdf->nrec;
-   char *rchg = xdf->rchg, *rchgo = xdfo->rchg;
-   unsigned int blank_lines;
-   xrecord_t **recs = xdf->recs;
+/*
+ * Represent a group of changed lines in an xdfile_t (i.e., a contiguous group
+ * of lines that was inserted or deleted from the corresponding version of the
+ * file). We consider there to be such a group at the beginning of the file, at
+ * the end of the file, and between any two unchanged lines, though most such
+ * groups will usually be empty.
+ *
+ * If the first line in a group is equal to the line following the group, then
+ * the group can be slid down. Similarly, if the last line in a group is equal
+ * to the line preceding the group, then the group can be slid up. See
+ * group_slide_down() and group_slide_up().
+ *
+ * Note that loops that are testing for changed lines in xdf->rchg do not need
+ * index bounding since the array is prepared with a zero at position -1 and N.
+ */
+struct group {
+   /*
+* The index of the first changed line in the group, or the index of
+* the unchanged line above which the (empty) group is located.
+*/
+   long start;
 
/*
-* This is the same of what GNU diff does. Move back and forward
-* change groups for a consistent and pretty diff output. This also
-* helps in finding joinable change groups and reduce the diff size.
+* The index of the first unchanged line after the group. For an empty
+* group, end is equal to start.
 */
-   for (ix = ixo = 0;;) {
-   /*
-* Find the first changed line in the to-be-compacted file.
-* We need to keep track of both indexes, so if we find a
-* changed lines group on the other file, while scanning the
-* to-be-compacted file, we need to skip it properly. Note
-* that loops that are testing for changed lines on rchg* do
-* not need index bounding since the array is prepared with
-* a zero at position -1 and N.
-*/
-   for (; ix < nrec && !rchg[ix]; ix++)
-   while (rchgo[ixo++]);
-   if (ix == nrec)
-   break;
+   long end;
+};
+
+/*
+ * Initialize g to point at the first group in xdf.
+ */
+static void group_init(xdfile_t *xdf, struct group *g)
+{
+

[PATCH v3 4/8] recs_match(): take two xrecord_t pointers as arguments

2016-09-05 Thread Michael Haggerty

There is no reason for it to take an array and two indexes as argument,
as it only accesses two elements of the array.

Signed-off-by: Michael Haggerty 
---
 xdiff/xdiffi.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index ed2df64..8a5832a 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -405,11 +405,11 @@ static int is_blank_line(xrecord_t *rec, long flags)
return xdl_blankline(rec->ptr, rec->size, flags);
 }
 
-static int recs_match(xrecord_t **recs, long ixs, long ix, long flags)
+static int recs_match(xrecord_t *rec1, xrecord_t *rec2, long flags)
 {
-   return (recs[ixs]->ha == recs[ix]->ha &&
-   xdl_recmatch(recs[ixs]->ptr, recs[ixs]->size,
-recs[ix]->ptr, recs[ix]->size,
+   return (rec1->ha == rec2->ha &&
+   xdl_recmatch(rec1->ptr, rec1->size,
+rec2->ptr, rec2->size,
 flags));
 }
 
@@ -457,7 +457,7 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long 
flags) {
 * the last line of the current change group, shift 
backward
 * the group.
 */
-   while (ixs > 0 && recs_match(recs, ixs - 1, ix - 1, 
flags)) {
+   while (ixs > 0 && recs_match(recs[ixs - 1], recs[ix - 
1], flags)) {
rchg[--ixs] = 1;
rchg[--ix] = 0;
 
@@ -484,7 +484,7 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long 
flags) {
 * the line next of the current change group, shift 
forward
 * the group.
 */
-   while (ix < nrec && recs_match(recs, ixs, ix, flags)) {
+   while (ix < nrec && recs_match(recs[ixs], recs[ix], 
flags)) {
blank_lines += is_blank_line(recs[ix], flags);
 
rchg[ixs++] = 0;
@@ -525,7 +525,7 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long 
flags) {
 */
while (ixs > 0 &&
   !is_blank_line(recs[ix - 1], flags) &&
-  recs_match(recs, ixs - 1, ix - 1, flags)) {
+  recs_match(recs[ixs - 1], recs[ix - 1], flags)) {
rchg[--ixs] = 1;
rchg[--ix] = 0;
while (rchgo[--ixo]);
-- 
2.9.3

[PATCH v3 6/8] diff: improve positioning of add/delete blocks in diffs

2016-09-05 Thread Michael Haggerty

Some groups of added/deleted lines in diffs can be slid up or down,
because lines at the edges of the group are not unique. Picking good
shifts for such groups is not a matter of correctness but definitely has
a big effect on aesthetics. For example, consider the following two
diffs. The first is what standard Git emits:

--- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
+++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
@@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) {
 }

 if (!$smtp_server) {
+   $smtp_server = $repo->config('sendemail.smtpserver');
+}
+if (!$smtp_server) {
foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
if (-x $_) {
$smtp_server = $_;

The following diff is equivalent, but is obviously preferable from an
aesthetic point of view:

--- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
+++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
@@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) {
$initial_reply_to =~ s/(^\s+|\s+$)//g;
 }

+if (!$smtp_server) {
+   $smtp_server = $repo->config('sendemail.smtpserver');
+}
 if (!$smtp_server) {
foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
if (-x $_) {

This patch teaches Git to pick better positions for such "diff sliders"
using heuristics that take the positions of nearby blank lines and the
indentation of nearby lines into account.

The existing Git code basically always shifts such "sliders" as far down
in the file as possible. The only exception is when the slider can be
aligned with a group of changed lines in the other file, in which case
Git favors depicting the change as one add+delete block rather than one
add and a slightly offset delete block. This naive algorithm often
yields ugly diffs.

Commit d634d61ed6 improved the situation somewhat by preferring to
position add/delete groups to make their last line a blank line, when
that is possible. This heuristic does more good than harm, but (1) it
can only help if there are blank lines in the right places, and (2)
always picks the last blank line, even if there are others that might be
better. The end result is that it makes perhaps 1/3 as many errors as
the default Git algorithm, but that still leaves a lot of ugly diffs.

This commit implements a new and much better heuristic for picking
optimal "slider" positions using the following approach: First observe
that each hypothetical positioning of a diff slider introduces two
splits: one between the context lines preceding the group and the first
added/deleted line, and the other between the last added/deleted line
and the first line of context following it. It tries to find the
positioning that creates the least bad splits.

Splits are evaluated based only on the presence and locations of nearby
blank lines, and the indentation of lines near the split. Basically, it
prefers to introduce splits adjacent to blank lines, between lines that
are indented less, and between lines with the same level of indentation.
In more detail:

1. It measures the following characteristics of a proposed splitting
   position in a `struct split_measurement`:

   * the number of blank lines above the proposed split
   * whether the line directly after the split is blank
   * the number of blank lines following that line
   * the indentation of the nearest non-blank line above the split
   * the indentation of the line directly below the split
   * the indentation of the nearest non-blank line after that line

2. It combines the measured attributes using a bunch of
   empirically-optimized weighting factors to derive a `struct
   split_score` that measures the "badness" of splitting the text at
   that position.

3. It combines the `split_score` for the top and the bottom of the
   slider at each of its possible positions, and selects the position
   that has the best `split_score`.

I determined the initial set of weighting factors by collecting a corpus
of Git histories from 29 open-source software projects in various
programming languages. I generated many diffs from this corpus, and
determined the best positioning "by eye" for about 6600 diff sliders. I
used about half of the repositories in the corpus (corresponding to
about 2/3 of the sliders) as a training set, and optimized the weights
against this corpus using a crude automated search of the parameter
space to get the best agreement with the manually-determined values.
Then I tested the resulting heuristic against the full corpus. The
results are summarized in the following table, in column `indent-1`:

| repository| count |  Git 2.9.0 | compaction | 
compaction-fixed |   indent-1 |   indent-2 |
| - | - | -- | -- | 
 | -- |

[PATCH v3 1/8] xdl_change_compact(): fix compaction heuristic to adjust ixo

2016-09-05 Thread Michael Haggerty

The code branch used for the compaction heuristic forgot to keep ixo in
sync while the group was shifted. This is certainly wrong, as it causes
the two counters to get out of sync.

I *think* that this bug could also have caused the function to read past
the end of the rchgo array, though I haven't done the work to prove it
for sure. Here is my reasoning:

If ixo is not decremented correctly during one iteration of the outer
while loop, then it will loose sync with the ix counter. In particular,
ixo will be too large.

Suppose that the next iterations of the outer while loop (i.e.,
processing the next block of add/delete lines) don't have any sliders.
Then the ixo counter would be incremented by the number of non-changed
lines in xdf, which is the same as the number of non-changed lines in
xdfo that *should have* followed the group that experienced the
malfunction. But since ixo was too large at the end of that iteration,
it will be incremented past the end of the xdfo->rchg array, and will
try to read that memory illegally.

Signed-off-by: Michael Haggerty 
---
 xdiff/xdiffi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index b3c6848..95b037e 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -528,6 +528,7 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long 
flags) {
   recs_match(recs, ixs - 1, ix - 1, flags)) {
rchg[--ixs] = 1;
rchg[--ix] = 0;
+   while (rchgo[--ixo]);
}
}
}
-- 
2.9.3

[PATCH v3 0/8] Better heuristics make prettier diffs

2016-09-05 Thread Michael Haggerty

This is v3 of the diff-indent-heuristic patch series. Thanks to René,
Junio, and Ramsay for their comments about v2 [1,2].

The heuristic itself hasn't changed since v2. Most of the changes are
in the tests and in how `git blame` is wired up to support the new
options. Changes since v1:

* In "diff: improve positioning of add/delete blocks in diffs":

  * Make `score_cmp()` static.

  * Add test t4059 as part of this commit, not as part of its
successor.

  * t4059: Create commits to use for the tests (rather than diffing
loose files with `--no-cache`), so that the same prep can be used
to test `git blame`. Split the tests into multiple smaller tests.

  * Add a test that `--no-indent-heuristic` overrides the config.

* Add a new commit "parse-options: add parse_opt_unknown_cb()":

  * Allow "-h" help text to be generated for options even if they are
being handled by code external to `parse_options()`.

  If there's already a way to do this, I didn't find it.

* In "blame: honor the diff heuristic options and config":

  * In v2, I suggested making `blame` honor all diff-related options.
Junio explained why this was a bad idea. So this version only
makes `blame` honor `--indent-heuristic` and
`--compaction-heuristic`.

  * Add some smoke tests.

This patch series is also available from my GitHub fork [3], branch
"diff-indent-heuristic"

[1] http://public-inbox.org/git/cover.1471864378.git.mhag...@alum.mit.edu/
[2] 
http://public-inbox.org/git/a27aa17e-f602-fc49-92b3-2198e4772...@ramsayjones.plus.com/
[3] https://github.com/mhagger/git

Michael Haggerty (8):
  xdl_change_compact(): fix compaction heuristic to adjust ixo
  xdl_change_compact(): only use heuristic if group can't be matched
  is_blank_line(): take a single xrecord_t as argument
  recs_match(): take two xrecord_t pointers as arguments
  xdl_change_compact(): introduce the concept of a change group
  diff: improve positioning of add/delete blocks in diffs
  parse-options: add parse_opt_unknown_cb()
  blame: honor the diff heuristic options and config

 Documentation/diff-config.txt|   7 +-
 Documentation/diff-heuristic-options.txt |   7 +
 Documentation/diff-options.txt   |   7 +-
 Documentation/git-annotate.txt   |   1 +
 Documentation/git-blame.txt  |   2 +
 builtin/blame.c  |  12 +
 diff.c   |  36 +-
 diff.h   |   1 +
 git-add--interactive.perl|   5 +-
 parse-options-cb.c   |  12 +
 parse-options.h  |   1 +
 t/t4059-diff-indent.sh   | 216 +++
 xdiff/xdiff.h|   1 +
 xdiff/xdiffi.c   | 635 ++-
 14 files changed, 828 insertions(+), 115 deletions(-)
 create mode 100644 Documentation/diff-heuristic-options.txt
 create mode 100755 t/t4059-diff-indent.sh

-- 
2.9.3

Re: Fixup of a fixup not working right

2016-09-05 Thread Johannes Schindelin

Hi Philip,

On Sun, 4 Sep 2016, Philip Oakley wrote:

> From: "Johannes Schindelin" 
> 
> > The point is that fixup! messages are really special, and are always
> > intended to be squashed into the referenced commit *before* the latter
> > hits `master`.
> 
> I think it's here that we have the hidden use case. I agree that all fixups
> should be squashed before they hit the blessed golden  repository.
> 
> I suspect that some use cases have intermediate repositories that
> contain a 'master' branch (it's just a name ;-) that isn't blessed and
> golden, e.g. at the team review repo level. In such cases it is possible
> for a fixup! to be passed up as part of the review, though it's not the
> current norm/expectation.

In such a case (which can totally arise when criss-crossing Pull Requests
on GitHub, for example, where a Pull Request's purpose may be to fix up
commits in another Pull Request before the latter is merged), the most
appropriate course of action is... to not reorder the fixup!s prematurely.

> > In short, I am opposed to this change.
> 
> It's not like G4W doesn't need fixup!s on the side branches e.g. 5eaffe9
> ("fixup! Handle new t1501 test case properly with MinGW", 2016-07-12)

Yeah, well, Git for Windows' `master` branch is special, in that it is
constantly rebased (as "merging rebases", to keep fast-forwardability). I
would not necessarily use Git for Windows as a role model in this respect.

Ciao,
Dscho

Re: [PATCH 1/4] add: document the chmod option

2016-09-05 Thread Johannes Schindelin

Hi Thomas,

On Sun, 4 Sep 2016, Thomas Gummerer wrote:

> +--chmod=(+|-)::

There is an "x" missing ;-)

Ciao,
Dscho

83 matches

Mail list logo