date:20180308

Re: What's cooking in git.git (Mar 2018, #02; Tue, 6)

2018-03-08 Thread Martin Ågren

On 7 March 2018 at 00:34, Junio C Hamano  wrote:

> * ma/config-page-only-in-list-mode (2018-02-21) 3 commits
>  - config: change default of `pager.config` to "on"
>  - config: respect `pager.config` in list/get-mode only
>  - t7006: add tests for how git config paginates
>
>  In a way similar to how "git tag" learned to honor the pager
>  setting only in the list mode, "git config" learned to ignore the
>  pager setting when it is used for setting values (i.e. when the
>  purpose of the operation is not to "show").
>
>  Is this ready for 'next'?

I am not aware of any open questions or issues. You thought out loud
about how the series was structured, in particular about introducing a
successful test, then redefining it, as opposed to introducing it as a
failing test, then making it succeed. I hope I managed to motivate my
choice better in v2 (which is what you have picked up).

Duy wondered if it was sane to use a pager when we know that we are
"--get"-ing at most one config item. In v2, I addressed this by turning
on paging for a more careful selection of "--get"-ters.

Martin

Re: feature request: git-config: Add conditional include for gitbranch

2018-03-08 Thread Duy Nguyen

On Thu, Mar 08, 2018 at 07:23:00PM -0500, Jeremy Bicha wrote:
> Use Case
> ==
> Jeremy is a developer for Debian and Ubuntu. The same repository is
> used for both Debian and Ubuntu packaging but with different branches.
> For commits to the debian/master branch, Jeremy wants to use his
> @debian.org email address. For commits to the ubuntu/master branch,
> Jeremy wants to use his @ubuntu.com email.
> 
> Proposal
> ===
> The includeIf feature of git-config could be extended to offer a
> gitbranch conditional include in addition to the gitdir conditional
> include it already offers.

Interesting. It looks quite simple to do this. My prototype looks like
this.

-- 8< --
Subject: [PATCH] config: support conditional include by matching ref pattern

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt  | 12 
 config.c  | 30 ++
 t/t1305-config-include.sh | 26 ++
 3 files changed, 68 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ce9102cea8..4e8fb6d99c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -143,6 +143,18 @@ refer to linkgit:gitignore[5] for details. For convenience:
This is the same as `gitdir` except that matching is done
case-insensitively (e.g. on case-insensitive file sytems)
 
+`ref`::
+   The data that follows the keyword `ref:` is used as a glob
+   pattern that matches against the current branch. `*` in the
+   pattern does not match `/` and `**` matches multiple levels,
+   the same as .gitignore syntax. The branch is a full reference
+   (i.e. with `refs/heads/` prefix). If HEAD is detached, the
+   pattern will be matched against the value "HEAD".
+
+`ref/i`::
+   This is the same as `ref` except that matching is done
+   case-insensitively
+
 A few more notes on matching via `gitdir` and `gitdir/i`:
 
  * Symlinks in `$GIT_DIR` are not resolved before matching.
diff --git a/config.c b/config.c
index b0c20e6cb8..72ff2da667 100644
--- a/config.c
+++ b/config.c
@@ -16,6 +16,7 @@
 #include "string-list.h"
 #include "utf8.h"
 #include "dir.h"
+#include "refs.h"
 
 struct config_source {
struct config_source *prev;
@@ -202,6 +203,31 @@ static int prepare_include_condition_pattern(struct strbuf 
*pat)
return prefix;
 }
 
+static int include_by_ref(const struct config_options *opts,
+ const char *cond, size_t cond_len, int icase)
+{
+   struct strbuf pattern = STRBUF_INIT;
+   char *branch;
+   unsigned flags = WM_PATHNAME;
+   int ret;
+
+   if (!opts->git_dir)
+   return 0;
+
+   branch = resolve_refdup("HEAD", 0, NULL, NULL);
+   if (!branch)
+   return 0;
+
+   if (icase)
+   flags |= WM_CASEFOLD;
+   strbuf_add(,  cond, cond_len);
+   ret = !wildmatch(pattern.buf, branch, flags);
+
+   free(branch);
+   strbuf_release();
+   return ret;
+}
+
 static int include_by_gitdir(const struct config_options *opts,
 const char *cond, size_t cond_len, int icase)
 {
@@ -268,6 +294,10 @@ static int include_condition_is_true(const struct 
config_options *opts,
return include_by_gitdir(opts, cond, cond_len, 0);
else if (skip_prefix_mem(cond, cond_len, "gitdir/i:", , _len))
return include_by_gitdir(opts, cond, cond_len, 1);
+   else if (skip_prefix_mem(cond, cond_len, "ref:", , _len))
+   return include_by_ref(opts, cond, cond_len, 0);
+   else if (skip_prefix_mem(cond, cond_len, "ref/i:", , _len))
+   return include_by_ref(opts, cond, cond_len, 1);
 
/* unknown conditionals are always false */
return 0;
diff --git a/t/t1305-config-include.sh b/t/t1305-config-include.sh
index d9d2f545a4..27ecfc74b7 100755
--- a/t/t1305-config-include.sh
+++ b/t/t1305-config-include.sh
@@ -296,6 +296,32 @@ test_expect_success SYMLINKS 'conditional include, gitdir 
matching symlink, icas
)
 '
 
+test_expect_success 'conditional include by refs' '
+   git init inc-by-ref &&
+   (
+   check() {
+   echo "ref: $1" >.git/HEAD &&
+   echo "[includeIf \"ref:$2\"]path=bar8" >.git/config &&
+   git config test.var >actual &&
+   test_cmp expect actual
+   }
+   cd inc-by-ref &&
+   echo "[test]var=matched" >.git/bar8 &&
+   echo matched >expect &&
+
+   check refs/heads/foo refs/heads/foo &&
+   check refs/heads/foo "refs/heads/*" &&
+   check refs/heads/foo "refs/heads/f*" &&
+   check refs/heads/deep/in/foo "refs/heads/**/foo" &&
+
+   test_commit one &&
+   git checkout --detach &&
+   echo "[includeIf \"ref:HEAD\"]path=bar8"

feature request: git-config: Add conditional include for gitbranch

2018-03-08 Thread Jeremy Bicha

Use Case
==
Jeremy is a developer for Debian and Ubuntu. The same repository is
used for both Debian and Ubuntu packaging but with different branches.
For commits to the debian/master branch, Jeremy wants to use his
@debian.org email address. For commits to the ubuntu/master branch,
Jeremy wants to use his @ubuntu.com email.

Proposal
===
The includeIf feature of git-config could be extended to offer a
gitbranch conditional include in addition to the gitdir conditional
include it already offers.

Thanks,
Jeremy Bicha

Re: [PATCH v1.1 1/2] t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'

2018-03-08 Thread Junio C Hamano

SZEDER Gábor  writes:

> On Fri, Mar 9, 2018 at 12:33 AM, Junio C Hamano  wrote:
>> SZEDER Gábor  writes:
>>
>>> diff --git a/t/t9400-git-cvsserver-server.sh 
>>> b/t/t9400-git-cvsserver-server.sh
>>> index c30660d606..5ff3789199 100755
>>> --- a/t/t9400-git-cvsserver-server.sh
>>> +++ b/t/t9400-git-cvsserver-server.sh
>>> @@ -449,10 +449,9 @@ test_expect_success 'cvs update (-p)' '
>>>  GIT_CONFIG="$git_config" cvs update &&
>>>  rm -f failures &&
>>>  for i in merge no-lf empty really-empty; do
>>> -GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out
>>> - test_cmp $i.out ../$i >>failures 2>&1
>>> -done &&
>>> -test -z "$(cat failures)"
>>> + GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out &&
>>> + test_cmp $i.out ../$i || return 1
>>> +done
>>>  '
>>
>> This makes "rm -f failures &&" unnecessary, no?
>
> Indeed, it does.

OK, no need to resend, as I'll amend it locally before queuing
(unless there is something else that needs updating, that is).

Thanks.

Re: [PATCH v1.1 1/2] t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'

2018-03-08 Thread SZEDER Gábor

On Fri, Mar 9, 2018 at 12:33 AM, Junio C Hamano  wrote:
> SZEDER Gábor  writes:
>
>> diff --git a/t/t9400-git-cvsserver-server.sh 
>> b/t/t9400-git-cvsserver-server.sh
>> index c30660d606..5ff3789199 100755
>> --- a/t/t9400-git-cvsserver-server.sh
>> +++ b/t/t9400-git-cvsserver-server.sh
>> @@ -449,10 +449,9 @@ test_expect_success 'cvs update (-p)' '
>>  GIT_CONFIG="$git_config" cvs update &&
>>  rm -f failures &&
>>  for i in merge no-lf empty really-empty; do
>> -GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out
>> - test_cmp $i.out ../$i >>failures 2>&1
>> -done &&
>> -test -z "$(cat failures)"
>> + GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out &&
>> + test_cmp $i.out ../$i || return 1
>> +done
>>  '
>
> This makes "rm -f failures &&" unnecessary, no?

Indeed, it does.

Re: [PATCH v1.1 1/2] t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'

2018-03-08 Thread Junio C Hamano

SZEDER Gábor  writes:

> diff --git a/t/t9400-git-cvsserver-server.sh b/t/t9400-git-cvsserver-server.sh
> index c30660d606..5ff3789199 100755
> --- a/t/t9400-git-cvsserver-server.sh
> +++ b/t/t9400-git-cvsserver-server.sh
> @@ -449,10 +449,9 @@ test_expect_success 'cvs update (-p)' '
>  GIT_CONFIG="$git_config" cvs update &&
>  rm -f failures &&
>  for i in merge no-lf empty really-empty; do
> -GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out
> - test_cmp $i.out ../$i >>failures 2>&1
> -done &&
> -test -z "$(cat failures)"
> + GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out &&
> + test_cmp $i.out ../$i || return 1
> +done
>  '

This makes "rm -f failures &&" unnecessary, no?

Re: [PATCH 0/4] Speed up git tag --contains

2018-03-08 Thread csilvers

} I had a few proposals over the years, but I won't even bother to dig
} them up, because there's quite recent and promising work in this
} area from Derrick Stolee:

It sounds like the best thing to do is to wait for this, then.

We managed to convert a bunch of our branches to tags, so our
immediate problem has been resolved.  But I'm sure it will come up
again as more branches are created...

carig

Re: [RFC v2] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-08 Thread Igor Djordjevic

On 08/03/2018 20:58, Igor Djordjevic wrote:
> 
> > Phillip's method is essentially merging the new tips into the original
> > merge, pretending that the new tips were not rebased but merged into
> > upstream.
> 
> [...]
> 
> Here`s a starting point, two commits A and B, merged into M:
> 
> (3) ---A
> \
>  M
> /
> ---B
> 
> 
> According the "patch theory"[1] (which might not be too popular 
> around here, but should serve the purpose for what I`m trying to 
> explain), each merge commit can be "transformed" to two non-merge 
> commits, one on top of each of the merge parents, where new commit 
> brings its original merge parent commit tree to the state of the 
> merge commit tree:
> 
> (4) ---A---U1
> 
> 
> 
> ---B---U2
> 
> 
> Now, we have two new commits, U1 and U2, each having the same tree as 
> previous merge commit M, but representing changes in regards to 
> specific parents - and this is essentially what Sergey`s original 
> approach was using (whether he knew it, or not).
> 
> When it comes to rebasing, it`s pretty simple, too. As this:
> 
> (5) ---X1---o---o---o---o---o---X2 (master)
>|\
>| A1---A2---A3
>| \
>|  M
>| /
>\-B1---B2---B3
> 
> ... actually equals this:
> 
> (6) ---X1---o---o---o---o---o---X2 (master)
>|\
>| A1---A2---A3---U1
>|
>|
>|
>\-B1---B2---B3---U2
> 
> ... where trees of M, U1 and U2 are same, and we can use the regular 
> rebase semantics and rebase it to this:
> 
> (7) ---X1---o---o---o---o---o---X2 (master)
> |\
> | A1'--A2'--A3'--U1'
> |
> |
> |
> \-B1'--B2'--B3'--U2'
> 
> ... which is essentially this again:
> 
> (8) ---X1---o---o---o---o---o---X2 (master)
> |\
> | A1'--A2'--A3'
> |\
> | M'
> |/
> \-B1'--B2'--B3'
> 

Having explained all this, I realized this is the same "essentially 
merging the new tips into the original pretending that the new tips 
were not rebased but merged into upstream" as Phillip`s one, just 
that we have additional temporary commits U1 and U2 (as per mentioned 
"patch theory") :)

Merging U1' and U2' with M as a base can initially be presented like 
this as well (what Phillip`s approach would yield):

git merge-recursive U1 -- M U1'
tree="$(git write-tree)"
git merge-recursive U2 -- $tree U2'
tree="$(git write-tree)"

..., where we know U1 = U2 = M (in regards to trees), so this is the 
same as:

git merge-recursive M -- M U1'
tree="$(git write-tree)"
git merge-recursive M -- $tree U2'
tree="$(git write-tree)"

..., which can be further simplified, being the same as:

git merge-recursive M -- U1' U2'
tree="$(git write-tree)"

... which is exactly what Sergey`s (updated) approach suggests, 
merging U1' and U2' with M as merge-base (and shown inside that 
sample implementation script I provided[1]) :)

With all this said, I think it`s safe to say these two approaches are 
exactly the same, just Sergey`s being simplified (thus harder to 
initially understand), and the only actual question is whether we 
value U1' and U2' enough, as possible "something suspicious happened" 
indicators, to use them, or not.

I would think yes, but I would be open for more samples of where do 
they become useful for reporting "suspicious activity", too.

Regards, Buga

[1] https://public-inbox.org/git/b11785bd-5c96-43c1-95d8-b28eccfd1...@gmail.com/

Re: [PATCH v10 3/9] strbuf: add a case insensitive starts_with()

2018-03-08 Thread Junio C Hamano

Duy Nguyen  writes:

>>  extern int starts_with(const char *str, const char *prefix);
>> +extern int startscase_with(const char *str, const char *prefix);
>
> This name is a bit hard to read. Boost [1] goes with istarts_with. I
> wonder if it's better. If not I guess either starts_with_case or
> starts_case_with will improve readability.

starts_with_case() sounds quite strange even though
starts_with_icase() may make it clear that it is "a variant of
starts_with() function that ignores case".  I dunno.
dir.c::cmp_icase() takes the _icase suffix for not quite the way
that is consistent with that understanding, though.

Hello dear,

2018-03-08 Thread Rachel Rachel

Hello dear,
I am Miss Rachel Jelani. I have very important thing to discuss with
you please, this information is very vital. Contact me with my
privarte email so we can talk ( rachelrachel...@hotmail.com )
Rachel.

[PATCH v1.1 1/2] t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'

2018-03-08 Thread SZEDER Gábor

The test 'cvs update (-p)' redirects and checks 'test_cmp's stdout and
even its stderr.  The commit introducing this test in 6e8937a084
(cvsserver: Add test for update -p, 2008-03-27) doesn't discuss why,
in fact its log message only consists of that subject line.  Anyway,
weird as it is, it kind of made sense due to the way that test was
structured:

After a bit of preparation, this test updates four files via CVS and
checks their contents using 'test_cmp', but it does so in a for loop
iterating over the names of those four files.  Now, the exit status of
a for loop is the exit status of the last command executed in the
loop, meaning that the test can't simply rely on the exit code of
'test_cmp' in the loop's body.  Instead, the test works it around by
relying on the stdout of 'test_cmp' being silent on success and
showing the diff on failure, as it appends the stdout of all four
'test_cmp' invocations to a single file and checks that file's
emptiness after the loop (with 'test -z "$(cat ...)"', no less; there
was no 'test_must_be_empty' back then).  Furthermore, the test
redirects the stderr of those 'test_cmp' invocations to this file,
too: while 'test_cmp' itself doesn't output anything to stderr, the
invoked 'diff' or 'cmp' commands do send their error messages there,
e.g. if they can't open a file because its name was misspelled.

This also makes this test fail when the test script is run with '-x'
tracing (and using a shell other than a Bash version supporting
BASH_XTRACEFD), because 'test_cmp's stderr contains the trace of the
'diff' command executed inside the helper function, throwing off the
subsequent emptiness check.

Stop relying on 'test_cmp's output and instead run 'test_cmp a b ||
return 1' in the for loop in order to make 'test_cmp's error code fail
the test.  Furthermore, add the missing && after the cvs command to
create a && chain in the loop's body.

After this change t9400 passes with '-x', even when running with
/bin/sh.

Signed-off-by: SZEDER Gábor 
---

Changes:
Use Eric's suggestion and run 'test_cmp a b || return 1' in the for
loop to fail the test.

 t/t9400-git-cvsserver-server.sh | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/t/t9400-git-cvsserver-server.sh b/t/t9400-git-cvsserver-server.sh
index c30660d606..5ff3789199 100755
--- a/t/t9400-git-cvsserver-server.sh
+++ b/t/t9400-git-cvsserver-server.sh
@@ -449,10 +449,9 @@ test_expect_success 'cvs update (-p)' '
 GIT_CONFIG="$git_config" cvs update &&
 rm -f failures &&
 for i in merge no-lf empty really-empty; do
-GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out
-   test_cmp $i.out ../$i >>failures 2>&1
-done &&
-test -z "$(cat failures)"
+   GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out &&
+   test_cmp $i.out ../$i || return 1
+done
 '
 
 cd "$WORKDIR"
-- 
2.16.2.603.g180c1428f0

Re: [PATCH 1/2] t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'

2018-03-08 Thread SZEDER Gábor

On Thu, Mar 8, 2018 at 11:01 PM, Eric Sunshine  wrote:
> On Thu, Mar 8, 2018 at 4:49 PM, SZEDER Gábor  wrote:
>> On Thu, Mar 8, 2018 at 7:51 PM, Eric Sunshine  
>> wrote:
>>> An alternative approach used elsewhere in the test suite[1] would be
>>> simply to 'exit' if test_cmp fails:
>>>
>>> for i in merge no-lf empty really-empty; do
>>> GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out
>>> test_cmp $i.out ../$i || exit 1
>>> done &&

> Sorry for the confusion. I meant "return 1" as used elsewhere in the
> test suite[1].

Oh, I see, that's indeed better.  Thanks.

Re: [PATCH 1/2] t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'

2018-03-08 Thread Eric Sunshine

On Thu, Mar 8, 2018 at 5:01 PM, Eric Sunshine  wrote:
> Sorry for the confusion. I meant "return 1" as used elsewhere in the
> test suite[1].
> [1]: For example, the "setup" test of t4151-am-abort.sh.

Additional context: e6821d09e4 (t: fix some trivial cases of ignored
exit codes in loops, 2015-03-25)

Re: [RFC v2] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-08 Thread Igor Djordjevic

On 08/03/2018 21:27, Igor Djordjevic wrote:
> 
> > git merge-recursive U1' -- M U2'
> > tree="$(git write-tree)"
> > # in case of original merge being octopus, we would continue like:
> > # git merge-recursive $tree -- M U3'
> > # tree="$(git write-tree)"
> > # git merge-recursive $tree -- M U4'
> > # ... and so on, then finally:
> > git merge-recursive $tree -- "$(git merge-base U1' U2' B1')" B1'
> > # in more general case, it would be:
> > # git merge-recursive $tree -- "$(git merge-base 
> > )" B1'
> > tree="$(git write-tree)"
> > git tag M' "$(git log --pretty=%B -1 M | git commit-tree $tree -p B3' 
> > -p B4 -p B1')"
> 
> That last line should obviously read just:
> 
>   git log --pretty=%B -1 M | git commit-tree $tree -p B3' -p B4 -p B1'
> 
> ..., above mentioned `git tag M'` part being a leftover from my other test 
> script.

Eh, pardon me, I managed to mess up all the merge-recursive lines, 
too, in regards to where the merge-base commit goes... Here`s a 
complete (and corrected) sample:

git merge-recursive M -- U1' U2'
tree="$(git write-tree)"
# in case of original merge being octopus, we would continue like:
# git merge-recursive M -- $tree U3'
# tree="$(git write-tree)"
# git merge-recursive M -- $tree U4'
# ... and so on, then finally:
git merge-recursive "$(git merge-base U1' U2' B1')" -- $tree B1'
# ... or even:
# git merge-recursive "$(git merge-base B3' B4 B1')" -- $tree B1'
# as in more general case, it would be:
# git merge-recursive "$(git merge-base 
)" -- $tree B1'
tree="$(git write-tree)"
git log --pretty=%B -1 M | git commit-tree $tree -p B3' -p B4 -p B1'

Re: recent glob expansion breakage on Windows?

2018-03-08 Thread Jonathan Nieder

+git-for-windows
Hi,

Laszlo Ersek wrote:

> Jaben reports that git-send-email is suddenly failing to expand the
> "*.patch" glob for him, at the Windows CMD prompt:
>
> -
> E:\...>git send-email --suppress-cc=author --suppress-cc=self 
> --suppress-cc=cc --suppress-cc=sob --dry-run *.patch
>
> No patch files specified!
> -
>
> Whereas, moving the same patch files to another subdir, and then passing
> the subdir to git-send-email, works fine.
>
> I seem to have found some $^O based perl code in the git tree that
> expands the glob manually (in place of the shell) on Windows -- however,
> that code looks ancient ^W very stable, and doesn't seem to explain the
> sudden breakage.
>
> Is it possible that a separate perl update on Jaben's Windows box is
> causing this? Or does the breakage look consistent with a recent git change?
>
> Has anyone else reported something similar recently?

This reminds me of https://github.com/git-for-windows/git/issues/339.
There, Johannes Schindelin writes (about a different command):

| This is expected because neither PowerShell nor cmd.exe nor git.exe
| expand wildcards. Those examples you found were written with a shell
| in mind, and the shell expands wildcards (hence Git does not think
| it needs to).

That may or may not also apply to send-email.  In what version did it
work?

Thanks,
Jonathan

> Thanks (and sorry about the noise; this list might not be the best place
> to ask)!
> Laszlo

Re: [PATCH 1/2] t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'

2018-03-08 Thread Eric Sunshine

On Thu, Mar 8, 2018 at 4:49 PM, SZEDER Gábor  wrote:
> On Thu, Mar 8, 2018 at 7:51 PM, Eric Sunshine  wrote:
>> An alternative approach used elsewhere in the test suite[1] would be
>> simply to 'exit' if test_cmp fails:
>>
>> for i in merge no-lf empty really-empty; do
>> GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out
>> test_cmp $i.out ../$i || exit 1
>> done &&
>
> And it's right: 'exit' terminates the shell process it's invoked in,
> i.e. the whole test script (well, unless it's invoked in a subshell)
> without executing the remaining tests and the usual housekeeping and
> reporting.
>
> Consider the following test script:
>
>   $ ./t-exit.sh
>   FATAL: Unexpected exit with code 1

Sorry for the confusion. I meant "return 1" as used elsewhere in the
test suite[1].

--- >8 ---
#!/bin/sh
test_description='return 1?'
. ./test-lib.sh

test_expect_success 'return 1' '
return 1
'

test_expect_success 'second test' '
true
'

test_done
--- >8 ---

$ ./t9977.sh
not ok 1 - return 1
#
# return 1
#
ok 2 - second test
# failed 1 among 2 test(s)
1..2
$

[1]: For example, the "setup" test of t4151-am-abort.sh.

Re: [PATCH 1/2] t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'

2018-03-08 Thread SZEDER Gábor

On Thu, Mar 8, 2018 at 7:51 PM, Eric Sunshine  wrote:
> On Thu, Mar 8, 2018 at 7:38 AM, SZEDER Gábor  wrote:

>> Unroll that for loop, so we can check the files' contents the usual
>> way and rely on 'test_cmp's exit code failing the && chain.  Extract
>> updating a file via CVS and verifying its contents using 'test_cmp'
>> into a helper function requiring the file's name as parameter to avoid
>> much of the repetition resulting from unrolling the loop.
>
> An alternative approach used elsewhere in the test suite[1] would be
> simply to 'exit' if test_cmp fails:
>
> for i in merge no-lf empty really-empty; do
> GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out
> test_cmp $i.out ../$i || exit 1
> done &&
>
> (And, like the existing patch, this removes the need for capturing
> test_cmp's output into a "failures" file.)
>
> [1]: For example, the "setup" test of t2204-add-ignored.sh.

But 't/README' sayeth:

Don't:

 - exit() within a

recent glob expansion breakage on Windows?

2018-03-08 Thread Laszlo Ersek

Hi,

Jaben reports that git-send-email is suddenly failing to expand the
"*.patch" glob for him, at the Windows CMD prompt:

-
E:\...>git send-email --suppress-cc=author --suppress-cc=self
--suppress-cc=cc --suppress-cc=sob --dry-run *.patch

No patch files specified!
-

Whereas, moving the same patch files to another subdir, and then passing
the subdir to git-send-email, works fine.

I seem to have found some $^O based perl code in the git tree that
expands the glob manually (in place of the shell) on Windows -- however,
that code looks ancient ^W very stable, and doesn't seem to explain the
sudden breakage.

Is it possible that a separate perl update on Jaben's Windows box is
causing this? Or does the breakage look consistent with a recent git change?

Has anyone else reported something similar recently?

Thanks (and sorry about the noise; this list might not be the best place
to ask)!
Laszlo

Re: [RFC v2] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-08 Thread Igor Djordjevic

On 08/03/2018 20:58, Igor Djordjevic wrote:
> 
>   git merge-recursive U1' -- M U2'
>   tree="$(git write-tree)"
>   # in case of original merge being octopus, we would continue like:
>   # git merge-recursive $tree -- M U3'
>   # tree="$(git write-tree)"
>   # git merge-recursive $tree -- M U4'
>   # ... and so on, then finally:
>   git merge-recursive $tree -- "$(git merge-base U1' U2' B1')" B1'
>   # in more general case, it would be:
>   # git merge-recursive $tree -- "$(git merge-base 
> )" B1'
>   tree="$(git write-tree)"
>   git tag M' "$(git log --pretty=%B -1 M | git commit-tree $tree -p B3' 
> -p B4 -p B1')"

That last line should obviously read just:

git log --pretty=%B -1 M | git commit-tree $tree -p B3' -p B4 -p B1'

..., above mentioned `git tag M'` part being a leftover from my other test 
script.

Re: [RFC v2] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-08 Thread Igor Djordjevic

Hi Johannes,

On 07/03/2018 15:08, Johannes Schindelin wrote:
> 
> > > Didn't we settle on Phillip's "perform successive three-way merges
> > > between the original merge commit and the new tips with the old tips
> > > as base" strategy?
> >
> > It seems you did, dunno exactly why.
> 
> That is not true. You make it sound like I was the only one who liked
> this, and not Phillip and Buga, too.

For myself, I do actually favor Sergey`s approach in general, but 
_implemented_ through what Phillip described (or a mixture of both, to 
be precise). But, let me explain... :)

> > The main problem with this decision is that we still don't see how and
> > when to stop for user amendment using this method. OTOH, the original
> > has this issue carefully discussed.
> 
> Why would we want to stop, unless there are merge conflicts?

Because we can reliably know that something "unusual" happened - and 
by that I don`t necessarily mean "wrong", but just might be worth 
user inspection.

For example, situation like this (M is made on A3 with `-s ours`, 
obsoleting Bx commits):

(1) ---X8--X9 (master)
   |\
   | A1---A2---A3
   | \
   |  M (topic)
   | /
   \-B1---B2---B3

... where we want to rebase M onto X9 is what I would call "usual 
stuff", but this situation (M is still made on A3 with `-s ours`, 
obsoleting Bx commits, but note cherry-picked B2'):

(2) ---X8--B2'--X9 (master)
   |\
   | A1---A2---A3
   | \
   |  M (topic)
   | /
   \-B1---B2---B3

... where we still want to rebase M onto X9 is what we might consider 
"unusual", because we noticed that something that shouldn`t be part 
of the rebased merge commit (due to previous `-s ours`) actually got 
in there (due to later cherry-pick), and just wanting the user to 
check and confirm.

This is the major reason why I would prefer Sergey`s approach in 
general... and might be I also have a good explanation on *why* it 
works, but let`s get there ;)

(p.s. This is only one, but certainly not the only case)

> > "rebase sides of the merge commit and then three-way merge them back
> > using original merge commit as base"
> 
> And that is also wrong, as I had proved already! Only Buga's addition made
> it robust against dropping/modifying commits, and that addition also makes
> it more complicated.

No, this is actually right, that sentence nicely describing _how_ it 
works. That addition of mine was just including the correct merge 
base (being the original merge commit that we are "rebasing"), and 
that`s what Sergey is talking about.

> And it still has no satisfactory simple explanation why it works.

Getting there... :)

> > > - it is *very easy* to reason about, once it is pointed out that
> > > rebases and merges result in the same trees.
> >
> > The original is as easy to reason about, if not easier, especially as
> > recursive merge strategy is not being used there in new ways.
> 
> So do it. I still have to hear a single-sentence, clear and obvious
> explanation why it works.
> 
> And please do not describe why your original version works, because it
> does not work. Describe why the one amended with Buga's hack works.
> 
> > I honestly don't see any advantages of Phillip's method over the
> > original, except personal preferences. At the same time, I have no
> > objection of using it either, provided consistency check problem is
> > solved there as well.
> 
> Okay, let me reiterate then, because I do not want this point to be
> missed:
> 
> Phillip's method is essentially merging the new tips into the original
> merge, pretending that the new tips were not rebased but merged into
> upstream.
> 
> So it exploits the duality of the rebase and merge operation, which both
> result in identical trees (potentially after resolving merge conflicts).
> 
> I cannot think of any such interpretation for your proposal augmented by
> Buga's fix-ups. And I haven't heard any such interpretation from your
> side, either.

Ok here it goes (I might be still wrong, but please bare with me).

What Sergey originally described also uses the "duality of the rebase 
and merge operation", too ;) Just in a bit different way (and might 
be a more straightforward one?).

Here`s a starting point, two commits A and B, merged into M:

(3) ---A
\
 M
/
---B

According the "patch theory"[1] (which might not be too popular 
around here, but should serve the purpose for what I`m trying to 
explain), each merge commit can be "transformed" to two non-merge 
commits, one on top of each of the merge parents, where new commit 
brings its original merge parent commit tree to the state of the 
merge commit tree:

(4) ---A---U1

---B---U2

Now, we have two new commits, U1 and U2, each having the same tree as 
previous merge commit M, but representing changes in regards to 
specific parents - and this is essentially what Sergey`s original

Case sensitivity when deleting a checked out branch

2018-03-08 Thread guillaume weybrecht


Hello,

I came across an odd behavior in Git related to case sensitivity when 
deleting a checked out branch.

I was not able to find much information about it.

$ git checkout -b foo
$ git branch -d foo   # error: Cannot delete branch 'foo' checked 
out => this is expected, nothing happens since you are on branch 'foo'
$ git branch -d Foo   # Deleted branch Foo => this is not expected, 
Git removed 'foo' from .git/refs/heads


Git removed the 'foo' file from .git/refs/heads, but did not update the 
.git/HEAD file which still contains "ref: refs/heads/foo".


In fact, everything looks like a "git checkout --orphan foo":
$ git status  # On branch foo. No commits yet (the working 
tree is staged)
$ git log # fatal: your current branch 'foo' does not 
have any commits yet

$ git rev-parse HEAD --   # fatal: bad revision 'HEAD'
$ git fsck    # notice: HEAD points to an unborn branch (foo)

You can run "git checkout " to get back on your feet as 
you would after "git checkout --orphan foo".


The thing is, you get there unexpectedly without any warning, just with 
a case sensitivity mistake in "git branch -d".

Even it is seems unlikely to happen, someone ran into this at my job.

Tested on Windows 2.15.1.windows.2 and 2.16.2.windows.1, and on Mac OS 
(I think it was a 2.14 version).


Best regards,
Guillaume

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default

2018-03-08 Thread Eddy Petrișor

2018-03-08 21:29 GMT+02:00 Eddy Petrișor :
> 2018-03-06 22:21 GMT+02:00 Jonathan Nieder :
>>
>> (cc list snipped)
>> Hi,
>>
>> Eddy Petrișor wrote:
>>
>> > Cc: [a lot of people]
>>
>> Can you say a little about how this cc list was created?  E.g. should
>> "git send-email" get a feature to warn about long cc lists?
>
>
> I did it as advised by the  documentation, git blame:
>
> https://github.com/git/git/blob/master/Documentation/SubmittingPatches#L264
>
> I was suprised that there is no specialized script that does this, as
> it is for the kernel or u-boot, so I ran first
>
> git log --pretty=format:'%an <%ae>' git-submodule.sh | sort -u >
> mail-list-submodule
>
> then configure git to use that custom output and ignore the patch
> since it was trying to convert every line of it into a email address:
>
> git config sendemail.ccCmd 'cat mail-list-submodule # '
>
> Then "git send-email 0001..." did the rest.
>
>
>>
>> > Signed-off-by: Eddy Petrișor 
>> > ---
>> >
>> > There are projects such as llvm/clang which use several repositories, and 
>> > they
>> > might be forked for providing support for various features such as adding 
>> > Redox
>> > awareness to the toolchain. This typically means the superproject will use
>> > another branch than master, occasionally even use an old commit from that
>> > non-master branch.
>> >
>> > Combined with the fact that when incorporating such a hierachy of 
>> > repositories
>> > usually the user is interested in just the exact commit specified in the
>> > submodule info, it follows that a desireable usecase is to be also able to
>> > provide '--depth 1' to avoid waiting for ages for the clone operation to
>> > finish.
>>
>> Some previous discussion is at
>> https://public-inbox.org/git/CAGZ79ka6UXKyVLmdLg_M5-sB1x96g8FRzRZy=eny5ajbqf9...@mail.gmail.com/.
>>
>> In theory this should be straightforward: Git protocol allows fetching
>> an arbitrary commit, so "git submodule update" and similar commands
>> could fetch the submodule commit by SHA-1 instead of by refname.  Poof!
>> Problem gone.
>>
>> In practice, some complications:
>>
>>  - some servers do not permit fetch-by-sha1.  For example, github does
>>not permit it.  This is governed by the
>>uploadpack.allowReachableSHA1InWant / uploadpack.allowAnySHA1InWant
>>configuration items.
>
>
> That is the problem I have faced since I tried to clone this repo
> which has at lest 2 levels of submodules:
> https://github.com/redox-os/redox
>
> The problematic modules are:
> rust @ 
> https://github.com/redox-os/rust/tree/81c2bf4e51647295d3d92952dbb0464b460df0c3
> - first level
>
> and
>
> rust/src/llvm @
> https://github.com/rust-lang/llvm/tree/ba2edd794c7def715007931fcd1b4ce62aa711c8
>
>
>>
>>That should be surmountable by making the behavior conditional, but
>>it's a complication.
>
>
> Which forced me to try to fetch a branch on which that commit exists,
> but also consider providing --depth for the submodule checkout,
> effectively minimizing the amount of brought in history.
>
>>
>>
>>  - When the user passes --depth=, do they mean that to apply to
>>the superproject, to the submodules, or both?  Documentation should
>>make the behavior clear.
>
>
> The supermodule clone already exists and that works OK; I was after
> providing something like 'git submodule update --depth 20 --recursive'
> or evn providing that 'depth' info via the .gitmodules parameters
> since 'shallow' is already used somehow, yet that is a bool value, not
> an integer, like depth expects:
>
>
> eddy@feodora:~/usr/src/redox/rust-eddy$ git config -f .gitmodules
> --list | egrep '(depth|shallow)'
> submodule.src/llvm.shallow=true
> submodule.src/rt/hoedown.shallow=true
> submodule.src/jemalloc.shallow=true
> submodule.src/liblibc.shallow=true
> submodule.src/doc/nomicon.shallow=true
> submodule.src/tools/cargo.shallow=true
> submodule.src/doc/reference.shallow=true
> submodule.src/doc/book.shallow=true
> submodule.src/tools/rls.shallow=true
> submodule.src/libcompiler_builtins.shallow=true
> submodule.src/tools/clippy.shallow=true
> submodule.src/tools/rustfmt.shallow=true
> submodule.src/tools/miri.shallow=true
> submodule.src/dlmalloc.shallow=true
> submodule.src/binaryen.shallow=true
> submodule.src/doc/rust-by-example.shallow=true
> submodule.src/llvm-emscripten.shallow=true
> submodule.src/tools/rust-installer.shallow=true
>
>
>> > Git submodule seems to be very stubborn and cloning master, although the
>> > wrapper script and the gitmodules-helper could work together to clone 
>> > directly
>> > the branch specified in the .gitmodules file, if specified.
>>
>> This could make sense.  For the same reason as --depth in the
>> superproject gives ambiguous signals about what should happen in
>> submodules, --single-branch in the superproject gives ambiguous
>> signals about what branch to fetch in submodules.
>
>
> I never thought of providing

Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default

2018-03-08 Thread Eddy Petrișor

2018-03-06 22:21 GMT+02:00 Jonathan Nieder :
>
> (cc list snipped)
> Hi,
>
> Eddy Petrișor wrote:
>
> > Cc: [a lot of people]
>
> Can you say a little about how this cc list was created?  E.g. should
> "git send-email" get a feature to warn about long cc lists?


I did it as advised by the  documentation, git blame:

https://github.com/git/git/blob/master/Documentation/SubmittingPatches#L264

I was suprised that there is no specialized script that does this, as
it is for the kernel or u-boot, so I ran first

git log --pretty=format:'%an <%ae>' git-submodule.sh | sort -u >
mail-list-submodule

then configure git to use that custom output and ignore the patch
since it was trying to convert every line of it into a email address:

git config sendemail.ccCmd 'cat mail-list-submodule # '

Then "git send-email 0001..." did the rest.


>
> > Signed-off-by: Eddy Petrișor 
> > ---
> >
> > There are projects such as llvm/clang which use several repositories, and 
> > they
> > might be forked for providing support for various features such as adding 
> > Redox
> > awareness to the toolchain. This typically means the superproject will use
> > another branch than master, occasionally even use an old commit from that
> > non-master branch.
> >
> > Combined with the fact that when incorporating such a hierachy of 
> > repositories
> > usually the user is interested in just the exact commit specified in the
> > submodule info, it follows that a desireable usecase is to be also able to
> > provide '--depth 1' to avoid waiting for ages for the clone operation to
> > finish.
>
> Some previous discussion is at
> https://public-inbox.org/git/CAGZ79ka6UXKyVLmdLg_M5-sB1x96g8FRzRZy=eny5ajbqf9...@mail.gmail.com/.
>
> In theory this should be straightforward: Git protocol allows fetching
> an arbitrary commit, so "git submodule update" and similar commands
> could fetch the submodule commit by SHA-1 instead of by refname.  Poof!
> Problem gone.
>
> In practice, some complications:
>
>  - some servers do not permit fetch-by-sha1.  For example, github does
>not permit it.  This is governed by the
>uploadpack.allowReachableSHA1InWant / uploadpack.allowAnySHA1InWant
>configuration items.


That is the problem I have faced since I tried to clone this repo
which has at lest 2 levels of submodules:
https://github.com/redox-os/redox

The problematic modules are:
rust @ 
https://github.com/redox-os/rust/tree/81c2bf4e51647295d3d92952dbb0464b460df0c3
- first level

and

rust/src/llvm @
https://github.com/rust-lang/llvm/tree/ba2edd794c7def715007931fcd1b4ce62aa711c8


>
>That should be surmountable by making the behavior conditional, but
>it's a complication.


Which forced me to try to fetch a branch on which that commit exists,
but also consider providing --depth for the submodule checkout,
effectively minimizing the amount of brought in history.

>
>
>  - When the user passes --depth=, do they mean that to apply to
>the superproject, to the submodules, or both?  Documentation should
>make the behavior clear.


The supermodule clone already exists and that works OK; I was after
providing something like 'git submodule update --depth 20 --recursive'
or evn providing that 'depth' info via the .gitmodules parameters
since 'shallow' is already used somehow, yet that is a bool value, not
an integer, like depth expects:


eddy@feodora:~/usr/src/redox/rust-eddy$ git config -f .gitmodules
--list | egrep '(depth|shallow)'
submodule.src/llvm.shallow=true
submodule.src/rt/hoedown.shallow=true
submodule.src/jemalloc.shallow=true
submodule.src/liblibc.shallow=true
submodule.src/doc/nomicon.shallow=true
submodule.src/tools/cargo.shallow=true
submodule.src/doc/reference.shallow=true
submodule.src/doc/book.shallow=true
submodule.src/tools/rls.shallow=true
submodule.src/libcompiler_builtins.shallow=true
submodule.src/tools/clippy.shallow=true
submodule.src/tools/rustfmt.shallow=true
submodule.src/tools/miri.shallow=true
submodule.src/dlmalloc.shallow=true
submodule.src/binaryen.shallow=true
submodule.src/doc/rust-by-example.shallow=true
submodule.src/llvm-emscripten.shallow=true
submodule.src/tools/rust-installer.shallow=true


> > Git submodule seems to be very stubborn and cloning master, although the
> > wrapper script and the gitmodules-helper could work together to clone 
> > directly
> > the branch specified in the .gitmodules file, if specified.
>
> This could make sense.  For the same reason as --depth in the
> superproject gives ambiguous signals about what should happen in
> submodules, --single-branch in the superproject gives ambiguous
> signals about what branch to fetch in submodules.


I never thought of providing the branch in the command line, since
that's not versionable info, but to provide git-submodule a hint in
the .gitsubmodule config about on which branch the hash in question
should be found, like this:

eddy@feodora:~/usr/src/redox/rust-eddy$ git config -f

Re: [PATCH] git manpage: note git-secur...@googlegroups.com

2018-03-08 Thread Junio C Hamano

Ævar Arnfjörð Bjarmason   writes:

> Add a mention of the security mailing list to the "Reporting Bugs"
> section. There's a mention of this list at
> https://git-scm.com/community but none in git.git itself.

This is quite a sensible thing to do.

>
> The copy is pasted from the git-scm.com website. Let's use the same
> wording in both places.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
> Someone at Git Merge mentioned that our own docs have no mention of
> how to report security issues. Perhaps this should be in
> SubmittingPatches too, but I couldn't figure out how that magical
> footnote format works.

The "Notes from the maintainer" posted periodically here for
developers does mention it, and I do agree with you that
SubmittingPatches is a good place to add it, as it is a document
that is targetted more towards developers.  But this is a good first
step.

Will queue.

>
>  Documentation/git.txt | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/Documentation/git.txt b/Documentation/git.txt
> index 8163b5796b..4767860e72 100644
> --- a/Documentation/git.txt
> +++ b/Documentation/git.txt
> @@ -849,6 +849,9 @@ Report bugs to the Git mailing list  
> where the
>  development and maintenance is primarily done.  You do not have to be
>  subscribed to the list to send a message there.
>  
> +Issues which are security relevant should be disclosed privately to
> +the Git Security mailing list .
> +
>  SEE ALSO
>  
>  linkgit:gittutorial[7], linkgit:gittutorial-2[7],

Re: [GSoC][PATCH v4] Make options that expect object ids less chatty if id is invalid

2018-03-08 Thread Martin Ågren

Hi Paul-Sebastian

On 6 March 2018 at 20:31, Paul-Sebastian Ungureanu
 wrote:
> Usually, the usage should be shown only if the user does not know what
> options are available. If the user specifies an invalid value, the user
> is already aware of the available options. In this case, there is no
> point in displaying the usage anymore.
>
> This patch applies to "git tag --contains", "git branch --contains",
> "git branch --points-at", "git for-each-ref --contains" and many more.

Thanks for scratching this itch.

>  8 files changed, 110 insertions(+), 14 deletions(-)
>  create mode 100755 t/tcontains.sh
>
> diff --git a/builtin/blame.c b/builtin/blame.c
> index 9dcb367b9..a5fbec8fb 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -728,6 +728,7 @@ int cmd_blame(int argc, const char **argv, const char 
> *prefix)
> PARSE_OPT_KEEP_DASHDASH | PARSE_OPT_KEEP_ARGV0);
> for (;;) {
> switch (parse_options_step(, options, blame_opt_usage)) {
> +   case PARSE_OPT_ERROR:
> case PARSE_OPT_HELP:
> exit(129);
> case PARSE_OPT_DONE:
> diff --git a/builtin/shortlog.c b/builtin/shortlog.c
> index e29875b84..0789e2eea 100644
> --- a/builtin/shortlog.c
> +++ b/builtin/shortlog.c
> @@ -282,6 +282,7 @@ int cmd_shortlog(int argc, const char **argv, const char 
> *prefix)
>
> for (;;) {
> switch (parse_options_step(, options, shortlog_usage)) {
> +   case PARSE_OPT_ERROR:
> case PARSE_OPT_HELP:
> exit(129);
> case PARSE_OPT_DONE:
> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index 58d1c2d28..34adf55a7 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -1059,6 +1059,7 @@ int cmd_update_index(int argc, const char **argv, const 
> char *prefix)
> break;
> switch (parseopt_state) {
> case PARSE_OPT_HELP:
> +   case PARSE_OPT_ERROR:
> exit(129);
> case PARSE_OPT_NON_OPTION:
> case PARSE_OPT_DONE:

> diff --git a/parse-options.c b/parse-options.c
> index d02eb8b01..47c09a82b 100644
> --- a/parse-options.c
> +++ b/parse-options.c
[...]
>  int parse_options_end(struct parse_opt_ctx_t *ctx)
> @@ -539,6 +540,7 @@ int parse_options(int argc, const char **argv, const char 
> *prefix,
> parse_options_start(, argc, argv, prefix, options, flags);
> switch (parse_options_step(, options, usagestr)) {
> case PARSE_OPT_HELP:
> +   case PARSE_OPT_ERROR:
> exit(129);
> case PARSE_OPT_NON_OPTION:
> case PARSE_OPT_DONE:

These are very slightly inconsistent with the ordering of
PARSE_OPT_ERROR and PARSE_OPT_HELP. I don't think it matters much. ;-)

> diff --git a/t/tcontains.sh b/t/tcontains.sh
> new file mode 100755
> index 0..4856111ff
> --- /dev/null
> +++ b/t/tcontains.sh

This filename is not on the usual form, t1234-foo. As a result, it it is
not picked up by `make test`, so isn't actually exercised. I believe you
selected this name based on the feedback in [1], but I do not think that
this ("tcontains.sh") was what Junio had in mind. Running the script
manually succeeds though. :-)

> @@ -0,0 +1,92 @@
> +#!/bin/sh
> +
> +test_description='Test "contains" argument behavior'

Ok, that matches the file name.

> +test_expect_success 'setup ' '
> +   git init . &&
> +   echo "this is a test" >file &&
> +   git add -A &&
> +   git commit -am "tag test" &&
> +   git tag "v1.0" &&
> +   git tag "v1.1"
> +'

You might find `test_commit` helpful. All in all, this could be a
two-liner (this example is white-space mangled though):

test_expect_success 'setup ' '
test_commit "v1.0" &&
git tag "v1.1"
'

> +test_expect_success 'tag usage error' '
> +   test_must_fail git tag --noopt 2>actual &&
> +   test_i18ngrep "usage" actual
> +'

Hmm, this one and a couple like it do not match the filename or test
description. I wonder if these are needed at all, or if some checks like
these are already done elsewhere (maybe not for "git tag", but for some
other user of the option-parsing machinery).

I hope you find this useful.

Martin

[1] https://public-inbox.org/git/xmqqinar1bq2@gitster-ct.c.googlers.com/

Re: [PATCH 1/2] t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'

2018-03-08 Thread Eric Sunshine

On Thu, Mar 8, 2018 at 7:38 AM, SZEDER Gábor  wrote:
> The test 'cvs update (-p)' redirects and checks 'test_cmp's stdout and
> even its stderr.  The commit introducing this test in 6e8937a084
> (cvsserver: Add test for update -p, 2008-03-27) doesn't discuss why,
> in fact its log message only consists of that subject line.  Anyway,
> weird as it is, it kind of made sense due to the way that test was
> structured:

[excellent explanation snipped]

> Unroll that for loop, so we can check the files' contents the usual
> way and rely on 'test_cmp's exit code failing the && chain.  Extract
> updating a file via CVS and verifying its contents using 'test_cmp'
> into a helper function requiring the file's name as parameter to avoid
> much of the repetition resulting from unrolling the loop.

An alternative approach used elsewhere in the test suite[1] would be
simply to 'exit' if test_cmp fails:

for i in merge no-lf empty really-empty; do
GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out
test_cmp $i.out ../$i || exit 1
done &&

(And, like the existing patch, this removes the need for capturing
test_cmp's output into a "failures" file.)

[1]: For example, the "setup" test of t2204-add-ignored.sh.

> Signed-off-by: SZEDER Gábor 
> ---
> diff --git a/t/t9400-git-cvsserver-server.sh b/t/t9400-git-cvsserver-server.sh
> @@ -447,12 +452,10 @@ test_expect_success 'cvs update (-p)' '
>  git push gitcvs.git >/dev/null &&
>  cd cvswork &&
>  GIT_CONFIG="$git_config" cvs update &&
> -rm -f failures &&
> -for i in merge no-lf empty really-empty; do
> -GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out
> -   test_cmp $i.out ../$i >>failures 2>&1
> -done &&
> -test -z "$(cat failures)"
> +check_cvs_update_p merge &&
> +check_cvs_update_p no-lf &&
> +check_cvs_update_p empty &&
> +check_cvs_update_p really-empty
>  '

Re: [PATCH] fetch-pack.c: use oidset to check existence of loose object

2018-03-08 Thread Junio C Hamano

Takuto Ikuta  writes:

> In repository having large number of refs, lstat for non-existing loose
> objects makes `git fetch` slow.

It is not immediately clear how "large number of refs" and "lstat
for loose objects" interact with each other to create a problem.
"In repository having large number of refs, because such and such
processing needs to do this and that, 'git fetch' ends up doing a
lot of lstat(2) calls to see if many objects exist in the loose
form, which makes it slow".  Please fill in the blanks.

> This patch stores existing loose objects in hashmap beforehand and use
> it to check existence instead of using lstat.
>
> With this patch, the number of lstat calls in `git fetch` is reduced
> from 411412 to 13794 for chromium repository.
>
> I took time stat of `git fetch` disabling quickfetch for chromium
> repository 3 time on linux with SSD.

Now you drop a clue that would help to fill in the blanks above, but
I am not sure what the significance of your having to disable
quickfetch in order to take measurements---it makes it sound as if
it is an articificial problem that does not exist in real life
(i.e. when quickfetch is not disabled), but I am getting the feeling
that it is not what you wanted to say here.

In any case, do_fetch_pack() tries to see if all of the tip commits
we are going to fetch exist locally, so when you are trying a fetch
that grabs huge number of refs (by the way, it means that the first
sentence of the proposed log message is not quite true---it is "When
fetching a large number of refs", as it does not matter how many
refs _we_ have, no?), everything_local() ends up making repeated
calls to has_object_file_with_flags() to all of the refs.

I like the idea---this turns "for each of these many things, check
if it exists with lstat(2)" into "enumerate what exists with
lstat(2), and then use that for the existence test"; if you need to
try N objects for existence, and you only have M objects loose where
N is vastly larger than M, it will be a huge win.  If you have very
many loose objects and checking only a handful of objects for
existence check, you would lose big, though, no?

> diff --git a/fetch-pack.c b/fetch-pack.c
> index d97461296..1658487f7 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -711,6 +711,15 @@ static void mark_alternate_complete(struct object *obj)
>   mark_complete(>oid);
>  }
>  
> +static int add_loose_objects_to_set(const struct object_id *oid,
> + const char *path,
> + void *data)
> +{
> + struct oidset* set = (struct oidset*)(data);

Style: in our codebase, asterisk does not stick to the type.

struct oidset *set = (struct oidset *)(data);

> @@ -719,16 +728,21 @@ static int everything_local(struct fetch_pack_args 
> *args,
>   int retval;
>   int old_save_commit_buffer = save_commit_buffer;
>   timestamp_t cutoff = 0;
> + struct oidset loose_oid_set = OIDSET_INIT;
> +
> + for_each_loose_object(add_loose_objects_to_set, _oid_set, 0);

OK, so this is the "enumerate all loose objects" phase.

>   save_commit_buffer = 0;
>  
>   for (ref = *refs; ref; ref = ref->next) {
>   struct object *o;
> + unsigned int flag = OBJECT_INFO_QUICK;

Hmm, OBJECT_INFO_QUICK optimization was added in dfdd4afc
("sha1_file: teach sha1_object_info_extended more flags",
2017-06-21), but since 8b4c0103 ("sha1_file: support lazily fetching
missing objects", 2017-12-08) it appears that passing
OBJECT_INFO_QUICK down the codepath does not do anything
interesting.  Jonathan (cc'ed), are all remaining hits from "git
grep OBJECT_INFO_QUICK" all dead no-ops these days?

>  
> - if (!has_object_file_with_flags(>old_oid,
> - OBJECT_INFO_QUICK))
> - continue;
> + if (!oidset_contains(_oid_set, >old_oid))
> + flag |= OBJECT_INFO_SKIP_LOOSE;
>  
> + if (!has_object_file_with_flags(>old_oid, flag))
> + continue;

Here, you want a way to say "I know this does not exist in the loose
form, so check if it exists in a non-loose form", and that is why
you invented the new flag.

> diff --git a/sha1_file.c b/sha1_file.c
> index 1b94f39c4..c903cbcec 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1262,6 +1262,9 @@ int sha1_object_info_extended(const unsigned char 
> *sha1, struct object_info *oi,
>   if (find_pack_entry(real, ))
>   break;
>  
> + if (flags & OBJECT_INFO_SKIP_LOOSE)
> + return -1;
> +

I cannot quite convince myself that this is done at the right layer;
it smells to be at a bit too low a layer.  This change makes sense
only to a caller that is interested in the existence test.  If the
flag is named after what it does, i.e. "ignore loose object", then
it does sort-of make sense, though.

>   /* Most

Re: [PATCH] git{,-blame}.el: remove old bitrotting Emacs code

2018-03-08 Thread Kyle Meyer

Ævar Arnfjörð Bjarmason  writes:

[...]

> These days these modes have very few if users, and users of git aren't

s/if// or s/if/if any/ ?

-- 
Kyle

Re: [PATCH v2 2/3] add -p: allow line selection to be inverted

2018-03-08 Thread Junio C Hamano

Phillip Wood  writes:

> and use a leading '-' for inversion. I'm tempted to keep supporting 'n-'
> to mean everything from 'n' to the last line though.

Thanks for double checking.  It would be a better endgame to follow
up with an update to existing "range selection" code to also support
"n-", if you go that route.

Re: [RFC PATCH 1/5] Add testcases for improved file collision conflict handling

2018-03-08 Thread Elijah Newren

Sweet!  Thanks for taking a look, and for spotting lots of
improvements (and some really embarrassing errors).  I'll keep the
fixes queued up while waiting for other feedback.  A few comments...

On Thu, Mar 8, 2018 at 4:25 AM, SZEDER Gábor  wrote:

> This setup test is enormous, and the conditions for the combination of
> the two sides and the add/rename conflicts are somewhat distracting.
> I don't know how it could be structured better/shorter/clearer...  I
> couldn't come up with anything useful during lunch.

Yeah.  :-(  It's part attempt to test these conflict types much more
thoroughly than other tests in the testsuite do, and part attempt to
keep the test setup consistent between the types to reflect the fact
that I'm consolidating the conflict resolution into a single function
as well.

Two possible ideas:

  * Split the tests for "*_unrelated" files and "*_related" files into
separate tests (doubling the number of tests, but making each only
deal with half the files.  That would make each setup function about
half the size, though both check functions would be about as big as
the original.

  * Instead of programatically generated tests, just manually write
out the tests for each of the four combinations (rename/rename,
rename/add, add/rename, add/add).  That means four "copies" of fairly
similar functions (and possible greater difficulty keeping things
consistent if changes are requested), but does allow removal of the
three different if-statements and thus makes each one easier to
understand in isolation.

Doing both would be possible as well.

Personally, I'm much more in favor of the first idea than the second.
I'm still kind of borderline about whether to make the change
mentioned in the first idea, but if others feel that splitting would
help a lot, I'm happy to look into either or both ideas.

>> + cat <>expected &&
>> +<<< HEAD
>> +modification
>> +===
>> +more stuff
>> +yet more stuff
>> +>>> R^0
>> +EOF
>
> You could use 'cat <<-EOF ' and indent the here doc with tabs, so
> it won't look so out-of-place.  Or even '<<-\EOF' to indicate that
> there is nothing to be expanded in the here doc.

I just learned two new things about heredocs; I've wanted both of
those things in the past, but didn't even think to check if they were
possible.  Thanks for enlightening me.

Re: [PATCH] git{,-blame}.el: remove old bitrotting Emacs code

2018-03-08 Thread Junio C Hamano

Ævar Arnfjörð Bjarmason   writes:

> On Tue, Mar 06 2018, Alexandre Julliard jotted:
> ...
>> I'd recommend that anybody still using it switch to Magit, which is
>> being actively maintained, and IMO superior to git.el in all respects.
>
> I think at this point it's best to remove both of these modes from
> being distributed with Git, per this patch.
>
>  contrib/emacs/.gitignore   |1 -
>  contrib/emacs/Makefile |   21 -
>  contrib/emacs/README   |   39 -
>  contrib/emacs/git-blame.el |  483 -
>  contrib/emacs/git.el   | 1704 
> 
>  5 files changed, 2248 deletions(-)
>  delete mode 100644 contrib/emacs/.gitignore
>  delete mode 100644 contrib/emacs/Makefile
>  delete mode 100644 contrib/emacs/README
>  delete mode 100644 contrib/emacs/git-blame.el
>  delete mode 100644 contrib/emacs/git.el

I agree with the overall direction.  The only difference between
this and what I had in mind was if we want to leave README that says
what you said in the log message.  That way, those who just got a
new version of tarball and then wonder what happened to these
scripts would save one trip to the Internet.

Re: [PATCH] fetch-pack.c: use oidset to check existence of loose object

2018-03-08 Thread René Scharfe

Am 08.03.2018 um 13:06 schrieb Takuto Ikuta:
> In repository having large number of refs, lstat for non-existing loose
> objects makes `git fetch` slow.
> 
> This patch stores existing loose objects in hashmap beforehand and use
> it to check existence instead of using lstat.
> 
> With this patch, the number of lstat calls in `git fetch` is reduced
> from 411412 to 13794 for chromium repository.
> 
> I took time stat of `git fetch` disabling quickfetch for chromium
> repository 3 time on linux with SSD.
> * with this patch
> 8.105s
> 8.309s
> 7.640s
> avg: 8.018s
> 
> * master
> 12.287s
> 11.175s
> 12.227s
> avg: 11.896s
> 
> On my MacBook Air which has slower lstat.
> * with this patch
> 14.501s
> 
> * master
> 1m16.027s

Nice improvement!

> 
> `git fetch` on slow disk will be improved largely.
> 
> Signed-off-by: Takuto Ikuta 
> ---
>   cache.h  |  2 ++
>   fetch-pack.c | 22 +++---
>   sha1_file.c  |  3 +++
>   3 files changed, 24 insertions(+), 3 deletions(-)
> 
> diff --git a/cache.h b/cache.h
> index d06932ed0..db38db40e 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1773,6 +1773,8 @@ struct object_info {
>   #define OBJECT_INFO_SKIP_CACHED 4
>   /* Do not retry packed storage after checking packed and loose storage */
>   #define OBJECT_INFO_QUICK 8
> +/* Do not check loose object */
> +#define OBJECT_INFO_SKIP_LOOSE 16
>   extern int sha1_object_info_extended(const unsigned char *, struct 
> object_info *, unsigned flags);
>   
>   /*
> diff --git a/fetch-pack.c b/fetch-pack.c
> index d97461296..1658487f7 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -711,6 +711,15 @@ static void mark_alternate_complete(struct object *obj)
>   mark_complete(>oid);
>   }
>   
> +static int add_loose_objects_to_set(const struct object_id *oid,
> + const char *path,
> + void *data)
> +{
> + struct oidset* set = (struct oidset*)(data);

This cast is not needed (unlike in C++).  And the asterisk should be stuck
to the variable, not the type (see Documentation/CodingGuidelines).

> + oidset_insert(set, oid);

In fact, you could just put "data" in here instead of "set" (without a
cast), with no loss in readability or safety.

> + return 0;
> +}
> +
>   static int everything_local(struct fetch_pack_args *args,
>   struct ref **refs,
>   struct ref **sought, int nr_sought)
> @@ -719,16 +728,21 @@ static int everything_local(struct fetch_pack_args 
> *args,
>   int retval;
>   int old_save_commit_buffer = save_commit_buffer;
>   timestamp_t cutoff = 0;
> + struct oidset loose_oid_set = OIDSET_INIT;
> +
> + for_each_loose_object(add_loose_objects_to_set, _oid_set, 0);
>   
>   save_commit_buffer = 0;
>   
>   for (ref = *refs; ref; ref = ref->next) {
>   struct object *o;
> + unsigned int flag = OBJECT_INFO_QUICK;
>   
> - if (!has_object_file_with_flags(>old_oid,
> - OBJECT_INFO_QUICK))
> - continue;
> + if (!oidset_contains(_oid_set, >old_oid))
> + flag |= OBJECT_INFO_SKIP_LOOSE;
>   
> + if (!has_object_file_with_flags(>old_oid, flag))
> + continue;
>   o = parse_object(>old_oid);
>   if (!o)
>   continue;
> @@ -744,6 +758,8 @@ static int everything_local(struct fetch_pack_args *args,
>   }
>   }
>   
> + oidset_clear(_oid_set);
> +

This part looks fine to me.  (Except perhaps call the variable "flags"
because you sometimes have two?)

Why not include packed objects as well?  Probably because packs have
indexes which can queried quickly to determine object existence, and
because there are only few loose objects in typical repositories,
right?

A similar cache was introduced by cc817ca3ef (sha1_name: cache
readdir(3) results in find_short_object_filename()) to speed up
finding unambiguous shorts object hashes.  I wonder if it could be
used here as well, but I don't see an easy way.

>   if (!args->no_dependents) {
>   if (!args->deepen) {
>   for_each_ref(mark_complete_oid, NULL);
> diff --git a/sha1_file.c b/sha1_file.c
> index 1b94f39c4..c903cbcec 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1262,6 +1262,9 @@ int sha1_object_info_extended(const unsigned char 
> *sha1, struct object_info *oi,
>   if (find_pack_entry(real, ))
>   break;
>   
> + if (flags & OBJECT_INFO_SKIP_LOOSE)
> + return -1;
> +
>   /* Most likely it's a loose object. */
>   if (!sha1_loose_object_info(real, oi, flags))
>   return 0;
> 

This early return doesn't just skip checking loose objects.  It
also skips reloading packs and fetching missing objects for
partial clones.

Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-08 Thread Igor Djordjevic

On 06/03/2018 12:45, Sergey Organov wrote:
> 
> > > The only thing I wonder of here is how would we check if the 
> > > "rebased" merge M' was "clean", or should we stop for user amendment? 
> > > With that other approach Sergey described, we have U1'==U2' to test with.
> >
> > I think (though I haven't rigorously proved to myself) that in the
> > absence of conflicts this scheme has well defined semantics (the merges
> > can be commuted), so the result should be predicable from the users
> > point of view so maybe it could just offer an option to stop.
> 
> Yes, hopefully it's predictable, but is it the intended one? We don't
> know, so there is still some level of uncertainty.
> 
> When in doubt, I try to find similar cases. There are two I'm aware of:
> 
> 1. "git merge" just commits the result when there are no conflicts.
> However, it supposedly has been run by the user just now, and thus user
> can amend what he gets. That's effectively a stop for amendment from our
> POV.
> 
> 2. When rebasing, "rerere", when fires, stages the changes, and rebasing
> stops for amendment. For me "rerere" behavior is rather annoying (I've
> never in fact amended what it prepared), but I always assumed there are
> good reasons it behaves this way.
> 
> Overall, to be consistent, it seems we do need to stop at U1' != U2', at
> least by default. Additional options could be supported then to specify
> user intentions, both on the command level and in the todo list,
> provided it proves to be useful.

Just to say I agree with this, `if U1' == U2' then proceed else stop` 
seems as a good sanity check.

Re: Error compiling Git in current master branch

2018-03-08 Thread Gaston Gonzalez


On 08/03/18 13:17, SZEDER Gábor wrote:

Just to let you know, I get the following error compiling Git in master
branch:

$ make prefix=/usr NO_GETTEXT=YesPlease all doc info
...
      GEN git-remote-testgit
      GEN perl/build/man/man3/Git.3pm
Can't write-open perl/build/man/man3/Git.3pm: Permission denied at
/usr/bin/pod2man line 70.
Makefile:2317: fallo en las instrucciones para el objetivo
'perl/build/man/man3/Git.3pm'
make: *** [perl/build/man/man3/Git.3pm] Error 13

This didn't happen in v2.16.2. Doing a git-bisect I've got the following
culprit:

$ git bisect start HEAD v2.16.2 --
2530afd3519a34b66e72cc29e7751d650cedc6dd is the first bad commit

That's not the culprit, that's the fix :)


make clean fails too:

rm -f -r perl/build/
rm: no se puede borrar 'perl/build/man/man3/Git.3pm': Permiso denegado
Makefile:2734: fallo en las instrucciones para el objetivo 'clean'
make: *** [clean] Error 1

Have a look at the permissions of 'perl/build' and its contents, they
are likely owned by root for reasons described in 2530afd's log
message.



Right you are:

$ ll perl/build/
total 8
drwxr-xr-x 3 ggonzalez ggonzalez 4096 mar  8 12:52 lib
drwxr-xr-x 3 root  root  4096 feb 14 13:56 man

The procedure would be just removing perl/build/ manually?

Thanks,

gaston

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-08 Thread Igor Djordjevic

On 08/03/2018 16:16, Igor Djordjevic wrote:
> 
> > Unless we reimplement the octopus merge (which works quite a bit
> > differently from the "rebase merge commit" strategy, even if it is
> > incremental, too), which has its own challenges: if there are merge
> > conflicts before merging the last MERGE_HEAD, the octopus merge will exit
> > with status 2, telling you "Should not be doing an octopus.". While we
> > will want to keep merge conflict markers and continue with the "rebase the
> > original merge commit" strategy.
> >
> > [...]
> 
> The thing is, in my opinion, as long as we are _rebasing_, you can`t 
> pick any merge strategy, as it doesn`t really make much sense. If you 
> do want a specific strategy, than that`s _recreating_ a merge, and it 
> goes fine with what you already have for `--recreate-merges`.
> 
> On merge rebasing, the underlying strategy we decide to use is just an 
> implementation detail, picking the one that works best (or the only 
> one that works, even), user should have nothing to do with it.

Just to add, if not already assumable, that I think we should stop 
and let user react on conflicts on each of the "rebase the original 
commit" strategy steps (rebase first parent, rebase second parent... 
merge parents).

I guess this stresses not using real "octopus merge" strategy even in 
case where we`re rebasing octopus merge commit even more (and aligns 
nicely with what you seem to expect already).

Re: Error compiling Git in current master branch

2018-03-08 Thread SZEDER Gábor

> Just to let you know, I get the following error compiling Git in master
> branch:
> 
> $ make prefix=/usr NO_GETTEXT=YesPlease all doc info
> ...
>      GEN git-remote-testgit
>      GEN perl/build/man/man3/Git.3pm
> Can't write-open perl/build/man/man3/Git.3pm: Permission denied at 
> /usr/bin/pod2man line 70.
> Makefile:2317: fallo en las instrucciones para el objetivo 
> 'perl/build/man/man3/Git.3pm'
> make: *** [perl/build/man/man3/Git.3pm] Error 13
> 
> This didn't happen in v2.16.2. Doing a git-bisect I've got the following
> culprit:
> 
> $ git bisect start HEAD v2.16.2 --
> 2530afd3519a34b66e72cc29e7751d650cedc6dd is the first bad commit

That's not the culprit, that's the fix :)

> make clean fails too:
> 
> rm -f -r perl/build/
> rm: no se puede borrar 'perl/build/man/man3/Git.3pm': Permiso denegado
> Makefile:2734: fallo en las instrucciones para el objetivo 'clean'
> make: *** [clean] Error 1

Have a look at the permissions of 'perl/build' and its contents, they
are likely owned by root for reasons described in 2530afd's log
message.

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-08 Thread Jacob Keller

On Thu, Mar 8, 2018 at 3:20 AM, Phillip Wood  wrote:
> On 07/03/18 07:26, Johannes Schindelin wrote:
>> Hi Buga,
>>
>> On Tue, 6 Mar 2018, Igor Djordjevic wrote:
>>
>>> On 06/03/2018 19:12, Johannes Schindelin wrote:

>> And I guess being consistent is pretty important, too - if you add new
>> content during merge rebase, it should always show up in the merge,
>> period.
>
> Yes, that should make it easy for the user to know what to expect from
> rebase.

 [...]

 It will be slightly inconsistent. But in a defendable way, I think.
>>>
>>> I like where this discussion is heading, and here`s what I thought
>>> about it :)
>>>
>>> [...]
>>>
>>> Here`s a twist - not letting `merge` trying to be too smart by
>>> figuring out whether passed arguments correspond to rewritten
>>> versions of the original merge parents (which would be too
>>> restrictive, too, I`m afraid), but just be explicit about it, instead!
>>
>> That's the missing piece, I think.
>>
>>> So, it could be something like:
>>>
>>>  merge -C deadbee 123abc:cafecafe 234bcd:bedbedbed
>>
>> I like where this is heading, too, but I do not think that we can do this
>> on a per-MERGE_HEAD basis. The vast majority of merge commits, in
>> practice, have two parents. So the `merge` command would actually only
>> have one revision to merge (because HEAD is the implicit first parent). So
>> that is easy.
>>
>> But as soon as you go octopus, you can either perform an octopus merge, or
>> rebase the original merge commit. You cannot really mix and match here.
>>
>> Unless we reimplement the octopus merge (which works quite a bit
>> differently from the "rebase merge commit" strategy, even if it is
>> incremental, too), which has its own challenges: if there are merge
>> conflicts before merging the last MERGE_HEAD, the octopus merge will exit
>> with status 2, telling you "Should not be doing an octopus.". While we
>> will want to keep merge conflict markers and continue with the "rebase the
>> original merge commit" strategy.
>>
>> And it would slam the door shut for adding support for *other* merge
>> strategies to perform a more-than-two-parents merge.
>>
>> Also, I do not think that it makes a whole lot of sense in practice to let
>> users edit what will be used for "original parent". If the user wants to
>> do complicated stuff, they can already do that, via `exec`. The `merge`
>> command really should be about facilitating common workflows, guiding the
>> user to what is sane.
>>
>> Currently my favorite idea is to introduce a new flag: -R (for "rebase the
>> original merge commit"). It would look like this:
>>
>>   merge -R -C   # 
>>
>> This flag would of course trigger the consistency check (does the number
>> of parents of the original merge commit agree with the parameter list? Was
>> an original merge commit specified to begin with?), and it would not fall
>> back to the recursive merge, but error out if that check failed.
>>
>> Side note: I wonder whether we really need to perform the additional check
>> that ensures that the  refers to the rewritten version of the
>> original merge commit's parent.
>>
>> Second side note: if we can fast-forward, currently we prefer that, and I
>> think we should keep that behavior with -R, too.
>
> I think that would be a good idea to avoid unpleasant surprises.
>
>> If the user wants to force a new merge, they simply remove that -R flag.
>>
>> What do you think?
>
> I did wonder about using 'pick ' for rebasing merges and
> keeping 'merge ...' for recreating them but I'm not sure if that is a
> good idea. It has the advantage that the user cannot specify the wrong
> parents for the merge to be rebased as 'git rebase' would work out if
> the parents have been rebased, but maybe it's a bit magical to use pick
> for merge commits. Also there isn't such a simple way for the user to go
> from 'rabase this merge' to 'recreate this merge' as they'd have to
> write the whole merge line themselves (though I guess something like
> emacs' git-rebase.el would be able to help with that)
>
> Best Wishes
>
> Phillip
>

Since the ultimate commit hashes of newly rebased commits would be
unknown at the time of writing the todo file, I'm not sure how this
would work to specify the parents?

>
>> Ciao,
>> Dscho
>>
>

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-08 Thread Igor Djordjevic

On 08/03/2018 13:16, Phillip Wood wrote:
> 
> > Side note: I wonder whether we really need to perform the additional check
> > that ensures that the  refers to the rewritten version of the
> > original merge commit's parent.
> > 
> > [...]
> 
> Oops that was referring to the first side note. I think fast forwarding
> is a good idea. I'm not so sure about checking that  refers
> to the rewritten version of the original merge commit's parent any more
> though. Having thought some more, I think we would want to allow the
> user to rearrange a topic branch that is the parent of a merge and that
> would require allowing a different parent as the old parent could be
> dropped or swapped with another commit in the branch. I can't think of a
> way to mechanically check that the new parent is 'somehow derived from'
> the old one.

Exactly, we must not depend on exact parent commits, but on parent 
"branches" (so to say).

And that is why I think explicit mapping would be pretty helpful (if 
not the only approach).

> > I did wonder about using 'pick ' for rebasing merges and
> > keeping 'merge ...' for recreating them but I'm not sure if that is a
> > good idea. It has the advantage that the user cannot specify the wrong
> > parents for the merge to be rebased as 'git rebase' would work out if
> > the parents have been rebased, but maybe it's a bit magical to use pick
> > for merge commits. Also there isn't such a simple way for the user to go
> > from 'rabase this merge' to 'recreate this merge' as they'd have to
> > write the whole merge line themselves (though I guess something like
> > emacs' git-rebase.el would be able to help with that)
> 
> Scrub that, it is too magical and I don't think it would work with
> rearranged commits - it's making the --preserve-merges mistake all over
> again. It's a shame to have 'merge' mean 'recreate the merge' and
> 'rebase the merge' but I don't think there is an easy way round that.

I actually like `pick` for _rebasing_ merge commits, as `pick` is 
already used for rebasing non-merge commits, too, so it feels natural.

Then `merge` is left to do what it is meant for - merging (or 
"recreate the merge", in the given context).

I tried to outline a possible user interface in that other reply[1], 
elaborating it a bit, too,

Regards, Buga

[1] https://public-inbox.org/git/f3872fb9-01bc-b2f1-aee9-cfc0e4db7...@gmail.com/

Rename of file is causing changes to be lost

2018-03-08 Thread Robert Dailey

I'm on Windows and core.ignorecase is set to 'true' when I clone/init
a repository. I've got a branch where I started making changes to a
file AND renamed it only to change its case. The changes I've made
were significant enough that git no longer detects a rename, instead
the files show up as "D" and "A" in git status (deleted then added).
To correct this, I do an interactive rebase to add an additional
commit before the first one to rename the file without changing it,
and *then* allow the second commit to change the file. The goal is
that rebase should detect the rename and automatically move the
changes in the (now) second commit to the newly named file. Here's a
MCVE (treat this as a script):

#/bin/bash
git init testgitrepo
cd testgitrepo/
git config core.ignorecase true # This is set by Windows for me, but
hopefully will allow this to repro on linux. Didn't test linux though.
echo "first change" > foo.txt
git add . && git commit -m 'first change'
git checkout -b topic
echo "second change" > foo.txt
git mv foo.txt FOO.txt
git add . && git commit -m 'second change'
git rebase -i master # Move line 1 to line 2, and put "x false" in line 1
git mv foo.txt FOO.txt && git commit -m 'rename foo'
git rebase --continue
git mergetool

After the rebase continue, you will get a conflict like so:

error: could not apply 527d208... second change

When you have resolved this problem, run "git rebase --continue".
If you prefer to skip this patch, run "git rebase --skip" instead.
To check out the original branch and stop rebasing, run "git rebase --abort".

Could not apply 527d208... second change
CONFLICT (rename/delete): foo.txt deleted in 527d208... second change
and renamed to FOO.txt in HEAD. Version HEAD of FOO.txt left in tree.

The last command, `git mergetool` runs, giving you the option to pick
the Created (left) or Deleted (right) version of the file:

Left: The file is created, but selecting this erases the changes from
the "added" version on the remote (which is topic). Basically the
rename of only case confused git, and we lost the changes on the
remote version of the file
Right: File is deleted. Changes are still lost.

The ideal outcome is that the changes from the "added" version of the
file in the 2nd commit get carried over to the "renamed" version of
the file, which when you compare the two are named exactly the same
after the 1st commit is introduced. How can I solve this issue?

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-08 Thread Igor Djordjevic

Hi Phillip and Johannes, 

On 08/03/2018 12:20, Phillip Wood wrote:
> 
> I did wonder about using 'pick ' for rebasing merges and
> keeping 'merge ...' for recreating them but I'm not sure if that is a
> good idea. It has the advantage that the user cannot specify the wrong
> parents for the merge to be rebased as 'git rebase' would work out if
> the parents have been rebased, but maybe it's a bit magical to use pick
> for merge commits. Also there isn't such a simple way for the user to go
> from 'rabase this merge' to 'recreate this merge' as they'd have to
> write the whole merge line themselves (though I guess something like
> emacs' git-rebase.el would be able to help with that)

Hmm, funny enough, `pick ` was something I though 
about originally, too, feeling that it might make more sense in terms 
on what`s really going on, but I guess I wanted it incorporated into 
`--recreate-merges` too much that I tried really hard to fit it in, 
without changing it much :/

And now that I said this in a previous reply:

> The thing is, in my opinion, as long as we are _rebasing_, you can`t 
> pick any merge strategy, as it doesn`t really make much sense. If you 
> do want a specific strategy, than that`s _recreating_ a merge, and it 
> goes fine with what you already have for `--recreate-merges`.
> 
> On merge rebasing, the underline strategy we decide to use is just an 
> implementation detail, picking the one that works best (or the only 
> one that works, even), user should have nothing to do with it.

The difference between "rebase merge commit" and "recreate merge 
commit" might starting to be more evident.

So... I might actually go for this one now. And (trying to stick with 
explicit mappings, still :P), now that we`re not married to `merge` 
expectations a user may already have, maybe a format like this:

  pick  :HEAD :

Here, original-merge is a _commit_, where original-parent and 
new-parent are _labels_ (in terms of `--recreate-merges`).

Everything else I previously said still holds - one is allowed to 
change or drop mappings, and add or drop new merge parents. Yes, in 
case user does something "stupid", he`ll get a lot of conflicts, but 
hey, we shouldn`t judge.

p.s. Are we moving towards `--rebase-merges` I mentioned in that 
other topic[1], as an add-on series after `--recreate-merges` hits 
the mainstream (as-is)...? :P

Regards, Buga

[1] https://public-inbox.org/git/bc9f82fb-fd18-ee45-36a4-921a1381b...@gmail.com/

Error compiling Git in current master branch

2018-03-08 Thread Gaston Gonzalez


Hi,

Just to let you know, I get the following error compiling Git in master
branch:

$ make prefix=/usr NO_GETTEXT=YesPlease all doc info
...
    GEN git-remote-testgit
    GEN perl/build/man/man3/Git.3pm
Can't write-open perl/build/man/man3/Git.3pm: Permission denied at 
/usr/bin/pod2man line 70.
Makefile:2317: fallo en las instrucciones para el objetivo 
'perl/build/man/man3/Git.3pm'

make: *** [perl/build/man/man3/Git.3pm] Error 13

This didn't happen in v2.16.2. Doing a git-bisect I've got the following
culprit:

$ git bisect start HEAD v2.16.2 --
2530afd3519a34b66e72cc29e7751d650cedc6dd is the first bad commit

make clean fails too:

rm -f -r perl/build/
rm: no se puede borrar 'perl/build/man/man3/Git.3pm': Permiso denegado
Makefile:2734: fallo en las instrucciones para el objetivo 'clean'
make: *** [clean] Error 1

Regards,

Gaston

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-08 Thread Igor Djordjevic

Hi Dscho,

On 07/03/2018 08:26, Johannes Schindelin wrote:
> 
> > So, it could be something like:
> >
> > merge -C deadbee 123abc:cafecafe 234bcd:bedbedbed
> 
> I like where this is heading, too, but I do not think that we can do this
> on a per-MERGE_HEAD basis. The vast majority of merge commits, in
> practice, have two parents. So the `merge` command would actually only
> have one revision to merge (because HEAD is the implicit first parent). So
> that is easy.
> 
> But as soon as you go octopus, you can either perform an octopus merge, or
> rebase the original merge commit. You cannot really mix and match here.
> 
> Unless we reimplement the octopus merge (which works quite a bit
> differently from the "rebase merge commit" strategy, even if it is
> incremental, too), which has its own challenges: if there are merge
> conflicts before merging the last MERGE_HEAD, the octopus merge will exit
> with status 2, telling you "Should not be doing an octopus.". While we
> will want to keep merge conflict markers and continue with the "rebase the
> original merge commit" strategy.
> 
> And it would slam the door shut for adding support for *other* merge
> strategies to perform a more-than-two-parents merge.

The thing is, in my opinion, as long as we are _rebasing_, you can`t 
pick any merge strategy, as it doesn`t really make much sense. If you 
do want a specific strategy, than that`s _recreating_ a merge, and it 
goes fine with what you already have for `--recreate-merges`.

On merge rebasing, the underline strategy we decide to use is just an 
implementation detail, picking the one that works best (or the only 
one that works, even), user should have nothing to do with it.

> Also, I do not think that it makes a whole lot of sense in practice to let
> users edit what will be used for "original parent". If the user wants to
> do complicated stuff, they can already do that, via `exec`. The `merge`
> command really should be about facilitating common workflows, guiding the
> user to what is sane.

I thought of a situation like this:

(1) ---o---o---o---M--- (master)
\ /
 X1--X2--X3 (topic)

Merge M was done with `-s ours`, obsoleting "topic" branch. But, I 
later realized that I actually do want that X2 commit in master.

Now, I guess the most obvious approach is just cherry-picking it, but 
what if I would like to do something like this instead, with an 
interactive rebase (and rebasing the merge, not recreating it):

(2) ---o---o---o---M'--- (master)
   |\ /|
   | X1'-X3'-/ | (topic)
   |   |
   \--X2'--/ (new)

This way, and having "topic" inherit original merge behavior due to 
merge rebasing, X1' and X3' would still be missing from M' as they 
did originally from M, but X2' would now be included, as it`s coming 
from a new branch, forged during interactive rebase.

(note - implementation wise, this still wouldn`t be an octopus merge ;)

> The vast majority of merge commits, in practice, have two parents. So
> the `merge` command would actually only have one revision to merge
> (because HEAD is the implicit first parent).

Now, this is something I actually overlooked :( 

I guess order of parent commits could then be used to map to old 
commit parents, being a limitation in comparison to direct 
old-parent:new-parent mapping, but might be a more straightforward 
user experience...

Though in case of octopus merge, where one would like to drop a 
branch from the middle, being merged with `-s ours`, that would be 
impossible, as then the next branch would be taking over dropped 
branch merge parent, yielding an incorrect result.

So in this case:

(3) ---o---o---o---M--- (master)
   |\ /|
   | X1--X3--/ | (topic)
   |   |
   \--X2---/ (new)

... where "topic" was merged using `-s ours`, we wouldn`t be able to 
just remove whole "topic" branch from the rebased merge without 
influencing it incorrectly.

With (any kind of) explicit old-parent:new-parent mapping, this is 
possible (and shouldn`t be any harder, implementation wise).

Now, it`s a different story if we`re interested in such exotic 
scenarios in the first place, but if possible, I would be all for 
it... :)

> Currently my favorite idea is to introduce a new flag: -R (for "rebase the
> original merge commit"). It would look like this:
> 
>   merge -R -C   # 
> 
> This flag would of course trigger the consistency check (does the number
> of parents of the original merge commit agree with the parameter list? Was
> an original merge commit specified to begin with?), and it would not fall
> back to the recursive merge, but error out if that check failed.
> 
> Side note: I wonder whether we really need to perform the additional check
> that ensures that the  refers to the rewritten version of the
> original merge commit's parent.

No, and even worse - I think we must not do that, as that merge 
parent might be moved elsewhere, which should be

[PATCH] git manpage: note git-secur...@googlegroups.com

2018-03-08 Thread Ævar Arnfjörð Bjarmason

Add a mention of the security mailing list to the "Reporting Bugs"
section. There's a mention of this list at
https://git-scm.com/community but none in git.git itself.

The copy is pasted from the git-scm.com website. Let's use the same
wording in both places.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
Someone at Git Merge mentioned that our own docs have no mention of
how to report security issues. Perhaps this should be in
SubmittingPatches too, but I couldn't figure out how that magical
footnote format works.

 Documentation/git.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 8163b5796b..4767860e72 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -849,6 +849,9 @@ Report bugs to the Git mailing list  
where the
 development and maintenance is primarily done.  You do not have to be
 subscribed to the list to send a message there.
 
+Issues which are security relevant should be disclosed privately to
+the Git Security mailing list .
+
 SEE ALSO
 
 linkgit:gittutorial[7], linkgit:gittutorial-2[7],
-- 
2.15.1.424.g9478a66081

RE: Bug report: git-stash doesn't return correct status code

2018-03-08 Thread Vromen, Tomer

> But stepping back a bit, why do you even need stash save/pop around
> "checkout -b new-feature-branch" (that implicitly branches at the
> tip) in the first place?  

Sorry about that, I meant something like

git stash && git checkout develop && git pull && git checkout -b 
new-feature-branch && git stash pop

My point is that it is the user's expectation that "git stash" will push to the 
stash.
Not pushing anything should be considered a failure.

Tomer.

-Original Message-
From: Junio C Hamano [mailto:jch2...@gmail.com] On Behalf Of Junio C Hamano
Sent: Wednesday, March 07, 2018 23:03
To: Vromen, Tomer 
Cc: git@vger.kernel.org
Subject: Re: Bug report: git-stash doesn't return correct status code

Junio C Hamano  writes:

> "Vromen, Tomer"  writes:
>
>>> git stash && git checkout -b new-feature-branch && git stash pop
>>
>> This is useful when I realize that I want to open a new branch for my 
>> changes (that I haven't committed yet).
>> However, I might have forgotten to save my changes in the editor, so 
>> git-stash will give this error:
>>
>> No local changes to save
>
> This is given with "say" and not with "die", as this is merely an
> informational diagnosis.  The command did not find any erroneous
> condition, the command did not fail to do anything it was supposed
> to do, so the command exited with 0 status.

I guess that is only half an answer.  If you really want to avoid
creating the new branch when the working tree and the index are
clean, you'd need to check that yourself before that three-command
sequence.  In our shell script, we use these as such a check:

git update-index --refresh -q --ignore-submodules
git diff-files --quiet --ignore-submodules &&
git diff-index --cached --quiet --ignore-submodules HEAD --

But stepping back a bit, why do you even need stash save/pop around
"checkout -b new-feature-branch" (that implicitly branches at the
tip) in the first place?  "checkout -b" that begins at the current
HEAD does not touch the index nor the working tree and take the
local changes with you to the new branch, so save/pop around it
seems totally pointless.
-
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Re: What's cooking in git.git (Mar 2018, #02; Tue, 6)

2018-03-08 Thread Daniel Jacques

> It would be great to have this rebooted now that my perl cleanup efforts
> have un-blocked this. Will be happy to help review & test the next
> iteration.

Yes, I was just thinking the same thing. I wanted to make sure the Perl
changes had landed, and I'm pleased to see that they have. I should have
time in the next few days to rebase and put up a new version of the patch
series. I'll keep you in the loop, and thanks for pinging!

[PATCH 1/2] t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'

2018-03-08 Thread SZEDER Gábor

The test 'cvs update (-p)' redirects and checks 'test_cmp's stdout and
even its stderr.  The commit introducing this test in 6e8937a084
(cvsserver: Add test for update -p, 2008-03-27) doesn't discuss why,
in fact its log message only consists of that subject line.  Anyway,
weird as it is, it kind of made sense due to the way that test was
structured:

After a bit of preparation, this test updates four files via CVS and
checks their contents using 'test_cmp', but it does so in a for loop
iterating over the names of those four files.  Now, the exit status of
a for loop is the exit status of the last command executed in the
loop, meaning that the test can't simply rely on the exit code of
'test_cmp' in the loop's body.  Instead, the test works it around by
relying on the stdout of 'test_cmp' being silent on success and
showing the diff on failure, as it appends the stdout of all four
'test_cmp' invocations to a single file and checks that file's
emptiness after the loop (with 'test -z "$(cat ...)"', no less; there
was no 'test_must_be_empty' back then).  Furthermore, the test
redirects the stderr of those 'test_cmp' invocations to this file,
too: while 'test_cmp' itself doesn't output anything to stderr, the
invoked 'diff' or 'cmp' commands do send their error messages there,
e.g. if they can't open a file because its name was misspelled.

This also makes this test fail when the test script is run with '-x'
tracing (and using a shell other than a Bash version supporting
BASH_XTRACEFD), because 'test_cmp's stderr contains the trace of the
'diff' command executed inside the helper function, throwing off the
subsequent emptiness check.

Unroll that for loop, so we can check the files' contents the usual
way and rely on 'test_cmp's exit code failing the && chain.  Extract
updating a file via CVS and verifying its contents using 'test_cmp'
into a helper function requiring the file's name as parameter to avoid
much of the repetition resulting from unrolling the loop.

After this change t9400 passes with '-x', even when running with
/bin/sh.

Signed-off-by: SZEDER Gábor 
---
 t/t9400-git-cvsserver-server.sh | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/t/t9400-git-cvsserver-server.sh b/t/t9400-git-cvsserver-server.sh
index c30660d606..1f67d2822f 100755
--- a/t/t9400-git-cvsserver-server.sh
+++ b/t/t9400-git-cvsserver-server.sh
@@ -437,6 +437,11 @@ test_expect_success 'cvs update (merge no-op)' \
 GIT_CONFIG="$git_config" cvs -Q update &&
 test_cmp merge ../merge'
 
+check_cvs_update_p () {
+GIT_CONFIG="$git_config" cvs update -p "$1" >"$1".out &&
+test_cmp "$1".out ../"$1"
+}
+
 cd "$WORKDIR"
 test_expect_success 'cvs update (-p)' '
 touch really-empty &&
@@ -447,12 +452,10 @@ test_expect_success 'cvs update (-p)' '
 git push gitcvs.git >/dev/null &&
 cd cvswork &&
 GIT_CONFIG="$git_config" cvs update &&
-rm -f failures &&
-for i in merge no-lf empty really-empty; do
-GIT_CONFIG="$git_config" cvs update -p "$i" >$i.out
-   test_cmp $i.out ../$i >>failures 2>&1
-done &&
-test -z "$(cat failures)"
+check_cvs_update_p merge &&
+check_cvs_update_p no-lf &&
+check_cvs_update_p empty &&
+check_cvs_update_p really-empty
 '
 
 cd "$WORKDIR"
-- 
2.16.2.603.g180c1428f0

[PATCH 2/2] t9402-git-cvsserver-refs: don't check the stderr of a subshell

2018-03-08 Thread SZEDER Gábor

Four 'cvs diff' related tests in 't9402-git-cvsserver-refs.sh' fail
when the test script is run with '-x' tracing (and using a shell other
than a Bash version supporting BASH_XTRACEFD).  The reason for those
failures is that the tests check the emptiness of a subshell's stderr,
which includes the trace of commands executed in that subshell as
well, throwing off the emptiness check.

Save the stdout and stderr of the invoked 'cvs' command instead of the
whole subshell, so the latter remains free from tracing output.  (Note
that changing how stdout is saved is only done for the sake of
consistency, it's not necessary for correctness.)

After this change t9402 passes with '-x', even when running with
/bin/sh.

Signed-off-by: SZEDER Gábor 
---
 t/t9402-git-cvsserver-refs.sh | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t9402-git-cvsserver-refs.sh b/t/t9402-git-cvsserver-refs.sh
index 6d2d3c8739..cf31ace667 100755
--- a/t/t9402-git-cvsserver-refs.sh
+++ b/t/t9402-git-cvsserver-refs.sh
@@ -455,20 +455,20 @@ test_expect_success 'cvs up -r $(git rev-parse v1)' '
 '
 
 test_expect_success 'cvs diff -r v1 -u' '
-   ( cd cvswork && cvs -f diff -r v1 -u ) >cvsDiff.out 2>cvs.log &&
+   ( cd cvswork && cvs -f diff -r v1 -u >../cvsDiff.out 2>../cvs.log ) &&
test_must_be_empty cvsDiff.out &&
test_must_be_empty cvs.log
 '
 
 test_expect_success 'cvs diff -N -r v2 -u' '
-   ( cd cvswork && ! cvs -f diff -N -r v2 -u ) >cvsDiff.out 2>cvs.log &&
+   ( cd cvswork && ! cvs -f diff -N -r v2 -u >../cvsDiff.out 2>../cvs.log 
) &&
test_must_be_empty cvs.log &&
test -s cvsDiff.out &&
check_diff cvsDiff.out v2 v1 >check_diff.out 2>&1
 '
 
 test_expect_success 'cvs diff -N -r v2 -r v1.2' '
-   ( cd cvswork && ! cvs -f diff -N -r v2 -r v1.2 -u ) >cvsDiff.out 
2>cvs.log &&
+   ( cd cvswork && ! cvs -f diff -N -r v2 -r v1.2 -u >../cvsDiff.out 
2>../cvs.log ) &&
test_must_be_empty cvs.log &&
test -s cvsDiff.out &&
check_diff cvsDiff.out v2 v1.2 >check_diff.out 2>&1
@@ -487,7 +487,7 @@ test_expect_success 'apply early [cvswork3] diff to b3' '
 '
 
 test_expect_success 'check [cvswork3] diff' '
-   ( cd cvswork3 && ! cvs -f diff -N -u ) >"$WORKDIR/cvsDiff.out" 
2>cvs.log &&
+   ( cd cvswork3 && ! cvs -f diff -N -u >"$WORKDIR/cvsDiff.out" 
2>../cvs.log ) &&
test_must_be_empty cvs.log &&
test -s cvsDiff.out &&
test $(grep Index: cvsDiff.out | wc -l) = 3 &&
-- 
2.16.2.603.g180c1428f0

[PATCH 0/2] Make cvs tests pass with '-x' tracing and /bin/sh

2018-03-08 Thread SZEDER Gábor

> On Sat, Feb 24, 2018 at 12:39 AM, SZEDER Gábor  wrote:
> > This patch series makes '-x' tracing of tests work reliably even when
> > running them with /bin/sh, and setting TEST_SHELL_PATH=bash will be
> > unnecessary.
> >
> >   make GIT_TEST_OPTS='-x --verbose-log' test
> >
> > passes on my setup and on all Travis CI build jobs (though neither me
> > nor Travis CI run all tests, e.g. CVS).
> 
> I installed 'cvs' and whatnot to run t94* and t96* tests, and sure
> enough, 5 tests in 2 test scripts fail with '-x' tracing and /bin/sh.
> I think I will be able to get around to send v2 during the weekend.

Well, I wasn't able to get around to it, and in the meantime
'sg/test-x' got merged into 'next'.  So here are the updates for those
two test scripts.

The commit message of the first patch is perhaps unnecessary long, but
it's mostly about explaining why the affected test was working
reasonably well despite the rather weird 'test_cmp this that >>out
2>&1' thing.

SZEDER Gábor (2):
  t9400-git-cvsserver-server: don't rely on the output of 'test_cmp'
  t9402-git-cvsserver-refs: don't check the stderr of a subshell

 t/t9400-git-cvsserver-server.sh | 15 +--
 t/t9402-git-cvsserver-refs.sh   |  8 
 2 files changed, 13 insertions(+), 10 deletions(-)

-- 
2.16.2.603.g180c1428f0

Re: [RFC PATCH 1/5] Add testcases for improved file collision conflict handling

2018-03-08 Thread SZEDER Gábor

> Adds testcases dealing with file collisions for the following types of
> conflicts:
>   * add/add
>   * rename/add
>   * rename/rename(2to1)
> These tests include expectations for proposed smarter behavior which has
> not yet been implemented and thus are currently expected to fail.
> Subsequent commits will correct that and explain the new behavior.
> 
> Signed-off-by: Elijah Newren 
> ---
>  t/t6042-merge-rename-corner-cases.sh | 220 
> +++
>  1 file changed, 220 insertions(+)
> 
> diff --git a/t/t6042-merge-rename-corner-cases.sh 
> b/t/t6042-merge-rename-corner-cases.sh
> index 411550d2b6..a6c151ef95 100755
> --- a/t/t6042-merge-rename-corner-cases.sh
> +++ b/t/t6042-merge-rename-corner-cases.sh
> @@ -575,4 +575,224 @@ test_expect_success 'rename/rename/add-dest merge still 
> knows about conflicting
>   test ! -f c
>  '
>
> +test_conflicts_with_adds_and_renames() {
> + test $1 != 0 && side1=rename || side1=add
> + test $2 != 0 && side2=rename || side2=add

For additonal context I'm going to quote the callsites of this
function from the end of the test script:

> +test_conflicts_with_adds_and_renames 1 1
> +test_conflicts_with_adds_and_renames 1 0
> +test_conflicts_with_adds_and_renames 0 1
> +test_conflicts_with_adds_and_renames 0 0

Instead of the two conditions at the beginning of the function and the
1 and 0 sort-of magic numbers at the callsites, you could just pass
the words "rename" and "add" as parameters to the function.  The
callsites would be clearer and the function could start with two
simple variable assignments side1=$1 ; side2=$2.

Please feel free to dismiss this as bikeshedding: since the branches
are called 'L' and 'R', maybe calling the variables $sideL and $sideR
would match better the rest of the test?  Dunno.

> + # Setup:
> + #  L
> + # / \
> + #   master   ?
> + # \ /
> + #  R
> + #
> + # Where:
> + #   Both L and R have files named 'three-unrelated' and
> + #   'three-related' which collide (i.e. 4 files colliding at two
> + #   pathnames).  Each of the colliding files could have been
> + #   involved in a rename, in which case there was a file named
> + #   'one-[un]related' or 'two-[un]related' that was modified on the
> + #   opposite side of history and renamed into the collision on this
> + #   side of history.
> + #
> + # Questions for both sets of collisions:
> + #   1) The index should contain both a stage 2 and stage 3 entry
> + #  for the colliding file.  Does it?
> + #   2) When renames are involved, the content merges are clean, so
> + #  the index should reflect the content merges, not merely the
> + #  version of the colliding file from the prior commit.  Does
> + #  it?
> + #
> + # Questions for three-unrelated:
> + #   3) There should be files in the worktree named
> + #  'three-unrelated~HEAD' and 'three-unrelated~R^0' with the
> + #  (content-merged) version of 'three-unrelated' from the
> + #  appropriate side of the merge.  Are they present?
> + #   4) There should be no file named 'three-unrelated' in the
> + #  working tree.  That'd make it too likely that users would
> + #  use it instead of carefully looking at both
> + #  three-unrelated~HEAD and three-unrelated~R^0.  Is it
> + #  correctly missing?
> + #
> + # Questions for three-related:
> + #   3) There should be a file in the worktree named three-related
> + #  containing the two-way merged contents of the content-merged
> + #  versions of three-related from each of the two colliding
> + #  files.  Is it present?
> + #   4) There should not be any three-related~* files in the working
> + #  tree.
> + test_expect_success "setup simple $side1/$side2 conflict" '
> + test_create_repo simple_${side1}_${side2} &&
> + (
> + cd simple_${side1}_${side2} &&
> +
> + # Create a simple file with 10 lines
> + ten="0 1 2 3 4 5 6 7 8 9" &&
> + for i in $ten
> + do
> + echo line $i in a sample file
> + done >unrelated1_v1 &&
> + # Create a 2nd version of same file with one more line
> + cat unrelated1_v1 >unrelated1_v2 &&

'cp unrelated1_v1 unrelated1_v2', perhaps?

> + echo another line >>unrelated1_v2 &&
> +
> + # Create an unrelated simple file with 10 lines
> + for i in $ten
> + do
> + echo line $i in another sample file
> + done >unrelated2_v1 &&
> + # Create a 2nd version of same file with one more line
> + cat

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-08 Thread Phillip Wood

On 08/03/18 11:20, Phillip Wood wrote:
> On 07/03/18 07:26, Johannes Schindelin wrote:
>> Hi Buga,
>>
>> On Tue, 6 Mar 2018, Igor Djordjevic wrote:
>>
>>> On 06/03/2018 19:12, Johannes Schindelin wrote:

>> And I guess being consistent is pretty important, too - if you add new
>> content during merge rebase, it should always show up in the merge,
>> period. 
>
> Yes, that should make it easy for the user to know what to expect from
> rebase.

 [...]

 It will be slightly inconsistent. But in a defendable way, I think.
>>>
>>> I like where this discussion is heading, and here`s what I thought 
>>> about it :)
>>>
>>> [...]
>>>
>>> Here`s a twist - not letting `merge` trying to be too smart by 
>>> figuring out whether passed arguments correspond to rewritten 
>>> versions of the original merge parents (which would be too 
>>> restrictive, too, I`m afraid), but just be explicit about it, instead!
>>
>> That's the missing piece, I think.
>>
>>> So, it could be something like:
>>>
>>> merge -C deadbee 123abc:cafecafe 234bcd:bedbedbed
>>
>> I like where this is heading, too, but I do not think that we can do this
>> on a per-MERGE_HEAD basis. The vast majority of merge commits, in
>> practice, have two parents. So the `merge` command would actually only
>> have one revision to merge (because HEAD is the implicit first parent). So
>> that is easy.
>>
>> But as soon as you go octopus, you can either perform an octopus merge, or
>> rebase the original merge commit. You cannot really mix and match here.
>>
>> Unless we reimplement the octopus merge (which works quite a bit
>> differently from the "rebase merge commit" strategy, even if it is
>> incremental, too), which has its own challenges: if there are merge
>> conflicts before merging the last MERGE_HEAD, the octopus merge will exit
>> with status 2, telling you "Should not be doing an octopus.". While we
>> will want to keep merge conflict markers and continue with the "rebase the
>> original merge commit" strategy.
>>
>> And it would slam the door shut for adding support for *other* merge
>> strategies to perform a more-than-two-parents merge.
>>
>> Also, I do not think that it makes a whole lot of sense in practice to let
>> users edit what will be used for "original parent". If the user wants to
>> do complicated stuff, they can already do that, via `exec`. The `merge`
>> command really should be about facilitating common workflows, guiding the
>> user to what is sane.
>>
>> Currently my favorite idea is to introduce a new flag: -R (for "rebase the
>> original merge commit"). It would look like this:
>>
>>  merge -R -C   # 
>>
>> This flag would of course trigger the consistency check (does the number
>> of parents of the original merge commit agree with the parameter list? Was
>> an original merge commit specified to begin with?), and it would not fall
>> back to the recursive merge, but error out if that check failed.
>>
>> Side note: I wonder whether we really need to perform the additional check
>> that ensures that the  refers to the rewritten version of the
>> original merge commit's parent.
>>
>> Second side note: if we can fast-forward, currently we prefer that, and I
>> think we should keep that behavior with -R, too.
> 
> I think that would be a good idea to avoid unpleasant surprises.

Oops that was referring to the first side note. I think fast forwarding
is a good idea. I'm not so sure about checking that  refers
to the rewritten version of the original merge commit's parent any more
though. Having thought some more, I think we would want to allow the
user to rearrange a topic branch that is the parent of a merge and that
would require allowing a different parent as the old parent could be
dropped or swapped with another commit in the branch. I can't think of a
way to mechanically check that the new parent is 'somehow derived from'
the old one.

>> If the user wants to force a new merge, they simply remove that -R flag.
>>
>> What do you think?
> 
> I did wonder about using 'pick ' for rebasing merges and
> keeping 'merge ...' for recreating them but I'm not sure if that is a
> good idea. It has the advantage that the user cannot specify the wrong
> parents for the merge to be rebased as 'git rebase' would work out if
> the parents have been rebased, but maybe it's a bit magical to use pick
> for merge commits. Also there isn't such a simple way for the user to go
> from 'rabase this merge' to 'recreate this merge' as they'd have to
> write the whole merge line themselves (though I guess something like
> emacs' git-rebase.el would be able to help with that)

Scrub that, it is too magical and I don't think it would work with
rearranged commits - it's making the --preserve-merges mistake all over
again. It's a shame to have 'merge' mean 'recreate the merge' and
'rebase the merge' but I don't think there is an easy way round that.

> Best Wishes
> 
> Phillip
> 
> 
>> Ciao,
>> Dscho
>>
>

[PATCH] fetch-pack.c: use oidset to check existence of loose object

2018-03-08 Thread Takuto Ikuta

In repository having large number of refs, lstat for non-existing loose
objects makes `git fetch` slow.

This patch stores existing loose objects in hashmap beforehand and use
it to check existence instead of using lstat.

With this patch, the number of lstat calls in `git fetch` is reduced
from 411412 to 13794 for chromium repository.

I took time stat of `git fetch` disabling quickfetch for chromium
repository 3 time on linux with SSD.
* with this patch
8.105s
8.309s
7.640s
avg: 8.018s

* master
12.287s
11.175s
12.227s
avg: 11.896s

On my MacBook Air which has slower lstat.
* with this patch
14.501s

* master
1m16.027s

`git fetch` on slow disk will be improved largely.

Signed-off-by: Takuto Ikuta 
---
 cache.h  |  2 ++
 fetch-pack.c | 22 +++---
 sha1_file.c  |  3 +++
 3 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index d06932ed0..db38db40e 100644
--- a/cache.h
+++ b/cache.h
@@ -1773,6 +1773,8 @@ struct object_info {
 #define OBJECT_INFO_SKIP_CACHED 4
 /* Do not retry packed storage after checking packed and loose storage */
 #define OBJECT_INFO_QUICK 8
+/* Do not check loose object */
+#define OBJECT_INFO_SKIP_LOOSE 16
 extern int sha1_object_info_extended(const unsigned char *, struct object_info 
*, unsigned flags);
 
 /*
diff --git a/fetch-pack.c b/fetch-pack.c
index d97461296..1658487f7 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -711,6 +711,15 @@ static void mark_alternate_complete(struct object *obj)
mark_complete(>oid);
 }
 
+static int add_loose_objects_to_set(const struct object_id *oid,
+   const char *path,
+   void *data)
+{
+   struct oidset* set = (struct oidset*)(data);
+   oidset_insert(set, oid);
+   return 0;
+}
+
 static int everything_local(struct fetch_pack_args *args,
struct ref **refs,
struct ref **sought, int nr_sought)
@@ -719,16 +728,21 @@ static int everything_local(struct fetch_pack_args *args,
int retval;
int old_save_commit_buffer = save_commit_buffer;
timestamp_t cutoff = 0;
+   struct oidset loose_oid_set = OIDSET_INIT;
+
+   for_each_loose_object(add_loose_objects_to_set, _oid_set, 0);
 
save_commit_buffer = 0;
 
for (ref = *refs; ref; ref = ref->next) {
struct object *o;
+   unsigned int flag = OBJECT_INFO_QUICK;
 
-   if (!has_object_file_with_flags(>old_oid,
-   OBJECT_INFO_QUICK))
-   continue;
+   if (!oidset_contains(_oid_set, >old_oid))
+   flag |= OBJECT_INFO_SKIP_LOOSE;
 
+   if (!has_object_file_with_flags(>old_oid, flag))
+   continue;
o = parse_object(>old_oid);
if (!o)
continue;
@@ -744,6 +758,8 @@ static int everything_local(struct fetch_pack_args *args,
}
}
 
+   oidset_clear(_oid_set);
+
if (!args->no_dependents) {
if (!args->deepen) {
for_each_ref(mark_complete_oid, NULL);
diff --git a/sha1_file.c b/sha1_file.c
index 1b94f39c4..c903cbcec 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1262,6 +1262,9 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
if (find_pack_entry(real, ))
break;
 
+   if (flags & OBJECT_INFO_SKIP_LOOSE)
+   return -1;
+
/* Most likely it's a loose object. */
if (!sha1_loose_object_info(real, oi, flags))
return 0;
-- 
2.16.2

[PATCH/RFC v3 03/12] pack-objects: use bitfield for object_entry::dfs_state

2018-03-08 Thread Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c |  3 +++
 pack-objects.h | 33 -
 2 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index fd217cb51f..a4dbb40824 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3049,6 +3049,9 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
OPT_END(),
};
 
+   if (DFS_NUM_STATES > (1 << OE_DFS_STATE_BITS))
+   die("BUG: too many dfs states, increase OE_DFS_STATE_BITS");
+
check_replace_refs = 0;
 
reset_pack_idx_option(_idx_opts);
diff --git a/pack-objects.h b/pack-objects.h
index 85b01b66da..628c45871c 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -1,6 +1,21 @@
 #ifndef PACK_OBJECTS_H
 #define PACK_OBJECTS_H
 
+#define OE_DFS_STATE_BITS  2
+
+/*
+ * State flags for depth-first search used for analyzing delta cycles.
+ *
+ * The depth is measured in delta-links to the base (so if A is a delta
+ * against B, then A has a depth of 1, and B a depth of 0).
+ */
+enum dfs_state {
+   DFS_NONE = 0,
+   DFS_ACTIVE,
+   DFS_DONE,
+   DFS_NUM_STATES
+};
+
 /*
  * basic object info
  * -
@@ -73,21 +88,13 @@ struct object_entry {
unsigned no_try_delta:1;
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
+   unsigned dfs_state:OE_DFS_STATE_BITS;
+
+   /* XXX 20 bits hole, try to pack */
 
-   /* XXX 22 bits hole, try to pack */
-   /*
-* State flags for depth-first search used for analyzing delta cycles.
-*
-* The depth is measured in delta-links to the base (so if A is a delta
-* against B, then A has a depth of 1, and B a depth of 0).
-*/
-   enum {
-   DFS_NONE = 0,
-   DFS_ACTIVE,
-   DFS_DONE
-   } dfs_state;
int depth;
-   /* size: 128, padding: 4 */
+
+   /* size: 120 */
 };
 
 struct packing_data {
-- 
2.16.2.873.g32ff258c87

[PATCH/RFC v3 09/12] pack-objects: reorder 'hash' to pack struct object_entry

2018-03-08 Thread Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 pack-objects.h | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/pack-objects.h b/pack-objects.h
index 1c0ad4c9ef..3c15cf7b23 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -77,12 +77,10 @@ struct object_entry {
uint32_t delta_sibling_idx; /* other deltified objects who
 * uses the same base as me
 */
-   /* XXX 4 bytes hole, try to pack */
-
+   uint32_t hash;  /* name hint hash */
void *delta_data;   /* cached delta (uncompressed) */
unsigned long delta_size;   /* delta data size (uncompressed) */
unsigned long z_delta_size; /* delta data size (compressed) */
-   uint32_t hash;  /* name hint hash */
unsigned char in_pack_header_size; /* note: spare bits available! */
unsigned in_pack_idx:OE_IN_PACK_BITS;   /* already in pack */
unsigned type:TYPE_BITS;
@@ -101,7 +99,7 @@ struct object_entry {
 
unsigned depth:OE_DEPTH_BITS;
 
-   /* size: 104, padding: 4, bit_padding: 18 bits */
+   /* size: 96, bit_padding: 18 bits */
 };
 
 struct packing_data {
-- 
2.16.2.873.g32ff258c87

[PATCH/RFC v3 01/12] pack-objects: a bit of document about struct object_entry

2018-03-08 Thread Nguyễn Thái Ngọc Duy

The role of this comment block becomes more important after we shuffle
fields around to shrink this struct. It will be much harder to see what
field is related to what. This also documents the holes in this struct
according to pahole.

A couple of notes on shrinking the struct:

1) The reader may notice one thing from this document and the shrinking
business. If "delta" is NULL, all other delta-related fields should be
irrelevant. We could group all these in a separate struct and replace
them all with a pointer to this struct (allocated separately).

This does not help much though since 85% of objects are deltified
(source: linux-2.6.git). The gain is only from non-delta objects, which
is not that significant.

2) The field in_pack_offset and idx.offset could be merged. But we need
to be very careful. Up until the very last phase (object writing),
idx.offset is not used and can hold in_pack_offset. Then idx.offset will
be updated with _destination pack's_ offset, not source's. But since we
always write delta's bases first, and we only use in_pack_offset in
writing phase when we reuse objects, we should be ok?

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 pack-objects.h | 48 
 1 file changed, 48 insertions(+)

diff --git a/pack-objects.h b/pack-objects.h
index 03f1191659..f834ead541 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -1,6 +1,52 @@
 #ifndef PACK_OBJECTS_H
 #define PACK_OBJECTS_H
 
+/*
+ * basic object info
+ * -
+ * idx.oid is filled up before delta searching starts. idx.crc32 and
+ * is only valid after the object is written down and will be used for
+ * generating the index. idx.offset will be both gradually set and
+ * used in writing phase (base objects get offset first, then deltas
+ * refer to them)
+ *
+ * "size" is the uncompressed object size. Compressed size is not
+ * cached (ie. raw data in a pack) but available via revindex.
+ *
+ * "hash" contains a path name hash which is used for sorting the
+ * delta list and also during delta searching. Once prepare_pack()
+ * returns it's no longer needed.
+ *
+ * source pack info
+ * 
+ * The (in_pack, in_pack_offset, in_pack_header_size) tuple contains
+ * the location of the object in the source pack, with or without
+ * header.
+ *
+ * "type" and "in_pack_type" both describe object type. in_pack_type
+ * may contain a delta type, while type is always the canonical type.
+ *
+ * deltas
+ * --
+ * Delta links (delta, delta_child and delta_sibling) are created
+ * reflect that delta graph from the source pack then updated or added
+ * during delta searching phase when we find better deltas.
+ *
+ * delta_child and delta_sibling are last needed in
+ * compute_write_order(). "delta" and "delta_size" must remain valid
+ * at object writing phase in case the delta is not cached.
+ *
+ * If a delta is cached in memory and is compressed, "delta" points to
+ * the data and z_delta_size contains the compressed size. If it's
+ * uncompressed [1], z_delta_size must be zero. delta_size is always
+ * the uncompressed size and must be valid even if the delta is not
+ * cached. Delta recreation technically only depends on "delta"
+ * pointer, but delta_size is still used to verify it's the same as
+ * before.
+ *
+ * [1] during try_delta phase we don't bother with compressing because
+ * the delta could be quickly replaced with a better one.
+ */
 struct object_entry {
struct pack_idx_entry idx;
unsigned long size; /* uncompressed size */
@@ -28,6 +74,7 @@ struct object_entry {
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
 
+   /* XXX 28 bits hole, try to pack */
/*
 * State flags for depth-first search used for analyzing delta cycles.
 *
@@ -40,6 +87,7 @@ struct object_entry {
DFS_DONE
} dfs_state;
int depth;
+   /* size: 136, padding: 4 */
 };
 
 struct packing_data {
-- 
2.16.2.873.g32ff258c87

[PATCH/RFC v3 04/12] pack-objects: use bitfield for object_entry::depth

2018-03-08 Thread Nguyễn Thái Ngọc Duy

This does not give us any saving due to padding. But we will be able
to save once we cut 4 bytes out of this struct in a subsequent patch.

Because of struct packing from now on we can only handle max depth
4095 (or even lower when new booleans are added in this struct). This
should be ok since long delta chain will cause significant slow down
anyway.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt   | 1 +
 Documentation/git-pack-objects.txt | 4 +++-
 Documentation/git-repack.txt   | 4 +++-
 builtin/pack-objects.c | 4 
 pack-objects.h | 8 +++-
 5 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f57e9cf10c..9bd3f5a789 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2412,6 +2412,7 @@ pack.window::
 pack.depth::
The maximum delta depth used by linkgit:git-pack-objects[1] when no
maximum depth is given on the command line. Defaults to 50.
+   Maximum value is 4095.
 
 pack.windowMemory::
The maximum size of memory that is consumed by each thread
diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 81bc490ac5..3503c9e3e6 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -96,7 +96,9 @@ base-name::
it too deep affects the performance on the unpacker
side, because delta data needs to be applied that many
times to get to the necessary object.
-   The default value for --window is 10 and --depth is 50.
++
+The default value for --window is 10 and --depth is 50. The maximum
+depth is 4095.
 
 --window-memory=::
This option provides an additional limit on top of `--window`;
diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index ae750e9e11..25c83c4927 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -90,7 +90,9 @@ other objects in that pack they already have locally.
space. `--depth` limits the maximum delta depth; making it too deep
affects the performance on the unpacker side, because delta data needs
to be applied that many times to get to the necessary object.
-   The default value for --window is 10 and --depth is 50.
++
+The default value for --window is 10 and --depth is 50. The maximum
+depth is 4095.
 
 --threads=::
This option is passed through to `git pack-objects`.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index a4dbb40824..cfd97da7db 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3068,6 +3068,10 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
if (pack_to_stdout != !base_name || argc)
usage_with_options(pack_usage, pack_objects_options);
 
+   if (depth > (1 << OE_DEPTH_BITS))
+   die(_("delta chain depth %d is greater than maximum limit %d"),
+   depth, (1 << OE_DEPTH_BITS));
+
argv_array_push(, "pack-objects");
if (thin) {
use_internal_rev_list = 1;
diff --git a/pack-objects.h b/pack-objects.h
index 628c45871c..4b17402953 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -2,6 +2,7 @@
 #define PACK_OBJECTS_H
 
 #define OE_DFS_STATE_BITS  2
+#define OE_DEPTH_BITS  12
 
 /*
  * State flags for depth-first search used for analyzing delta cycles.
@@ -89,12 +90,9 @@ struct object_entry {
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
unsigned dfs_state:OE_DFS_STATE_BITS;
+   unsigned depth:OE_DEPTH_BITS;
 
-   /* XXX 20 bits hole, try to pack */
-
-   int depth;
-
-   /* size: 120 */
+   /* size: 120, bit_padding: 8 bits */
 };
 
 struct packing_data {
-- 
2.16.2.873.g32ff258c87

[PATCH/RFC v3 06/12] pack-objects: move in_pack_pos out of struct object_entry

2018-03-08 Thread Nguyễn Thái Ngọc Duy

This field is only need for pack-bitmap, which is an optional
feature. Move it to a separate array that is only allocated when
pack-bitmap is used (it's not freed in the same way that objects[] is
not). This saves us 8 bytes in struct object_entry.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c |  3 ++-
 pack-bitmap-write.c|  8 +---
 pack-bitmap.c  |  2 +-
 pack-bitmap.h  |  4 +++-
 pack-objects.h | 18 --
 5 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index cfd97da7db..7bb5544883 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -878,7 +878,8 @@ static void write_pack_file(void)
 
if (write_bitmap_index) {
bitmap_writer_set_checksum(oid.hash);
-   bitmap_writer_build_type_index(written_list, 
nr_written);
+   bitmap_writer_build_type_index(
+   _pack, written_list, nr_written);
}
 
finish_tmp_packfile(, pack_tmp_name,
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index e01f992884..256a63f892 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -48,7 +48,8 @@ void bitmap_writer_show_progress(int show)
 /**
  * Build the initial type index for the packfile
  */
-void bitmap_writer_build_type_index(struct pack_idx_entry **index,
+void bitmap_writer_build_type_index(struct packing_data *to_pack,
+   struct pack_idx_entry **index,
uint32_t index_nr)
 {
uint32_t i;
@@ -57,12 +58,13 @@ void bitmap_writer_build_type_index(struct pack_idx_entry 
**index,
writer.trees = ewah_new();
writer.blobs = ewah_new();
writer.tags = ewah_new();
+   ALLOC_ARRAY(to_pack->in_pack_pos, to_pack->nr_objects);
 
for (i = 0; i < index_nr; ++i) {
struct object_entry *entry = (struct object_entry *)index[i];
enum object_type real_type;
 
-   entry->in_pack_pos = i;
+   oe_set_in_pack_pos(to_pack, entry, i);
 
switch (entry->type) {
case OBJ_COMMIT:
@@ -147,7 +149,7 @@ static uint32_t find_object_pos(const unsigned char *sha1)
"(object %s is missing)", sha1_to_hex(sha1));
}
 
-   return entry->in_pack_pos;
+   return oe_in_pack_pos(writer.to_pack, entry);
 }
 
 static void show_object(struct object *object, const char *name, void *data)
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 9270983e5f..865d9ecc4e 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -1032,7 +1032,7 @@ int rebuild_existing_bitmaps(struct packing_data *mapping,
oe = packlist_find(mapping, sha1, NULL);
 
if (oe)
-   reposition[i] = oe->in_pack_pos + 1;
+   reposition[i] = oe_in_pack_pos(mapping, oe) + 1;
}
 
rebuild = bitmap_new();
diff --git a/pack-bitmap.h b/pack-bitmap.h
index 3742a00e14..5ded2f139a 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -44,7 +44,9 @@ int rebuild_existing_bitmaps(struct packing_data *mapping, 
khash_sha1 *reused_bi
 
 void bitmap_writer_show_progress(int show);
 void bitmap_writer_set_checksum(unsigned char *sha1);
-void bitmap_writer_build_type_index(struct pack_idx_entry **index, uint32_t 
index_nr);
+void bitmap_writer_build_type_index(struct packing_data *to_pack,
+   struct pack_idx_entry **index,
+   uint32_t index_nr);
 void bitmap_writer_reuse_bitmaps(struct packing_data *to_pack);
 void bitmap_writer_select_commits(struct commit **indexed_commits,
unsigned int indexed_commits_nr, int max_bitmaps);
diff --git a/pack-objects.h b/pack-objects.h
index 2ccd6359d2..9ab0ce300d 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -77,7 +77,6 @@ struct object_entry {
unsigned long delta_size;   /* delta data size (uncompressed) */
unsigned long z_delta_size; /* delta data size (compressed) */
uint32_t hash;  /* name hint hash */
-   unsigned int in_pack_pos;
unsigned char in_pack_header_size; /* note: spare bits available! */
unsigned type:TYPE_BITS;
unsigned in_pack_type:TYPE_BITS; /* could be delta */
@@ -92,7 +91,7 @@ struct object_entry {
unsigned dfs_state:OE_DFS_STATE_BITS;
unsigned depth:OE_DEPTH_BITS;
 
-   /* size: 120, bit_padding: 8 bits */
+   /* size: 112, bit_padding: 8 bits */
 };
 
 struct packing_data {
@@ -101,6 +100,8 @@ struct packing_data {
 
int32_t *index;
uint32_t index_size;
+
+   unsigned int *in_pack_pos;
 };
 
 struct object_entry *packlist_alloc(struct packing_data *pdata,
@@ -131,4 +132,17 @@ static inline

[PATCH/RFC v3 10/12] pack-objects: shrink z_delta_size field in struct object_entry

2018-03-08 Thread Nguyễn Thái Ngọc Duy

We only cache deltas when it's smaller than a certain limit. This limit
defaults to 1000 but save its compressed length in a 64-bit field.
Shrink that field down to 16 bits, so you can only cache 65kb deltas.
Larger deltas must be recomputed at when the pack is written down.

This saves us 8 bytes (some from previous bit padding).

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt |  3 ++-
 builtin/pack-objects.c   | 22 --
 pack-objects.h   | 11 ---
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 9bd3f5a789..00fa824448 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2449,7 +2449,8 @@ pack.deltaCacheLimit::
The maximum size of a delta, that is cached in
linkgit:git-pack-objects[1]. This cache is used to speed up the
writing object phase by not having to recompute the final delta
-   result once the best match for all objects is found. Defaults to 1000.
+   result once the best match for all objects is found.
+   Defaults to 1000. Maximum value is 65535.
 
 pack.threads::
Specifies the number of threads to spawn when searching for best
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 82a4a95888..39920061e9 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2105,12 +2105,19 @@ static void find_deltas(struct object_entry **list, 
unsigned *list_size,
 * between writes at that moment.
 */
if (entry->delta_data && !pack_to_stdout) {
-   entry->z_delta_size = do_compress(>delta_data,
- entry->delta_size);
-   cache_lock();
-   delta_cache_size -= entry->delta_size;
-   delta_cache_size += entry->z_delta_size;
-   cache_unlock();
+   unsigned long size;
+
+   size = do_compress(>delta_data, 
entry->delta_size);
+   entry->z_delta_size = size;
+   if (entry->z_delta_size == size) {
+   cache_lock();
+   delta_cache_size -= entry->delta_size;
+   delta_cache_size += entry->z_delta_size;
+   cache_unlock();
+   } else {
+   FREE_AND_NULL(entry->delta_data);
+   entry->z_delta_size = 0;
+   }
}
 
/* if we made n a delta, and if n is already at max
@@ -3089,6 +3096,9 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
if (depth > (1 << OE_DEPTH_BITS))
die(_("delta chain depth %d is greater than maximum limit %d"),
depth, (1 << OE_DEPTH_BITS));
+   if (cache_max_small_delta_size >= (1 << OE_Z_DELTA_BITS))
+   die(_("pack.deltaCacheLimit is greater than maximum limit %d"),
+   1 << OE_Z_DELTA_BITS);
 
argv_array_push(, "pack-objects");
if (thin) {
diff --git a/pack-objects.h b/pack-objects.h
index 3c15cf7b23..cbb39ab568 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -4,6 +4,7 @@
 #define OE_DFS_STATE_BITS  2
 #define OE_DEPTH_BITS  12
 #define OE_IN_PACK_BITS14
+#define OE_Z_DELTA_BITS16
 
 /*
  * State flags for depth-first search used for analyzing delta cycles.
@@ -80,7 +81,6 @@ struct object_entry {
uint32_t hash;  /* name hint hash */
void *delta_data;   /* cached delta (uncompressed) */
unsigned long delta_size;   /* delta data size (uncompressed) */
-   unsigned long z_delta_size; /* delta data size (compressed) */
unsigned char in_pack_header_size; /* note: spare bits available! */
unsigned in_pack_idx:OE_IN_PACK_BITS;   /* already in pack */
unsigned type:TYPE_BITS;
@@ -93,13 +93,18 @@ struct object_entry {
unsigned no_try_delta:1;
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
-   unsigned dfs_state:OE_DFS_STATE_BITS;
 
/* XXX 8 bits hole, try to pack */
 
+   unsigned dfs_state:OE_DFS_STATE_BITS;
unsigned depth:OE_DEPTH_BITS;
+   /*
+* if delta_data contains a compressed delta, this contains
+* the compressed length
+   */
+   unsigned z_delta_size:OE_Z_DELTA_BITS;
 
-   /* size: 96, bit_padding: 18 bits */
+   /* size: 88, bit_padding: 2 bits */
 };
 
 struct packing_data {
-- 
2.16.2.873.g32ff258c87

[PATCH/RFC v3 00/12] Reduce pack-objects memory footprint

2018-03-08 Thread Nguyễn Thái Ngọc Duy

v3 cleans up a bit to avoid the horrible macros that v2 adds, and
adds some more documentation for this struct since it's getting harder
to just look at the struct and see what field is related to what.

v3 also adds three more patches and reduces another 16 bytes (total
struct reduction now is 41%). After this there's hardly anything else I
could do. Two 64-bit fields left, but even if I shrink them, I'd lose
it to padding. There's still one possibility to share in_pack_offset
with idx.offset, but it's risky.

These three patches are made to optimize for the common case. The
incommon cases will suffer some performance loss:

- 10/12 limits the cached compressed delta size to 64k (default 1000
  bytes). If you normally have lots of huge deltas, you're going to
  take a hit because more deltas must be recreated at writing phase.
  Note that it does not stop pack-objects from creating deltas larger
  than 64k.

- 11/12 reduces uncompressed object size to 4GB. Whenever we need to
  read object size of those larger than that, we read the pack again
  to retrieve the information, which is much slower than accessing a
  piece of memory. Again I'm assuming these giant blobs are really
  really rare that this performance hit won't matter.

- 12/12 is similar to 11/12 and reduces uncompressed delta size to
  4GB. Frankly a 4GB delta is still ridiculous, but I don't think we
  gain more by shrinking it further. If your packs have one of those
  giant deltas, it still works, delta_size will be read back from the
  pack again.

The following interdiff does _NOT_ cover the new patches, just the
first nine that v2 has.

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 55f19a1f18..82a4a95888 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -29,19 +29,13 @@
 #include "list.h"
 #include "packfile.h"
 
-#define DELTA(obj) \
-   ((obj)->delta_idx ? _pack.objects[(obj)->delta_idx - 1] : NULL)
-#define DELTA_CHILD(obj) \
-   ((obj)->delta_child_idx ? _pack.objects[(obj)->delta_child_idx - 1] 
: NULL)
-#define DELTA_SIBLING(obj) \
-   ((obj)->delta_sibling_idx ? _pack.objects[(obj)->delta_sibling_idx - 
1] : NULL)
-
-#define CLEAR_DELTA(obj) (obj)->delta_idx = 0
-#define CLEAR_DELTA_CHILD(obj) (obj)->delta_child_idx = 0
-#define CLEAR_DELTA_SIBLING(obj) (obj)->delta_sibling_idx = 0
-
-#define SET_DELTA(obj, val) (obj)->delta_idx = ((val) - to_pack.objects) + 1
-#define SET_DELTA_CHILD(obj, val) (obj)->delta_child_idx = ((val) - 
to_pack.objects) + 1
+#define IN_PACK(obj) oe_in_pack(_pack, obj)
+#define DELTA(obj) oe_delta(_pack, obj)
+#define DELTA_CHILD(obj) oe_delta_child(_pack, obj)
+#define DELTA_SIBLING(obj) oe_delta_sibling(_pack, obj)
+#define SET_DELTA(obj, val) oe_set_delta(_pack, obj, val)
+#define SET_DELTA_CHILD(obj, val) oe_set_delta_child(_pack, obj, val)
+#define SET_DELTA_SIBLING(obj, val) oe_set_delta_sibling(_pack, obj, val)
 
 static const char *pack_usage[] = {
N_("git pack-objects --stdout [...] [<  | < 
]"),
@@ -381,7 +375,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
 static off_t write_reuse_object(struct hashfile *f, struct object_entry *entry,
unsigned long limit, int usable_delta)
 {
-   struct packed_git *p = IN_PACK(_pack, entry);
+   struct packed_git *p = IN_PACK(entry);
struct pack_window *w_curs = NULL;
struct revindex_entry *revidx;
off_t offset;
@@ -492,7 +486,7 @@ static off_t write_object(struct hashfile *f,
 
if (!reuse_object)
to_reuse = 0;   /* explicit */
-   else if (!IN_PACK(_pack, entry))
+   else if (!IN_PACK(entry))
to_reuse = 0;   /* can't reuse what we don't have */
else if (entry->type == OBJ_REF_DELTA || entry->type == OBJ_OFS_DELTA)
/* check_object() decided it for us ... */
@@ -557,7 +551,7 @@ static enum write_one_status write_one(struct hashfile *f,
switch (write_one(f, DELTA(e), offset)) {
case WRITE_ONE_RECURSIVE:
/* we cannot depend on this one */
-   CLEAR_DELTA(e);
+   SET_DELTA(e, NULL);
break;
default:
break;
@@ -672,8 +666,8 @@ static struct object_entry **compute_write_order(void)
for (i = 0; i < to_pack.nr_objects; i++) {
objects[i].tagged = 0;
objects[i].filled = 0;
-   CLEAR_DELTA_CHILD([i]);
-   CLEAR_DELTA_SIBLING([i]);
+   SET_DELTA_CHILD([i], NULL);
+   SET_DELTA_SIBLING([i], NULL);
}
 
/*
@@ -1067,19 +1061,8 @@ static int want_object_in_pack(const struct object_id 
*oid,
 
want = 1;
 done:
-   if (want && *found_pack && !(*found_pack)->index) {
-   struct packed_git *p = *found_pack;
-
-   if (to_pack.in_pack_count >=

[PATCH/RFC v3 12/12] pack-objects: shrink delta_size field in struct object_entry

2018-03-08 Thread Nguyễn Thái Ngọc Duy

Allowing a delta size of 64 bits is crazy. Shrink this field down to
31 bits with one overflow bit.

If we encounter an existing delta larger than 2GB, we do not cache
delta_size at all and will get the value from oe_size(), potentially
from disk if it's larger than 4GB.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 24 ++--
 pack-objects.h | 30 +-
 2 files changed, 39 insertions(+), 15 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index db040e95db..0f65e0f243 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -30,10 +30,12 @@
 #include "packfile.h"
 
 #define IN_PACK(obj) oe_in_pack(_pack, obj)
+#define DELTA_SIZE(obj) oe_delta_size(_pack, obj)
 #define DELTA(obj) oe_delta(_pack, obj)
 #define DELTA_CHILD(obj) oe_delta_child(_pack, obj)
 #define DELTA_SIBLING(obj) oe_delta_sibling(_pack, obj)
 #define SET_DELTA(obj, val) oe_set_delta(_pack, obj, val)
+#define SET_DELTA_SIZE(obj, val) oe_set_delta_size(_pack, obj, val)
 #define SET_DELTA_CHILD(obj, val) oe_set_delta_child(_pack, obj, val)
 #define SET_DELTA_SIBLING(obj, val) oe_set_delta_sibling(_pack, obj, val)
 
@@ -140,7 +142,7 @@ static void *get_delta(struct object_entry *entry)
oid_to_hex((entry)->idx.oid));
delta_buf = diff_delta(base_buf, base_size,
   buf, size, _size, 0);
-   if (!delta_buf || delta_size != entry->delta_size)
+   if (!delta_buf || delta_size != DELTA_SIZE(entry))
die("delta size changed");
free(buf);
free(base_buf);
@@ -291,14 +293,14 @@ static unsigned long write_no_reuse_object(struct 
hashfile *f, struct object_ent
FREE_AND_NULL(entry->delta_data);
entry->z_delta_size = 0;
} else if (entry->delta_data) {
-   size = entry->delta_size;
+   size = DELTA_SIZE(entry);
buf = entry->delta_data;
entry->delta_data = NULL;
type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
} else {
buf = get_delta(entry);
-   size = entry->delta_size;
+   size = DELTA_SIZE(entry);
type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
}
@@ -1509,7 +1511,7 @@ static void check_object(struct object_entry *entry)
 */
entry->type = entry->in_pack_type;
SET_DELTA(entry, base_entry);
-   entry->delta_size = oe_size(entry);
+   SET_DELTA_SIZE(entry, oe_size(entry));
entry->delta_sibling_idx = base_entry->delta_child_idx;
SET_DELTA_CHILD(base_entry, entry);
unuse_pack(_curs);
@@ -1895,7 +1897,7 @@ static int try_delta(struct unpacked *trg, struct 
unpacked *src,
max_size = trg_size/2 - 20;
ref_depth = 1;
} else {
-   max_size = trg_entry->delta_size;
+   max_size = DELTA_SIZE(trg_entry);
ref_depth = trg->depth;
}
max_size = (uint64_t)max_size * (max_depth - src->depth) /
@@ -1966,10 +1968,12 @@ static int try_delta(struct unpacked *trg, struct 
unpacked *src,
delta_buf = create_delta(src->index, trg->data, trg_size, _size, 
max_size);
if (!delta_buf)
return 0;
+   if (delta_size >= maximum_unsigned_value_of_type(uint32_t))
+   return 0;
 
if (DELTA(trg_entry)) {
/* Prefer only shallower same-sized deltas. */
-   if (delta_size == trg_entry->delta_size &&
+   if (delta_size == DELTA_SIZE(trg_entry) &&
src->depth + 1 >= trg->depth) {
free(delta_buf);
return 0;
@@ -1984,7 +1988,7 @@ static int try_delta(struct unpacked *trg, struct 
unpacked *src,
free(trg_entry->delta_data);
cache_lock();
if (trg_entry->delta_data) {
-   delta_cache_size -= trg_entry->delta_size;
+   delta_cache_size -= DELTA_SIZE(trg_entry);
trg_entry->delta_data = NULL;
}
if (delta_cacheable(src_size, trg_size, delta_size)) {
@@ -1997,7 +2001,7 @@ static int try_delta(struct unpacked *trg, struct 
unpacked *src,
}
 
SET_DELTA(trg_entry, src_entry);
-   trg_entry->delta_size = delta_size;
+   SET_DELTA_SIZE(trg_entry, delta_size);
trg->depth = src->depth + 1;
 
return 1;
@@ -2120,11 +2124,11 @@ static void find_deltas(struct object_entry **list, 
unsigned *list_size,
if (entry->delta_data && !pack_to_stdout) {
unsigned long size;
 
-   size =

[PATCH/RFC v3 11/12] pack-objects: shrink size field in struct object_entry

2018-03-08 Thread Nguyễn Thái Ngọc Duy

It's very very rare that an uncompressd object is larger than
4GB (partly because Git does not handle those large files very well to
begin with). Let's optimize it for the common case where object size is
smaller than this limit.

Shrink size field down to 32 bits [1] and one overflow bit. If the size
is too large, we read it back from disk.

Add two compare helpers that can take advantage of the overflow
bit (e.g. if the file is 4GB+, chances are it's already larger than
core.bigFileThreshold and there's no point in comparing the actual
value).

There's no actual saving from this due to holes. Which should be gone in
the next patch.

[1] it's actually already 32 bits on 64-bit Windows

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 49 ++
 pack-objects.h | 48 +++--
 2 files changed, 77 insertions(+), 20 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 39920061e9..db040e95db 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -274,7 +274,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
 
if (!usable_delta) {
if (entry->type == OBJ_BLOB &&
-   entry->size > big_file_threshold &&
+   oe_size_greater_than(entry, big_file_threshold) &&
(st = open_istream(entry->idx.oid.hash, , , 
NULL)) != NULL)
buf = NULL;
else {
@@ -384,12 +384,13 @@ static off_t write_reuse_object(struct hashfile *f, 
struct object_entry *entry,
unsigned char header[MAX_PACK_OBJECT_HEADER],
  dheader[MAX_PACK_OBJECT_HEADER];
unsigned hdrlen;
+   unsigned long entry_size = oe_size(entry);
 
if (DELTA(entry))
type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
hdrlen = encode_in_pack_object_header(header, sizeof(header),
- type, entry->size);
+ type, entry_size);
 
offset = entry->in_pack_offset;
revidx = find_pack_revindex(p, offset);
@@ -406,7 +407,7 @@ static off_t write_reuse_object(struct hashfile *f, struct 
object_entry *entry,
datalen -= entry->in_pack_header_size;
 
if (!pack_to_stdout && p->index_version == 1 &&
-   check_pack_inflate(p, _curs, offset, datalen, entry->size)) {
+   check_pack_inflate(p, _curs, offset, datalen, entry_size)) {
error("corrupt packed object for %s",
  oid_to_hex(>idx.oid));
unuse_pack(_curs);
@@ -1412,6 +1413,8 @@ static void cleanup_preferred_base(void)
 
 static void check_object(struct object_entry *entry)
 {
+   unsigned long size;
+
if (IN_PACK(entry)) {
struct packed_git *p = IN_PACK(entry);
struct pack_window *w_curs = NULL;
@@ -1431,13 +1434,14 @@ static void check_object(struct object_entry *entry)
 */
used = unpack_object_header_buffer(buf, avail,
   ,
-  >size);
+  );
if (used == 0)
goto give_up;
 
if (type < 0)
die("BUG: invalid type %d", type);
entry->in_pack_type = type;
+   oe_set_size(entry, size);
 
/*
 * Determine if this is a delta and if so whether we can
@@ -1505,7 +1509,7 @@ static void check_object(struct object_entry *entry)
 */
entry->type = entry->in_pack_type;
SET_DELTA(entry, base_entry);
-   entry->delta_size = entry->size;
+   entry->delta_size = oe_size(entry);
entry->delta_sibling_idx = base_entry->delta_child_idx;
SET_DELTA_CHILD(base_entry, entry);
unuse_pack(_curs);
@@ -1513,14 +1517,17 @@ static void check_object(struct object_entry *entry)
}
 
if (entry->type) {
+   unsigned long size;
+
+   size = get_size_from_delta(p, _curs,
+   entry->in_pack_offset + 
entry->in_pack_header_size);
/*
 * This must be a delta and we already know what the
 * final object type is.  Let's extract the actual
 * object size from the delta header.
 */
-   entry->size = get_size_from_delta(p, _curs,
-   entry->in_pack_offset +

[PATCH/RFC v3 05/12] pack-objects: note about in_pack_header_size

2018-03-08 Thread Nguyễn Thái Ngọc Duy

Object header in a pack is packed really tight (see
pack-format.txt). Even with 8 bytes length, we need 9-10 bytes most,
plus a hash (20 bytes). Which means this field only needs to store a
number as big as 32 (5 bits).

This is trickier to pack tight though since a new hash algorithm is
coming, the number of bits needed may quickly increase. So leave it
for now.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 pack-objects.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pack-objects.h b/pack-objects.h
index 4b17402953..2ccd6359d2 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -78,7 +78,7 @@ struct object_entry {
unsigned long z_delta_size; /* delta data size (compressed) */
uint32_t hash;  /* name hint hash */
unsigned int in_pack_pos;
-   unsigned char in_pack_header_size;
+   unsigned char in_pack_header_size; /* note: spare bits available! */
unsigned type:TYPE_BITS;
unsigned in_pack_type:TYPE_BITS; /* could be delta */
unsigned preferred_base:1; /*
-- 
2.16.2.873.g32ff258c87

[PATCH/RFC v3 02/12] pack-objects: turn type and in_pack_type to bitfields

2018-03-08 Thread Nguyễn Thái Ngọc Duy

This saves 8 bytes in sizeof(struct object_entry). On a large
repository like linux-2.6.git (6.5M objects), this saves us 52MB
memory.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 14 --
 cache.h|  2 ++
 object.h   |  1 -
 pack-objects.h |  8 
 4 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5c674b2843..fd217cb51f 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1407,6 +1407,7 @@ static void check_object(struct object_entry *entry)
unsigned long avail;
off_t ofs;
unsigned char *buf, c;
+   enum object_type type;
 
buf = use_pack(p, _curs, entry->in_pack_offset, );
 
@@ -1415,11 +1416,15 @@ static void check_object(struct object_entry *entry)
 * since non-delta representations could still be reused.
 */
used = unpack_object_header_buffer(buf, avail,
-  >in_pack_type,
+  ,
   >size);
if (used == 0)
goto give_up;
 
+   if (type < 0)
+   die("BUG: invalid type %d", type);
+   entry->in_pack_type = type;
+
/*
 * Determine if this is a delta and if so whether we can
 * reuse it or not.  Otherwise let's find out as cheaply as
@@ -1559,6 +1564,7 @@ static void drop_reused_delta(struct object_entry *entry)
 {
struct object_entry **p = >delta->delta_child;
struct object_info oi = OBJECT_INFO_INIT;
+   enum object_type type;
 
while (*p) {
if (*p == entry)
@@ -1570,7 +1576,7 @@ static void drop_reused_delta(struct object_entry *entry)
entry->depth = 0;
 
oi.sizep = >size;
-   oi.typep = >type;
+   oi.typep = 
if (packed_object_info(entry->in_pack, entry->in_pack_offset, ) < 0) 
{
/*
 * We failed to get the info from this pack for some reason;
@@ -1580,6 +1586,10 @@ static void drop_reused_delta(struct object_entry *entry)
 */
entry->type = sha1_object_info(entry->idx.oid.hash,
   >size);
+   } else {
+   if (type < 0)
+   die("BUG: invalid type %d", type);
+   entry->type = type;
}
 }
 
diff --git a/cache.h b/cache.h
index 21fbcc2414..862bdff83a 100644
--- a/cache.h
+++ b/cache.h
@@ -373,6 +373,8 @@ extern void free_name_hash(struct index_state *istate);
 #define read_blob_data_from_cache(path, sz) 
read_blob_data_from_index(_index, (path), (sz))
 #endif
 
+#define TYPE_BITS 3
+
 enum object_type {
OBJ_BAD = -1,
OBJ_NONE = 0,
diff --git a/object.h b/object.h
index 87563d9056..8ce294d6ec 100644
--- a/object.h
+++ b/object.h
@@ -25,7 +25,6 @@ struct object_array {
 
 #define OBJECT_ARRAY_INIT { 0, 0, NULL }
 
-#define TYPE_BITS   3
 /*
  * object flag allocation:
  * revision.h:  0-1026
diff --git a/pack-objects.h b/pack-objects.h
index f834ead541..85b01b66da 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -60,11 +60,11 @@ struct object_entry {
void *delta_data;   /* cached delta (uncompressed) */
unsigned long delta_size;   /* delta data size (uncompressed) */
unsigned long z_delta_size; /* delta data size (compressed) */
-   enum object_type type;
-   enum object_type in_pack_type;  /* could be delta */
uint32_t hash;  /* name hint hash */
unsigned int in_pack_pos;
unsigned char in_pack_header_size;
+   unsigned type:TYPE_BITS;
+   unsigned in_pack_type:TYPE_BITS; /* could be delta */
unsigned preferred_base:1; /*
* we do not pack this, but is available
* to be used as the base object to delta
@@ -74,7 +74,7 @@ struct object_entry {
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
 
-   /* XXX 28 bits hole, try to pack */
+   /* XXX 22 bits hole, try to pack */
/*
 * State flags for depth-first search used for analyzing delta cycles.
 *
@@ -87,7 +87,7 @@ struct object_entry {
DFS_DONE
} dfs_state;
int depth;
-   /* size: 136, padding: 4 */
+   /* size: 128, padding: 4 */
 };
 
 struct packing_data {
-- 
2.16.2.873.g32ff258c87

[PATCH/RFC v3 08/12] pack-objects: refer to delta objects by index instead of pointer

2018-03-08 Thread Nguyễn Thái Ngọc Duy

Notice that packing_data::nr_objects is uint32_t, we could only handle
maximum 4G objects and can address all of them with an uint32_t. If we
use a pointer here, we waste 4 bytes on 64 bit architecture.

Convert these delta pointers to indexes. Since we need to handle NULL
pointers as well, the index is shifted by one [1].

There are holes in this struct but this patch is already big. Struct
packing can be done separately. Even with holes, we save 8 bytes per
object_entry.

[1] This means we can only index 2^32-2 objects even though nr_objects
could contain 2^32-1 objects. It should not be a problem in
practice because when we grow objects[], nr_alloc would probably
blow up long before nr_objects hits the wall.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 116 ++---
 pack-objects.h |  71 ++---
 2 files changed, 127 insertions(+), 60 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 7df525e201..82a4a95888 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -30,6 +30,12 @@
 #include "packfile.h"
 
 #define IN_PACK(obj) oe_in_pack(_pack, obj)
+#define DELTA(obj) oe_delta(_pack, obj)
+#define DELTA_CHILD(obj) oe_delta_child(_pack, obj)
+#define DELTA_SIBLING(obj) oe_delta_sibling(_pack, obj)
+#define SET_DELTA(obj, val) oe_set_delta(_pack, obj, val)
+#define SET_DELTA_CHILD(obj, val) oe_set_delta_child(_pack, obj, val)
+#define SET_DELTA_SIBLING(obj, val) oe_set_delta_sibling(_pack, obj, val)
 
 static const char *pack_usage[] = {
N_("git pack-objects --stdout [...] [<  | < 
]"),
@@ -127,11 +133,11 @@ static void *get_delta(struct object_entry *entry)
buf = read_sha1_file(entry->idx.oid.hash, , );
if (!buf)
die("unable to read %s", oid_to_hex(>idx.oid));
-   base_buf = read_sha1_file(entry->delta->idx.oid.hash, ,
+   base_buf = read_sha1_file(DELTA(entry)->idx.oid.hash, ,
  _size);
if (!base_buf)
die("unable to read %s",
-   oid_to_hex(>delta->idx.oid));
+   oid_to_hex((entry)->idx.oid));
delta_buf = diff_delta(base_buf, base_size,
   buf, size, _size, 0);
if (!delta_buf || delta_size != entry->delta_size)
@@ -288,12 +294,12 @@ static unsigned long write_no_reuse_object(struct 
hashfile *f, struct object_ent
size = entry->delta_size;
buf = entry->delta_data;
entry->delta_data = NULL;
-   type = (allow_ofs_delta && entry->delta->idx.offset) ?
+   type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
} else {
buf = get_delta(entry);
size = entry->delta_size;
-   type = (allow_ofs_delta && entry->delta->idx.offset) ?
+   type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
}
 
@@ -317,7 +323,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
 * encoding of the relative offset for the delta
 * base from this object's position in the pack.
 */
-   off_t ofs = entry->idx.offset - entry->delta->idx.offset;
+   off_t ofs = entry->idx.offset - DELTA(entry)->idx.offset;
unsigned pos = sizeof(dheader) - 1;
dheader[pos] = ofs & 127;
while (ofs >>= 7)
@@ -343,7 +349,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
return 0;
}
hashwrite(f, header, hdrlen);
-   hashwrite(f, entry->delta->idx.oid.hash, 20);
+   hashwrite(f, DELTA(entry)->idx.oid.hash, 20);
hdrlen += 20;
} else {
if (limit && hdrlen + datalen + 20 >= limit) {
@@ -379,8 +385,8 @@ static off_t write_reuse_object(struct hashfile *f, struct 
object_entry *entry,
  dheader[MAX_PACK_OBJECT_HEADER];
unsigned hdrlen;
 
-   if (entry->delta)
-   type = (allow_ofs_delta && entry->delta->idx.offset) ?
+   if (DELTA(entry))
+   type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
hdrlen = encode_in_pack_object_header(header, sizeof(header),
  type, entry->size);
@@ -408,7 +414,7 @@ static off_t write_reuse_object(struct hashfile *f, struct 
object_entry *entry,
}
 
if (type == OBJ_OFS_DELTA) {
-   off_t ofs = entry->idx.offset - entry->delta->idx.offset;
+   off_t ofs = entry->idx.offset - DELTA(entry)->idx.offset;
unsigned pos = sizeof(dheader) - 1;

[PATCH/RFC v3 07/12] pack-objects: move in_pack out of struct object_entry

2018-03-08 Thread Nguyễn Thái Ngọc Duy

Instead of using 8 bytes (on 64 bit arch) to store a pointer to a
pack. Use an index isntead since the number of packs should be
relatively small.

This limits the number of packs we can handle to 16k. For now if you hit
16k pack files limit, pack-objects will simply fail [1].

This technically saves 7 bytes. But we don't see any of that in
practice due to padding. The saving becomes real when we pack this
struct tighter later.

[1] The escape hatch is .keep file to limit the non-kept pack files
below 16k limit. Then you can go for another pack-objects run to
combine another 16k pack files. Repeat until you're satisfied.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-pack-objects.txt |  9 ++
 builtin/pack-objects.c | 40 +++-
 cache.h|  1 +
 pack-objects.h | 49 --
 4 files changed, 83 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 3503c9e3e6..b8d936ccf5 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -269,6 +269,15 @@ Unexpected missing object will raise an error.
locally created objects [without .promisor] and objects from the
promisor remote [with .promisor].)  This is used with partial clone.
 
+LIMITATIONS
+---
+
+This command could only handle 16384 existing pack files at a time.
+If you have more than this, you need to exclude some pack files with
+".keep" file and --honor-pack-keep option, to combine 16k pack files
+in one, then remove these .keep files and run pack-objects one more
+time.
+
 SEE ALSO
 
 linkgit:git-rev-list[1]
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 7bb5544883..7df525e201 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -29,6 +29,8 @@
 #include "list.h"
 #include "packfile.h"
 
+#define IN_PACK(obj) oe_in_pack(_pack, obj)
+
 static const char *pack_usage[] = {
N_("git pack-objects --stdout [...] [<  | < 
]"),
N_("git pack-objects [...]  [<  | < 
]"),
@@ -367,7 +369,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
 static off_t write_reuse_object(struct hashfile *f, struct object_entry *entry,
unsigned long limit, int usable_delta)
 {
-   struct packed_git *p = entry->in_pack;
+   struct packed_git *p = IN_PACK(entry);
struct pack_window *w_curs = NULL;
struct revindex_entry *revidx;
off_t offset;
@@ -478,7 +480,7 @@ static off_t write_object(struct hashfile *f,
 
if (!reuse_object)
to_reuse = 0;   /* explicit */
-   else if (!entry->in_pack)
+   else if (!IN_PACK(entry))
to_reuse = 0;   /* can't reuse what we don't have */
else if (entry->type == OBJ_REF_DELTA || entry->type == OBJ_OFS_DELTA)
/* check_object() decided it for us ... */
@@ -1024,7 +1026,7 @@ static int want_object_in_pack(const struct object_id 
*oid,
if (*found_pack) {
want = want_found_object(exclude, *found_pack);
if (want != -1)
-   return want;
+   goto done;
}
 
list_for_each(pos, _git_mru) {
@@ -1047,11 +1049,16 @@ static int want_object_in_pack(const struct object_id 
*oid,
if (!exclude && want > 0)
list_move(>mru, _git_mru);
if (want != -1)
-   return want;
+   goto done;
}
}
 
-   return 1;
+   want = 1;
+done:
+   if (want && *found_pack && !(*found_pack)->index)
+   oe_add_pack(_pack, *found_pack);
+
+   return want;
 }
 
 static void create_object_entry(const struct object_id *oid,
@@ -1074,7 +1081,7 @@ static void create_object_entry(const struct object_id 
*oid,
else
nr_result++;
if (found_pack) {
-   entry->in_pack = found_pack;
+   oe_set_in_pack(entry, found_pack);
entry->in_pack_offset = found_offset;
}
 
@@ -1399,8 +1406,8 @@ static void cleanup_preferred_base(void)
 
 static void check_object(struct object_entry *entry)
 {
-   if (entry->in_pack) {
-   struct packed_git *p = entry->in_pack;
+   if (IN_PACK(entry)) {
+   struct packed_git *p = IN_PACK(entry);
struct pack_window *w_curs = NULL;
const unsigned char *base_ref = NULL;
struct object_entry *base_entry;
@@ -1535,14 +1542,16 @@ static int pack_offset_sort(const void *_a, const void 
*_b)
 {
const struct object_entry *a = *(struct object_entry **)_a;
const struct object_entry *b = *(struct object_entry **)_b;
+   const struct packed_git

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-08 Thread Phillip Wood

On 07/03/18 07:26, Johannes Schindelin wrote:
> Hi Buga,
> 
> On Tue, 6 Mar 2018, Igor Djordjevic wrote:
> 
>> On 06/03/2018 19:12, Johannes Schindelin wrote:
>>>
> And I guess being consistent is pretty important, too - if you add new
> content during merge rebase, it should always show up in the merge,
> period. 

 Yes, that should make it easy for the user to know what to expect from
 rebase.
>>>
>>> [...]
>>>
>>> It will be slightly inconsistent. But in a defendable way, I think.
>>
>> I like where this discussion is heading, and here`s what I thought 
>> about it :)
>>
>> [...]
>>
>> Here`s a twist - not letting `merge` trying to be too smart by 
>> figuring out whether passed arguments correspond to rewritten 
>> versions of the original merge parents (which would be too 
>> restrictive, too, I`m afraid), but just be explicit about it, instead!
> 
> That's the missing piece, I think.
> 
>> So, it could be something like:
>>
>>  merge -C deadbee 123abc:cafecafe 234bcd:bedbedbed
> 
> I like where this is heading, too, but I do not think that we can do this
> on a per-MERGE_HEAD basis. The vast majority of merge commits, in
> practice, have two parents. So the `merge` command would actually only
> have one revision to merge (because HEAD is the implicit first parent). So
> that is easy.
> 
> But as soon as you go octopus, you can either perform an octopus merge, or
> rebase the original merge commit. You cannot really mix and match here.
> 
> Unless we reimplement the octopus merge (which works quite a bit
> differently from the "rebase merge commit" strategy, even if it is
> incremental, too), which has its own challenges: if there are merge
> conflicts before merging the last MERGE_HEAD, the octopus merge will exit
> with status 2, telling you "Should not be doing an octopus.". While we
> will want to keep merge conflict markers and continue with the "rebase the
> original merge commit" strategy.
> 
> And it would slam the door shut for adding support for *other* merge
> strategies to perform a more-than-two-parents merge.
> 
> Also, I do not think that it makes a whole lot of sense in practice to let
> users edit what will be used for "original parent". If the user wants to
> do complicated stuff, they can already do that, via `exec`. The `merge`
> command really should be about facilitating common workflows, guiding the
> user to what is sane.
> 
> Currently my favorite idea is to introduce a new flag: -R (for "rebase the
> original merge commit"). It would look like this:
> 
>   merge -R -C   # 
>
> This flag would of course trigger the consistency check (does the number
> of parents of the original merge commit agree with the parameter list? Was
> an original merge commit specified to begin with?), and it would not fall
> back to the recursive merge, but error out if that check failed.
> 
> Side note: I wonder whether we really need to perform the additional check
> that ensures that the  refers to the rewritten version of the
> original merge commit's parent.
> 
> Second side note: if we can fast-forward, currently we prefer that, and I
> think we should keep that behavior with -R, too.

I think that would be a good idea to avoid unpleasant surprises.

> If the user wants to force a new merge, they simply remove that -R flag.
> 
> What do you think?

I did wonder about using 'pick ' for rebasing merges and
keeping 'merge ...' for recreating them but I'm not sure if that is a
good idea. It has the advantage that the user cannot specify the wrong
parents for the merge to be rebased as 'git rebase' would work out if
the parents have been rebased, but maybe it's a bit magical to use pick
for merge commits. Also there isn't such a simple way for the user to go
from 'rabase this merge' to 'recreate this merge' as they'd have to
write the whole merge line themselves (though I guess something like
emacs' git-rebase.el would be able to help with that)

Best Wishes

Phillip


> Ciao,
> Dscho
>

Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-08 Thread Phillip Wood

On 06/03/18 18:12, Johannes Schindelin wrote:
> Hi Phillip,
> 
> On Tue, 6 Mar 2018, Phillip Wood wrote:
> 
>> On 03/03/18 00:29, Igor Djordjevic wrote:
>>>
>>> On 02/03/2018 12:31, Phillip Wood wrote:

> Thinking about it overnight, I now suspect that original proposal
> had a mistake in the final merge step. I think that what you did is
> a way to fix it, and I want to try to figure what exactly was wrong
> in the original proposal and to find simpler way of doing it right.
>
> The likely solution is to use original UM as a merge-base for final
> 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty
> natural though, as that's exactly UM from which both U1' and U2'
> have diverged due to rebasing and other history editing.

 Hi Sergey, I've been following this discussion from the sidelines,
 though I haven't had time to study all the posts in this thread in
 detail. I wonder if it would be helpful to think of rebasing a merge
 as merging the changes in the parents due to the rebase back into the
 original merge. So for a merge M with parents A B C that are rebased
 to A' B' C' the rebased merge M' would be constructed by (ignoring
 shell quoting issues)

 git checkout --detach M
 git merge-recursive A -- M A'
 tree=$(git write-tree)
 git merge-recursive B -- $tree B'
 tree=$(git write-tree)
 git merge-recursive C -- $tree C'
 tree=$(git write-tree)
 M'=$(git log --pretty=%B -1 M | git commit-tree -pA' -pB' -pC')

 This should pull in all the changes from the parents while preserving
 any evil conflict resolution in the original merge. It superficially
 reminds me of incremental merging [1] but it's so long since I looked at
 that I'm not sure if there are any significant similarities.

 [1] https://github.com/mhagger/git-imerge
>>>
>>> Interesting, from quick test[3], this seems to produce the same 
>>> result as that other test I previously provided[2], where temporary 
>>> commits U1' and U2' are finally merged with original M as a base :)
>>>
>>> Just that this looks like even more straight-forward approach...?
>>>
>>> The only thing I wonder of here is how would we check if the 
>>> "rebased" merge M' was "clean", or should we stop for user amendment? 
>>> With that other approach Sergey described, we have U1'==U2' to test with.
>>
>> I think (though I haven't rigorously proved to myself) that in the
>> absence of conflicts this scheme has well defined semantics (the merges
>> can be commuted), so the result should be predicable from the users
>> point of view so maybe it could just offer an option to stop.
> 
> I am not so sure that the result is independent of the order of the
> merges. In other words, I am not necessarily certain that it is impossible
> to concoct A,A',B,B' commits where merging B'/B before A'/A has a
> different result than merging A'/A before B'/B.
> 
> Remember, when constructing counter-examples to hypotheses, those
> counter-examples do not really *have* to make sense on their own. For
> example, A' could introduce *completely different* changes from A, and the
> same is true for B' and B.
> 
> I could imagine, for example, that using a ton of consecutive empty lines,
> and using patches that insert something into these empty lines (and are
> thusly inherently ambiguous when said set of empty lines has changed),
> could even introduce a merge conflict in one order, but no conflict in the
> other.

Yes I should have thought of that given I've just been working on making
'add -p' more robust when there are lots of identical lines.

> Even so, I think that merging in the order of the parents makes the most
> sense, and that using that strategy makes sense, too, because you really
> have to try hard to make it fail.
> 
> Ciao,
> Dscho
>

Re: [PATCH v2 2/3] add -p: allow line selection to be inverted

2018-03-08 Thread Phillip Wood

On 06/03/18 19:57, Junio C Hamano wrote:
> Phillip Wood  writes:
> 
>> From: Phillip Wood 
>>
>> If the list of lines to be selected begins with '^' select all the
>> lines except the ones listed.
> 
> There is "# Input that begins with '-'; unchoose" in list_and_choose
> helper.  Does it make things inconsistent to use '^' for negation
> like this patch does with it?
> 
Hmm yes, I think it probably does (I've just checked and git clean also
uses '-' for de-selection). I think I'll remove support for open-ended
ranges on the left side (it's not so hard to type '1-n' instead of '-n')
and use a leading '-' for inversion. I'm tempted to keep supporting 'n-'
to mean everything from 'n' to the last line though.

Best Wishes

Phillip

[PATCH] userdiff.c: Add C# async keyword in diff pattern

2018-03-08 Thread Thomas Levesque

Currently C# async methods are not shown in diff hunk headers. I just
added the async keyword to the csharp method pattern so that they are
properly detected.

Signed-off-by: Thomas Levesque 
---
 userdiff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/userdiff.c b/userdiff.c
index dbfb4e13cddce..b92caf42b27be 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -138,7 +138,7 @@ PATTERNS("csharp",
 /* Keywords */
 "!^[ 
\t]*(do|while|for|if|else|instanceof|new|return|switch|case|throw|catch|using)\n"
 /* Methods and constructors */
-"^[ 
\t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe)[
 \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[<>@._[:alnum:]]+[ \t]*\\(.*\\))[ \t]*$\n"
+"^[ 
\t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe|async)[
 \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[<>@._[:alnum:]]+[ \t]*\\(.*\\))[ \t]*$\n"
 /* Properties */
 "^[ 
\t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe)[
 \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[@._[:alnum:]]+)[ \t]*$\n"
 /* Type definitions */

--
https://github.com/git/git/pull/464

Ekofast aizdevuma paziņojums

2018-03-08 Thread ekofast

Tas ir, lai informētu jūs, ka Ekofast finansē aizdevumus gan vecajiem, 
gan jaunajiem klientiem, kuru minimālais diapazons ir 10 000,00 eiro un 
maksimāli 10 000 000 eiro. Vai jums ir kādas finansiālas grūtības? Vai 
jūs meklējat likumīgu aizdevumu? Noguris, meklējot aizdevumus un 
hipotēkas? Vai jūs esat noraidījis jūsu bankas? Vai jums ir nepieciešams 
aizdevums, lai dzēstu savus parādus / rēķinus? Tad jūsu finansiālā 
trauma ir beigusies, mēs piedāvājam kredītus ar 3% procentu likmi. Mēs 
esam sertificēti un uzticami. mēs varam jums palīdzēt ar finansiālu 
palīdzību. Sazinieties ar mums pa e-pastu ekof...@ekofastfin.com


Piesakieties tagad, aizpildot tālāk sniegto informāciju.

Vārds:
Aizdevuma summa:
Aizdevuma ilgums:
Valsts:
Adrese:
Telefona numurs.

Sveicieni
Reģistratūra.
Ekofast Finanses.

Re: [PATCH] userdiff.c: Add C# async keyword in diff pattern

2018-03-08 Thread Eric Sunshine

On Thu, Mar 8, 2018 at 5:04 AM, Thomas Levesque
 wrote:
> Currently C# async methods are not shown in diff hunk headers. I just
> added the async keyword to the csharp method pattern so that they are
> properly detected.

Thanks for the contribution. Please sign-off your patch (see
SubmittingPatches[1]).

[1]: https://github.com/git/git/blob/master/Documentation/SubmittingPatches#L278

> ---
> diff --git a/userdiff.c b/userdiff.c
> index dbfb4e13cddce..b92caf42b27be 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -138,7 +138,7 @@ PATTERNS("csharp",
>  /* Keywords */
>  "!^[ 
> \t]*(do|while|for|if|else|instanceof|new|return|switch|case|throw|catch|using)\n"
>  /* Methods and constructors */
> -"^[ 
> \t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe)[
>  \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[<>@._[:alnum:]]+[ \t]*\\(.*\\))[ \t]*$\n"
> +"^[ 
> \t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe|async)[
>  \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[<>@._[:alnum:]]+[ \t]*\\(.*\\))[ \t]*$\n"
>  /* Properties */
>  "^[ 
> \t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe)[
>  \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[@._[:alnum:]]+)[ \t]*$\n"
>  /* Type definitions */

Re: How to use filter-branch with --state-branch?

2018-03-08 Thread Ian Campbell

On Thu, 2018-03-08 at 10:25 +0100, Ævar Arnfjörð Bjarmason wrote:

> > The first filter-branch call required 7168 steps, so did the second call...
> > I also tried without the --prune option of remote update (I had to add
> > --force to the second filter-branch), but nothing changed.

You can see an example of the usage in:

https://git.kernel.org/pub/scm/linux/kernel/git/devicetree/devicetree-rebasing.git/

in the `scripts/` sub dir (flow is `cronjob` → `filter.sh` → `git
filter-branch...`.

I think the big difference is rather than `--all` you need to give it
the `previous..now` range since that is the update you wish to do
(first time around you just give it `now`).

The devicetree-rebasing scripting arranges that by keeping the previous
in a separate branch.

Ian.

[PATCH] userdiff.c: Add C# async keyword in diff pattern

2018-03-08 Thread Thomas Levesque

Currently C# async methods are not shown in diff hunk headers. I just
added the async keyword to the csharp method pattern so that they are
properly detected.
---
 userdiff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/userdiff.c b/userdiff.c
index dbfb4e13cddce..b92caf42b27be 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -138,7 +138,7 @@ PATTERNS("csharp",
 /* Keywords */
 "!^[ 
\t]*(do|while|for|if|else|instanceof|new|return|switch|case|throw|catch|using)\n"
 /* Methods and constructors */
-"^[ 
\t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe)[
 \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[<>@._[:alnum:]]+[ \t]*\\(.*\\))[ \t]*$\n"
+"^[ 
\t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe|async)[
 \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[<>@._[:alnum:]]+[ \t]*\\(.*\\))[ \t]*$\n"
 /* Properties */
 "^[ 
\t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe)[
 \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[@._[:alnum:]]+)[ \t]*$\n"
 /* Type definitions */

--
https://github.com/git/git/pull/464

[PATCH] git{,-blame}.el: remove old bitrotting Emacs code

2018-03-08 Thread Ævar Arnfjörð Bjarmason

The git-blame.el mode has been superseded by Emacs's own
vc-annotate (invoked by C-x v g). Users of the git.el mode are now
much better off using either Magit or the Git backend for Emacs's own
VC mode.

These modes were added over 10 years ago when Emacs's own Git support
was much less mature, and there weren't other mature modes in the wild
or shipped with Emacs itself.

These days these modes have very few if users, and users of git aren't
well served by us shipping these (some OS's install them alongside git
by default, which is confusing and leads users astray).

So let's remove these per Alexandre Julliard's message to the
ML[1]. If someone still wants these for some reason they're better
served by hosting these elsewhere (e.g. on ELPA), instead of us
distributing them with git.

1. "Re: [PATCH] git.el: handle default excludesfile
   properly" (87muzlwhb0@winehq.org) --
   https://public-inbox.org/git/87muzlwhb0@winehq.org/

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

On Tue, Mar 06 2018, Alexandre Julliard jotted:

> Junio C Hamano  writes:
>
>> Having said that, I am sorry to say that I am not sure if the copy
>> we have is the one to be patched, so I would appreciate if Alexandre
>> (cc'ed) can clarify the situation.  The last change done to our copy
>> of the script is from 2012, and I do not know if Alexandre is still
>> taking care of it here but the script is so perfect that there was
>> no need to update it for the past 5 years and we haven't seen an
>> update, or the caninical copy is now being maintained elsewhere and
>> we only have a stale copy, or what.
>
> This is the canonical version, and I guess in theory I'm still taking
> care of it. However, the need that git.el was originally addressing is
> now fulfilled by better tools. As such, I feel that it has outlived its
> usefulness, and I'm no longer actively developing it.
>
> I'd recommend that anybody still using it switch to Magit, which is
> being actively maintained, and IMO superior to git.el in all respects.

I think at this point it's best to remove both of these modes from
being distributed with Git, per this patch.

 contrib/emacs/.gitignore   |1 -
 contrib/emacs/Makefile |   21 -
 contrib/emacs/README   |   39 -
 contrib/emacs/git-blame.el |  483 -
 contrib/emacs/git.el   | 1704 
 5 files changed, 2248 deletions(-)
 delete mode 100644 contrib/emacs/.gitignore
 delete mode 100644 contrib/emacs/Makefile
 delete mode 100644 contrib/emacs/README
 delete mode 100644 contrib/emacs/git-blame.el
 delete mode 100644 contrib/emacs/git.el

diff --git a/contrib/emacs/.gitignore b/contrib/emacs/.gitignore
deleted file mode 100644
index c531d9867f..00
--- a/contrib/emacs/.gitignore
+++ /dev/null
@@ -1 +0,0 @@
-*.elc
diff --git a/contrib/emacs/Makefile b/contrib/emacs/Makefile
deleted file mode 100644
index 24d9312941..00
--- a/contrib/emacs/Makefile
+++ /dev/null
@@ -1,21 +0,0 @@
-## Build and install stuff
-
-EMACS = emacs
-
-ELC = git.elc git-blame.elc
-INSTALL ?= install
-INSTALL_ELC = $(INSTALL) -m 644
-prefix ?= $(HOME)
-emacsdir = $(prefix)/share/emacs/site-lisp
-RM ?= rm -f
-
-all: $(ELC)
-
-install: all
-   $(INSTALL) -d $(DESTDIR)$(emacsdir)
-   $(INSTALL_ELC) $(ELC:.elc=.el) $(ELC) $(DESTDIR)$(emacsdir)
-
-%.elc: %.el
-   $(EMACS) -batch -f batch-byte-compile $<
-
-clean:; $(RM) $(ELC)
diff --git a/contrib/emacs/README b/contrib/emacs/README
deleted file mode 100644
index 82368bdbff..00
--- a/contrib/emacs/README
+++ /dev/null
@@ -1,39 +0,0 @@
-This directory contains various modules for Emacs support.
-
-To make the modules available to Emacs, you should add this directory
-to your load-path, and then require the modules you want. This can be
-done by adding to your .emacs something like this:
-
-  (add-to-list 'load-path ".../git/contrib/emacs")
-  (require 'git)
-  (require 'git-blame)
-
-
-The following modules are available:
-
-* git.el:
-
-  Status manager that displays the state of all the files of the
-  project, and provides easy access to the most frequently used git
-  commands. The user interface is as far as possible compatible with
-  the pcl-cvs mode. It can be started with `M-x git-status'.
-
-* git-blame.el:
-
-  Emacs implementation of incremental git-blame.  When you turn it on
-  while viewing a file, the editor buffer will be updated by setting
-  the background of individual lines to a color that reflects which
-  commit it comes from.  And when you move around the buffer, a
-  one-line summary will be shown in the echo area.
-
-* vc-git.el:
-
-  This file used to contain the VC-mode backend for git, but it is no
-  longer distributed with git. It is now maintained as part of Emacs
-  and included in standard Emacs distributions starting from version
-  22.2.
-
-  If you have an earlier Emacs version, upgrading to Emacs 22 is
-  recommended, since the VC

Re: [PATCH] git.el: handle default excludesfile properly

2018-03-08 Thread Ævar Arnfjörð Bjarmason


On Wed, Mar 07 2018, Dorab Patel jotted:

> Thanks for updating us with the status of git.el.
>
> The last time I looked at magit, it was too heavyweight for my needs.
> I like the simplicity of git.el. But perhaps it's time for me to take
> another look at magit.

You can also check out the VC mode that ships with Emacs itself. I
prefer Magit, but the VC mode is more lightweight.

> On Tue, Mar 6, 2018 at 3:54 AM, Alexandre Julliard  
> wrote:
>> Junio C Hamano  writes:
>>
>>> Having said that, I am sorry to say that I am not sure if the copy
>>> we have is the one to be patched, so I would appreciate if Alexandre
>>> (cc'ed) can clarify the situation.  The last change done to our copy
>>> of the script is from 2012, and I do not know if Alexandre is still
>>> taking care of it here but the script is so perfect that there was
>>> no need to update it for the past 5 years and we haven't seen an
>>> update, or the caninical copy is now being maintained elsewhere and
>>> we only have a stale copy, or what.
>>
>> This is the canonical version, and I guess in theory I'm still taking
>> care of it. However, the need that git.el was originally addressing is
>> now fulfilled by better tools. As such, I feel that it has outlived its
>> usefulness, and I'm no longer actively developing it.
>>
>> I'd recommend that anybody still using it switch to Magit, which is
>> being actively maintained, and IMO superior to git.el in all respects.
>>
>> --
>> Alexandre Julliard
>> julli...@winehq.org

Re: How to use filter-branch with --state-branch?

2018-03-08 Thread Ævar Arnfjörð Bjarmason


On Tue, Mar 06 2018, Michele Locati jotted:

> Recent versions of git filter-branch command introduced the --state-branch
> option.
> BTW I can't find any info about how this can be actually used.
>
> We have this repository on github:
> https://github.com/concrete5/concrete5
>
> When someone pushes to that repo, we clone it and execute
> `git filter-branch --subdirectory-filter concrete`
> to extract the concrete directory, and we push the result to
> https://github.com/concrete5/concrete5-core
> (including all the branches and tags)
>
> The script at the moment is this one:
> https://github.com/concrete5/core_splitter/blob/70879e676b95160f7fc5d0ffc22b8f7420b0580b/bin/splitcore
>
> I tried to use the --state-branch option on a local mirror, so that we could
> do an incremental filtering. Here's the script:
>
> # Executed just one time
> git clone --no-checkout --mirror \
>https://github.com/concrete5/concrete5.git work
> cd work
> git filter-branch \
>--subdirectory-filter concrete \
>--tag-name-filter cat \
>--prune-empty \
>--state-branch FILTERBRANCH_STATE \
>-- --all
> # Executed every time the repo is updated
> git remote update --prune
> git filter-branch \
>--subdirectory-filter concrete \
>--tag-name-filter cat \
>--prune-empty \
>--state-branch FILTERBRANCH_STATE \
>-- --all
>
> The first filter-branch call required 7168 steps, so did the second call...
> I also tried without the --prune option of remote update (I had to add
> --force to the second filter-branch), but nothing changed.

CC-ing the author of that feature. Usually I'd just look at how the
tests for it work to answer your question, but I see this new feature
made it in recently with no tests for it, which doesn't make me very
happy :(

Re: What's cooking in git.git (Mar 2018, #02; Tue, 6)

2018-03-08 Thread Ævar Arnfjörð Bjarmason


On Wed, Mar 07 2018, Johannes Schindelin jotted:

> Hi Dan,
>
> On Tue, 6 Mar 2018, Junio C Hamano wrote:
>
>> * dj/runtime-prefix (2017-12-05) 4 commits
>>  . exec_cmd: RUNTIME_PREFIX on some POSIX systems
>>  . Makefile: add Perl runtime prefix support
>>  . Makefile: add support for "perllibdir"
>>  . Makefile: generate Perl header from template file
>>
>>  A build-time option has been added to allow Git to be told to refer
>>  to its associated files relative to the main binary, in the same
>>  way that has been possible on Windows for quite some time, for
>>  Linux, BSDs and Darwin.
>>
>>  Perhaps it is about time to reboot the effort?
>
> You probably missed this in the huge "What's cooking" mail. Are you game?

It would be great to have this rebooted now that my perl cleanup efforts
have un-blocked this. Will be happy to help review & test the next
iteration.

Shaye

2018-03-08 Thread Shaye Lynne Haver

Tôi là shaye Lynne Haver
đến từ Mỹ
tôi muốn giao tiếp
với bạn. Trân trọng,
Shaye
---
I am shaye Lynne Haver
from the United States
i wish to communicate
with you. Sincerely,
Shaye

Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-08 Thread Junio C Hamano

Johannes Schindelin  writes:

>> Non-textual semantic conflicts are made (in the best case just once)
>> as a separate commit on top of mechanical auto-merge whose focus is
>> predictability (rather than cleverness) done by Git, and then that
>> separate commit is kept outside the history.  When replaying these
>> merges to rebuild the 'pu' branch, after resetting the tip to
>> 'master', each topic is merged mechanically, and if such a fix-up
>> commit is present, "cherry-pick --no-commit" applies it and then
>> "commit --amend --no-edit" to adjust the merge.  I find it quite
>> valuable to have a separate record of what "evil" non-mechanical
>> adjustment was done, which I know won't be lost in the noise when
>> these merges need to be redone daily or more often.
>
> So essentially, you have something that `git rerere` would have learned,
> but as a commit?

You probably wouldn't be asking that if you read what you cut out
when you quoted above ;-) 

There are a collection of cherry-pickable commits in hierarchy under
refs/merge-fix.  They are indexed by the branch that will cause
semantic conflicts that do not involve textual conflicts at all (the
important implication of which is that 'rerere' fundamentally will
not trigger to help resolving them) [*1*], and are used to create
evil merge when a corresponding branch is merged to 'pu' (and down).

[Footnote]

*1* One topic adds an extra parameter to read_index_from() that has
been and still is defined in a file and merged to 'pu' first, while
another topic adds a new callsite for the same function in a file
that the former topic does not touch, hence a merge of the latter
topic has no textual conflict to the file with a new callsite, but
still needs adjusting.  That sort of think.

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-08 Thread Junio C Hamano

Johannes Schindelin  writes:

>> If we are talking about a drastic change, a few more days may not be
>> sufficient, but we are not in a hurry, as this already sounds like a
>> 2.18 material anyway.
>
> It is not at all a drastic change. It will actually make the current patch
> series better (simplifying the "can we fast-forward?" check).
>
> I just want to make sure that I already have Phillip's strategy working,
> but it will be yet another topic branch on top of the topic branch that
> will add support for octopus merges *after* the current --recreate-merges
> topic branch ;-)

Oh, if the "not redoing the merge afresh, but attempt to reuse the
previous merge" that was discussed is going to be done as an
update/addition to the "redo the merge afresh" you already had in
production forever (and I had in 'pu' for quite a while in various
polished-ness during iteration), then I do prefer merging down what
has already proven to be 'next' worthy without waiting for the
discussion and your local verification of Phillip's new thing,
especially given that you'll be giving an explicit control to the
users which variant of "merge" insn will be used and the addition
of the Phillip's thing won't be a backward-compatibility issue when
it comes later.

>> As you made it clear that it is OK not to merge the current one for now,
>> my objective of asking the question is already satisfied ;-)
>
> Depending how much GitMerge will occupy my time, I hope to have something
> polished by tomorrow.

My "for now" above was just for the coming few days.  Don't rush
things, and use valuable in-person time wisely, and have fun.

Thanks.

79 matches

Mail list logo