date:20180724

[no subject]

2018-07-24 Thread Sumitomo Rubber Industries





--
We have been trying to reach you.

Re: [RFC PATCH v2] Add 'human' date format

2018-07-24 Thread Linus Torvalds

On Tue, Jul 24, 2018 at 2:49 PM Junio C Hamano  wrote:
>
> But lack of TZ does not give me enough hint about which content it
> happened.  The fact that this was done late at night on weekend is
> indeed interesting, and I may not care what time it locally was for
> me, so perhaps this is an intended behaviour.

I'm not sure you could call it "intended". The TZ hiding did change as
I was playing with this.

The first version of the patch only hid the timezone if it matched the
current one.

Because, as you say, the time zone can be interesting not so much
because you care about *when* the commit happened, but you care about
*where* the commit happened. And that can be true even if the commit
is very old.

So the timezone data in some sense isn't necessarily about the date at all.

When I used it a bit more (I still have the "--date=auto" as my
default for the kernel), I decided I really don't much care about the
timezone. In any _individual_ case, the timezone looks fine, but when
you look at many different dates, it looks really odd how it sometimes
shows, and sometimes does not, particularly for old dates when it
really doesn't matter for the *time* reason.

So I decided that it's better to not show the timezone at all when you
show a real date.

But honestly, I don't claim to have a really strong argument. It's
just a choice. Nothing says it's the absolute right choice.

I pointed out that you can use "--date=human-local" to get an even
denser representation that gives you the human date without ever
having a TZ. But we don't have the reverse of "-local", which would
explicitly show the timezones.

Again, I think this is really because the timezone is about something
other than just the time. I think the "do we care *where* it was done
or not?" question in many ways is entirely independent of the time
question.

So right now the patch says

hide.tz |= !hide.date;

which ends up being good for the "times are roughly the same size"
(which I decided was a good thing - again, I don't really have a
hugely strong argument for it, it was a matter of me playing with
options).

But it would make equally much sense to say

hide.tz |= hide.time;

and just say that the timezone is hidden if it matches the current
one, or if the commit is just so old that we don't show the time at
all.

OR you could just say "timezone is always interesting, because you
want to know _where_ it was done regardless of _when_ it was done",
and just not hide the timezone at all.

I think all are "technically valid" choices to make. The one I made
was just a random personal preference, not necessarily the right one.

Could we extend on the "decorations" (like the "-local" thing)?
Absolutely.  I'm not sure it's worth doing, but it would certainly
solve the "different people have different preferences" issue.

I think that *if* we want to extend on the decorations, that would
probably still be a separate patch from the basic patch.

   Linus

Re: [GSoC] GSoC with git, week 12

2018-07-24 Thread Paul-Sebastian Ungureanu


Hello,

On 24.07.2018 21:55, Alban Gruin wrote:

Hi,

I published a new blog post here:

 https://blog.pa1ch.fr/posts/2018/07/24/en/gsoc2018-week12.html

Cheers,
Alban



Great!

Mine is also up:

https://ungps.github.io/2018/07/22/week-12/

Best,
Paul

Re: [PATCH] diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"

2018-07-24 Thread Stefan Beller

On Tue, Jul 24, 2018 at 3:09 PM Jonathan Nieder  wrote:
> Hm.  Looks like dimmed_zebra was introduced in v2.15.0-rc0~16^2~2
> (2017-06-30), so it has been around for a while (about a year).  But I
> would like to be able to simplify by getting rid of it.
>
> https://codesearch.debian.net/search?q=dimmed_zebra doesn't find any
> instances of it being used outside Git itself.
>
> https://twitter.com/kornelski/status/982247477020508162 (found with a
> web search) shows that some people may have it in configuration.

Thanks for the searches.

>
> I don't have any good ideas about deprecating more aggressively, and
> the patch looks good, so
>
> Reviewed-by: Jonathan Nieder 

Same here. I wonder if we actually ever want to deprecate it further than
this patch, as the additional code to support it is only 2 lines.
So the effort to deprecate it (more than this patch) is more than
keeping the misnamed version around forever.

Thanks!
Stefan

Re: [PATCH] pack-protocol: mention and point to docs for protocol v2

2018-07-24 Thread Jonathan Nieder

Brandon Williams wrote:

> If so I suggest we move away from the term "pack" protocol.  Mostly
> because maybe at some future date we don't only want to communicate to
> transfer packs.  So at the risk of bikeshedding (and because naming is
> hard) I think we should begin talking about the over the wire protocol
> as just that, the "wire protocol" or if we need to be more explicit the
> "git wire protocol". Thoughts?

Sounds fine to me.

You can call it Documentation/technical/git-protocol.txt,
since from the context it's clear that this is going over the
wire.

The main point of what Junio said is that it means the docs should
treat "git upload-archive" instead same way as "git upload-pack" and
"git receive-pack", instead of artifically separating the archive
file-oriented and the pack-oriented parts of the protocol.

Thanks,
Jonathan

Re: [PATCH 01/10] t: add tool to translate hash-related values

2018-07-24 Thread brian m. carlson

On Mon, Jun 11, 2018 at 03:47:43AM -0400, Eric Sunshine wrote:
> Here's what I had envisioned when reading your emails about OID lookup
> table functionality:
> 
> --- >8 ---
> test_detect_hash () {
> test_hash_algo=...
> }
> 
> test_oid_cache () {
> while read tag rest
> do
> case $tag in \#*) continue ;; esac
> 
> for x in $rest
> do
> k=${x%:*}
> v=${x#*:}
> if test "$k" = $test_hash_algo
> then
> eval "test_oid_$tag=$v"
> break
> fi
> done
> done
> }
> 
> test_oid () {
> if test $# -gt 1
> then
> test_oid_cache <<-EOF
> $*
> EOF
> fi
> eval "echo \$test_oid_$1"
> }
> --- >8 ---

I'd like to use this as a basis for my v2, but I would need your
sign-off in order to do that.  Would that be okay?
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: Hash algorithm analysis

2018-07-24 Thread brian m. carlson

On Tue, Jul 24, 2018 at 02:13:07PM -0700, Junio C Hamano wrote:
> Yup.  I actually was leaning toward saying "all of them are OK in
> practice, so the person who is actually spear-heading the work gets
> to choose", but if we picked SHA-256 now, that would not be a choice
> that Brian has to later justify for choosing against everybody
> else's wishes, which makes it the best choice ;-)

This looks like a rough consensus.  And fortunately, I was going to pick
SHA-256 and implemented it over the weekend.

Things I thought about in this regard:
* When you compare against SHA1DC, most vectorized SHA-256
  implementations are indeed faster, even without acceleration.
* If we're doing signatures with OpenPGP (or even, I suppose, CMS),
  we're going to be using SHA-2, so it doesn't make sense to have our
  security depend on two separate algorithms when either one of them
  alone could break the security when we could just depend on one.

I'll be sending out some patches, probably in a few days, with SHA-256
and some test fixes.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

signature.asc
Description: PGP signature

Re: [PATCH] diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"

2018-07-24 Thread Jonathan Nieder

Eric Sunshine wrote:

> Subject: diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"

It would be clearer to say something like

diff --color-moved: add "dimmed-zebra" synonym for "dimmed_zebra"

> The --color-moved "dimmed_zebra" mode (with an underscore) is an
> anachronism. Most options and modes are hyphenated. It is more difficult
> to type and somewhat more difficult to read than those which are
> hyphenated. Therefore, rename it to "dimmed-zebra", and nominally
> deprecate "dimmed_zebra".

Hm.  Looks like dimmed_zebra was introduced in v2.15.0-rc0~16^2~2
(2017-06-30), so it has been around for a while (about a year).  But I
would like to be able to simplify by getting rid of it.

https://codesearch.debian.net/search?q=dimmed_zebra doesn't find any
instances of it being used outside Git itself.

https://twitter.com/kornelski/status/982247477020508162 (found with a
web search) shows that some people may have it in configuration.

Is there anything we can do to make it possible to remove eventually?
For example, should we (eventually, after dimmed-zebra has existed
for some time) start warning when people use dimmed_zebra to encourage
them to use dimmed-zebra instead?

> Signed-off-by: Eric Sunshine 
> ---
>  Documentation/diff-options.txt | 3 ++-
>  diff.c | 4 +++-
>  t/t4015-diff-whitespace.sh | 4 ++--
>  3 files changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index 41064909ee..9195bcd398 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -282,10 +282,11 @@ zebra::
>   painted using either the 'color.diff.{old,new}Moved' color or
>   'color.diff.{old,new}MovedAlternative'. The change between
>   the two colors indicates that a new block was detected.
> -dimmed_zebra::
> +dimmed-zebra::
>   Similar to 'zebra', but additional dimming of uninteresting parts
>   of moved code is performed. The bordering lines of two adjacent
>   blocks are considered interesting, the rest is uninteresting.
> + `dimmed_zebra` is a deprecated synonym.

Thanks for including that note.  It means that when people see
dimmed_zebra in scripts or configuration they won't have to be
mystified about why it works.

I don't have any good ideas about deprecating more aggressively, and
the patch looks good, so

Reviewed-by: Jonathan Nieder 

Thanks.

Re: [PATCH 2/2] remote-odb: un-inline function remote_odb_reinit

2018-07-24 Thread Beat Bolli

Hi Jonathan

On 24.07.18 23:59, Jonathan Nieder wrote:
> Hi,
> 
> Beat Bolli wrote:
> 
>> When compiling under Apple LLVM version 9.1.0 (clang-902.0.39.2) with
>> "make DEVELOPER=1 DEVOPTS=pedantic", the compiler says
>>
>> remote-odb.c:87:2: error: static function 'remote_odb_do_init' is
>> used in an inline function with external linkage
>> [-Werror,-Wstatic-in-inline]
>>
>> Remove the inline specifier that would only make sense if
>> remote_odb_reinit were defined in the header file. Moving it into the
>> header is not possible because the called function remote_odb_do_init is
>> static and thus not visible from the includers of the header.
>>
>> Signed-off-by: Beat Bolli 
>> ---
>>  remote-odb.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> What branch does this apply to?

To pu as of today.

> [...]
>> --- a/remote-odb.c
>> +++ b/remote-odb.c
>> @@ -82,7 +82,7 @@ static inline void remote_odb_init(void)
>>  remote_odb_do_init(0);
>>  }
>>  
>> -inline void remote_odb_reinit(void)
>> +void remote_odb_reinit(void)
> 
> This looks like an oversight in
> https://public-inbox.org/git/20180713174959.16748-6-chrisc...@tuxfamily.org/:
> there isn't any reason for this function to be inline.
> 
> Christian, can you squash it in on your next reroll?

That would probably make sense. I didn't check how mature the topics
were that caused errors.

Beat

Re: [PATCH 2/2] remote-odb: un-inline function remote_odb_reinit

2018-07-24 Thread Jonathan Nieder

Hi,

Beat Bolli wrote:

> When compiling under Apple LLVM version 9.1.0 (clang-902.0.39.2) with
> "make DEVELOPER=1 DEVOPTS=pedantic", the compiler says
>
> remote-odb.c:87:2: error: static function 'remote_odb_do_init' is
> used in an inline function with external linkage
> [-Werror,-Wstatic-in-inline]
>
> Remove the inline specifier that would only make sense if
> remote_odb_reinit were defined in the header file. Moving it into the
> header is not possible because the called function remote_odb_do_init is
> static and thus not visible from the includers of the header.
>
> Signed-off-by: Beat Bolli 
> ---
>  remote-odb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

What branch does this apply to?

[...]
> --- a/remote-odb.c
> +++ b/remote-odb.c
> @@ -82,7 +82,7 @@ static inline void remote_odb_init(void)
>   remote_odb_do_init(0);
>  }
>  
> -inline void remote_odb_reinit(void)
> +void remote_odb_reinit(void)

This looks like an oversight in
https://public-inbox.org/git/20180713174959.16748-6-chrisc...@tuxfamily.org/:
there isn't any reason for this function to be inline.

Christian, can you squash it in on your next reroll?

Thanks,
Jonathan

[PATCH] diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"

2018-07-24 Thread Eric Sunshine

The --color-moved "dimmed_zebra" mode (with an underscore) is an
anachronism. Most options and modes are hyphenated. It is more difficult
to type and somewhat more difficult to read than those which are
hyphenated. Therefore, rename it to "dimmed-zebra", and nominally
deprecate "dimmed_zebra".

Signed-off-by: Eric Sunshine 
---
 Documentation/diff-options.txt | 3 ++-
 diff.c | 4 +++-
 t/t4015-diff-whitespace.sh | 4 ++--
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 41064909ee..9195bcd398 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -282,10 +282,11 @@ zebra::
painted using either the 'color.diff.{old,new}Moved' color or
'color.diff.{old,new}MovedAlternative'. The change between
the two colors indicates that a new block was detected.
-dimmed_zebra::
+dimmed-zebra::
Similar to 'zebra', but additional dimming of uninteresting parts
of moved code is performed. The bordering lines of two adjacent
blocks are considered interesting, the rest is uninteresting.
+   `dimmed_zebra` is a deprecated synonym.
 --
 
 --word-diff[=]::
diff --git a/diff.c b/diff.c
index dc53a19bab..4ffabb4965 100644
--- a/diff.c
+++ b/diff.c
@@ -268,10 +268,12 @@ static int parse_color_moved(const char *arg)
return COLOR_MOVED_ZEBRA;
else if (!strcmp(arg, "default"))
return COLOR_MOVED_DEFAULT;
+   else if (!strcmp(arg, "dimmed-zebra"))
+   return COLOR_MOVED_ZEBRA_DIM;
else if (!strcmp(arg, "dimmed_zebra"))
return COLOR_MOVED_ZEBRA_DIM;
else
-   return error(_("color moved setting must be one of 'no', 
'default', 'zebra', 'dimmed_zebra', 'plain'"));
+   return error(_("color moved setting must be one of 'no', 
'default', 'zebra', 'dimmed-zebra', 'plain'"));
 }
 
 int git_diff_ui_config(const char *var, const char *value, void *cb)
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 17df491a3a..8cdfa225ef 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1223,7 +1223,7 @@ test_expect_success 'plain moved code, inside file' '
test_cmp expected actual
 '
 
-test_expect_success 'detect permutations inside moved code -- dimmed_zebra' '
+test_expect_success 'detect permutations inside moved code -- dimmed-zebra' '
git reset --hard &&
cat <<-\EOF >lines.txt &&
long line 1
@@ -1271,7 +1271,7 @@ test_expect_success 'detect permutations inside moved 
code -- dimmed_zebra' '
test_config color.diff.newMovedDimmed "normal cyan" &&
test_config color.diff.oldMovedAlternativeDimmed "normal blue" &&
test_config color.diff.newMovedAlternativeDimmed "normal yellow" &&
-   git diff HEAD --no-renames --color-moved=dimmed_zebra --color |
+   git diff HEAD --no-renames --color-moved=dimmed-zebra --color |
grep -v "index" |
test_decode_color >actual &&
cat <<-\EOF >expected &&
-- 
2.18.0.345.g5c9ce644c3

Re: [PATCH 1/2] packfile: drop a repeated enum declaration

2018-07-24 Thread Jonathan Nieder

Hi,

Beat Bolli wrote:

> --- a/packfile.h
> +++ b/packfile.h
> @@ -6,7 +6,6 @@
>  /* in object-store.h */
>  struct packed_git;
>  struct object_info;
> -enum object_type;
>  
>  /*
>   * Generate the filename to be used for a pack file with checksum "sha1" and

Good catch.  This means packfile.h should #include "cache.h" ---
otherwise, it is not usable in a file that does

#include "git-compat-util.h"
#include "packfile.h"

Thanks and hope that helps,
Jonathan

[PATCH 0/2] Pedantic fixes for Apple clang

2018-07-24 Thread Beat Bolli

Following up on my previous series bb/pedantic for gcc, here are two
fixes for pedantic compilation under MacOS 10.13.6 (High Sierra) with
the command line tools of Xcode Version 9.4.1 (9F2000).

Beat Bolli (2):
  packfile: drop a repeated enum declaration
  remote-odb: un-inline function remote_odb_reinit

 packfile.h   | 1 -
 remote-odb.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

-- 
2.18.0

[PATCH 2/2] remote-odb: un-inline function remote_odb_reinit

2018-07-24 Thread Beat Bolli

When compiling under Apple LLVM version 9.1.0 (clang-902.0.39.2) with
"make DEVELOPER=1 DEVOPTS=pedantic", the compiler says

remote-odb.c:87:2: error: static function 'remote_odb_do_init' is
used in an inline function with external linkage
[-Werror,-Wstatic-in-inline]

Remove the inline specifier that would only make sense if
remote_odb_reinit were defined in the header file. Moving it into the
header is not possible because the called function remote_odb_do_init is
static and thus not visible from the includers of the header.

Signed-off-by: Beat Bolli 
---
 remote-odb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/remote-odb.c b/remote-odb.c
index 847a86505778..49cf8e30aa92 100644
--- a/remote-odb.c
+++ b/remote-odb.c
@@ -82,7 +82,7 @@ static inline void remote_odb_init(void)
remote_odb_do_init(0);
 }
 
-inline void remote_odb_reinit(void)
+void remote_odb_reinit(void)
 {
remote_odb_do_init(1);
 }
-- 
2.18.0

[PATCH 1/2] packfile: drop a repeated enum declaration

2018-07-24 Thread Beat Bolli

When compiling under Apple LLVM version 9.1.0 (clang-902.0.39.2) with
"make DEVELOPER=1 DEVOPTS=pedantic", the compiler says

error: redeclaration of already-defined enum 'object_type' is a GNU
extension [-Werror,-Wgnu-redeclared-enum]

According to https://en.cppreference.com/w/c/language/declarations
(section "Redeclaration"), a repeated declaration after the definition
is only legal for structs and unions, but not for enums.

Drop the belated declaration of enum object_type. It seems that each
includer of packfile.h includes the definition of the enum before
including packfile.h.

Signed-off-by: Beat Bolli 
---
 packfile.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/packfile.h b/packfile.h
index 51383774ec72..9b6198c4c7e0 100644
--- a/packfile.h
+++ b/packfile.h
@@ -6,7 +6,6 @@
 /* in object-store.h */
 struct packed_git;
 struct object_info;
-enum object_type;
 
 /*
  * Generate the filename to be used for a pack file with checksum "sha1" and
-- 
2.18.0

Re: [RFC PATCH v2] Add 'human' date format

2018-07-24 Thread Junio C Hamano

Linus Torvalds  writes:

> From: Linus Torvalds 
>
> This adds --date=human, which skips the timezone if it matches the
> current time-zone, and doesn't print the whole date if that matches (ie
> skip printing year for dates that are "this year", but also skip the
> whole date itself if it's in the last few days and we can just say what
> weekday it was).

The behavior of the code does not quite match my intuition, though.

$ date ;# to show that I am in -0700 zone
Tue Jul 24 14:42:09 PDT 2018
$ git show -s pk/rebase-in-c | head -n 3
commit d18b5221ba98fe8254c3f9922ba31b21d7c954af
Author: Pratik Karki 
Date:   Sun Jul 8 23:46:04 2018 +0545
$ git show --date=human -s pk/rebase-in-c | head -n 3
commit d18b5221ba98fe8254c3f9922ba31b21d7c954af
Author: Pratik Karki 
Date:   Sun Jul 8 23:46

It is sensible to omit the seconds; I do not really care about that
level of precision for an event that happened two weeks ago in a
different continent.

But lack of TZ does not give me enough hint about which content it
happened.  The fact that this was done late at night on weekend is
indeed interesting, and I may not care what time it locally was for
me, so perhaps this is an intended behaviour.

My Transaction,

2018-07-24 Thread MRS

Please did you recieve the email i sent you ?

Re: [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc)

2018-07-24 Thread Jeff King

On Tue, Jul 24, 2018 at 05:13:36PM +0200, Duy Nguyen wrote:

> I guess you have the starting points you need after Jeff's and Junio's
> explanation (and it would be great if cache-tree could actually be for
> for this two-way merge). But to make it easier for new people in
> future, maybe we should add this?
> 
> This is basically a ripoff of Junio's explanation with starting points
> (write-tree and index-format.txt). I wanted to incorporate some pieces
> from Jeff's too but I think Junio's already covered it well.
> 
> -- 8< --
> Subject: [PATCH] cache-tree.h: more description of what it is and what's it 
> used for

There is some discussion of this extension in
Documentation/technical/index-format.txt. But it's mostly the mechanical
bits, not how or why you would use it.

I like the idea of putting this explanation into the repo, though a lot
of it is pretty specific to "diff-index --cached", which could
potentially grow stale.

-Peff

Re: [RFC PATCH 0/5] Add delta islands support

2018-07-24 Thread Jeff King

On Tue, Jul 24, 2018 at 10:18:03AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >I think this is inherent in the scheme (we're losing some delta
> >opportunities). But I think it's also made worse because the delta
> >window gets clogged with candidates that are forbidden by the island
> >config.
> 
> Hmph, and the reason why objects that do not even belong to the same
> island to be usable as a base are in the object list in the first
> place is...?

Because we are doing a "git repack -ad" here. So we are considering
_all_ of the objects, but avoiding making deltas between some of them.

During an actual fetch, the islands are not used at all (but the delta
relationships left on disk are important for letting us reuse those
deltas as-is).

> >Repacking with a big --window helps (and doesn't take as long
> >as it otherwise might because we can reject some object pairs based
> >on islands before doing any computation on the content).
> 
> Ah, then yes, a large window with early culling based on the delta
> island criteria definitely sounds like the right solution to that
> problem.

I try to account for this somewhat by looking at islands in the sort we
do of the delta candidates.  But to be honest, I am not sure how much
sense that makes (but I did verify experimentally that it helps). Again,
this is one of the reasons I had not sent this previously: I am not at
all confident that it is doing the right thing in all places.

-Peff

Re: Hash algorithm analysis

2018-07-24 Thread Junio C Hamano

Linus Torvalds  writes:

> On Tue, Jul 24, 2018 at 12:01 PM Edward Thomson
>  wrote:
>>
>> Switching gears, if I look at this from the perspective of the libgit2
>> project, I would also prefer SHA-256 or SHA3 over blake2b.  To support
>> blake2b, we'd have to include - and support - that code ourselves.  But
>> to support SHA-256, we would simply use the system's crypto libraries
>> that we already take a dependecy on (OpenSSL, mbedTLS, CryptoNG, or
>> SecureTransport).
>
> I think this is probably the single strongest argument for sha256.
> "It's just there".

Yup.  I actually was leaning toward saying "all of them are OK in
practice, so the person who is actually spear-heading the work gets
to choose", but if we picked SHA-256 now, that would not be a choice
that Brian has to later justify for choosing against everybody
else's wishes, which makes it the best choice ;-)

Re: de-alphabetizing the documentation

2018-07-24 Thread Jonathan Nieder

Hi,

frede...@ofb.net wrote:

> Next week I should have time to send you a patch with the manual page
> reordered...

Yay!

>   although, unless you have a special 'diff' which can
> detect when text has been moved from one place to another, I'm
> guessing it would take you even longer to check the validity of the
> patch than it would for me to create it.

Fortunately we have "git diff --color-moved".

> However, I'm happy to do this or whatever other small projects you
> would like to delegate as far as improving readability of your
> documentation. I just need to know what is likely to be accepted.

Starting with this one seems fine.  Maybe people on list will have
ideas for followups on top, or maybe you'll have ideas for ways others
can help you, or both. ;-)

Thanks again,
Jonathan

Re: Hash algorithm analysis

2018-07-24 Thread Jonathan Nieder

Hi,

Linus Torvalds wrote:
> On Tue, Jul 24, 2018 at 12:01 PM Edward Thomson
>  wrote:

>> Switching gears, if I look at this from the perspective of the libgit2
>> project, I would also prefer SHA-256 or SHA3 over blake2b.  To support
>> blake2b, we'd have to include - and support - that code ourselves.  But
>> to support SHA-256, we would simply use the system's crypto libraries
>> that we already take a dependecy on (OpenSSL, mbedTLS, CryptoNG, or
>> SecureTransport).

Just to be clear, OpenSSL has built-in blake2b support.

[...]
> So I'm not a huge fan of sha256, partly because of my disappointment
> in lack of hw acceleration in releant markets (sure, it's fairly
> common in ARM, but nobody sane uses ARM for development because of
> _other_ reasons). And partly because I don't like how the internal
> data size is the same as the final hash. But that second issue is an
> annoyance with it, not a real issue - in the absence of weaknesses
> it's a non-issue, and any future weaknesses might affect any other
> choice too.

Thanks for saying this.  With this in mind, I think we have a clear
way forward: we should use SHA-256.

My main complaint about it is that it is not a tree hash, but the
common availability in libraries trumps that (versus SHA-256x16, say).
I also was excited about K12, both because I like a world where Keccak
gets wide hardware accelaration (improving PRNGs and other
applications) and because of Keccak team's helpfulness throughout the
process of helping us evaluate this, and it's possible that some day
in the future we may want to switch to something like it.  But today,
as mentioned in [1] and [2], there is value in settling on one
standard and SHA2-256 is the obvious standard today.

Thanks,
Jonathan

[1] 
https://public-inbox.org/git/cal9pxlynvlccqv1ftra3r4kuoamdzof29hjehv2jxrbhj1n...@mail.gmail.com/
[2] 
https://public-inbox.org/git/20180723183523.gb9...@aiede.svl.corp.google.com/

Re: [PATCH v1] checkout: optionally speed up "git checkout -b foo"

2018-07-24 Thread Ben Peart





On 7/24/2018 3:21 PM, Junio C Hamano wrote:

Ben Peart  writes:


From: Ben Peart 

If the new core.optimizecheckout config setting is set to true, speed up
"git checkout -b foo" by avoiding the work to merge the working tree.  This
is valid because no merge needs to occur - only creating the new branch/
updating the refs. Any other options force it through the old code path.

This change in behavior is off by default and behind the config setting so
that users have to opt-in to the optimized behavior.






We've been running with this patch internally for a long time but it was
rejected when I submitted it to the mailing list before because it
implicitly changes the behavior of checkout -b. Trying it again configured
behind a config setting as a potential solution for other optimizations to
checkout that could change the behavior as well.

https://public-inbox.org/git/20180724042740.gb13...@sigill.intra.peff.net/T/#m75afe3ab318d23f36334cf3a6e3d058839592469


An incorrect link?  It does not look like a thread that explains
what was previously submitted but failed.  The last paragraph looks
like a fine material below the three-dash line.



See my earlier reply about this section in:

https://public-inbox.org/git/xmqqh8koxwwi@gitster-ct.c.googlers.com/T/#mb31136a09dbc1a963a5a62e840b118ac33043edf




Signed-off-by: Ben Peart 
---

Notes:
 Base Ref: master
 Web-Diff: https://github.com/benpeart/git/commit/f43d934ce7
 Checkout: git fetch https://github.com/benpeart/git checkout-b-v1 && git 
checkout f43d934ce7

  Documentation/config.txt |  6 +++
  builtin/checkout.c   | 94 
  cache.h  |  1 +
  config.c |  5 +++
  environment.c|  1 +
  5 files changed, 107 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index a32172a43c..2c4f513bf1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -911,6 +911,12 @@ core.commitGraph::
Enable git commit graph feature. Allows reading from the
commit-graph file.
  
+core.optimizedCheckout

+   Speed up "git checkout -b foo" by skipping much of the work of a
+   full checkout command.  This changs the behavior as it will skip
+   merging the trees and updating the index and instead only create
+   and switch to the new ref.


By the way, why is it a core.* thing, not checkout.* thing?



I followed the naming convention used by core.sparsecheckout but I'm 
happy to call it whatever people want.



If a new feature is not necessarily recommendable for normal users
and it needs to be hidden behind an opt-in knob (I do not have a
strong opinion if that is or is not the case for this particular
feature at this point), the documentation for the knob should give a
bit more than "This chang(e)s the behavior" to the readers, I would
think, to be intellectually honest ;-).  Let's tell them what bad
things happen if we pretend that we switched the branch without
twoway merge and the index update to help them make an informed
decision.



I attempted to explain what the change in behavior was in the same 
sentence by saying what it skips and what it still does.  If that isn't 
intellectually honest, help me know how to better explain it so that it is.


Is this better?

Speed up "git checkout -b " by skipping the twoway merge and 
index update.  Instead, just create a new branch named  and 
switch to it.  The working directory and index are left unchanged.



+static int needs_working_tree_merge(const struct checkout_opts *opts,
+   const struct branch_info *old_branch_info,
+   const struct branch_info *new_branch_info)
+{
+   /*
+* We must do the merge if we are actually moving to a new
+* commit tree.


What's a "commit tree"?  Shouldn't it be just a "commit"?



Sure


+*/
+   if (!old_branch_info->commit || !new_branch_info->commit ||
+   oidcmp(_branch_info->commit->object.oid, 
_branch_info->commit->object.oid))
+   return 1;
+
+   /*
+* opts->patch_mode cannot be used with switching branches so is
+* not tested here
+*/
+
+   /*
+* opts->quiet only impacts output so doesn't require a merge
+*/
+
+   /*
+* Honor the explicit request for a three-way merge or to throw away
+* local changes
+*/
+   if (opts->merge || opts->force)
+   return 1;
+
+   /*
+* --detach is documented as "updating the index and the files in the
+* working tree" but this optimization skips those steps so fall through
+* to the regular code path.
+*/
+   if (opts->force_detach)
+   return 1;
+
+   /*
+* opts->writeout_stage cannot be used with switching branches so is
+* not tested here
+*/
+
+   /*
+* Honor the explicit ignore requests
+*/
+   if

Re: [PATCH] pack-protocol: mention and point to docs for protocol v2

2018-07-24 Thread Brandon Williams

On 07/24, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> >> Not about this patch, but I wonder if an organization along the
> >> following lines would make sense?
> >> 
> >>  1. Rename pack-protocol.txt to protocol-v1.txt.  Rename
> >> protocol-v2.txt to pack-protocol.txt.
> >> 
> >>  2. Make pack-protocol.txt self-contained, and remove any redundant
> >> sections from protocol-v1.txt.
> >> 
> >>  3. Add a new protocol-v2.txt that briefly describes the benefits and
> >> highlights of protocol v2, referring to pack-protocol.txt for
> >> details.
> >> 
> >> That way, newcomers of the future could read pack-protocol.txt and
> >> quickly glean the main protocol in (then) current use.
> >> 
> >> What do you think?
> >
> > I dislike the idea of renaming protocol-v2.txt to pack-protocol.txt.  I
> > agree that we should probably have protocol-v1 broken out into its own
> > file, taking the parts from pack-protocol.txt, but what really should
> > happen is that pack-protocol.txt could describe the basics of the wire
> > protocol (pkt-lines, the format of the various transports, etc) and then
> > refer to the protocol-v{1,2}.txt documents themselves.
> 
> WRT the naming, are we happy with the idea of (1) pretending that
> when we say 'protocol', there is nothing but the on-the-wire
> pkt-line protocol (i.e. that is why we call "protocol-v2" without
> giving any other adjective---are we sure we won't have need for any
> other kind of protocol?) and (2) tying the "pack" ness to the name of
> on-the-wire pkt-line protocol (i.e. that is where the name of the
> original pack-protocol.txt came from, as it started only for the
> packfile transfer---are we happy to keep newer protocols tied to
> "pack" the same way)?

If so I suggest we move away from the term "pack" protocol.  Mostly
because maybe at some future date we don't only want to communicate to
transfer packs.  So at the risk of bikeshedding (and because naming is
hard) I think we should begin talking about the over the wire protocol
as just that, the "wire protocol" or if we need to be more explicit the
"git wire protocol". Thoughts?

-- 
Brandon Williams

Re: Hash algorithm analysis

2018-07-24 Thread Linus Torvalds

On Tue, Jul 24, 2018 at 12:01 PM Edward Thomson
 wrote:
>
> Switching gears, if I look at this from the perspective of the libgit2
> project, I would also prefer SHA-256 or SHA3 over blake2b.  To support
> blake2b, we'd have to include - and support - that code ourselves.  But
> to support SHA-256, we would simply use the system's crypto libraries
> that we already take a dependecy on (OpenSSL, mbedTLS, CryptoNG, or
> SecureTransport).

I think this is probably the single strongest argument for sha256.
"It's just there".

The hardware acceleration hasn't become nearly as ubiquitous as I
would have hoped, and honestly, sha256 _needs_ hw acceleration more
than some of the alternatives in the first place.

But sha256 does have the big advantage of just having been around and
existing in pretty much every single crypto library.

So I'm not a huge fan of sha256, partly because of my disappointment
in lack of hw acceleration in releant markets (sure, it's fairly
common in ARM, but nobody sane uses ARM for development because of
_other_ reasons). And partly because I don't like how the internal
data size is the same as the final hash. But that second issue is an
annoyance with it, not a real issue - in the absence of weaknesses
it's a non-issue, and any future weaknesses might affect any other
choice too.

So hey, if people are actually at the point where the lack of choice
holds up development, we should just pick one. And despite what I've
said in this discussion, sha256 would have been my first choice, just
because it's the "obvious" choice. The exact same way that SHA1 was
the obvious choice (for pretty much the same infrastructure reasons)
back in 2005.

And maybe the hw acceleration landscape will actually improve. I think
AMD actually does do the SHA extensions in Zen/TR.

So I think Junio should just pick one. And I'll stand up and say

 "Let's just pick one.

  And sha256 is certainly the safe choice in that it won't strike
  anybody as being the _wrong_ choice per se, even if not everybody will
  necessarily agree it's the _bext_ choice".

but in the end I think Junio should be the final arbiter. I think all
of the discussed choices are perfectly fine in practice.

Btw, the one thing I *would* suggest is that the git community just
also says that the current hash is not SHA1, but SHA1DC.

Support for "plain" SHA1 should be removed entirely. If we add a lot
of new infrastructure to support a new more secure hash, we should not
have the old fallback for the known-weak one. Just make SHA1DC the
only one git can be built with.

   Linus

Re: [PATCH] Documentation/diff-options: explain different diff algorithms

2018-07-24 Thread Junio C Hamano

Stefan Beller  writes:

> On Mon, Jul 23, 2018 at 9:41 PM Jonathan Nieder  wrote:
>>
>> Hi,
>>
>> Stefan Beller wrote:
>>
>> > As a user I wondered what the diff algorithms are about. Offer at least
>> > a basic explanation on the differences of the diff algorithms.
>>
>> Interesting.  Let's see.
>>
>> [...]
>> > --- a/Documentation/diff-options.txt
>> > +++ b/Documentation/diff-options.txt
>> > @@ -94,16 +94,34 @@ diff" algorithm internally.
>> >   Choose a diff algorithm. The variants are as follows:
>> >  +
>> >  --
>> > -`default`, `myers`;;
>> > - The basic greedy diff algorithm. Currently, this is the default.
>> >  `minimal`;;
>> > + A diff as produced by the basic greedy algorithm described in
>>
>> Why this reordering?
>
> because Myers (below) is constructed from minimal + heuristic.
> ...
> So the "minimal" algorithm is the basic myers algorithm,
> and the "myers" algorithm is the extended version (extended
> with a heuristic to be fast in corner cases, still producing good
> enough diffs).

I am not sure that is a good reason for the target readers of this
document to move the default from the beginning to somewhere in the
middle.  For a textbook that explains different algorithms and gives
overview of how they work, that order may be appropriate, though.

>> > - Spend extra time to make sure the smallest possible diff is
>> > - produced.
>> > +`default`, `myers`;;
>> > + The same algorithm as `minimal`, extended with a heuristic to
>> > + reduce extensive searches. Currently, this is the default.
>>
>> Amusing --- this means that the Myers diff is spelled as "minimal"
>> instead of "myers".

Yeah, I had the same thought---for an end-user documentation update,
this change and the reordering does not feel like an improvement.

>> I wonder if these details (about all the algorithms) should go in a
>> separate Diff Algorithms section and this section could focus on
>> what use cases warrant using each, referring to that section for
>> details.  What do you think?

Good suggestion.

Re: [PATCH] pack-protocol: mention and point to docs for protocol v2

2018-07-24 Thread Junio C Hamano

Brandon Williams  writes:

>> Not about this patch, but I wonder if an organization along the
>> following lines would make sense?
>> 
>>  1. Rename pack-protocol.txt to protocol-v1.txt.  Rename
>> protocol-v2.txt to pack-protocol.txt.
>> 
>>  2. Make pack-protocol.txt self-contained, and remove any redundant
>> sections from protocol-v1.txt.
>> 
>>  3. Add a new protocol-v2.txt that briefly describes the benefits and
>> highlights of protocol v2, referring to pack-protocol.txt for
>> details.
>> 
>> That way, newcomers of the future could read pack-protocol.txt and
>> quickly glean the main protocol in (then) current use.
>> 
>> What do you think?
>
> I dislike the idea of renaming protocol-v2.txt to pack-protocol.txt.  I
> agree that we should probably have protocol-v1 broken out into its own
> file, taking the parts from pack-protocol.txt, but what really should
> happen is that pack-protocol.txt could describe the basics of the wire
> protocol (pkt-lines, the format of the various transports, etc) and then
> refer to the protocol-v{1,2}.txt documents themselves.

WRT the naming, are we happy with the idea of (1) pretending that
when we say 'protocol', there is nothing but the on-the-wire
pkt-line protocol (i.e. that is why we call "protocol-v2" without
giving any other adjective---are we sure we won't have need for any
other kind of protocol?) and (2) tying the "pack" ness to the name of
on-the-wire pkt-line protocol (i.e. that is where the name of the
original pack-protocol.txt came from, as it started only for the
packfile transfer---are we happy to keep newer protocols tied to
"pack" the same way)?

Re: [PATCH v1] config.c: fix msvc compile error

2018-07-24 Thread Taylor Blau

On Tue, Jul 24, 2018 at 03:30:10PM +, g...@jeffhostetler.com wrote:
> From: Jeff Hostetler 
>
> In commit fb0dc3bac135e9f6243bd6d293e8c9293c73b9cd code was added
> to builtin/config.c to define a new function and a forward declaration
> for an array of unknown size.  This causes a compile error under MSVC.

Thanks for spending time fixing that. fb0dc3bac1 (builtin/config.c:
support `--type=` as preferred alias for `--`, 2018-04-18)
is from me, so I owe you an extra thanks for patching up mistakes :-).

As others have noted in this thread, another patch was sent into similar
effect, which has already been queued, and I agree that we should prefer
that, since it's further along.

Thanks,
Taylor

Re: de-alphabetizing the documentation

2018-07-24 Thread frederik

Hello Jonathan,

Thank you for replying to me earlier. I just wanted to follow up on
this thread. Did you decide not to go with my proposal?

Next week I should have time to send you a patch with the manual page
reordered... although, unless you have a special 'diff' which can
detect when text has been moved from one place to another, I'm
guessing it would take you even longer to check the validity of the
patch than it would for me to create it.

However, I'm happy to do this or whatever other small projects you
would like to delegate as far as improving readability of your
documentation. I just need to know what is likely to be accepted.

Thanks,

Frederick

On Sat, Jul 07, 2018 at 06:09:26PM -0700, frede...@ofb.net wrote:
> Hi Jonathan,
> 
> If it's really just a matter of needing someone with a newcomer's
> perspective, then I'd be happy to look over the ordering of the git
> subcommands. You can run the command I provided to glean the frequency
> of each subcommand from your shell history, I'll look over the output
> and see if the ordering makes sense to me, and then you or someone
> else can rearrange the manual page to list the subcommands in this
> order. Is that a suitable plan?
> 
> Thanks,
> 
> Frederick
> 
> On Fri, Jul 06, 2018 at 04:47:15PM -0700, Jonathan Nieder wrote:
> > Hi,
> > 
> > Frederick Eaton wrote:
> > 
> > >   Unfortunately my contribution will have to be limited for the
> > > moment to making this suggestion, as I am extraordinarily busy. I hope
> > > it will not be too burdensome to add this item to your TODO list and
> > > keep it there until a willing volunteer comes along.
> > 
> > No problem.  If you have time to contribute later, we can wait. :)
> > 
> > > For what it's worth, I made extensive changes to the Arch Wiki Git
> > > article back in 2015, following an initial attempt of mine to
> > > understand various tutorials. It was the most prominent wiki-based Git
> > > documentation I could find at the time. The article has of course seen
> > > numerous improvements since then.
> > 
> > For reference: https://wiki.archlinux.org/index.php/git
> > 
> > > I don't think that it's really important to find a "best" ordering for
> > > commands or glossary terms; it's more a matter of finding someone who
> > > is willing to take responsibility for choosing a reasonable ordering.
> > > Presumably the head maintainer of this project could delegate the task
> > > to a qualified volunteer, not a newbie like myself but not necessarily
> > > someone with expert knowledge either.
> > 
> > I'd have to say, when I compare the troubles a new user and a
> > long-timer would run into, I conclude that the long-timers would be
> > more likely to produce worse documentation.  It is very difficult to
> > remember how new users see things.  The ideal skill set in fact has
> > nothing to do with level of Git expertise: to produce a good result, a
> > good technical writer would ask the right questions to gather
> > information from the experts and then test their documentation on
> > newcomers until it works well.
> > 
> > Based on the work you've described already having done, it sounds like
> > you'd be an ideal person to get this going, if you find yourself with
> > time for it.
> > 
> > >   It's too bad that a policy of
> > > not listing things alphabetically wasn't adopted from the beginning of
> > > this project, but I guess that's life.
> > 
> > From this thread I've been convinced that for this kind of reference
> > document, alphabetical organization within each section is a good
> > organization, provided each section is small enough (as in "git help"
> > output).
> > 
> > I'm also a fan of non reference documentation that can use a narrative
> > ordering instead (like "git help core-tutorial", except with more
> > modern commands).
> > 
> > > Thanks Eric for the pointer to "git help". This does indeed provide a
> > > finer and better grouping than the man-page (but it also looks like
> > > another candidate for de-alphabetization...!).
> > 
> > Indeed, copying that organization over from "git help" to the git(1)
> > manpage may be a good step for any interested people overhearing this
> > conversation.
> > 
> > As a first step, how about making git(1) recommend "git help", like
> > this?  It already recommends giteveryday(7) but the more interactive
> > first command might be useful for some people.
> > 
> > Thoughts?  Improvements?
> > 
> > -- >8 --
> > Subject: git doc: recommend "git help" as a starting point
> > 
> > The list of subcommands described in git(1) can be overwhelming.
> > Encourage newcomers to run "git help" to get a shorter list of
> > commands as a starting point.
> > 
> > Based on a suggestion by Frederick Eaton.
> > 
> > Signed-off-by: Jonathan Nieder 
> > ---
> >  Documentation/git.txt | 9 ++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Documentation/git.txt b/Documentation/git.txt
> >

Re: [PATCH v1] checkout: optionally speed up "git checkout -b foo"

2018-07-24 Thread Ben Peart





On 7/24/2018 2:42 PM, Eric Sunshine wrote:

On Tue, Jul 24, 2018 at 2:01 PM Ben Peart  wrote:

If the new core.optimizecheckout config setting is set to true, speed up


Maybe:

 Add core.optimizeCheckout config setting which, when true, speeds up



Sure


"git checkout -b foo" by avoiding the work to merge the working tree.  This
is valid because no merge needs to occur - only creating the new branch/
updating the refs. Any other options force it through the old code path.

This change in behavior is off by default and behind the config setting so
that users have to opt-in to the optimized behavior.

We've been running with this patch internally for a long time but it was
rejected when I submitted it to the mailing list before because it
implicitly changes the behavior of checkout -b. Trying it again configured
behind a config setting as a potential solution for other optimizations to
checkout that could change the behavior as well.


This paragraph is mere commentary which probably belongs below the
"---" line following your sign-off.



Hopefully this commentary (I'll move it below the --- line) is clearer:

We've been running with this patch internally for a long time but it was
rejected when I submitted it to the mailing list before [1] because it
implicitly changes the behavior of checkout -b as it no longer updates
the working directory.

I'm submitting it again behind a config setting so that it doesn't cause
any back compat issues unless the user explicitly opts in to the new
behavior. My hope is this same setting and model can be used if/when we
make other performance optimizations to checkout like using the cache
tree to avoid having to traverse the entire tree being discussed [2].

[1] 
https://public-inbox.org/git/20160909192520.4812-1-benpe...@microsoft.com/
[2] 
https://public-inbox.org/git/20180724042740.gb13...@sigill.intra.peff.net/T/#m75afe3ab318d23f36334cf3a6e3d058839592469



https://public-inbox.org/git/20180724042740.gb13...@sigill.intra.peff.net/T/#m75afe3ab318d23f36334cf3a6e3d058839592469


Is this link meant to reference the previous attempt of optimizing
"checkout -b"? Although there's a single mention of "checkout -b" in
that discussion, it doesn't seem to be the previous attempt or explain
why it was rejected.

It would be quite nice to see a discussion in both the commit message
and the documentation about the pros and cons of enabling this
optimization. That it was previously rejected suggests that there may
be serious or unexpected consequences. How will a typical user know
whether its use is desirable or not?


Signed-off-by: Ben Peart 
---
diff --git a/Documentation/config.txt b/Documentation/config.txt
@@ -911,6 +911,12 @@ core.commitGraph::
+core.optimizedCheckout
+   Speed up "git checkout -b foo" by skipping much of the work of a
+   full checkout command.  This changs the behavior as it will skip


s/changs/changes/


+   merging the trees and updating the index and instead only create
+   and switch to the new ref.
diff --git a/builtin/checkout.c b/builtin/checkout.c
@@ -471,6 +475,88 @@ static void setup_branch_path(struct branch_info *branch)
+static int needs_working_tree_merge(const struct checkout_opts *opts,
+   const struct branch_info *old_branch_info,
+   const struct branch_info *new_branch_info)
+{
+   /*
+* We must do the merge if we are actually moving to a new
+* commit tree.
+*/
+   if (!old_branch_info->commit || !new_branch_info->commit ||
+   oidcmp(_branch_info->commit->object.oid, 
_branch_info->commit->object.oid))
+   return 1;
+   [...]
+   return 0;
+}


This long list of special-case checks doesn't leave me too enthused,
however, that aside, this approach seems backward. Rather than erring
on the side of safety by falling back to the merging behavior, it errs
in the other direction, which may be a problem if this list of
special-case checks ever gets out of sync with 'checkout_opts'. That
is, if someone adds a new option which ought to employ the merging
behavior, but forgets to update this function, then this function will
incorrectly default to using the optimization.



I'm not thrilled with the long list either (the plethora of comments 
probably makes it appear worse than it is) but I don't see how flipping 
the logic around makes it fail if someone adds a new option.  The "meets 
all criteria for optimization" code can only test existing options.



A safer approach would be the inverse, namely:

 static int skip_worktree_merge(...)
 {
 if (...meets all criteria for optimization...)
 return 1;
 return 0;
 }


  static int merge_working_tree(const struct checkout_opts *opts,
   struct branch_info *old_branch_info,
   struct branch_info *new_branch_info,
{
+   /*
+* Skip merging the trees, updating the index, and work tree only if we
+

Re: What's cooking in git.git (Jul 2018, #02; Wed, 18)

2018-07-24 Thread Brandon Williams

On 07/24, Junio C Hamano wrote:
> Jonathan Tan  writes:
> 
> >> Here are the topics that have been cooking.  Commits prefixed with
> >> '-' are only in 'pu' (proposed updates) while commits prefixed with
> >> '+' are in 'next'.  The ones marked with '.' do not appear in any of
> >> the integration branches, but I am still holding onto them.
> >
> > What do you think about my fixes to protocol v2 tag following [1]? There
> > was some discussion about correctness vs the drop in performance, but it
> > seems to me that there is some consensus that the drop in performance is
> > OK.
> >
> > [1] 
> > https://public-inbox.org/git/cover.1528234587.git.jonathanta...@google.com/
> 
> Thanks for reminding.  I think I was waiting for Brandon or somebody
> else to say something after [2] as the final confirmation before
> queuing it, and then the thread was forgotten ;-)
> 
> Will pick it up; it seems to have some interaction with Brandon's
> 6d1700d5 ("fetch: refactor to make function args narrower",
> 2018-06-27), and I think the correct resolution is to move your
> removal of "&& !rs->nr" to do_fetch() function where that commit
> moved to.
> 
> Thanks.
> 
> [2] https://public-inbox.org/git/xmqqd0vwcfkr@gitster-ct.c.googlers.com/ 

Yeah I still don't like it from a performance perspective, but given
people rely on this functionality I've been convinced its necessary for
correctness until we make other changes.

-- 
Brandon Williams

Re: What's cooking in git.git (Jul 2018, #02; Wed, 18)

2018-07-24 Thread Junio C Hamano

Jonathan Tan  writes:

>> Here are the topics that have been cooking.  Commits prefixed with
>> '-' are only in 'pu' (proposed updates) while commits prefixed with
>> '+' are in 'next'.  The ones marked with '.' do not appear in any of
>> the integration branches, but I am still holding onto them.
>
> What do you think about my fixes to protocol v2 tag following [1]? There
> was some discussion about correctness vs the drop in performance, but it
> seems to me that there is some consensus that the drop in performance is
> OK.
>
> [1] 
> https://public-inbox.org/git/cover.1528234587.git.jonathanta...@google.com/

Thanks for reminding.  I think I was waiting for Brandon or somebody
else to say something after [2] as the final confirmation before
queuing it, and then the thread was forgotten ;-)

Will pick it up; it seems to have some interaction with Brandon's
6d1700d5 ("fetch: refactor to make function args narrower",
2018-06-27), and I think the correct resolution is to move your
removal of "&& !rs->nr" to do_fetch() function where that commit
moved to.

Thanks.

[2] https://public-inbox.org/git/xmqqd0vwcfkr@gitster-ct.c.googlers.com/

Re: [RFC] push: add documentation on push v2

2018-07-24 Thread Brandon Williams

On 07/17, Brandon Williams wrote:
> Signed-off-by: Brandon Williams 
> ---
> 
> Since introducing protocol v2 and enabling fetch I've been thinking
> about what its inverse 'push' would look like.  After talking with a
> number of people I have a longish list of things that could be done to
> improve push and I think I've been able to distill the core features we
> want in push v2.  Thankfully (due to the capability system) most of the
> other features/improvements can be added later with ease.
> 
> What I've got now is a rough design for a more flexible push, more
> flexible because it allows for the server to do what it wants with the
> refs that are pushed and has the ability to communicate back what was
> done to the client.  The main motivation for this is to work around
> issues when working with Gerrit and other code-review systems where you
> need to have Change-Ids in the commit messages (now the server can just
> insert them for you and send back new commits) and you need to push to
> magic refs to get around various limitations (now a Gerrit server should
> be able to communicate that pushing to 'master' doesn't update master
> but instead creates a refs/changes/ ref).
> 
> Before actually moving to write any code I'm hoping to get some feedback
> on if we think this is an acceptable base design for push (other
> features like atomic-push, signed-push, etc can be added as
> capabilities), so any comments are appreciated.
> 
>  Documentation/technical/protocol-v2.txt | 76 +
>  1 file changed, 76 insertions(+)

Pinging this thread again to hopefully reach some more people for
commentary.  Looking back through the comments so far there are concerns
that a server shouldn't be trusted rewriting my local changes, so to
address that we could have the be a config option which is defaulted to
not take changes from a server.

Apart from that I didn't see any other major concerns.  I'm hoping to
get a bit more discussion going before actually beginning work on this.

-- 
Brandon Williams

[PATCH v4] Makefile: add a DEVOPTS flag to get pedantic compilation

2018-07-24 Thread Beat Bolli

In the interest of code hygiene, make it easier to compile Git with the
flag -pedantic.

Pure pedantic compilation with GCC 7.3 results in one warning per use of
the translation macro `N_`:

warning: array initialized from parenthesized string constant [-Wpedantic]

Therefore also disable the parenthesising of i18n strings with
-DUSE_PARENS_AROUND_GETTEXT_N=0.

Signed-off-by: Beat Bolli 
---

Now with -DUSE_PARENS_AROUND_GETTEXT_N=0 instead of =No.

This is the convenience knob for all developers that led to the series
bb/pedantic[1]. It does not depend on this series, though.

[1] https://public-inbox.org/git/20180708144342.11922-1-dev+...@drbeat.li/T/#u

 Makefile   | 6 ++
 config.mak.dev | 5 +
 2 files changed, 11 insertions(+)

diff --git a/Makefile b/Makefile
index 0cb6590f24..2bfc051652 100644
--- a/Makefile
+++ b/Makefile
@@ -484,6 +484,12 @@ all::
 #The DEVELOPER mode enables -Wextra with a few exceptions. By
 #setting this flag the exceptions are removed, and all of
 #-Wextra is used.
+#
+#pedantic:
+#
+#Enable -pedantic compilation. This also disables
+#USE_PARENS_AROUND_GETTEXT_N to produce only relevant warnings.
 
 GIT-VERSION-FILE: FORCE
@$(SHELL_PATH) ./GIT-VERSION-GEN
diff --git a/config.mak.dev b/config.mak.dev
index 2d244ca470..e11dd94741 100644
--- a/config.mak.dev
+++ b/config.mak.dev
@@ -1,6 +1,11 @@
 ifeq ($(filter no-error,$(DEVOPTS)),)
 CFLAGS += -Werror
 endif
+ifneq ($(filter pedantic,$(DEVOPTS)),)
+CFLAGS += -pedantic
+# don't warn for each N_ use
+CFLAGS += -DUSE_PARENS_AROUND_GETTEXT_N=0
+endif
 CFLAGS += -Wdeclaration-after-statement
 CFLAGS += -Wno-format-zero-length
 CFLAGS += -Wold-style-definition
-- 
2.18.0.203.gfac676dfb9

Re: [PATCH v1] checkout: optionally speed up "git checkout -b foo"

2018-07-24 Thread Junio C Hamano

Ben Peart  writes:

> From: Ben Peart 
>
> If the new core.optimizecheckout config setting is set to true, speed up
> "git checkout -b foo" by avoiding the work to merge the working tree.  This
> is valid because no merge needs to occur - only creating the new branch/
> updating the refs. Any other options force it through the old code path.
>
> This change in behavior is off by default and behind the config setting so
> that users have to opt-in to the optimized behavior.




> We've been running with this patch internally for a long time but it was
> rejected when I submitted it to the mailing list before because it
> implicitly changes the behavior of checkout -b. Trying it again configured
> behind a config setting as a potential solution for other optimizations to
> checkout that could change the behavior as well.
>
> https://public-inbox.org/git/20180724042740.gb13...@sigill.intra.peff.net/T/#m75afe3ab318d23f36334cf3a6e3d058839592469

An incorrect link?  It does not look like a thread that explains
what was previously submitted but failed.  The last paragraph looks
like a fine material below the three-dash line.


> Signed-off-by: Ben Peart 
> ---
>
> Notes:
> Base Ref: master
> Web-Diff: https://github.com/benpeart/git/commit/f43d934ce7
> Checkout: git fetch https://github.com/benpeart/git checkout-b-v1 && git 
> checkout f43d934ce7
>
>  Documentation/config.txt |  6 +++
>  builtin/checkout.c   | 94 
>  cache.h  |  1 +
>  config.c |  5 +++
>  environment.c|  1 +
>  5 files changed, 107 insertions(+)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index a32172a43c..2c4f513bf1 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -911,6 +911,12 @@ core.commitGraph::
>   Enable git commit graph feature. Allows reading from the
>   commit-graph file.
>  
> +core.optimizedCheckout
> + Speed up "git checkout -b foo" by skipping much of the work of a
> + full checkout command.  This changs the behavior as it will skip
> + merging the trees and updating the index and instead only create
> + and switch to the new ref.

By the way, why is it a core.* thing, not checkout.* thing?

If a new feature is not necessarily recommendable for normal users
and it needs to be hidden behind an opt-in knob (I do not have a
strong opinion if that is or is not the case for this particular
feature at this point), the documentation for the knob should give a
bit more than "This chang(e)s the behavior" to the readers, I would
think, to be intellectually honest ;-).  Let's tell them what bad
things happen if we pretend that we switched the branch without
twoway merge and the index update to help them make an informed
decision.

> +static int needs_working_tree_merge(const struct checkout_opts *opts,
> + const struct branch_info *old_branch_info,
> + const struct branch_info *new_branch_info)
> +{
> + /*
> +  * We must do the merge if we are actually moving to a new
> +  * commit tree.

What's a "commit tree"?  Shouldn't it be just a "commit"?

> +  */
> + if (!old_branch_info->commit || !new_branch_info->commit ||
> + oidcmp(_branch_info->commit->object.oid, 
> _branch_info->commit->object.oid))
> + return 1;
> +
> + /*
> +  * opts->patch_mode cannot be used with switching branches so is
> +  * not tested here
> +  */
> +
> + /*
> +  * opts->quiet only impacts output so doesn't require a merge
> +  */
> +
> + /*
> +  * Honor the explicit request for a three-way merge or to throw away
> +  * local changes
> +  */
> + if (opts->merge || opts->force)
> + return 1;
> +
> + /*
> +  * --detach is documented as "updating the index and the files in the
> +  * working tree" but this optimization skips those steps so fall through
> +  * to the regular code path.
> +  */
> + if (opts->force_detach)
> + return 1;
> +
> + /*
> +  * opts->writeout_stage cannot be used with switching branches so is
> +  * not tested here
> +  */
> +
> + /*
> +  * Honor the explicit ignore requests
> +  */
> + if (!opts->overwrite_ignore || opts->ignore_skipworktree ||
> + opts->ignore_other_worktrees)
> + return 1;
> +
> + /*
> +  * opts->show_progress only impacts output so doesn't require a merge
> +  */
> +
> + /*
> +  * If we aren't creating a new branch any changes or updates will
> +  * happen in the existing branch.  Since that could only be updating
> +  * the index and working directory, we don't want to skip those steps
> +  * or we've defeated any purpose in running the command.
> +  */
> + if (!opts->new_branch)
> + return 1;
> +
> + /*
> +  * new_branch_force is defined to "create/reset and checkout a branch"
> +  *

Re: [PATCH v1] config.c: fix msvc compile error

2018-07-24 Thread Beat Bolli

On 24.07.18 20:50, Junio C Hamano wrote:
> Beat Bolli  writes:
> 
>> On 24.07.18 20:22, Junio C Hamano wrote:
>>
 This was already fixed (differently) in
 <20180705183445.30901-1-dev+...@drbeat.li>.
>>>
>>> Thanks for saving me from having to dig the list archive myself.
>>> Yes, it is already applied to the tip of the topic that originally
>>> caused the breakage.
>>>
>> Just a general question:
>>
>> Is it OK to refer to patches on pu with the Message-ID, or would you
>> prefer the commit hash? The hash changes whenever you recreate pu,
>> doesn't it?
> 
> Either is fine in practice.  The commits themselves on a topic
> branch that is not yet in 'next' usually stay the same once the tip
> of 'pu' that contains them gets published.  Even though I often use
> "git rebase -i", "git commit --amend", etc. to fix up posted patches
> while turning them into commits on topic branches, I usually stop
> doing so once I push out day's integration result.
> 
> Until a new version of the series is posted to replace them on the
> topic branch, that is.  But at that point we are talking about new
> patches with different message-ids that got turned into different
> commit objects, so either commit object name or message id that
> refer to older iteration would still name the same old version, and
> new names would refer to the same new version.
> 

Ok, thanks!

Re: Hash algorithm analysis

2018-07-24 Thread Edward Thomson

On Fri, Jul 20, 2018 at 09:52:20PM +, brian m. carlson wrote:
> 
> To summarize the discussion that's been had in addition to the above,
> Ævar has also stated a preference for SHA-256 and I would prefer BLAKE2b
> over SHA-256 over SHA3-256, although any of them would be fine.
> 
> Are there other contributors who have a strong opinion?  Are there
> things I can do to help us coalesce around an option?

Overall, I prefer SHA-256.

I mentioned this at the contributor summit - so this may have been
captured in the notes.  But if not, when I look at this from the
perspective of my day job at Notorious Big Software Company, we would
prefer SHA-256 due to its performance characteristics and the
availability of hardware acceleration.  We think about git object ids
in a few different ways:

Obviously we use git as a version control system - we have a significant
investment in hosting repositories (for both internal Microsoft teams
and our external customers).  What may be less obvious is that often,
git blob ids are used as fingerprints: on a typical Windows machine,
you don't have the command-line hash functions (md5sum and friends),
but every developer has git installed.  So we end up calculating git
object ids in places within the development pipeline that are beyond the
scope of just version control.  Not to dwell too much on implementation
details, but this is especially advantageous for us in (say) labs where
we can ensure that particular hardware is available to speed this up as
necessary.

Switching gears, if I look at this from the perspective of the libgit2
project, I would also prefer SHA-256 or SHA3 over blake2b.  To support
blake2b, we'd have to include - and support - that code ourselves.  But
to support SHA-256, we would simply use the system's crypto libraries
that we already take a dependecy on (OpenSSL, mbedTLS, CryptoNG, or
SecureTransport).  All of those support SHA-256 and none of them include
support for blake2b.  That means if there's a problem with (say)
OpenSSL's SHA-256 implementation, then it will be fixed by their vendor.
If there's a problem with libb2, then that's now my responsibility.

This is not to suggest that one library is of higher or lower quality
than another.  And surely we would try to use the same blake2b library
that git itself is using to minimize some of this risk (so that at least
we're all in the same boat and can leverage each other's communications
to users) but even then, there will be inevitable drift between our
vendored dependencies and the upstream code.  You can see this in action
in xdiff: git's xdiff has deviated from upstream, and libgit2 has taken
git's and ours has deviated from that.

Cheers-
-ed

Re: [RFC] push: add documentation on push v2

2018-07-24 Thread Brandon Williams

On 07/20, Jeff Hostetler wrote:
> 
> 
> On 7/18/2018 1:15 PM, Brandon Williams wrote:
> > On 07/18, Stefan Beller wrote:
> > > On Wed, Jul 18, 2018 at 6:31 AM Derrick Stolee  wrote:
> > > > 
> > > > On 7/17/2018 7:25 PM, Stefan Beller wrote:
> > > > > On Tue, Jul 17, 2018 at 2:09 PM Brandon Williams  
> > > > > wrote:
> > > > > > Signed-off-by: Brandon Williams 
> > > > > > ---
> > > > > > 
> > > > > > Since introducing protocol v2 and enabling fetch I've been thinking
> > > > > > about what its inverse 'push' would look like.  After talking with a
> > > > > > number of people I have a longish list of things that could be done 
> > > > > > to
> > > > > > improve push and I think I've been able to distill the core 
> > > > > > features we
> > > > > > want in push v2.
> > > > > It would be nice to know which things you want to improve.
> > > > 
> > > > Hopefully we can also get others to chime in with things they don't like
> > > > about the existing protocol. What pain points exist, and what can we do
> > > > to improve at the transport layer before considering new functionality?
> > > 
> > > Another thing that I realized last night was the possibility to chunk 
> > > requests.
> > > The web of today is driven by lots of small http(s) requests. I know our 
> > > server
> > > team fights with the internal tools all the time because the communication
> > > involved in git-fetch is usually a large http request (large packfile).
> > > So it would be nice to have the possibility of chunking the request.
> > > But I think that can be added as a capability? (Not sure how)
> > 
> > Fetch and push requests/responses are already "chunked" when using the
> > http transport.  So I'm not sure what you mean by adding a capability
> > because the protocol doesn't care about which transport you're using.
> > This is of course unless you're talking about a different "chunking"
> > from what it means to chunk an http request/response.
> > 
> 
> Internally, we've talked about wanting to have resumable pushes and
> fetches.  I realize this is difficult to do when the server is
> replicated and the repeated request might be talking to a different
> server instance.  And there's a problem with temp files littering the
> server as it waits for the repeated attempt.  But still, the packfile
> sent/received can be large and connections do get dropped.
> 
> That is, if we think about sending 1 large packfile and just using a
> byte-range-like approach to resuming the transfer.
> 
> If we allowed the request to send a series of packfiles, with each
> "chunk" being self-contained and usable.  So if a push connection was
> dropped the server could apply the successfully received packfile(s)
> (add the received objects and update the refs to the commits received so
> far).  And ignore the interrupted and unreceived packfile(s) and let the
> client retry later.  When/if the client retried the push, it would
> renegotiate haves/wants and send a new series of packfile(s).  With the
> assumption being that the server would have updated refs from the
> earlier aborted push, so the packfile(s) computed for the second attempt
> would not repeat the content successfully transmitted in the first
> attempt.
> 
> This would require that the client build an ordered set of packfiles
> from oldest to newest so that the server can apply them in-order and
> the graph remain connected.  That may be outside your scope here.
> 
> Also, we might have to add a few messages to the protocol after the
> negotiation, for the client to say that it is going to send the push
> content in 'n' packfiles and send 'n' messages with the intermediate
> ref values being updated in each packfile.
> 
> Just thinking out loud here.
> Jeff

We've talked about working on resumable fetch/push (both of which are
out of the scope of this work), but we haven't started working on
anything just yet.

There's a couple different ways to do this like you've pointed out, we
can either have the server redirect the client to fetch from a CDN
(where its put the packfile) and then the client can use ranged requests
to fetch until the server decides to remove it from the CDN.  This can
be tricky because every fetch can produce a unique packfile so maybe you
don't want to put a freshly constructed, unique packfile for each client
request up on a CDN somewhere.

Breaking up a response into multiple packfiles and small ref-updates
could work, that way as long as some of the smaller packs/updates are
applied then the client is making headway towards being up to date with
the server.

-- 
Brandon Williams

[GSoC] GSoC with git, week 12

2018-07-24 Thread Alban Gruin

Hi,

I published a new blog post here:

https://blog.pa1ch.fr/posts/2018/07/24/en/gsoc2018-week12.html

Cheers,
Alban

Re: [PATCH v1] config.c: fix msvc compile error

2018-07-24 Thread Junio C Hamano

Beat Bolli  writes:

> On 24.07.18 20:22, Junio C Hamano wrote:
>
>>> This was already fixed (differently) in
>>> <20180705183445.30901-1-dev+...@drbeat.li>.
>> 
>> Thanks for saving me from having to dig the list archive myself.
>> Yes, it is already applied to the tip of the topic that originally
>> caused the breakage.
>> 
> Just a general question:
>
> Is it OK to refer to patches on pu with the Message-ID, or would you
> prefer the commit hash? The hash changes whenever you recreate pu,
> doesn't it?

Either is fine in practice.  The commits themselves on a topic
branch that is not yet in 'next' usually stay the same once the tip
of 'pu' that contains them gets published.  Even though I often use
"git rebase -i", "git commit --amend", etc. to fix up posted patches
while turning them into commits on topic branches, I usually stop
doing so once I push out day's integration result.

Until a new version of the series is posted to replace them on the
topic branch, that is.  But at that point we are talking about new
patches with different message-ids that got turned into different
commit objects, so either commit object name or message id that
refer to older iteration would still name the same old version, and
new names would refer to the same new version.

Re: [PATCH v1] checkout: optionally speed up "git checkout -b foo"

2018-07-24 Thread Eric Sunshine

On Tue, Jul 24, 2018 at 2:01 PM Ben Peart  wrote:
> If the new core.optimizecheckout config setting is set to true, speed up

Maybe:

Add core.optimizeCheckout config setting which, when true, speeds up

> "git checkout -b foo" by avoiding the work to merge the working tree.  This
> is valid because no merge needs to occur - only creating the new branch/
> updating the refs. Any other options force it through the old code path.
>
> This change in behavior is off by default and behind the config setting so
> that users have to opt-in to the optimized behavior.
>
> We've been running with this patch internally for a long time but it was
> rejected when I submitted it to the mailing list before because it
> implicitly changes the behavior of checkout -b. Trying it again configured
> behind a config setting as a potential solution for other optimizations to
> checkout that could change the behavior as well.

This paragraph is mere commentary which probably belongs below the
"---" line following your sign-off.

> https://public-inbox.org/git/20180724042740.gb13...@sigill.intra.peff.net/T/#m75afe3ab318d23f36334cf3a6e3d058839592469

Is this link meant to reference the previous attempt of optimizing
"checkout -b"? Although there's a single mention of "checkout -b" in
that discussion, it doesn't seem to be the previous attempt or explain
why it was rejected.

It would be quite nice to see a discussion in both the commit message
and the documentation about the pros and cons of enabling this
optimization. That it was previously rejected suggests that there may
be serious or unexpected consequences. How will a typical user know
whether its use is desirable or not?

> Signed-off-by: Ben Peart 
> ---
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> @@ -911,6 +911,12 @@ core.commitGraph::
> +core.optimizedCheckout
> +   Speed up "git checkout -b foo" by skipping much of the work of a
> +   full checkout command.  This changs the behavior as it will skip

s/changs/changes/

> +   merging the trees and updating the index and instead only create
> +   and switch to the new ref.
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> @@ -471,6 +475,88 @@ static void setup_branch_path(struct branch_info *branch)
> +static int needs_working_tree_merge(const struct checkout_opts *opts,
> +   const struct branch_info *old_branch_info,
> +   const struct branch_info *new_branch_info)
> +{
> +   /*
> +* We must do the merge if we are actually moving to a new
> +* commit tree.
> +*/
> +   if (!old_branch_info->commit || !new_branch_info->commit ||
> +   oidcmp(_branch_info->commit->object.oid, 
> _branch_info->commit->object.oid))
> +   return 1;
> +   [...]
> +   return 0;
> +}

This long list of special-case checks doesn't leave me too enthused,
however, that aside, this approach seems backward. Rather than erring
on the side of safety by falling back to the merging behavior, it errs
in the other direction, which may be a problem if this list of
special-case checks ever gets out of sync with 'checkout_opts'. That
is, if someone adds a new option which ought to employ the merging
behavior, but forgets to update this function, then this function will
incorrectly default to using the optimization.

A safer approach would be the inverse, namely:

static int skip_worktree_merge(...)
{
if (...meets all criteria for optimization...)
return 1;
return 0;
}

>  static int merge_working_tree(const struct checkout_opts *opts,
>   struct branch_info *old_branch_info,
>   struct branch_info *new_branch_info,
> {
> +   /*
> +* Skip merging the trees, updating the index, and work tree only if 
> we
> +* are simply creating a new branch via "git checkout -b foo."  Any
> +* other options or usage will continue to do all these steps.
> +*/
> +   if (core_optimize_checkout && !needs_working_tree_merge(opts, 
> old_branch_info, new_branch_info))
> +   return 0;

This seems a somewhat odd place to hook in this optimization,
especially as there is only a single caller of this function. Instead,
one might expect the caller itself to make this judgment and avoid
trying the merge in the first place if not needed. That is, in
switch_branches:

if (!skip_worktree_merge(...))
ret = merge_working_tree(...);

Re: [PATCH v1] config.c: fix msvc compile error

2018-07-24 Thread Beat Bolli

On 24.07.18 20:22, Junio C Hamano wrote:
> Beat Bolli  writes:
> 
>> Hi Jeff
>>
>> On 24.07.18 17:30, g...@jeffhostetler.com wrote:
>>> From: Jeff Hostetler 
>>>
>>> In commit fb0dc3bac135e9f6243bd6d293e8c9293c73b9cd code was added
>>> to builtin/config.c to define a new function and a forward declaration
>>> for an array of unknown size.  This causes a compile error under MSVC.
>>>
>>> Reorder the code to forward declare the function instead of the array.
>>
>> This was already fixed (differently) in
>> <20180705183445.30901-1-dev+...@drbeat.li>.
> 
> Thanks for saving me from having to dig the list archive myself.
> Yes, it is already applied to the tip of the topic that originally
> caused the breakage.
> 
Just a general question:

Is it OK to refer to patches on pu with the Message-ID, or would you
prefer the commit hash? The hash changes whenever you recreate pu,
doesn't it?

Beat

Re: [PATCH v1] config.c: fix msvc compile error

2018-07-24 Thread Junio C Hamano

Beat Bolli  writes:

> Hi Jeff
>
> On 24.07.18 17:30, g...@jeffhostetler.com wrote:
>> From: Jeff Hostetler 
>> 
>> In commit fb0dc3bac135e9f6243bd6d293e8c9293c73b9cd code was added
>> to builtin/config.c to define a new function and a forward declaration
>> for an array of unknown size.  This causes a compile error under MSVC.
>> 
>> Reorder the code to forward declare the function instead of the array.
>
> This was already fixed (differently) in
> <20180705183445.30901-1-dev+...@drbeat.li>.

Thanks for saving me from having to dig the list archive myself.
Yes, it is already applied to the tip of the topic that originally
caused the breakage.

Re: [PATCH] pack-protocol: mention and point to docs for protocol v2

2018-07-24 Thread Brandon Williams

On 07/23, Jonathan Nieder wrote:
> Hi,
> 
> Brandon Williams wrote:
> 
> > --- a/Documentation/technical/pack-protocol.txt
> > +++ b/Documentation/technical/pack-protocol.txt
> > @@ -50,7 +50,8 @@ Each Extra Parameter takes the form of `=` or 
> > ``.
> >  
> >  Servers that receive any such Extra Parameters MUST ignore all
> >  unrecognized keys. Currently, the only Extra Parameter recognized is
> > -"version=1".
> > +"version" with a vlue of '1' or '2'.  See protocol-v2.txt for more
> 
> value?

yep, missed a letter.

> 
> > +information on protocol version 2.
> 
> Thanks.  Some thoughts on other parts of this document that may need
> updating:
> 
> - the whole document assumes that 0 and 1 are the only protocol
>   versions.  E.g. the discussion of the version number line in the
>   response when "version=1" is sent as an Extra Paramter should probably
>   apply to version 2, too.
> 
> - because the document was written before protocol v2, it describes the
>   more complicated v1 that many readers shouldn't have to care about
> 
> - there is no one document that describes v2 in a self contained way,
>   since protocol-v2.txt makes reference to protocol v1.
> 
> - the description of pkt-line format in protocol-common.txt is missing
>   a discussion of delim-pkt.
> 
> Not about this patch, but I wonder if an organization along the
> following lines would make sense?
> 
>  1. Rename pack-protocol.txt to protocol-v1.txt.  Rename
> protocol-v2.txt to pack-protocol.txt.
> 
>  2. Make pack-protocol.txt self-contained, and remove any redundant
> sections from protocol-v1.txt.
> 
>  3. Add a new protocol-v2.txt that briefly describes the benefits and
> highlights of protocol v2, referring to pack-protocol.txt for
> details.
> 
> That way, newcomers of the future could read pack-protocol.txt and
> quickly glean the main protocol in (then) current use.
> 
> What do you think?

I dislike the idea of renaming protocol-v2.txt to pack-protocol.txt.  I
agree that we should probably have protocol-v1 broken out into its own
file, taking the parts from pack-protocol.txt, but what really should
happen is that pack-protocol.txt could describe the basics of the wire
protocol (pkt-lines, the format of the various transports, etc) and then
refer to the protocol-v{1,2}.txt documents themselves.

-- 
Brandon Williams

Re: [PATCH v1] config.c: fix msvc compile error

2018-07-24 Thread Beat Bolli

Hi Jeff

On 24.07.18 17:30, g...@jeffhostetler.com wrote:
> From: Jeff Hostetler 
> 
> In commit fb0dc3bac135e9f6243bd6d293e8c9293c73b9cd code was added
> to builtin/config.c to define a new function and a forward declaration
> for an array of unknown size.  This causes a compile error under MSVC.
> 
> Reorder the code to forward declare the function instead of the array.

This was already fixed (differently) in
<20180705183445.30901-1-dev+...@drbeat.li>.

Cheers,
Beat


> Signed-off-by: Jeff Hostetler 
> ---
>  builtin/config.c | 79 
> 
>  1 file changed, 40 insertions(+), 39 deletions(-)
> 
> diff --git a/builtin/config.c b/builtin/config.c
> index b29d26d..564f18f 100644
> --- a/builtin/config.c
> +++ b/builtin/config.c
> @@ -67,7 +67,46 @@ static int show_origin;
>   { OPTION_CALLBACK, (s), (l), (v), NULL, (h), PARSE_OPT_NOARG | \
>   PARSE_OPT_NONEG, option_parse_type, (i) }
>  
> -static struct option builtin_config_options[];
> +static int option_parse_type(const struct option *opt, const char *arg,
> +  int unset);
> +
> +static struct option builtin_config_options[] = {
> + OPT_GROUP(N_("Config file location")),
> + OPT_BOOL(0, "global", _global_config, N_("use global config file")),
> + OPT_BOOL(0, "system", _system_config, N_("use system config file")),
> + OPT_BOOL(0, "local", _local_config, N_("use repository config 
> file")),
> + OPT_STRING('f', "file", _config_source.file, N_("file"), N_("use 
> given config file")),
> + OPT_STRING(0, "blob", _config_source.blob, N_("blob-id"), 
> N_("read config from given blob object")),
> + OPT_GROUP(N_("Action")),
> + OPT_BIT(0, "get", , N_("get value: name [value-regex]"), 
> ACTION_GET),
> + OPT_BIT(0, "get-all", , N_("get all values: key 
> [value-regex]"), ACTION_GET_ALL),
> + OPT_BIT(0, "get-regexp", , N_("get values for regexp: 
> name-regex [value-regex]"), ACTION_GET_REGEXP),
> + OPT_BIT(0, "get-urlmatch", , N_("get value specific for the 
> URL: section[.var] URL"), ACTION_GET_URLMATCH),
> + OPT_BIT(0, "replace-all", , N_("replace all matching variables: 
> name value [value_regex]"), ACTION_REPLACE_ALL),
> + OPT_BIT(0, "add", , N_("add a new variable: name value"), 
> ACTION_ADD),
> + OPT_BIT(0, "unset", , N_("remove a variable: name 
> [value-regex]"), ACTION_UNSET),
> + OPT_BIT(0, "unset-all", , N_("remove all matches: name 
> [value-regex]"), ACTION_UNSET_ALL),
> + OPT_BIT(0, "rename-section", , N_("rename section: old-name 
> new-name"), ACTION_RENAME_SECTION),
> + OPT_BIT(0, "remove-section", , N_("remove a section: name"), 
> ACTION_REMOVE_SECTION),
> + OPT_BIT('l', "list", , N_("list all"), ACTION_LIST),
> + OPT_BIT('e', "edit", , N_("open an editor"), ACTION_EDIT),
> + OPT_BIT(0, "get-color", , N_("find the color configured: slot 
> [default]"), ACTION_GET_COLOR),
> + OPT_BIT(0, "get-colorbool", , N_("find the color setting: slot 
> [stdout-is-tty]"), ACTION_GET_COLORBOOL),
> + OPT_GROUP(N_("Type")),
> + OPT_CALLBACK('t', "type", , "", N_("value is given this type"), 
> option_parse_type),
> + OPT_CALLBACK_VALUE(0, "bool", , N_("value is \"true\" or 
> \"false\""), TYPE_BOOL),
> + OPT_CALLBACK_VALUE(0, "int", , N_("value is decimal number"), 
> TYPE_INT),
> + OPT_CALLBACK_VALUE(0, "bool-or-int", , N_("value is --bool or 
> --int"), TYPE_BOOL_OR_INT),
> + OPT_CALLBACK_VALUE(0, "path", , N_("value is a path (file or 
> directory name)"), TYPE_PATH),
> + OPT_CALLBACK_VALUE(0, "expiry-date", , N_("value is an expiry 
> date"), TYPE_EXPIRY_DATE),
> + OPT_GROUP(N_("Other")),
> + OPT_BOOL('z', "null", _null, N_("terminate values with NUL byte")),
> + OPT_BOOL(0, "name-only", _values, N_("show variable names only")),
> + OPT_BOOL(0, "includes", _includes_opt, N_("respect include 
> directives on lookup")),
> + OPT_BOOL(0, "show-origin", _origin, N_("show origin of config 
> (file, standard input, blob, command line)")),
> + OPT_STRING(0, "default", _value, N_("value"), N_("with --get, 
> use default value when missing entry")),
> + OPT_END(),
> +};
>  
>  static int option_parse_type(const struct option *opt, const char *arg,
>int unset)
> @@ -119,44 +158,6 @@ static int option_parse_type(const struct option *opt, 
> const char *arg,
>   return 0;
>  }
>  
> -static struct option builtin_config_options[] = {
> - OPT_GROUP(N_("Config file location")),
> - OPT_BOOL(0, "global", _global_config, N_("use global config file")),
> - OPT_BOOL(0, "system", _system_config, N_("use system config file")),
> - OPT_BOOL(0, "local", _local_config, N_("use repository config 
> file")),
> - OPT_STRING('f', "file", _config_source.file, N_("file"), N_("use 
> given config file")),
> - OPT_STRING(0, "blob", _config_source.blob, N_("blob-id"), 
> N_("read config from given blob

Re: [PATCH v1] msvc: fix non-standard escape sequence in source

2018-07-24 Thread Beat Bolli

Hi Jeff

On 24.07.18 16:42, g...@jeffhostetler.com wrote:
> From: Jeff Hostetler 
> 
> Replace non-standard "\e" escape sequence with "\x1B".

This was already fixed in <20180708144342.11922-4-dev+...@drbeat.li>.

Cheers,
Beat


> 
> In commit 7a17918c34f4e83982456ffe22d880c3cda5384f a trace message with
> several "\e" escape sequences was added.  This causes a compiler warning
> under MSVC.
> 
> According to [1], the "\e" sequence is an extension supported by GCC,
> clang, and tcc.
> 
> [1] https://en.wikipedia.org/wiki/Escape_sequences_in_C
> 
> Signed-off-by: Jeff Hostetler 
> ---
>  convert.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/convert.c b/convert.c
> index 56cfe31..52092be 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -335,7 +335,7 @@ static void trace_encoding(const char *context, const 
> char *path,
>   strbuf_addf(, "%s (%s, considered %s):\n", context, path, 
> encoding);
>   for (i = 0; i < len && buf; ++i) {
>   strbuf_addf(
> - ,"| \e[2m%2i:\e[0m %2x \e[2m%c\e[0m%c",
> + ,"| \x1B[2m%2i:\x1B[0m %2x \x1B[2m%c\x1B[0m%c",
>   i,
>   (unsigned char) buf[i],
>   (buf[i] > 32 && buf[i] < 127 ? buf[i] : ' '),
>

Re: [PATCH v1] config.c: fix msvc compile error

2018-07-24 Thread Junio C Hamano

g...@jeffhostetler.com writes:

> From: Jeff Hostetler 
>
> In commit fb0dc3bac135e9f6243bd6d293e8c9293c73b9cd code was added
> to builtin/config.c to define a new function and a forward declaration
> for an array of unknown size.  This causes a compile error under MSVC.
>
> Reorder the code to forward declare the function instead of the array.
>
> Signed-off-by: Jeff Hostetler 
> ---

Somebody please tell me why I have a feeling that I've seen this
patch, even though it is labeled as v1...  The problem and the
solution seem pretty obvious and I recall somebody saying that a
single pass compiler wouldn't be able to grok the forward decl.



>  builtin/config.c | 79 
> 
>  1 file changed, 40 insertions(+), 39 deletions(-)
>
> diff --git a/builtin/config.c b/builtin/config.c
> index b29d26d..564f18f 100644
> --- a/builtin/config.c
> +++ b/builtin/config.c
> @@ -67,7 +67,46 @@ static int show_origin;
>   { OPTION_CALLBACK, (s), (l), (v), NULL, (h), PARSE_OPT_NOARG | \
>   PARSE_OPT_NONEG, option_parse_type, (i) }
>  
> -static struct option builtin_config_options[];
> +static int option_parse_type(const struct option *opt, const char *arg,
> +  int unset);
> +
> +static struct option builtin_config_options[] = {
> + OPT_GROUP(N_("Config file location")),
> + OPT_BOOL(0, "global", _global_config, N_("use global config file")),
> + OPT_BOOL(0, "system", _system_config, N_("use system config file")),
> + OPT_BOOL(0, "local", _local_config, N_("use repository config 
> file")),
> + OPT_STRING('f', "file", _config_source.file, N_("file"), N_("use 
> given config file")),
> + OPT_STRING(0, "blob", _config_source.blob, N_("blob-id"), 
> N_("read config from given blob object")),
> + OPT_GROUP(N_("Action")),
> + OPT_BIT(0, "get", , N_("get value: name [value-regex]"), 
> ACTION_GET),
> + OPT_BIT(0, "get-all", , N_("get all values: key 
> [value-regex]"), ACTION_GET_ALL),
> + OPT_BIT(0, "get-regexp", , N_("get values for regexp: 
> name-regex [value-regex]"), ACTION_GET_REGEXP),
> + OPT_BIT(0, "get-urlmatch", , N_("get value specific for the 
> URL: section[.var] URL"), ACTION_GET_URLMATCH),
> + OPT_BIT(0, "replace-all", , N_("replace all matching variables: 
> name value [value_regex]"), ACTION_REPLACE_ALL),
> + OPT_BIT(0, "add", , N_("add a new variable: name value"), 
> ACTION_ADD),
> + OPT_BIT(0, "unset", , N_("remove a variable: name 
> [value-regex]"), ACTION_UNSET),
> + OPT_BIT(0, "unset-all", , N_("remove all matches: name 
> [value-regex]"), ACTION_UNSET_ALL),
> + OPT_BIT(0, "rename-section", , N_("rename section: old-name 
> new-name"), ACTION_RENAME_SECTION),
> + OPT_BIT(0, "remove-section", , N_("remove a section: name"), 
> ACTION_REMOVE_SECTION),
> + OPT_BIT('l', "list", , N_("list all"), ACTION_LIST),
> + OPT_BIT('e', "edit", , N_("open an editor"), ACTION_EDIT),
> + OPT_BIT(0, "get-color", , N_("find the color configured: slot 
> [default]"), ACTION_GET_COLOR),
> + OPT_BIT(0, "get-colorbool", , N_("find the color setting: slot 
> [stdout-is-tty]"), ACTION_GET_COLORBOOL),
> + OPT_GROUP(N_("Type")),
> + OPT_CALLBACK('t', "type", , "", N_("value is given this type"), 
> option_parse_type),
> + OPT_CALLBACK_VALUE(0, "bool", , N_("value is \"true\" or 
> \"false\""), TYPE_BOOL),
> + OPT_CALLBACK_VALUE(0, "int", , N_("value is decimal number"), 
> TYPE_INT),
> + OPT_CALLBACK_VALUE(0, "bool-or-int", , N_("value is --bool or 
> --int"), TYPE_BOOL_OR_INT),
> + OPT_CALLBACK_VALUE(0, "path", , N_("value is a path (file or 
> directory name)"), TYPE_PATH),
> + OPT_CALLBACK_VALUE(0, "expiry-date", , N_("value is an expiry 
> date"), TYPE_EXPIRY_DATE),
> + OPT_GROUP(N_("Other")),
> + OPT_BOOL('z', "null", _null, N_("terminate values with NUL byte")),
> + OPT_BOOL(0, "name-only", _values, N_("show variable names only")),
> + OPT_BOOL(0, "includes", _includes_opt, N_("respect include 
> directives on lookup")),
> + OPT_BOOL(0, "show-origin", _origin, N_("show origin of config 
> (file, standard input, blob, command line)")),
> + OPT_STRING(0, "default", _value, N_("value"), N_("with --get, 
> use default value when missing entry")),
> + OPT_END(),
> +};
>  
>  static int option_parse_type(const struct option *opt, const char *arg,
>int unset)
> @@ -119,44 +158,6 @@ static int option_parse_type(const struct option *opt, 
> const char *arg,
>   return 0;
>  }
>  
> -static struct option builtin_config_options[] = {
> - OPT_GROUP(N_("Config file location")),
> - OPT_BOOL(0, "global", _global_config, N_("use global config file")),
> - OPT_BOOL(0, "system", _system_config, N_("use system config file")),
> - OPT_BOOL(0, "local", _local_config, N_("use repository config 
> file")),
> - OPT_STRING('f', "file", _config_source.file, N_("file"), N_("use

Re: [PATCH v1] msvc: fix non-standard escape sequence in source

2018-07-24 Thread Junio C Hamano

g...@jeffhostetler.com writes:

> From: Jeff Hostetler 
>
> Replace non-standard "\e" escape sequence with "\x1B".
>
> In commit 7a17918c34f4e83982456ffe22d880c3cda5384f a trace message with
> several "\e" escape sequences was added.  This causes a compiler warning
> under MSVC.
>
> According to [1], the "\e" sequence is an extension supported by GCC,
> clang, and tcc.

Good spotting.  Please spell it \x1b (or \033 if you are a
traditionalist), as it seems nobody in the existing code uses
uppercase, according to

$ git grep -e '\\x[A-F][0-9A-F]' -e '\\x[0-9A-F][A-F]' \*.c

and "\033"  is already used in many places

$ git grep -e '\\0[0-7][0-7]' \*.c



> [1] https://en.wikipedia.org/wiki/Escape_sequences_in_C
>
> Signed-off-by: Jeff Hostetler 
> ---
>  convert.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/convert.c b/convert.c
> index 56cfe31..52092be 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -335,7 +335,7 @@ static void trace_encoding(const char *context, const 
> char *path,
>   strbuf_addf(, "%s (%s, considered %s):\n", context, path, 
> encoding);
>   for (i = 0; i < len && buf; ++i) {
>   strbuf_addf(
> - ,"| \e[2m%2i:\e[0m %2x \e[2m%c\e[0m%c",
> + ,"| \x1B[2m%2i:\x1B[0m %2x \x1B[2m%c\x1B[0m%c",
>   i,
>   (unsigned char) buf[i],
>   (buf[i] > 32 && buf[i] < 127 ? buf[i] : ' '),

[PATCH v1] checkout: optionally speed up "git checkout -b foo"

2018-07-24 Thread Ben Peart

From: Ben Peart 

If the new core.optimizecheckout config setting is set to true, speed up
"git checkout -b foo" by avoiding the work to merge the working tree.  This
is valid because no merge needs to occur - only creating the new branch/
updating the refs. Any other options force it through the old code path.

This change in behavior is off by default and behind the config setting so
that users have to opt-in to the optimized behavior.

We've been running with this patch internally for a long time but it was
rejected when I submitted it to the mailing list before because it
implicitly changes the behavior of checkout -b. Trying it again configured
behind a config setting as a potential solution for other optimizations to
checkout that could change the behavior as well.

https://public-inbox.org/git/20180724042740.gb13...@sigill.intra.peff.net/T/#m75afe3ab318d23f36334cf3a6e3d058839592469

Signed-off-by: Ben Peart 
---

Notes:
Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/f43d934ce7
Checkout: git fetch https://github.com/benpeart/git checkout-b-v1 && git 
checkout f43d934ce7

 Documentation/config.txt |  6 +++
 builtin/checkout.c   | 94 
 cache.h  |  1 +
 config.c |  5 +++
 environment.c|  1 +
 5 files changed, 107 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index a32172a43c..2c4f513bf1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -911,6 +911,12 @@ core.commitGraph::
Enable git commit graph feature. Allows reading from the
commit-graph file.
 
+core.optimizedCheckout
+   Speed up "git checkout -b foo" by skipping much of the work of a
+   full checkout command.  This changs the behavior as it will skip
+   merging the trees and updating the index and instead only create
+   and switch to the new ref.
+
 core.sparseCheckout::
Enable "sparse checkout" feature. See section "Sparse checkout" in
linkgit:git-read-tree[1] for more information.
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 28627650cd..b186a3201e 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -41,6 +41,10 @@ struct checkout_opts {
int ignore_skipworktree;
int ignore_other_worktrees;
int show_progress;
+   /*
+* If new checkout options are added, needs_working_tree_merge
+* should be updated accordingly.
+*/
 
const char *new_branch;
const char *new_branch_force;
@@ -471,6 +475,88 @@ static void setup_branch_path(struct branch_info *branch)
branch->path = strbuf_detach(, NULL);
 }
 
+static int needs_working_tree_merge(const struct checkout_opts *opts,
+   const struct branch_info *old_branch_info,
+   const struct branch_info *new_branch_info)
+{
+   /*
+* We must do the merge if we are actually moving to a new
+* commit tree.
+*/
+   if (!old_branch_info->commit || !new_branch_info->commit ||
+   oidcmp(_branch_info->commit->object.oid, 
_branch_info->commit->object.oid))
+   return 1;
+
+   /*
+* opts->patch_mode cannot be used with switching branches so is
+* not tested here
+*/
+
+   /*
+* opts->quiet only impacts output so doesn't require a merge
+*/
+
+   /*
+* Honor the explicit request for a three-way merge or to throw away
+* local changes
+*/
+   if (opts->merge || opts->force)
+   return 1;
+
+   /*
+* --detach is documented as "updating the index and the files in the
+* working tree" but this optimization skips those steps so fall through
+* to the regular code path.
+*/
+   if (opts->force_detach)
+   return 1;
+
+   /*
+* opts->writeout_stage cannot be used with switching branches so is
+* not tested here
+*/
+
+   /*
+* Honor the explicit ignore requests
+*/
+   if (!opts->overwrite_ignore || opts->ignore_skipworktree ||
+   opts->ignore_other_worktrees)
+   return 1;
+
+   /*
+* opts->show_progress only impacts output so doesn't require a merge
+*/
+
+   /*
+* If we aren't creating a new branch any changes or updates will
+* happen in the existing branch.  Since that could only be updating
+* the index and working directory, we don't want to skip those steps
+* or we've defeated any purpose in running the command.
+*/
+   if (!opts->new_branch)
+   return 1;
+
+   /*
+* new_branch_force is defined to "create/reset and checkout a branch"
+* so needs to go through the merge to do the reset
+*/
+   if (opts->new_branch_force)
+   return 1;
+
+   /*
+* A new orphaned branch requrires the index and

Re: [PATCH] Documentation/diff-options: explain different diff algorithms

2018-07-24 Thread Stefan Beller

On Mon, Jul 23, 2018 at 9:41 PM Jonathan Nieder  wrote:
>
> Hi,
>
> Stefan Beller wrote:
>
> > As a user I wondered what the diff algorithms are about. Offer at least
> > a basic explanation on the differences of the diff algorithms.
>
> Interesting.  Let's see.
>
> [...]
> > --- a/Documentation/diff-options.txt
> > +++ b/Documentation/diff-options.txt
> > @@ -94,16 +94,34 @@ diff" algorithm internally.
> >   Choose a diff algorithm. The variants are as follows:
> >  +
> >  --
> > -`default`, `myers`;;
> > - The basic greedy diff algorithm. Currently, this is the default.
> >  `minimal`;;
> > + A diff as produced by the basic greedy algorithm described in
>
> Why this reordering?

because Myers (below) is constructed from minimal + heuristic.
Before this patch we say Myers is myers and minimal "spends
extra cycles", which is true if you read the code, as it just is

for (...) {
test_path
if (need_min)
continue;
if (heuristic_good_enough())
break;
}

https://github.com/git/git/blob/master/xdiff/xdiffi.c#L152

So the "minimal" algorithm is the basic myers algorithm,
and the "myers" algorithm is the extended version (extended
with a heuristic to be fast in corner cases, still producing good
enough diffs).

>
> > + link:http://www.xmailserver.org/diff2.pdf[An O(ND) Difference 
> > Algorithm and its Variations]
>
> This URL doesn't seem likely to stay stable.  Are appearances
> deceiving?  (Maybe so, since that's the same host as libxdiff's
> homepage .)  It might be
> worth mentioning the author and date just in case.

I though about it and did not mention it, as the name
"An O(ND) Difference Algorithm and its Variations" is already
enough to find that paper quickly.

> > - Spend extra time to make sure the smallest possible diff is
> > - produced.
> > +`default`, `myers`;;
> > + The same algorithm as `minimal`, extended with a heuristic to
> > + reduce extensive searches. Currently, this is the default.
>
> Amusing --- this means that the Myers diff is spelled as "minimal"
> instead of "myers".
>
> >  `patience`;;
> > - Use "patience diff" algorithm when generating patches.
> > + Use "patience diff" algorithm when generating patches. This
> > + matches the longest common subsequence of unique lines on
> > + both sides, recursively. It obtained its name by the way the
> > + longest subsequence is found, as that is a byproduct of the
> > + patience sorting algorithm. If there are no unique lines left
> > + it falls back to `myers`. Empirically this algorithm produces
> > + a more readable output for code, but it does not garantuee
> > + the shortest output.
>
> Probably also worth mentioning that this was invented by Bram Cohen
> for the bazaar vcs.

I tried to balance the part that reads like a history lesson and the
immediately useful part (why is this better, when should I use it?)

Mentioning bazaar probably makes sense. Not sure about the author,
but I'll include him in a reroll of this patch.

> >  `histogram`;;
> > - This algorithm extends the patience algorithm to "support
> > - low-occurrence common elements".
> > + This algorithm re-implements the `patience` algorithm with
>
> What does reimplements mean here?  Do you mean that it is a variation
> on patience?

It is supposed to be a variation of patience, but as it comes with its
own implementation which I would not fully trust (as it was ported
from JGit, and reading the comments over there, a misunderstanding
of LCS made it possible to have only one midpoint before recursing)

So it is not just taking the patience algorithm and adding support for
"low-occurrence common elements"  (what does that even mean?
patience already distinguishes uniques and non-uniques), but
its own implementation, hence it is not bug-for-bug compatible.

> > + "support of low-occurrence common elements" and only picks
> > + one element of the LCS for the recursion. It is often the
>
> Does LCS mean longest common subsequence or longest common substring
> here?  Probably best to spell it out to avoid confusion.

When writing the patch I meant to refer to longest common subsequence
from above, but by picking just one element, this algorithm understands it
as longest common string.


>
> > + fastest, but in cornercases (when there are many longest
>
> s/cornercases/corner cases/
>
> > + common subsequences of the same length) it produces bad
> > + results as seen in:
> > +
> > + seq 1 100 >one
> > + echo 99 > two
> > + seq 1 2 98 >>two
> > + git diff --no-index --histogram one two
>
> I wonder if these details (about all the algorithms) should go in a
> separate Diff Algorithms section and this section could focus on
> what use cases warrant using each, referring to that section for
> details.  What do you think?

Yeah I thought

Re: [PATCH v1] msvc: fix non-standard escape sequence in source

2018-07-24 Thread Eric Sunshine

On Tue, Jul 24, 2018 at 10:43 AM  wrote:
> Replace non-standard "\e" escape sequence with "\x1B".
>
> In commit 7a17918c34f4e83982456ffe22d880c3cda5384f a trace message with
> several "\e" escape sequences was added.  This causes a compiler warning
> under MSVC.

Wrong commit. That code was actually introduced by 541d059cd9
(convert: add tracing for 'working-tree-encoding' attribute,
2018-04-15).

> diff --git a/convert.c b/convert.c
> @@ -335,7 +335,7 @@ static void trace_encoding(const char *context, const 
> char *path,
> strbuf_addf(, "%s (%s, considered %s):\n", context, path, 
> encoding);
> for (i = 0; i < len && buf; ++i) {
> strbuf_addf(
> -   ,"| \e[2m%2i:\e[0m %2x \e[2m%c\e[0m%c",
> +   ,"| \x1B[2m%2i:\x1B[0m %2x \x1B[2m%c\x1B[0m%c",

In this codebase, ESC is always spelled as octal \033 rather than hex \x1b.

Re: [PATCH v2 1/4] automatically ban strcpy()

2018-07-24 Thread Eric Sunshine

On Tue, Jul 24, 2018 at 5:26 AM Jeff King  wrote:
>   1. We'll only trigger with -Wimplicit-function-declaration
>  (and only stop compilation with -Werror). These are
>  generally enabled by DEVELOPER=1. If you _don't_ have
>  that set, we'll still catch the problem, but only at
>  link-time, with a slightly less useful message:
>
>  If instead we convert this to a reference to an
>  undefined variable, that always dies immediately. But
>  gcc seems to print the set of errors twice, which
>  clutters things up.

The above does a pretty good job of convincing me that this ought to
be implemented via an undefined variable rather than undefined
function, exactly because it is the newcomer or casual contributor who
is most likely to trip over a banned function, and almost certainly
won't have DEVELOPER=1 set. The gcc clutter seems a minor point
against the benefit this provides to that audience.

Re: [RFC PATCH 3/5] pack-objects: add delta-islands support

2018-07-24 Thread Stefan Beller

On Tue, Jul 24, 2018 at 2:58 AM Jeff King  wrote:
>
> On Mon, Jul 23, 2018 at 11:52:49AM -0700, Stefan Beller wrote:
>
> > > +DELTA ISLANDS
> > > +-
> > [...]
> >
> > I had to read all of this [background information] to understand the
> > concept and I think it is misnamed, as my gut instinct first told me
> > to have deltas only "within an island and no island hopping is allowed".
> > (This message reads a bit like a commit message, not as documentation
> > as it is long winded, too).
>
> I'm not sure if I'm parsing your sentence correctly, but the reason I
> called them "islands" is exactly that you'd have deltas within an island
> and want to forbid island hopping. So I wasn't sure if you were saying
> "that's what I think, but not how the feature works".

Yeah, you want to avoid island hopping, but still want the sea bed to be
useful for all islands (i.e. to have the largest possible base pack, that all
islands can share without being overexposed with objects they should
not see). And that is the main feature IMHO. It is not so much about
island separation (as you could use its own physically separated repo
for these separation tasks), but the main selling point of this feature
is that it enables a base pack that can be shared across all islands
without violating ACLs for example or making one island "too big"
(having unneeded objects on the island).

So metaphorically speaking I assumed the sea bed to support
all islands, which themselves are independent from each other.

> There _is_ a tricky thing, which is that a given object is going to
> appear in many islands. So the rule is really "you cannot hop to a base
> that is not in all of your islands".

Yes, if islands were numbers, we'd be looking for the Greatest common
common divisor for all of them, as all islands have to have access to
all objects in the base pack.

>
> > What about renaming this feature to
> >
> > [pack]
> > excludePartialReach = refs/virtual/[0-9]]+/tags/
> >
> >   "By setting `pack.excludePartialReach`, object deltafication is
> >   prohibited for objects that are not reachable from all
> >   manifestations of the given regex"
> >
> > Cryptic, but it explains it in my mind in a shorter, more concise way. ;-)
>
> So I'm hopelessly biased at this point, having developed and worked with
> the feature under the "island" name for several years. But I find your
> name and explanation pretty confusing. :)

Ok.

>
> Worse, though, it does not have any noun to refer to the reachable set.
> The regex capture and the island names are an important part of the
> feature, because it lets you make a single island out of:
>
>   refs/virtual/([0-9]+)/heads
>   refs/virtual/([0-9]+)/tags
>
> but exclude:
>
>   refs/virtual/([0-9]+)/(foo)
>
> which goes into its own island ("123-foo" instead of "123"). So what's
> the equivalent nomenclature to "island name"?

So in my understanding we have a "common base pack" and specific
packs on top for each "island".

Regarding naming I find islands interesting and well fitting here, I don't
know of any better name.

I was just confused in the beginning as the name indicates that we care
about the islands, but we rather care about *not* hopping them.

Do you envision to have "groups of islands" (an atoll) for say all
open source clones of linux.git, such that you can layer the packs?
You would not just have the base pack + island pack, but have one
pack that is common to most islands?

Sorry if the initial email came off as a rant; I think the islands
metaphor is very good.

Thanks,
Stefan

Re: [RFC PATCH 0/5] Add delta islands support

2018-07-24 Thread Junio C Hamano

Jeff King  writes:

>I think this is inherent in the scheme (we're losing some delta
>opportunities). But I think it's also made worse because the delta
>window gets clogged with candidates that are forbidden by the island
>config.

Hmph, and the reason why objects that do not even belong to the same
island to be usable as a base are in the object list in the first
place is...?

>Repacking with a big --window helps (and doesn't take as long
>as it otherwise might because we can reject some object pairs based
>on islands before doing any computation on the content).

Ah, then yes, a large window with early culling based on the delta
island criteria definitely sounds like the right solution to that
problem.

>I have replacement code (which we have been running in production)
>that is more clever about the threshold, and also can handle gaps in
>the continuity (so we might realize we need to send objects 1-5000,
>then skip a few, then 5037-8000, and so on).

That vaguely sounds similar to what folks at $DAYJOB runs in their
Gerrit/jgit thing.

[PATCH v3] Makefile: add a DEVOPTS flag to get pedantic compilation

2018-07-24 Thread Beat Bolli

In the interest of code hygiene, make it easier to compile Git with the
flag -pedantic.

Pure pedantic compilation with GCC 7.3 results in one warning per use of
the translation macro `N_`:

warning: array initialized from parenthesized string constant [-Wpedantic]

Therefore also disable the parenthesising of i18n strings with
-DUSE_PARENS_AROUND_GETTEXT_N=no.

Signed-off-by: Beat Bolli 
---

Now with -DUSE_PARENS_AROUND_GETTEXT_N=0 instead of =No.

This is the convenience knob for all developers that led to the series
bb/pedantic[1]. It does not depend on this series, though.

[1] https://public-inbox.org/git/20180708144342.11922-1-dev+...@drbeat.li/T/#u

 Makefile   | 6 ++
 config.mak.dev | 5 +
 2 files changed, 11 insertions(+)

diff --git a/Makefile b/Makefile
index 0cb6590f24..2bfc051652 100644
--- a/Makefile
+++ b/Makefile
@@ -484,6 +484,12 @@ all::
 #The DEVELOPER mode enables -Wextra with a few exceptions. By
 #setting this flag the exceptions are removed, and all of
 #-Wextra is used.
+#
+#pedantic:
+#
+#Enable -pedantic compilation. This also disables
+#USE_PARENS_AROUND_GETTEXT_N to produce only relevant warnings.
 
 GIT-VERSION-FILE: FORCE
@$(SHELL_PATH) ./GIT-VERSION-GEN
diff --git a/config.mak.dev b/config.mak.dev
index 2d244ca470..e11dd94741 100644
--- a/config.mak.dev
+++ b/config.mak.dev
@@ -1,6 +1,11 @@
 ifeq ($(filter no-error,$(DEVOPTS)),)
 CFLAGS += -Werror
 endif
+ifneq ($(filter pedantic,$(DEVOPTS)),)
+CFLAGS += -pedantic
+# don't warn for each N_ use
+CFLAGS += -DUSE_PARENS_AROUND_GETTEXT_N=0
+endif
 CFLAGS += -Wdeclaration-after-statement
 CFLAGS += -Wno-format-zero-length
 CFLAGS += -Wold-style-definition
-- 
2.18.0.203.gfac676dfb9

Re: [RFC PATCH 3/5] pack-objects: add delta-islands support

2018-07-24 Thread Junio C Hamano

Junio C Hamano  writes:

>> +---
>> +[pack]
>> +island = refs/virtual/([0-9]+)/heads/
>> +island = refs/virtual/([0-9]+)/tags/
>> +island = refs/virtual/([0-9]+)/(pull)/
>> +---
>
> It becomes clear only from this example that what the feature calls
> (and documented in patch 2/5) "regexp" is not BRE but ERE.  Update
> 2/5 so that it is clear to readers of "git config --help" who looks
> for "pack.island" in the output.
>
>> +That puts the heads and tags for each fork in their own island (named
>> +"1234" or similar), and the pull refs for each go into their own
>> +"1234-pull".
>
> "by concatenating any capture groups" made me imagine that the last
> one would be "1234pull" without dash.  The actual rule should be
> mentioned in that paragraph (i.e.  "concatenating any capture groups
> from the regex, with a '-' dash in between" or something like that).

Another thing I noticed from 2/5 is that you can have up to 7 such
capture groups.  I do not have any opinion if 7 is too few or too
many, but we would want the number to be documented, and end-user
input diagnosed when it requires more captures than we support (if
we are not already checking, that is).

Re: [RFC PATCH 3/5] pack-objects: add delta-islands support

2018-07-24 Thread Junio C Hamano

Christian Couder  writes:

> +Islands are configured via the `pack.island` option, which can be
> +specified multiple times. Each value is a left-anchored regular
> +expressions matching refnames. For example:
> +
> +---
> +[pack]
> +island = refs/heads/
> +island = refs/tags/
> +---
> +
> +puts heads and tags into an island (whose name is the empty string; see
> +below for more on naming). Any refs which do not match those regular
> +expressions (e.g., `refs/pull/123`) is not in any island. Any object
> +which is reachable only from `refs/pull/` (but not heads or tags) is
> +therefore not a candidate to be used as a base for `refs/heads/`.
> +
> +Refs are grouped into islands based on their "names", and two regexes
> +that produce the same name are considered to be in the same island. The
> +names are computed from the regexes by concatenating any capture groups
> +from the regex (and if there are none, then the name is the empty
> +string, as in the above example). This allows you to create arbitrary
> +numbers of islands. For example, imagine you store the refs for each
> +fork in `refs/virtual/ID`, where `ID` is a numeric identifier. You might
> +then configure:
> +
> +---
> +[pack]
> +island = refs/virtual/([0-9]+)/heads/
> +island = refs/virtual/([0-9]+)/tags/
> +island = refs/virtual/([0-9]+)/(pull)/
> +---

It becomes clear only from this example that what the feature calls
(and documented in patch 2/5) "regexp" is not BRE but ERE.  Update
2/5 so that it is clear to readers of "git config --help" who looks
for "pack.island" in the output.

> +That puts the heads and tags for each fork in their own island (named
> +"1234" or similar), and the pull refs for each go into their own
> +"1234-pull".

"by concatenating any capture groups" made me imagine that the last
one would be "1234pull" without dash.  The actual rule should be
mentioned in that paragraph (i.e.  "concatenating any capture groups
from the regex, with a '-' dash in between" or something like that).

> +Note that we pick a single island for each regex to go into, using "last
> +one wins" ordering (which allows repo-specific config to take precedence
> +over user-wide config, and so forth).

Re: [RFC PATCH 2/5] Add delta-islands.{c,h}

2018-07-24 Thread Junio C Hamano

Christian Couder  writes:

> From: Jeff King 
>
> Hosting providers that allow users to "fork" existing
> repos want as much as possible those forks to use a
> small amount of disk space.

"... want those forks to consume as little amount of disk space as
possible?"  Or perhaps "... want those forks to share as much disk
space as possible"?

> Using alternates to keep all the objects from all the forks into a
> unique central repo is way to do that, but ...

s/is way to/is a way to/, I guess, but that makes it sound as if the
series invented a way to share objects without using alternates.

> it can have some
> drawbacks. Especially when packing the central repo, deltas will
> be created between objects from different forks.
>
> This can make cloning or fetching a fork much slower and much more
> CPU intensive as Git might have to compute new deltas for many
> objects to avoid sending objects from a different fork.
>
> Delta islands is a way to store objects from different forks in
> the same repo and packfile without having deltas between objects
> from different forks.

I think another paragraph between the above two is needed to briefly
outline the central idea (which would make it clear why that is
called "delta islands") is called for.  Perhaps

Because the inefficiency primarily arises when an object is
delitified against another object that does not exist in the
same fork, we partion objects into sets that appear in the
same fork, and define "delta islands".  When finding delta
base, we do not allow an object outside the same island to
be considered as its base.

or something along that line.  Perhaps that would even make the last
paragraph unnecessary.

> +struct island_bitmap {
> + uint32_t refcount;
> + uint32_t bits[];
> +};
> +
> +static uint32_t island_bitmap_size;
> +
> +/*
> + * Allocate a new bitmap; if "old" is not NULL, the new bitmap will be a copy
> + * of "old". Otherwise, the new bitmap is empty.
> + */
> +static struct island_bitmap *island_bitmap_new(const struct island_bitmap 
> *old)
> +{
> + size_t size = sizeof(struct island_bitmap) + (island_bitmap_size * 4);
> + struct island_bitmap *b = xcalloc(1, size);

Is one of the variants of flex array macros applicable here?

> + if (old)
> + memcpy(b, old, size);
> +
> + b->refcount = 1;
> + return b;
> +}
> +
> +static void island_bitmap_or(struct island_bitmap *a, const struct 
> island_bitmap *b)
> +{
> + uint32_t i;
> +
> + for (i = 0; i < island_bitmap_size; ++i)
> + a->bits[i] |= b->bits[i];
> +}
> +
> +static int island_bitmap_is_subset(struct island_bitmap *self,
> + struct island_bitmap *super)
> +{
> + uint32_t i;
> +
> + if (self == super)
> + return 1;
> +
> + for (i = 0; i < island_bitmap_size; ++i) {
> + if ((self->bits[i] & super->bits[i]) != self->bits[i])
> + return 0;
> + }
> +
> + return 1;
> +}
> +#define ISLAND_BITMAP_BLOCK(x) (x / 32)
> +#define ISLAND_BITMAP_MASK(x) (1 << (x % 32))
> +
> +static void island_bitmap_set(struct island_bitmap *self, uint32_t i)
> +{
> + self->bits[ISLAND_BITMAP_BLOCK(i)] |= ISLAND_BITMAP_MASK(i);
> +}
> +
> +static int island_bitmap_get(struct island_bitmap *self, uint32_t i)
> +{
> + return (self->bits[ISLAND_BITMAP_BLOCK(i)] & ISLAND_BITMAP_MASK(i)) != 
> 0;
> +}
> +

Not necessarily a complaint, but do we need another implementation
of bitmap here, or the compressed bitmap used for the pack bitmap is
unsuited for the purpose of this thing (e.g. perhaps it is overkill,
as we won't be shooting for saving disk footprint of a bitmap that
we are not going to save on disk anyway)?

> +static int cmp_tree_depth(const void *va, const void *vb)
> +{
> + struct object_entry *a = *(struct object_entry **)va;
> + struct object_entry *b = *(struct object_entry **)vb;
> + return a->tree_depth - b->tree_depth;
> +}
> +
> +void resolve_tree_islands(int progress, struct packing_data *to_pack)
> +{
> + struct progress *progress_state = NULL;
> + struct object_entry **todo;
> + int nr = 0;
> + int i;
> +
> + if (!island_marks)
> + return;
> +
> + /*
> +  * We process only trees, as commits and tags have already been handled
> +  * (and passed their marks on to root trees, as well. We must make sure
> +  * to process them in descending tree-depth order so that marks
> +  * propagate down the tree properly, even if a sub-tree is found in
> +  * multiple parent trees.
> +  */
> + todo = xmalloc(to_pack->nr_objects * sizeof(*todo));
> + for (i = 0; i < to_pack->nr_objects; i++) {
> + if (oe_type(_pack->objects[i]) == OBJ_TREE)
> + todo[nr++] = _pack->objects[i];
> + }
> + qsort(todo, nr, sizeof(*todo), cmp_tree_depth);

Hmph, at this stage nobody actually seems to set tree_depth; I am
wondering

[GSoC][PATCH v4 18/20] rebase--interactive2: rewrite the submodes of interactive rebase in C

2018-07-24 Thread Alban Gruin

This rewrites the submodes of interactive rebase (`--continue`,
`--skip`, `--edit-todo`, and `--show-current-patch`) in C.

git-rebase.sh is then modified to call directly git-rebase--interactive2
instead of git-rebase--interactive.sh.

Signed-off-by: Alban Gruin 
---
 builtin/rebase--interactive2.c | 47 +++---
 git-rebase.sh  | 45 +---
 2 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/builtin/rebase--interactive2.c b/builtin/rebase--interactive2.c
index e89ef71499..53b4f7483d 100644
--- a/builtin/rebase--interactive2.c
+++ b/builtin/rebase--interactive2.c
@@ -134,11 +134,14 @@ int cmd_rebase__interactive(int argc, const char **argv, 
const char *prefix)
 {
struct replay_opts opts = REPLAY_OPTS_INIT;
unsigned flags = 0, keep_empty = 0, rebase_merges = 0, autosquash = 0;
-   int abbreviate_commands = 0, rebase_cousins = -1;
+   int abbreviate_commands = 0, rebase_cousins = -1, ret = 0;
const char *onto = NULL, *onto_name = NULL, *restrict_revision = NULL,
*squash_onto = NULL, *upstream = NULL, *head_name = NULL,
*switch_to = NULL, *cmd = NULL;
char *raw_strategies = NULL;
+   enum {
+   NONE = 0, CONTINUE, SKIP, EDIT_TODO, SHOW_CURRENT_PATCH
+   } command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
OPT_BOOL(0, "keep-empty", _empty, N_("keep empty 
commits")),
@@ -151,6 +154,13 @@ int cmd_rebase__interactive(int argc, const char **argv, 
const char *prefix)
 N_("move commits that begin with squash!/fixup!")),
OPT_BOOL(0, "signoff", , N_("sign commits")),
OPT__VERBOSE(, N_("be verbose")),
+   OPT_CMDMODE(0, "continue", , N_("continue rebase"),
+   CONTINUE),
+   OPT_CMDMODE(0, "skip", , N_("skip commit"), SKIP),
+   OPT_CMDMODE(0, "edit-todo", , N_("edit the todo list"),
+   EDIT_TODO),
+   OPT_CMDMODE(0, "show-current-patch", , N_("show the 
current patch"),
+   SHOW_CURRENT_PATCH),
OPT_STRING(0, "onto", , N_("onto"), N_("onto")),
OPT_STRING(0, "restrict-revision", _revision,
   N_("restrict-revision"), N_("restrict revision")),
@@ -194,7 +204,36 @@ int cmd_rebase__interactive(int argc, const char **argv, 
const char *prefix)
warning(_("--[no-]rebase-cousins has no effect without "
  "--rebase-merges"));
 
-   return !!do_interactive_rebase(, flags, switch_to, upstream, onto,
-  onto_name, squash_onto, head_name, 
restrict_revision,
-  raw_strategies, cmd, autosquash);
+   switch (command) {
+   case NONE:
+   ret = do_interactive_rebase(, flags, switch_to, upstream, 
onto,
+   onto_name, squash_onto, head_name, 
restrict_revision,
+   raw_strategies, cmd, autosquash);
+   break;
+   case SKIP: {
+   struct string_list merge_rr = STRING_LIST_INIT_DUP;
+
+   rerere_clear(_rr);
+   /* fallthrough */
+   case CONTINUE:
+   ret = sequencer_continue();
+   break;
+   }
+   case EDIT_TODO:
+   ret = edit_todo_list(flags);
+   break;
+   case SHOW_CURRENT_PATCH: {
+   struct child_process cmd = CHILD_PROCESS_INIT;
+
+   cmd.git_cmd = 1;
+   argv_array_pushl(, "show", "REBASE_HEAD", "--", NULL);
+   ret = run_command();
+
+   break;
+   }
+   default:
+   BUG("invalid command '%d'", command);
+   }
+
+   return !!ret;
 }
diff --git a/git-rebase.sh b/git-rebase.sh
index 86da3816be..65e909d7bd 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -200,19 +200,56 @@ finish_rebase () {
rm -rf "$state_dir"
 }
 
+run_interactive () {
+   GIT_CHERRY_PICK_HELP="$resolvemsg"
+   export GIT_CHERRY_PICK_HELP
+
+   test -n "$keep_empty" && keep_empty="--keep-empty"
+   test -n "$rebase_merges" && rebase_merges="--rebase-merges"
+   test -n "$rebase_cousins" && rebase_cousins="--rebase-cousins"
+   test -n "$autosquash" && autosquash="--autosquash"
+   test -n "$verbose" && verbose="--verbose"
+   test -n "$force_rebase" && force_rebase="--no-ff"
+   test -n "$restrict_revisions" && \
+   restrict_revisions="--restrict-revisions=^$restrict_revisions"
+   test -n "$upstream" && upstream="--upstream=$upstream"
+   test -n "$onto" && onto="--onto=$onto"
+   test -n "$squash_onto" && squash_onto="--squash-onto=$squash_onto"
+   test -n "$onto_name" &&

[GSoC][PATCH v4 20/20] rebase -i: move rebase--helper modes to rebase--interactive

2018-07-24 Thread Alban Gruin

This moves the rebase--helper modes still used by
git-rebase--preserve-merges.sh (`--shorten-ids`, `--expand-ids`,
`--check-todo-list`, `--rearrange-squash` and `--add-exec-commands`) to
rebase--interactive.c.

git-rebase--preserve-merges.sh is modified accordingly, and
rebase--helper.c is removed as it is useless now.

Signed-off-by: Alban Gruin 
---
 .gitignore |   1 -
 Makefile   |   1 -
 builtin/rebase--helper.c   | 226 -
 builtin/rebase--interactive.c  |  27 +++-
 git-rebase--preserve-merges.sh |  10 +-
 git.c  |   1 -
 6 files changed, 31 insertions(+), 235 deletions(-)
 delete mode 100644 builtin/rebase--helper.c

diff --git a/.gitignore b/.gitignore
index 3284a1e9b1..406f26d050 100644
--- a/.gitignore
+++ b/.gitignore
@@ -116,7 +116,6 @@
 /git-read-tree
 /git-rebase
 /git-rebase--am
-/git-rebase--helper
 /git-rebase--interactive
 /git-rebase--merge
 /git-rebase--preserve-merges
diff --git a/Makefile b/Makefile
index 584834726d..ca3a0888dd 100644
--- a/Makefile
+++ b/Makefile
@@ -1059,7 +1059,6 @@ BUILTIN_OBJS += builtin/prune.o
 BUILTIN_OBJS += builtin/pull.o
 BUILTIN_OBJS += builtin/push.o
 BUILTIN_OBJS += builtin/read-tree.o
-BUILTIN_OBJS += builtin/rebase--helper.o
 BUILTIN_OBJS += builtin/rebase--interactive.o
 BUILTIN_OBJS += builtin/receive-pack.o
 BUILTIN_OBJS += builtin/reflog.o
diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
deleted file mode 100644
index ac21e8e06e..00
--- a/builtin/rebase--helper.c
+++ /dev/null
@@ -1,226 +0,0 @@
-#include "builtin.h"
-#include "cache.h"
-#include "config.h"
-#include "parse-options.h"
-#include "sequencer.h"
-#include "rebase-interactive.h"
-#include "argv-array.h"
-#include "refs.h"
-#include "rerere.h"
-#include "alias.h"
-
-static GIT_PATH_FUNC(path_state_dir, "rebase-merge/")
-static GIT_PATH_FUNC(path_squash_onto, "rebase-merge/squash-onto")
-static GIT_PATH_FUNC(path_interactive, "rebase-merge/interactive")
-
-static int get_revision_ranges(const char *upstream, const char *onto,
-  const char **head_hash,
-  char **revisions, char **shortrevisions)
-{
-   const char *base_rev = upstream ? upstream : onto;
-   struct object_id orig_head;
-
-   if (get_oid("HEAD", _head))
-   return error(_("no HEAD?"));
-
-   *head_hash = find_unique_abbrev(_head, GIT_MAX_HEXSZ);
-
-   if (revisions)
-   *revisions = xstrfmt("%s...%s", base_rev, *head_hash);
-   if (shortrevisions) {
-   const char *shorthead;
-
-   shorthead = find_unique_abbrev(_head, DEFAULT_ABBREV);
-
-   if (upstream) {
-   const char *shortrev;
-   struct object_id rev_oid;
-
-   get_oid(base_rev, _oid);
-   shortrev = find_unique_abbrev(_oid, DEFAULT_ABBREV);
-
-   *shortrevisions = xstrfmt("%s..%s", shortrev, 
shorthead);
-   } else
-   *shortrevisions = xstrdup(shorthead);
-   }
-
-   return 0;
-}
-
-static int init_basic_state(struct replay_opts *opts, const char *head_name,
-   const char *onto, const char *orig_head)
-{
-   FILE *interactive;
-
-   if (!is_directory(path_state_dir()) && 
mkdir_in_gitdir(path_state_dir()))
-   return error_errno(_("could not create temporary %s"), 
path_state_dir());
-
-   delete_reflog("REBASE_HEAD");
-
-   interactive = fopen(path_interactive(), "w");
-   if (!interactive)
-   return error_errno(_("could not mark as interactive"));
-   fclose(interactive);
-
-   return write_basic_state(opts, head_name, onto, orig_head);
-}
-
-static const char * const builtin_rebase_helper_usage[] = {
-   N_("git rebase--helper []"),
-   NULL
-};
-
-int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
-{
-   struct replay_opts opts = REPLAY_OPTS_INIT;
-   unsigned flags = 0, keep_empty = 0, rebase_merges = 0, autosquash = 0;
-   int abbreviate_commands = 0, rebase_cousins = -1, ret;
-   const char *head_hash = NULL, *onto = NULL, *restrict_revision = NULL,
-   *squash_onto = NULL, *upstream = NULL, *head_name = NULL;
-   char *raw_strategies = NULL;
-   enum {
-   CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
-   CHECK_TODO_LIST, REARRANGE_SQUASH, ADD_EXEC, EDIT_TODO, 
PREPARE_BRANCH,
-   COMPLETE_ACTION, INIT_BASIC_STATE
-   } command = 0;
-   struct option options[] = {
-   OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
-   OPT_BOOL(0, "keep-empty", _empty, N_("keep empty 
commits")),
-   OPT_BOOL(0, "allow-empty-message", _empty_message,
-   N_("allow commits with empty messages")),
-   OPT_BOOL(0,

[GSoC][PATCH v4 07/20] rebase -i: rewrite checkout_onto() in C

2018-07-24 Thread Alban Gruin

This rewrites checkout_onto() from shell to C.

A new command (“checkout-onto”) is added to rebase--helper.c. The shell
version is then stripped.

Signed-off-by: Alban Gruin 
---
No changes since v3.

 builtin/rebase--helper.c   |  7 ++-
 git-rebase--interactive.sh | 25 -
 sequencer.c| 19 +++
 sequencer.h|  3 +++
 4 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 0e76dadba6..7d9426d23c 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -18,7 +18,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH,
-   ADD_EXEC, APPEND_TODO_HELP, EDIT_TODO, PREPARE_BRANCH
+   ADD_EXEC, APPEND_TODO_HELP, EDIT_TODO, PREPARE_BRANCH,
+   CHECKOUT_ONTO
} command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
@@ -54,6 +55,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
EDIT_TODO),
OPT_CMDMODE(0, "prepare-branch", ,
N_("prepare the branch to be rebased"), 
PREPARE_BRANCH),
+   OPT_CMDMODE(0, "checkout-onto", ,
+   N_("checkout a commit"), CHECKOUT_ONTO),
OPT_END()
};
 
@@ -99,5 +102,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!edit_todo_list(flags);
if (command == PREPARE_BRANCH && argc == 2)
return !!prepare_branch_to_be_rebased(, argv[1]);
+   if (command == CHECKOUT_ONTO && argc == 4)
+   return !!checkout_onto(, argv[1], argv[2], argv[3]);
usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 77e972bb6c..b68f108f28 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -28,17 +28,6 @@ case "$comment_char" in
;;
 esac
 
-orig_reflog_action="$GIT_REFLOG_ACTION"
-
-comment_for_reflog () {
-   case "$orig_reflog_action" in
-   ''|rebase*)
-   GIT_REFLOG_ACTION="rebase -i ($1)"
-   export GIT_REFLOG_ACTION
-   ;;
-   esac
-}
-
 die_abort () {
apply_autostash
rm -rf "$state_dir"
@@ -70,14 +59,6 @@ collapse_todo_ids() {
git rebase--helper --shorten-ids
 }
 
-# Switch to the branch in $into and notify it in the reflog
-checkout_onto () {
-   comment_for_reflog start
-   GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $onto_name"
-   output git checkout $onto || die_abort "$(gettext "could not detach 
HEAD")"
-   git update-ref ORIG_HEAD $orig_head
-}
-
 get_missing_commit_check_level () {
check_level=$(git config --get rebase.missingCommitsCheck)
check_level=${check_level:-ignore}
@@ -176,7 +157,8 @@ EOF
 
git rebase--helper --check-todo-list || {
ret=$?
-   checkout_onto
+   git rebase--helper --checkout-onto "$onto_name" "$onto" \
+   "$orig_head" ${verbose:+--verbose}
exit $ret
}
 
@@ -186,7 +168,8 @@ EOF
onto="$(git rebase--helper --skip-unnecessary-picks)" ||
die "Could not skip unnecessary pick commands"
 
-   checkout_onto
+   git rebase--helper --checkout-onto "$onto_name" "$onto" "$orig_head" \
+   ${verbose:+--verbose}
require_clean_work_tree "rebase"
exec git rebase--helper ${force_rebase:+--no-ff} $allow_empty_message \
 --continue
diff --git a/sequencer.c b/sequencer.c
index 0bc53e009e..c7d01d60b3 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3169,6 +3169,25 @@ int prepare_branch_to_be_rebased(struct replay_opts 
*opts, const char *commit)
return 0;
 }
 
+int checkout_onto(struct replay_opts *opts,
+ const char *onto_name, const char *onto,
+ const char *orig_head)
+{
+   struct object_id oid;
+   const char *action = reflog_message(opts, "start", "checkout %s", 
onto_name);
+
+   if (get_oid(orig_head, ))
+   return error(_("%s: not a valid OID"), orig_head);
+
+   if (run_git_checkout(opts, onto, action)) {
+   apply_autostash(opts);
+   sequencer_remove_state(opts);
+   return error(_("could not detach HEAD"));
+   }
+
+   return update_ref(NULL, "ORIG_HEAD", , NULL, 0, 
UPDATE_REFS_MSG_ON_ERR);
+}
+
 static const char rescheduled_advice[] =
 N_("Could not execute the todo command\n"
 "\n"
diff --git a/sequencer.h b/sequencer.h
index 93da713fe2..11a5334612 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -107,6 +107,9 @@ void

[GSoC][PATCH v4 06/20] rebase -i: rewrite setup_reflog_action() in C

2018-07-24 Thread Alban Gruin

This rewrites (the misnamed) setup_reflog_action() from shell to C. The
new version is called prepare_branch_to_be_rebased().

A new command is added to rebase--helper.c, “checkout-base”, as well as
a new flag, “verbose”, to avoid silencing the output of the checkout
operation called by checkout_base_commit().

The function `run_git_checkout()` will also be used in the next commit,
therefore its code is not part of `checkout_base_commit()`.

The shell version is then stripped in favour of a call to the helper.

As $GIT_REFLOG_ACTION is no longer set at the first call of
checkout_onto(), a call to comment_for_reflog() is added at the
beginning of this function.

Signed-off-by: Alban Gruin 
---
Dropped the `verbose` variable.

 builtin/rebase--helper.c   |  7 ++-
 git-rebase--interactive.sh | 16 ++--
 sequencer.c| 30 ++
 sequencer.h|  2 ++
 4 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 731a64971d..0e76dadba6 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -18,7 +18,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH,
-   ADD_EXEC, APPEND_TODO_HELP, EDIT_TODO
+   ADD_EXEC, APPEND_TODO_HELP, EDIT_TODO, PREPARE_BRANCH
} command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
@@ -28,6 +28,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
OPT_BOOL(0, "rebase-merges", _merges, N_("rebase merge 
commits")),
OPT_BOOL(0, "rebase-cousins", _cousins,
 N_("keep original branch points of cousins")),
+   OPT__VERBOSE(, N_("be verbose")),
OPT_CMDMODE(0, "continue", , N_("continue rebase"),
CONTINUE),
OPT_CMDMODE(0, "abort", , N_("abort rebase"),
@@ -51,6 +52,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
OPT_CMDMODE(0, "edit-todo", ,
N_("edit the todo list during an interactive 
rebase"),
EDIT_TODO),
+   OPT_CMDMODE(0, "prepare-branch", ,
+   N_("prepare the branch to be rebased"), 
PREPARE_BRANCH),
OPT_END()
};
 
@@ -94,5 +97,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!append_todo_help(0, keep_empty);
if (command == EDIT_TODO && argc == 1)
return !!edit_todo_list(flags);
+   if (command == PREPARE_BRANCH && argc == 2)
+   return !!prepare_branch_to_be_rebased(, argv[1]);
usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 2defe607f4..77e972bb6c 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -72,6 +72,7 @@ collapse_todo_ids() {
 
 # Switch to the branch in $into and notify it in the reflog
 checkout_onto () {
+   comment_for_reflog start
GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $onto_name"
output git checkout $onto || die_abort "$(gettext "could not detach 
HEAD")"
git update-ref ORIG_HEAD $orig_head
@@ -119,19 +120,6 @@ initiate_action () {
esac
 }
 
-setup_reflog_action () {
-   comment_for_reflog start
-
-   if test ! -z "$switch_to"
-   then
-   GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $switch_to"
-   output git checkout "$switch_to" -- ||
-   die "$(eval_gettext "Could not checkout \$switch_to")"
-
-   comment_for_reflog start
-   fi
-}
-
 init_basic_state () {
orig_head=$(git rev-parse --verify HEAD) || die "$(gettext "No HEAD?")"
mkdir -p "$state_dir" || die "$(eval_gettext "Could not create 
temporary \$state_dir")"
@@ -211,7 +199,7 @@ git_rebase__interactive () {
return 0
fi
 
-   setup_reflog_action
+   git rebase--helper --prepare-branch "$switch_to" ${verbose:+--verbose}
init_basic_state
 
init_revisions_and_shortrevisions
diff --git a/sequencer.c b/sequencer.c
index 274cddd13f..0bc53e009e 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3139,6 +3139,36 @@ static const char *reflog_message(struct replay_opts 
*opts,
return buf.buf;
 }
 
+static int run_git_checkout(struct replay_opts *opts, const char *commit,
+   const char *action)
+{
+   struct child_process cmd = CHILD_PROCESS_INIT;
+
+   cmd.git_cmd = 1;
+
+   argv_array_push(, "checkout");
+   argv_array_push(, commit);
+

[GSoC][PATCH v4 05/20] sequencer: add a new function to silence a command, except if it fails

2018-07-24 Thread Alban Gruin

This adds a new function, run_command_silent_on_success(), to
redirect the stdout and stderr of a command to a strbuf, and then to run
that command. This strbuf is printed only if the command fails. It is
functionnaly similar to output() from git-rebase.sh.

run_git_commit() is then refactored to use of
run_command_silent_on_success().

Signed-off-by: Alban Gruin 
---
No changes since v3.

 sequencer.c | 51 +--
 1 file changed, 25 insertions(+), 26 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 0fc03dfe48..274cddd13f 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -769,6 +769,23 @@ N_("you have staged changes in your working tree\n"
 #define VERIFY_MSG  (1<<4)
 #define CREATE_ROOT_COMMIT (1<<5)
 
+static int run_command_silent_on_success(struct child_process *cmd)
+{
+   struct strbuf buf = STRBUF_INIT;
+   int rc;
+
+   cmd->stdout_to_stderr = 1;
+   rc = pipe_command(cmd,
+ NULL, 0,
+ NULL, 0,
+ , 0);
+
+   if (rc)
+   fputs(buf.buf, stderr);
+   strbuf_release();
+   return rc;
+}
+
 /*
  * If we are cherry-pick, and if the merge did not result in
  * hand-editing, we will hit this commit and inherit the original
@@ -823,18 +840,11 @@ static int run_git_commit(const char *defmsg, struct 
replay_opts *opts,
 
cmd.git_cmd = 1;
 
-   if (is_rebase_i(opts)) {
-   if (!(flags & EDIT_MSG)) {
-   cmd.stdout_to_stderr = 1;
-   cmd.err = -1;
-   }
-
-   if (read_env_script(_array)) {
-   const char *gpg_opt = gpg_sign_opt_quoted(opts);
+   if (is_rebase_i(opts) && read_env_script(_array)) {
+   const char *gpg_opt = gpg_sign_opt_quoted(opts);
 
-   return error(_(staged_changes_advice),
-gpg_opt, gpg_opt);
-   }
+   return error(_(staged_changes_advice),
+gpg_opt, gpg_opt);
}
 
argv_array_push(, "commit");
@@ -864,21 +874,10 @@ static int run_git_commit(const char *defmsg, struct 
replay_opts *opts,
if (opts->allow_empty_message)
argv_array_push(, "--allow-empty-message");
 
-   if (cmd.err == -1) {
-   /* hide stderr on success */
-   struct strbuf buf = STRBUF_INIT;
-   int rc = pipe_command(,
- NULL, 0,
- /* stdout is already redirected */
- NULL, 0,
- , 0);
-   if (rc)
-   fputs(buf.buf, stderr);
-   strbuf_release();
-   return rc;
-   }
-
-   return run_command();
+   if (is_rebase_i(opts) && !(flags & EDIT_MSG))
+   return run_command_silent_on_success();
+   else
+   return run_command();
 }
 
 static int rest_is_empty(const struct strbuf *sb, int start)
-- 
2.18.0

[GSoC][PATCH v4 03/20] editor: add a function to launch the sequence editor

2018-07-24 Thread Alban Gruin

As part of the rewrite of interactive rebase, the sequencer will need to
open the sequence editor to allow the user to edit the todo list.
Instead of duplicating the existing launch_editor() function, this
refactors it to a new function, launch_specified_editor(), which takes
the editor as a parameter, in addition to the path, the buffer and the
environment variables.  launch_sequence_editor() is then added to launch
the sequence editor.

Signed-off-by: Alban Gruin 
---
No changes since v3.

 cache.h  |  1 +
 editor.c | 27 +--
 strbuf.h |  2 ++
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index 8b447652a7..d70ae49ca2 100644
--- a/cache.h
+++ b/cache.h
@@ -1409,6 +1409,7 @@ extern const char *fmt_name(const char *name, const char 
*email);
 extern const char *ident_default_name(void);
 extern const char *ident_default_email(void);
 extern const char *git_editor(void);
+extern const char *git_sequence_editor(void);
 extern const char *git_pager(int stdout_is_tty);
 extern int is_terminal_dumb(void);
 extern int git_ident_config(const char *, const char *, void *);
diff --git a/editor.c b/editor.c
index 9a9b4e12d1..c985eee1f9 100644
--- a/editor.c
+++ b/editor.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "config.h"
 #include "strbuf.h"
 #include "run-command.h"
 #include "sigchain.h"
@@ -34,10 +35,21 @@ const char *git_editor(void)
return editor;
 }
 
-int launch_editor(const char *path, struct strbuf *buffer, const char *const 
*env)
+const char *git_sequence_editor(void)
 {
-   const char *editor = git_editor();
+   const char *editor = getenv("GIT_SEQUENCE_EDITOR");
+
+   if (!editor)
+   git_config_get_string_const("sequence.editor", );
+   if (!editor)
+   editor = git_editor();
 
+   return editor;
+}
+
+static int launch_specified_editor(const char *editor, const char *path,
+  struct strbuf *buffer, const char *const 
*env)
+{
if (!editor)
return error("Terminal is dumb, but EDITOR unset");
 
@@ -95,3 +107,14 @@ int launch_editor(const char *path, struct strbuf *buffer, 
const char *const *en
return error_errno("could not read file '%s'", path);
return 0;
 }
+
+int launch_editor(const char *path, struct strbuf *buffer, const char *const 
*env)
+{
+   return launch_specified_editor(git_editor(), path, buffer, env);
+}
+
+int launch_sequence_editor(const char *path, struct strbuf *buffer,
+  const char *const *env)
+{
+   return launch_specified_editor(git_sequence_editor(), path, buffer, 
env);
+}
diff --git a/strbuf.h b/strbuf.h
index 60a35aef16..66da9822fd 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -575,6 +575,8 @@ extern void strbuf_add_unique_abbrev(struct strbuf *sb,
  * file's contents are not read into the buffer upon completion.
  */
 extern int launch_editor(const char *path, struct strbuf *buffer, const char 
*const *env);
+extern int launch_sequence_editor(const char *path, struct strbuf *buffer,
+ const char *const *env);
 
 extern void strbuf_add_lines(struct strbuf *sb, const char *prefix, const char 
*buf, size_t size);
 
-- 
2.18.0

[GSoC][PATCH v4 17/20] rebase -i: implement the main part of interactive rebase as a builtin

2018-07-24 Thread Alban Gruin

This rewrites the part of interactive rebase which initializes the
basic state, make the script and complete the action, as a buitin, named
git-rebase--interactive2 for now.  Others modes (`--continue`,
`--edit-todo`, etc.) will be rewritten in the next commit.

git-rebase--interactive.sh is modified to call git-rebase--interactive2
instead of git-rebase--helper.

Signed-off-by: Alban Gruin 
---
 .gitignore |   1 +
 Makefile   |   1 +
 builtin.h  |   1 +
 builtin/rebase--interactive2.c | 200 +
 git-rebase--interactive.sh |  41 +++
 git.c  |   1 +
 6 files changed, 226 insertions(+), 19 deletions(-)
 create mode 100644 builtin/rebase--interactive2.c

diff --git a/.gitignore b/.gitignore
index 3284a1e9b1..404c9a8472 100644
--- a/.gitignore
+++ b/.gitignore
@@ -118,6 +118,7 @@
 /git-rebase--am
 /git-rebase--helper
 /git-rebase--interactive
+/git-rebase--interactive2
 /git-rebase--merge
 /git-rebase--preserve-merges
 /git-receive-pack
diff --git a/Makefile b/Makefile
index 909a687857..71f8f45fe5 100644
--- a/Makefile
+++ b/Makefile
@@ -1060,6 +1060,7 @@ BUILTIN_OBJS += builtin/prune.o
 BUILTIN_OBJS += builtin/pull.o
 BUILTIN_OBJS += builtin/push.o
 BUILTIN_OBJS += builtin/read-tree.o
+BUILTIN_OBJS += builtin/rebase--interactive2.o
 BUILTIN_OBJS += builtin/rebase--helper.o
 BUILTIN_OBJS += builtin/receive-pack.o
 BUILTIN_OBJS += builtin/reflog.o
diff --git a/builtin.h b/builtin.h
index 0362f1ce25..ed89b4f495 100644
--- a/builtin.h
+++ b/builtin.h
@@ -202,6 +202,7 @@ extern int cmd_prune_packed(int argc, const char **argv, 
const char *prefix);
 extern int cmd_pull(int argc, const char **argv, const char *prefix);
 extern int cmd_push(int argc, const char **argv, const char *prefix);
 extern int cmd_read_tree(int argc, const char **argv, const char *prefix);
+extern int cmd_rebase__interactive(int argc, const char **argv, const char 
*prefix);
 extern int cmd_rebase__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_receive_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_reflog(int argc, const char **argv, const char *prefix);
diff --git a/builtin/rebase--interactive2.c b/builtin/rebase--interactive2.c
new file mode 100644
index 00..e89ef71499
--- /dev/null
+++ b/builtin/rebase--interactive2.c
@@ -0,0 +1,200 @@
+#include "builtin.h"
+#include "cache.h"
+#include "config.h"
+#include "parse-options.h"
+#include "sequencer.h"
+#include "rebase-interactive.h"
+#include "argv-array.h"
+#include "refs.h"
+#include "rerere.h"
+#include "run-command.h"
+
+static GIT_PATH_FUNC(path_state_dir, "rebase-merge/")
+static GIT_PATH_FUNC(path_squash_onto, "rebase-merge/squash-onto")
+static GIT_PATH_FUNC(path_interactive, "rebase-merge/interactive")
+
+static int get_revision_ranges(const char *upstream, const char *onto,
+  const char **head_hash,
+  char **revisions, char **shortrevisions)
+{
+   const char *base_rev = upstream ? upstream : onto, *shorthead;
+   struct object_id orig_head;
+
+   if (get_oid("HEAD", _head))
+   return error(_("no HEAD?"));
+
+   *head_hash = find_unique_abbrev(_head, GIT_MAX_HEXSZ);
+   *revisions = xstrfmt("%s...%s", base_rev, *head_hash);
+
+   shorthead = find_unique_abbrev(_head, DEFAULT_ABBREV);
+
+   if (upstream) {
+   const char *shortrev;
+   struct object_id rev_oid;
+
+   get_oid(base_rev, _oid);
+   shortrev = find_unique_abbrev(_oid, DEFAULT_ABBREV);
+
+   *shortrevisions = xstrfmt("%s..%s", shortrev, shorthead);
+   } else
+   *shortrevisions = xstrdup(shorthead);
+
+   return 0;
+}
+
+static int init_basic_state(struct replay_opts *opts, const char *head_name,
+   const char *onto, const char *orig_head)
+{
+   FILE *interactive;
+
+   if (!is_directory(path_state_dir()) && 
mkdir_in_gitdir(path_state_dir()))
+   return error_errno(_("could not create temporary %s"), 
path_state_dir());
+
+   delete_reflog("REBASE_HEAD");
+
+   interactive = fopen(path_interactive(), "w");
+   if (!interactive)
+   return error_errno(_("could not mark as interactive"));
+   fclose(interactive);
+
+   return write_basic_state(opts, head_name, onto, orig_head);
+}
+
+static int do_interactive_rebase(struct replay_opts *opts, unsigned flags,
+const char *switch_to, const char *upstream,
+const char *onto, const char *onto_name,
+const char *squash_onto, const char *head_name,
+const char *restrict_revision, char 
*raw_strategies,
+const char *cmd, unsigned autosquash)
+{
+   int ret;
+   const char *head_hash

[GSoC][PATCH v4 12/20] rebase -i: remove unused modes and functions

2018-07-24 Thread Alban Gruin

This removes the modes `--skip-unnecessary-picks`, `--append-todo-help`,
and `--checkout-onto` from rebase--helper.c, the functions of
git-rebase--interactive.sh that were rendered useless by the rewrite of
complete_action(), and append_todo_help_to_file() from
rebase-interactive.c.

skip_unnecessary_picks() and checkout_onto() becomes static, as they are
only used inside of the sequencer.

Signed-off-by: Alban Gruin 
---
 builtin/rebase--helper.c   | 23 ++
 git-rebase--interactive.sh | 50 --
 rebase-interactive.c   | 22 -
 rebase-interactive.h   |  1 -
 sequencer.c|  8 +++---
 sequencer.h|  4 ---
 6 files changed, 6 insertions(+), 102 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index d7fa5a5062..6085527b2b 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -17,9 +17,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
int abbreviate_commands = 0, rebase_cousins = -1;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
-   CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH,
-   ADD_EXEC, APPEND_TODO_HELP, EDIT_TODO, PREPARE_BRANCH,
-   CHECKOUT_ONTO, COMPLETE_ACTION
+   CHECK_TODO_LIST, REARRANGE_SQUASH, ADD_EXEC, EDIT_TODO, 
PREPARE_BRANCH,
+   COMPLETE_ACTION
} command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
@@ -44,21 +43,15 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
N_("expand commit ids in the todo list"), EXPAND_OIDS),
OPT_CMDMODE(0, "check-todo-list", ,
N_("check the todo list"), CHECK_TODO_LIST),
-   OPT_CMDMODE(0, "skip-unnecessary-picks", ,
-   N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
OPT_CMDMODE(0, "rearrange-squash", ,
N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
OPT_CMDMODE(0, "add-exec-commands", ,
N_("insert exec commands in todo list"), ADD_EXEC),
-   OPT_CMDMODE(0, "append-todo-help", ,
-   N_("insert the help in the todo list"), 
APPEND_TODO_HELP),
OPT_CMDMODE(0, "edit-todo", ,
N_("edit the todo list during an interactive 
rebase"),
EDIT_TODO),
OPT_CMDMODE(0, "prepare-branch", ,
N_("prepare the branch to be rebased"), 
PREPARE_BRANCH),
-   OPT_CMDMODE(0, "checkout-onto", ,
-   N_("checkout a commit"), CHECKOUT_ONTO),
OPT_CMDMODE(0, "complete-action", ,
N_("complete the action"), COMPLETE_ACTION),
OPT_END()
@@ -94,26 +87,14 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!transform_todos(flags);
if (command == CHECK_TODO_LIST && argc == 1)
return !!check_todo_list();
-   if (command == SKIP_UNNECESSARY_PICKS && argc == 1) {
-   struct object_id oid;
-   int ret = skip_unnecessary_picks();
-
-   if (!ret)
-   puts(oid_to_hex());
-   return !!ret;
-   }
if (command == REARRANGE_SQUASH && argc == 1)
return !!rearrange_squash();
if (command == ADD_EXEC && argc == 2)
return !!sequencer_add_exec_commands(argv[1]);
-   if (command == APPEND_TODO_HELP && argc == 1)
-   return !!append_todo_help_to_file(0, keep_empty);
if (command == EDIT_TODO && argc == 1)
return !!edit_todo_list(flags);
if (command == PREPARE_BRANCH && argc == 2)
return !!prepare_branch_to_be_rebased(, argv[1]);
-   if (command == CHECKOUT_ONTO && argc == 4)
-   return !!checkout_onto(, argv[1], argv[2], argv[3]);
if (command == COMPLETE_ACTION && argc == 6)
return !!complete_action(, flags, argv[1], argv[2], 
argv[3],
 argv[4], argv[5], autosquash);
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 59dc4888a6..0d66c0f8b8 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -16,56 +16,6 @@ todo="$state_dir"/git-rebase-todo
 GIT_CHERRY_PICK_HELP="$resolvemsg"
 export GIT_CHERRY_PICK_HELP
 
-comment_char=$(git config --get core.commentchar 2>/dev/null)
-case "$comment_char" in
-'' | auto)
-   comment_char="#"
-   ;;
-?)
-   ;;
-*)
-   comment_char=$(echo "$comment_char" | cut -c1)
-   ;;
-esac
-
-die_abort () {
-   apply_autostash
-   rm -rf "$state_dir"
-   die "$1"
-}
-

[GSoC][PATCH v4 14/20] rebase -i: rewrite the rest of init_revisions_and_shortrevisions() in C

2018-07-24 Thread Alban Gruin

This rewrites the part of init_revisions_and_shortrevisions() needed by
`--complete-action` (which initialize $shortrevisions) from shell to C.

When `upstream` is empty, it means that the user launched a `rebase
--root`, and `onto` contains the ID of an empty commit.  As a range
between an empty commit and `head` is not really meaningful, `onto` is
not used to initialize `shortrevisions` in this case.

The corresponding arguments passed to `--complete-action` are then
dropped, and init_revisions_and_shortrevisions() is stripped from
git-rebase--interactive.sh

Signed-off-by: Alban Gruin 
---
 builtin/rebase--helper.c   | 40 --
 git-rebase--interactive.sh | 27 -
 2 files changed, 38 insertions(+), 29 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 15fa556f65..c4a4c5cfbb 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -10,7 +10,7 @@ static GIT_PATH_FUNC(path_squash_onto, 
"rebase-merge/squash-onto")
 
 static int get_revision_ranges(const char *upstream, const char *onto,
   const char **head_hash,
-  char **revisions)
+  char **revisions, char **shortrevisions)
 {
const char *base_rev = upstream ? upstream : onto;
struct object_id orig_head;
@@ -19,7 +19,25 @@ static int get_revision_ranges(const char *upstream, const 
char *onto,
return error(_("no HEAD?"));
 
*head_hash = find_unique_abbrev(_head, GIT_MAX_HEXSZ);
-   *revisions = xstrfmt("%s...%s", base_rev, *head_hash);
+
+   if (revisions)
+   *revisions = xstrfmt("%s...%s", base_rev, *head_hash);
+   if (shortrevisions) {
+   const char *shorthead;
+
+   shorthead = find_unique_abbrev(_head, DEFAULT_ABBREV);
+
+   if (upstream) {
+   const char *shortrev;
+   struct object_id rev_oid;
+
+   get_oid(base_rev, _oid);
+   shortrev = find_unique_abbrev(_oid, DEFAULT_ABBREV);
+
+   *shortrevisions = xstrfmt("%s..%s", shortrev, 
shorthead);
+   } else
+   *shortrevisions = xstrdup(shorthead);
+   }
 
return 0;
 }
@@ -116,7 +134,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
if (!upstream && squash_onto)
write_file(path_squash_onto(), "%s\n", squash_onto);
 
-   ret = get_revision_ranges(upstream, onto, _hash, 
);
+   ret = get_revision_ranges(upstream, onto, _hash, 
, NULL);
if (ret)
return ret;
 
@@ -145,9 +163,19 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!edit_todo_list(flags);
if (command == PREPARE_BRANCH && argc == 2)
return !!prepare_branch_to_be_rebased(, argv[1]);
-   if (command == COMPLETE_ACTION && argc == 6)
-   return !!complete_action(, flags, argv[1], argv[2], 
argv[3],
-argv[4], argv[5], autosquash);
+   if (command == COMPLETE_ACTION && argc == 3) {
+   char *shortrevisions = NULL;
+
+   ret = get_revision_ranges(upstream, onto, _hash, NULL, 
);
+   if (ret)
+   return ret;
+
+   ret = complete_action(, flags, shortrevisions, argv[1], 
onto,
+ head_hash, argv[2], autosquash);
+
+   free(shortrevisions);
+   return !!ret;
+   }
 
usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 4ca47aed1e..08e9a21c2f 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -60,23 +60,6 @@ init_basic_state () {
write_basic_state
 }
 
-init_revisions_and_shortrevisions () {
-   shorthead=$(git rev-parse --short $orig_head)
-   shortonto=$(git rev-parse --short $onto)
-   if test -z "$rebase_root"
-   # this is now equivalent to ! -z "$upstream"
-   then
-   shortupstream=$(git rev-parse --short $upstream)
-   revisions=$upstream...$orig_head
-   shortrevisions=$shortupstream..$shorthead
-   else
-   revisions=$onto...$orig_head
-   shortrevisions=$shorthead
-   test -z "$squash_onto" ||
-   echo "$squash_onto" >"$state_dir"/squash-onto
-   fi
-}
-
 git_rebase__interactive () {
initiate_action "$action"
ret=$?
@@ -87,8 +70,6 @@ git_rebase__interactive () {
git rebase--helper --prepare-branch "$switch_to" ${verbose:+--verbose}
init_basic_state
 
-   init_revisions_and_shortrevisions
-
git rebase--helper --make-script

[GSoC][PATCH v4 13/20] rebase -i: implement the logic to initialize $revisions in C

2018-07-24 Thread Alban Gruin

This rewrites the part of init_revisions_and_shortrevisions() needed by
`--make-script` from shell to C.  The new version is called
get_revision_ranges(), and is a static function inside of
rebase--helper.c.  As this does not initialize $shortrevision, the
original shell version is not yet stripped.

Unlike init_revisions_and_shortrevisions(), get_revision_ranges()
doesn’t write $squash_onto to the state directory, it’s done by the
handler of `--make-script` instead.

Finally, this drops the $revision argument passed to `--make-script` in
git-rebase--interactive.sh, and rebase--helper is changed accordingly.

Signed-off-by: Alban Gruin 
---
 builtin/rebase--helper.c   | 56 --
 git-rebase--interactive.sh |  4 ++-
 2 files changed, 56 insertions(+), 4 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 6085527b2b..15fa556f65 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -4,6 +4,25 @@
 #include "parse-options.h"
 #include "sequencer.h"
 #include "rebase-interactive.h"
+#include "argv-array.h"
+
+static GIT_PATH_FUNC(path_squash_onto, "rebase-merge/squash-onto")
+
+static int get_revision_ranges(const char *upstream, const char *onto,
+  const char **head_hash,
+  char **revisions)
+{
+   const char *base_rev = upstream ? upstream : onto;
+   struct object_id orig_head;
+
+   if (get_oid("HEAD", _head))
+   return error(_("no HEAD?"));
+
+   *head_hash = find_unique_abbrev(_head, GIT_MAX_HEXSZ);
+   *revisions = xstrfmt("%s...%s", base_rev, *head_hash);
+
+   return 0;
+}
 
 static const char * const builtin_rebase_helper_usage[] = {
N_("git rebase--helper []"),
@@ -14,7 +33,9 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
 {
struct replay_opts opts = REPLAY_OPTS_INIT;
unsigned flags = 0, keep_empty = 0, rebase_merges = 0, autosquash = 0;
-   int abbreviate_commands = 0, rebase_cousins = -1;
+   int abbreviate_commands = 0, rebase_cousins = -1, ret;
+   const char *head_hash = NULL, *onto = NULL, *restrict_revision = NULL,
+   *squash_onto = NULL, *upstream = NULL;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, REARRANGE_SQUASH, ADD_EXEC, EDIT_TODO, 
PREPARE_BRANCH,
@@ -54,6 +75,13 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
N_("prepare the branch to be rebased"), 
PREPARE_BRANCH),
OPT_CMDMODE(0, "complete-action", ,
N_("complete the action"), COMPLETE_ACTION),
+   OPT_STRING(0, "onto", , N_("onto"), N_("onto")),
+   OPT_STRING(0, "restrict-revision", _revision,
+  N_("restrict-revision"), N_("restrict revision")),
+   OPT_STRING(0, "squash-onto", _onto, N_("squash-onto"),
+  N_("squash onto")),
+   OPT_STRING(0, "upstream", , N_("upstream"),
+  N_("the upstream commit")),
OPT_END()
};
 
@@ -81,8 +109,30 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!sequencer_continue();
if (command == ABORT && argc == 1)
return !!sequencer_remove_state();
-   if (command == MAKE_SCRIPT && argc > 1)
-   return !!sequencer_make_script(stdout, argc, argv, flags);
+   if (command == MAKE_SCRIPT && argc == 1) {
+   char *revisions = NULL;
+   struct argv_array make_script_args = ARGV_ARRAY_INIT;
+
+   if (!upstream && squash_onto)
+   write_file(path_squash_onto(), "%s\n", squash_onto);
+
+   ret = get_revision_ranges(upstream, onto, _hash, 
);
+   if (ret)
+   return ret;
+
+   argv_array_pushl(_script_args, "", revisions, NULL);
+   if (restrict_revision)
+   argv_array_push(_script_args, restrict_revision);
+
+   ret = sequencer_make_script(stdout,
+   make_script_args.argc, 
make_script_args.argv,
+   flags);
+
+   free(revisions);
+   argv_array_clear(_script_args);
+
+   return !!ret;
+   }
if ((command == SHORTEN_OIDS || command == EXPAND_OIDS) && argc == 1)
return !!transform_todos(flags);
if (command == CHECK_TODO_LIST && argc == 1)
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 0d66c0f8b8..4ca47aed1e 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -92,7 +92,9 @@ git_rebase__interactive () {
git rebase--helper --make-script ${keep_empty:+--keep-empty} \

[GSoC][PATCH v4 19/20] rebase -i: remove git-rebase--interactive.sh

2018-07-24 Thread Alban Gruin

This removes git-rebase--interactive.sh, as its functionnality has been
replaced by git-rebase--interactive2.

git-rebase--interactive2.c is then renamed to git-rebase--interactive.c.

Signed-off-by: Alban Gruin 
---
 .gitignore|  1 -
 Makefile  |  4 +-
 ...--interactive2.c => rebase--interactive.c} |  0
 git-rebase--interactive.sh| 84 ---
 git-rebase.sh |  2 +-
 git.c |  2 +-
 6 files changed, 3 insertions(+), 90 deletions(-)
 rename builtin/{rebase--interactive2.c => rebase--interactive.c} (100%)
 delete mode 100644 git-rebase--interactive.sh

diff --git a/.gitignore b/.gitignore
index 404c9a8472..3284a1e9b1 100644
--- a/.gitignore
+++ b/.gitignore
@@ -118,7 +118,6 @@
 /git-rebase--am
 /git-rebase--helper
 /git-rebase--interactive
-/git-rebase--interactive2
 /git-rebase--merge
 /git-rebase--preserve-merges
 /git-receive-pack
diff --git a/Makefile b/Makefile
index 71f8f45fe5..584834726d 100644
--- a/Makefile
+++ b/Makefile
@@ -619,7 +619,6 @@ SCRIPT_SH += git-web--browse.sh
 SCRIPT_LIB += git-mergetool--lib
 SCRIPT_LIB += git-parse-remote
 SCRIPT_LIB += git-rebase--am
-SCRIPT_LIB += git-rebase--interactive
 SCRIPT_LIB += git-rebase--preserve-merges
 SCRIPT_LIB += git-rebase--merge
 SCRIPT_LIB += git-sh-setup
@@ -1060,8 +1059,8 @@ BUILTIN_OBJS += builtin/prune.o
 BUILTIN_OBJS += builtin/pull.o
 BUILTIN_OBJS += builtin/push.o
 BUILTIN_OBJS += builtin/read-tree.o
-BUILTIN_OBJS += builtin/rebase--interactive2.o
 BUILTIN_OBJS += builtin/rebase--helper.o
+BUILTIN_OBJS += builtin/rebase--interactive.o
 BUILTIN_OBJS += builtin/receive-pack.o
 BUILTIN_OBJS += builtin/reflog.o
 BUILTIN_OBJS += builtin/remote.o
@@ -2400,7 +2399,6 @@ XGETTEXT_FLAGS_PERL = $(XGETTEXT_FLAGS) --language=Perl \
 LOCALIZED_C = $(C_OBJ:o=c) $(LIB_H) $(GENERATED_H)
 LOCALIZED_SH = $(SCRIPT_SH)
 LOCALIZED_SH += git-parse-remote.sh
-LOCALIZED_SH += git-rebase--interactive.sh
 LOCALIZED_SH += git-rebase--preserve-merges.sh
 LOCALIZED_SH += git-sh-setup.sh
 LOCALIZED_PERL = $(SCRIPT_PERL)
diff --git a/builtin/rebase--interactive2.c b/builtin/rebase--interactive.c
similarity index 100%
rename from builtin/rebase--interactive2.c
rename to builtin/rebase--interactive.c
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
deleted file mode 100644
index e87d708a4d..00
--- a/git-rebase--interactive.sh
+++ /dev/null
@@ -1,84 +0,0 @@
-# This shell script fragment is sourced by git-rebase to implement
-# its interactive mode.  "git rebase --interactive" makes it easy
-# to fix up commits in the middle of a series and rearrange commits.
-#
-# Copyright (c) 2006 Johannes E. Schindelin
-#
-# The original idea comes from Eric W. Biederman, in
-# https://public-inbox.org/git/m1odwkyuf5.fsf...@ebiederm.dsl.xmission.com/
-#
-# The file containing rebase commands, comments, and empty lines.
-# This file is created by "git rebase -i" then edited by the user.  As
-# the lines are processed, they are removed from the front of this
-# file and written to the tail of $done.
-todo="$state_dir"/git-rebase-todo
-
-GIT_CHERRY_PICK_HELP="$resolvemsg"
-export GIT_CHERRY_PICK_HELP
-
-# Initiate an action. If the cannot be any
-# further action it  may exec a command
-# or exit and not return.
-#
-# TODO: Consider a cleaner return model so it
-# never exits and always return 0 if process
-# is complete.
-#
-# Parameter 1 is the action to initiate.
-#
-# Returns 0 if the action was able to complete
-# and if 1 if further processing is required.
-initiate_action () {
-   case "$1" in
-   continue)
-   exec git rebase--helper ${force_rebase:+--no-ff} 
$allow_empty_message \
---continue
-   ;;
-   skip)
-   git rerere clear
-   exec git rebase--helper ${force_rebase:+--no-ff} 
$allow_empty_message \
---continue
-   ;;
-   edit-todo)
-   exec git rebase--helper --edit-todo
-   ;;
-   show-current-patch)
-   exec git show REBASE_HEAD --
-   ;;
-   *)
-   return 1 # continue
-   ;;
-   esac
-}
-
-git_rebase__interactive () {
-   initiate_action "$action"
-   ret=$?
-   if test $ret = 0; then
-   return 0
-   fi
-
-   test -n "$keep_empty" && keep_empty="--keep-empty"
-   test -n "$rebase_merges" && rebase_merges="--rebase-merges"
-   test -n "$rebase_cousins" && rebase_cousins="--rebase-cousins"
-   test -n "$autosquash" && autosquash="--autosquash"
-   test -n "$verbose" && verbose="--verbose"
-   test -n "$force_rebase" && force_rebase="--no-ff"
-   test -n "$restrict_revisions" && 
restrict_revisions="--restrict-revisions=^$restrict_revisions"
-   test -n "$upstream" && upstream="--upstream=$upstream"
-   test -n

[GSoC][PATCH v4 15/20] rebase -i: rewrite write_basic_state() in C

2018-07-24 Thread Alban Gruin

This rewrites write_basic_state() from git-rebase.sh in C.  This is the
first step in the conversion of init_basic_state(), hence the mode in
rebase--helper.c is called INIT_BASIC_STATE.  init_basic_state() will be
converted in the next commit.

The part of read_strategy_opts() that parses the stategy options is
moved to a new function to allow its use in rebase--helper.c.

Finally, the call to write_basic_state() is removed from
git-rebase--interactive.sh, replaced by a call to `--init-basic-state`.

Signed-off-by: Alban Gruin 
---
All patches from this one are new.

 builtin/rebase--helper.c   | 28 +-
 git-rebase--interactive.sh |  7 +++-
 sequencer.c| 77 --
 sequencer.h|  4 ++
 4 files changed, 102 insertions(+), 14 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index c4a4c5cfbb..06fe3c018b 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -5,6 +5,8 @@
 #include "sequencer.h"
 #include "rebase-interactive.h"
 #include "argv-array.h"
+#include "rerere.h"
+#include "alias.h"
 
 static GIT_PATH_FUNC(path_squash_onto, "rebase-merge/squash-onto")
 
@@ -53,11 +55,12 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
unsigned flags = 0, keep_empty = 0, rebase_merges = 0, autosquash = 0;
int abbreviate_commands = 0, rebase_cousins = -1, ret;
const char *head_hash = NULL, *onto = NULL, *restrict_revision = NULL,
-   *squash_onto = NULL, *upstream = NULL;
+   *squash_onto = NULL, *upstream = NULL, *head_name = NULL;
+   char *raw_strategies = NULL;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, REARRANGE_SQUASH, ADD_EXEC, EDIT_TODO, 
PREPARE_BRANCH,
-   COMPLETE_ACTION
+   COMPLETE_ACTION, INIT_BASIC_STATE
} command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
@@ -69,6 +72,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
 N_("keep original branch points of cousins")),
OPT_BOOL(0, "autosquash", ,
 N_("move commits thas begin with squash!/fixup!")),
+   OPT_BOOL(0, "signoff", , N_("sign commits")),
OPT__VERBOSE(, N_("be verbose")),
OPT_CMDMODE(0, "continue", , N_("continue rebase"),
CONTINUE),
@@ -93,6 +97,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
N_("prepare the branch to be rebased"), 
PREPARE_BRANCH),
OPT_CMDMODE(0, "complete-action", ,
N_("complete the action"), COMPLETE_ACTION),
+   OPT_CMDMODE(0, "init-basic-state", ,
+   N_("initialise the rebase state"), 
INIT_BASIC_STATE),
OPT_STRING(0, "onto", , N_("onto"), N_("onto")),
OPT_STRING(0, "restrict-revision", _revision,
   N_("restrict-revision"), N_("restrict revision")),
@@ -100,6 +106,14 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
   N_("squash onto")),
OPT_STRING(0, "upstream", , N_("upstream"),
   N_("the upstream commit")),
+   OPT_STRING(0, "head-name", _name, N_("head-name"), 
N_("head name")),
+   OPT_STRING('S', "gpg-sign", _sign, N_("gpg-sign"),
+  N_("GPG-sign commits")),
+   OPT_STRING(0, "strategy", , N_("strategy"),
+  N_("rebase strategy")),
+   OPT_STRING(0, "strategy-opts", _strategies, 
N_("strategy-opts"),
+  N_("strategy options")),
+   OPT_RERERE_AUTOUPDATE(_rerere_auto),
OPT_END()
};
 
@@ -176,6 +190,16 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
free(shortrevisions);
return !!ret;
}
+   if (command == INIT_BASIC_STATE) {
+   if (raw_strategies)
+   parse_strategy_opts(, raw_strategies);
+
+   ret = get_revision_ranges(upstream, onto, _hash, NULL, 
NULL);
+   if (ret)
+   return ret;
+
+   return !!write_basic_state(, head_name, onto, head_hash);
+   }
 
usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 08e9a21c2f..6367da66e2 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -57,7 +57,6 @@ init_basic_state () {
rm -f "$(git rev-parse --git-path REBASE_HEAD)"
 
: > "$state_dir"/interactive || die "$(gettext "Could not mark as 
interactive")"
-

[GSoC][PATCH v4 16/20] rebase -i: rewrite init_basic_state() in C

2018-07-24 Thread Alban Gruin

This rewrites init_basic_state() from shell to C.  The call to
write_basic_state() in cmd_rebase__helper() is replaced by a call to the
new function.

The shell version is then stripped from git-rebase--interactive.sh.

Signed-off-by: Alban Gruin 
---
 builtin/rebase--helper.c   | 23 ++-
 git-rebase--interactive.sh |  9 -
 2 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 06fe3c018b..ac21e8e06e 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -5,10 +5,13 @@
 #include "sequencer.h"
 #include "rebase-interactive.h"
 #include "argv-array.h"
+#include "refs.h"
 #include "rerere.h"
 #include "alias.h"
 
+static GIT_PATH_FUNC(path_state_dir, "rebase-merge/")
 static GIT_PATH_FUNC(path_squash_onto, "rebase-merge/squash-onto")
+static GIT_PATH_FUNC(path_interactive, "rebase-merge/interactive")
 
 static int get_revision_ranges(const char *upstream, const char *onto,
   const char **head_hash,
@@ -44,6 +47,24 @@ static int get_revision_ranges(const char *upstream, const 
char *onto,
return 0;
 }
 
+static int init_basic_state(struct replay_opts *opts, const char *head_name,
+   const char *onto, const char *orig_head)
+{
+   FILE *interactive;
+
+   if (!is_directory(path_state_dir()) && 
mkdir_in_gitdir(path_state_dir()))
+   return error_errno(_("could not create temporary %s"), 
path_state_dir());
+
+   delete_reflog("REBASE_HEAD");
+
+   interactive = fopen(path_interactive(), "w");
+   if (!interactive)
+   return error_errno(_("could not mark as interactive"));
+   fclose(interactive);
+
+   return write_basic_state(opts, head_name, onto, orig_head);
+}
+
 static const char * const builtin_rebase_helper_usage[] = {
N_("git rebase--helper []"),
NULL
@@ -198,7 +219,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
if (ret)
return ret;
 
-   return !!write_basic_state(, head_name, onto, head_hash);
+   return !!init_basic_state(, head_name, onto, head_hash);
}
 
usage_with_options(builtin_rebase_helper_usage, options);
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 6367da66e2..761be95ed1 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -51,14 +51,6 @@ initiate_action () {
esac
 }
 
-init_basic_state () {
-   orig_head=$(git rev-parse --verify HEAD) || die "$(gettext "No HEAD?")"
-   mkdir -p "$state_dir" || die "$(eval_gettext "Could not create 
temporary \$state_dir")"
-   rm -f "$(git rev-parse --git-path REBASE_HEAD)"
-
-   : > "$state_dir"/interactive || die "$(gettext "Could not mark as 
interactive")"
-}
-
 git_rebase__interactive () {
initiate_action "$action"
ret=$?
@@ -67,7 +59,6 @@ git_rebase__interactive () {
fi
 
git rebase--helper --prepare-branch "$switch_to" ${verbose:+--verbose}
-   init_basic_state
 
git rebase--helper --init-basic-state ${upstream:+--upstream 
"$upstream"} \
${onto:+--onto "$onto"} ${head_name:+--head-name "$head_name"} \
-- 
2.18.0

[GSoC][PATCH v4 08/20] sequencer: refactor append_todo_help() to write its message to a buffer

2018-07-24 Thread Alban Gruin

This refactors append_todo_help() to write its message to a buffer
instead of the todo-list.  This is needed for the rewrite of
complete_action(), which will come after the next commit.

As rebase--helper still needs the file manipulation part of
append_todo_help(), it is extracted to a temporary function,
append_todo_help_to_file().  This function will go away after the
rewrite of complete_action().

Signed-off-by: Alban Gruin 
---
Dropped the `include` in rebase-interactive.h.

 builtin/rebase--helper.c |  2 +-
 rebase-interactive.c | 45 
 rebase-interactive.h |  4 +++-
 3 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 7d9426d23c..313092c465 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -97,7 +97,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
if (command == ADD_EXEC && argc == 2)
return !!sequencer_add_exec_commands(argv[1]);
if (command == APPEND_TODO_HELP && argc == 1)
-   return !!append_todo_help(0, keep_empty);
+   return !!append_todo_help_to_file(0, keep_empty);
if (command == EDIT_TODO && argc == 1)
return !!edit_todo_list(flags);
if (command == PREPARE_BRANCH && argc == 2)
diff --git a/rebase-interactive.c b/rebase-interactive.c
index 403ecbf3c9..d8b9465597 100644
--- a/rebase-interactive.c
+++ b/rebase-interactive.c
@@ -4,11 +4,9 @@
 #include "sequencer.h"
 #include "strbuf.h"
 
-int append_todo_help(unsigned edit_todo, unsigned keep_empty)
+void append_todo_help(unsigned edit_todo, unsigned keep_empty,
+ struct strbuf *buf)
 {
-   struct strbuf buf = STRBUF_INIT;
-   FILE *todo;
-   int ret;
const char *msg = _("\nCommands:\n"
 "p, pick  = use commit\n"
 "r, reword  = use commit, but edit the commit message\n"
@@ -26,11 +24,7 @@ int append_todo_help(unsigned edit_todo, unsigned keep_empty)
 "\n"
 "These lines can be re-ordered; they are executed from top to bottom.\n");
 
-   todo = fopen_or_warn(rebase_path_todo(), "a");
-   if (!todo)
-   return 1;
-
-   strbuf_add_commented_lines(, msg, strlen(msg));
+   strbuf_add_commented_lines(buf, msg, strlen(msg));
 
if (get_missing_commit_check_level() == MISSING_COMMIT_CHECK_ERROR)
msg = _("\nDo not remove any line. Use 'drop' "
@@ -39,7 +33,7 @@ int append_todo_help(unsigned edit_todo, unsigned keep_empty)
msg = _("\nIf you remove a line here "
 "THAT COMMIT WILL BE LOST.\n");
 
-   strbuf_add_commented_lines(, msg, strlen(msg));
+   strbuf_add_commented_lines(buf, msg, strlen(msg));
 
if (edit_todo)
msg = _("\nYou are editing the todo file "
@@ -50,12 +44,25 @@ int append_todo_help(unsigned edit_todo, unsigned 
keep_empty)
msg = _("\nHowever, if you remove everything, "
"the rebase will be aborted.\n\n");
 
-   strbuf_add_commented_lines(, msg, strlen(msg));
+   strbuf_add_commented_lines(buf, msg, strlen(msg));
 
if (!keep_empty) {
msg = _("Note that empty commits are commented out");
-   strbuf_add_commented_lines(, msg, strlen(msg));
+   strbuf_add_commented_lines(buf, msg, strlen(msg));
}
+}
+
+int append_todo_help_to_file(unsigned edit_todo, unsigned keep_empty)
+{
+   struct strbuf buf = STRBUF_INIT;
+   FILE *todo;
+   int ret;
+
+   todo = fopen_or_warn(rebase_path_todo(), "a");
+   if (!todo)
+   return 1;
+
+   append_todo_help(edit_todo, keep_empty, );
 
ret = fputs(buf.buf, todo);
if (ret < 0)
@@ -88,7 +95,19 @@ int edit_todo_list(unsigned flags)
strbuf_release();
 
transform_todos(flags | TODO_LIST_SHORTEN_IDS);
-   append_todo_help(1, 0);
+
+   strbuf_read_file(, todo_file, 0);
+   append_todo_help(1, 0, );
+
+   todo = fopen_or_warn(todo_file, "w");
+   if (!todo) {
+   strbuf_release();
+   return 1;
+   }
+
+   strbuf_write(, todo);
+   fclose(todo);
+   strbuf_release();
 
if (launch_sequence_editor(todo_file, NULL, NULL))
return 1;
diff --git a/rebase-interactive.h b/rebase-interactive.h
index 155219e742..d33f3176b7 100644
--- a/rebase-interactive.h
+++ b/rebase-interactive.h
@@ -1,7 +1,9 @@
 #ifndef REBASE_INTERACTIVE_H
 #define REBASE_INTERACTIVE_H
 
-int append_todo_help(unsigned edit_todo, unsigned keep_empty);
+void append_todo_help(unsigned edit_todo, unsigned keep_empty,
+ struct strbuf *buf);
+int append_todo_help_to_file(unsigned edit_todo, unsigned keep_empty);
 int edit_todo_list(unsigned flags);
 
 #endif
-- 
2.18.0

[GSoC][PATCH v4 10/20] t3404: todo list with commented-out commands only aborts

2018-07-24 Thread Alban Gruin

If the todo list generated by `--make-script` is empty,
complete_action() writes a noop, but if it has only commented-out
commands, it will abort with the message "Nothing to do", and does not
launch the editor.  This adds a new test to ensure that
complete_action() behaves this way.

Signed-off-by: Alban Gruin 
---
New patch.

 t/t3404-rebase-interactive.sh | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index d392160ba9..7755b68092 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -75,6 +75,16 @@ test_expect_success 'rebase --keep-empty' '
test_line_count = 6 actual
 '
 
+cat > expect &1 &&
+   test_i18ncmp expect actual
+'
+
 test_expect_success 'rebase -i with the exec command' '
git checkout master &&
(
-- 
2.18.0

[GSoC][PATCH v4 11/20] rebase -i: rewrite complete_action() in C

2018-07-24 Thread Alban Gruin

This rewrites complete_action() from shell to C.

A new mode is added to rebase--helper (`--complete-action`), as well as
a new flag (`--autosquash`).

Finally, complete_action() is stripped from git-rebase--interactive.sh.

The original complete_action() would return the code 2 when the todo
list contained no actions.  This was a special case for rebase -i and
-p; git-rebase.sh would then apply the autostash, delete the state
directory, and die with the message "Nothing to do".  This cleanup is
rewritten in C instead of returning 2.  As rebase -i no longer returns
2, the comment describing this behaviour in git-rebase.sh is updated to
reflect this change.

The first check might seem useless as we write "noop" to the todo list
if it is empty.  Actually, the todo list might contain commented
commands (ie. empty commits).  In this case, complete_action() won’t
write "noop", and will abort without starting the editor.

Signed-off-by: Alban Gruin 
---
 builtin/rebase--helper.c   | 12 -
 git-rebase--interactive.sh | 53 ++--
 git-rebase.sh  |  2 +-
 sequencer.c| 99 ++
 sequencer.h|  4 ++
 5 files changed, 118 insertions(+), 52 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index bed3dd2b95..d7fa5a5062 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,13 +13,13 @@ static const char * const builtin_rebase_helper_usage[] = {
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
struct replay_opts opts = REPLAY_OPTS_INIT;
-   unsigned flags = 0, keep_empty = 0, rebase_merges = 0;
+   unsigned flags = 0, keep_empty = 0, rebase_merges = 0, autosquash = 0;
int abbreviate_commands = 0, rebase_cousins = -1;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH,
ADD_EXEC, APPEND_TODO_HELP, EDIT_TODO, PREPARE_BRANCH,
-   CHECKOUT_ONTO
+   CHECKOUT_ONTO, COMPLETE_ACTION
} command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
@@ -29,6 +29,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
OPT_BOOL(0, "rebase-merges", _merges, N_("rebase merge 
commits")),
OPT_BOOL(0, "rebase-cousins", _cousins,
 N_("keep original branch points of cousins")),
+   OPT_BOOL(0, "autosquash", ,
+N_("move commits thas begin with squash!/fixup!")),
OPT__VERBOSE(, N_("be verbose")),
OPT_CMDMODE(0, "continue", , N_("continue rebase"),
CONTINUE),
@@ -57,6 +59,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
N_("prepare the branch to be rebased"), 
PREPARE_BRANCH),
OPT_CMDMODE(0, "checkout-onto", ,
N_("checkout a commit"), CHECKOUT_ONTO),
+   OPT_CMDMODE(0, "complete-action", ,
+   N_("complete the action"), COMPLETE_ACTION),
OPT_END()
};
 
@@ -110,5 +114,9 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!prepare_branch_to_be_rebased(, argv[1]);
if (command == CHECKOUT_ONTO && argc == 4)
return !!checkout_onto(, argv[1], argv[2], argv[3]);
+   if (command == COMPLETE_ACTION && argc == 6)
+   return !!complete_action(, flags, argv[1], argv[2], 
argv[3],
+argv[4], argv[5], autosquash);
+
usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index b68f108f28..59dc4888a6 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -127,54 +127,6 @@ init_revisions_and_shortrevisions () {
fi
 }
 
-complete_action() {
-   test -s "$todo" || echo noop >> "$todo"
-   test -z "$autosquash" || git rebase--helper --rearrange-squash || exit
-   test -n "$cmd" && git rebase--helper --add-exec-commands "$cmd"
-
-   todocount=$(git stripspace --strip-comments <"$todo" | wc -l)
-   todocount=${todocount##* }
-
-cat >>"$todo" <"$todo" ||
die "$(gettext "Could not generate todo list")"
 
-   complete_action
+   exec git rebase--helper --complete-action "$shortrevisions" 
"$onto_name" \
+   "$shortonto" "$orig_head" "$cmd" $allow_empty_message \
+   ${autosquash:+--autosquash} ${keep_empty:+--keep-empty} \
+   ${verbose:+--verbose} ${force_rebase:+--no-ff}
 }
diff --git a/git-rebase.sh b/git-rebase.sh
index f3b10c7f62..86da3816be 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -219,7 +219,7 @@

[GSoC][PATCH v4 09/20] sequencer: change the way skip_unnecessary_picks() returns its result

2018-07-24 Thread Alban Gruin

Instead of skip_unnecessary_picks() printing its result to stdout, it
returns it into a struct object_id, as the rewrite of complete_action()
(to come in the next commit) will need it.

rebase--helper then is modified to fit this change.

Signed-off-by: Alban Gruin 
---
 builtin/rebase--helper.c | 10 --
 sequencer.c  | 13 ++---
 sequencer.h  |  2 +-
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 313092c465..bed3dd2b95 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -90,8 +90,14 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!transform_todos(flags);
if (command == CHECK_TODO_LIST && argc == 1)
return !!check_todo_list();
-   if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
-   return !!skip_unnecessary_picks();
+   if (command == SKIP_UNNECESSARY_PICKS && argc == 1) {
+   struct object_id oid;
+   int ret = skip_unnecessary_picks();
+
+   if (!ret)
+   puts(oid_to_hex());
+   return !!ret;
+   }
if (command == REARRANGE_SQUASH && argc == 1)
return !!rearrange_squash();
if (command == ADD_EXEC && argc == 2)
diff --git a/sequencer.c b/sequencer.c
index c7d01d60b3..238c534049 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4416,17 +4416,17 @@ static int rewrite_file(const char *path, const char 
*buf, size_t len)
 }
 
 /* skip picking commits whose parents are unchanged */
-int skip_unnecessary_picks(void)
+int skip_unnecessary_picks(struct object_id *output_oid)
 {
const char *todo_file = rebase_path_todo();
struct strbuf buf = STRBUF_INIT;
struct todo_list todo_list = TODO_LIST_INIT;
-   struct object_id onto_oid, *oid = _oid, *parent_oid;
+   struct object_id *parent_oid;
int fd, i;
 
if (!read_oneliner(, rebase_path_onto(), 0))
return error(_("could not read 'onto'"));
-   if (get_oid(buf.buf, _oid)) {
+   if (get_oid(buf.buf, output_oid)) {
strbuf_release();
return error(_("need a HEAD to fixup"));
}
@@ -4456,9 +4456,9 @@ int skip_unnecessary_picks(void)
if (item->commit->parents->next)
break; /* merge commit */
parent_oid = >commit->parents->item->object.oid;
-   if (hashcmp(parent_oid->hash, oid->hash))
+   if (hashcmp(parent_oid->hash, output_oid->hash))
break;
-   oid = >commit->object.oid;
+   oidcpy(output_oid, >commit->object.oid);
}
if (i > 0) {
int offset = get_item_line_offset(_list, i);
@@ -4487,11 +4487,10 @@ int skip_unnecessary_picks(void)
 
todo_list.current = i;
if (is_fixup(peek_command(_list, 0)))
-   record_in_rewritten(oid, peek_command(_list, 0));
+   record_in_rewritten(output_oid, 
peek_command(_list, 0));
}
 
todo_list_release(_list);
-   printf("%s\n", oid_to_hex(oid));
 
return 0;
 }
diff --git a/sequencer.h b/sequencer.h
index 11a5334612..f11dabfd65 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -88,7 +88,7 @@ int sequencer_add_exec_commands(const char *command);
 int transform_todos(unsigned flags);
 enum missing_commit_check_level get_missing_commit_check_level(void);
 int check_todo_list(void);
-int skip_unnecessary_picks(void);
+int skip_unnecessary_picks(struct object_id *output_oid);
 int rearrange_squash(void);
 
 extern const char sign_off_header[];
-- 
2.18.0

[GSoC][PATCH v4 04/20] rebase -i: rewrite the edit-todo functionality in C

2018-07-24 Thread Alban Gruin

This rewrites the edit-todo functionality from shell to C.

To achieve that, a new command mode, `edit-todo`, is added, and the
`write-edit-todo` flag is removed, as the shell script does not need to
write the edit todo help message to the todo list anymore.

The shell version is then stripped in favour of a call to the helper.

Signed-off-by: Alban Gruin 
---
No code changes since v3.

 builtin/rebase--helper.c   | 13 -
 git-rebase--interactive.sh | 11 +--
 rebase-interactive.c   | 31 +++
 rebase-interactive.h   |  1 +
 4 files changed, 41 insertions(+), 15 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index 05e73e71d4..731a64971d 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -13,12 +13,12 @@ static const char * const builtin_rebase_helper_usage[] = {
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
struct replay_opts opts = REPLAY_OPTS_INIT;
-   unsigned flags = 0, keep_empty = 0, rebase_merges = 0, write_edit_todo 
= 0;
+   unsigned flags = 0, keep_empty = 0, rebase_merges = 0;
int abbreviate_commands = 0, rebase_cousins = -1;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH,
-   ADD_EXEC, APPEND_TODO_HELP
+   ADD_EXEC, APPEND_TODO_HELP, EDIT_TODO
} command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
@@ -28,8 +28,6 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
OPT_BOOL(0, "rebase-merges", _merges, N_("rebase merge 
commits")),
OPT_BOOL(0, "rebase-cousins", _cousins,
 N_("keep original branch points of cousins")),
-   OPT_BOOL(0, "write-edit-todo", _edit_todo,
-N_("append the edit-todo message to the todo-list")),
OPT_CMDMODE(0, "continue", , N_("continue rebase"),
CONTINUE),
OPT_CMDMODE(0, "abort", , N_("abort rebase"),
@@ -50,6 +48,9 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
N_("insert exec commands in todo list"), ADD_EXEC),
OPT_CMDMODE(0, "append-todo-help", ,
N_("insert the help in the todo list"), 
APPEND_TODO_HELP),
+   OPT_CMDMODE(0, "edit-todo", ,
+   N_("edit the todo list during an interactive 
rebase"),
+   EDIT_TODO),
OPT_END()
};
 
@@ -90,6 +91,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
if (command == ADD_EXEC && argc == 2)
return !!sequencer_add_exec_commands(argv[1]);
if (command == APPEND_TODO_HELP && argc == 1)
-   return !!append_todo_help(write_edit_todo, keep_empty);
+   return !!append_todo_help(0, keep_empty);
+   if (command == EDIT_TODO && argc == 1)
+   return !!edit_todo_list(flags);
usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 94c23a7af2..2defe607f4 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -108,16 +108,7 @@ initiate_action () {
 --continue
;;
edit-todo)
-   git stripspace --strip-comments <"$todo" >"$todo".new
-   mv -f "$todo".new "$todo"
-   collapse_todo_ids
-   git rebase--helper --append-todo-help --write-edit-todo
-
-   git_sequence_editor "$todo" ||
-   die "$(gettext "Could not execute editor")"
-   expand_todo_ids
-
-   exit
+   exec git rebase--helper --edit-todo
;;
show-current-patch)
exec git show REBASE_HEAD --
diff --git a/rebase-interactive.c b/rebase-interactive.c
index d7996bc8d9..403ecbf3c9 100644
--- a/rebase-interactive.c
+++ b/rebase-interactive.c
@@ -66,3 +66,34 @@ int append_todo_help(unsigned edit_todo, unsigned keep_empty)
 
return ret;
 }
+
+int edit_todo_list(unsigned flags)
+{
+   struct strbuf buf = STRBUF_INIT;
+   const char *todo_file = rebase_path_todo();
+   FILE *todo;
+
+   if (strbuf_read_file(, todo_file, 0) < 0)
+   return error_errno(_("could not read '%s'."), todo_file);
+
+   strbuf_stripspace(, 1);
+   todo = fopen_or_warn(todo_file, "w");
+   if (!todo) {
+   strbuf_release();
+   return 1;
+   }
+
+   strbuf_write(, todo);
+   fclose(todo);
+   strbuf_release();
+
+   transform_todos(flags | TODO_LIST_SHORTEN_IDS);
+   append_todo_help(1, 0);
+
+   if

[GSoC][PATCH v4 01/20] sequencer: make two functions and an enum from sequencer.c public

2018-07-24 Thread Alban Gruin

This makes rebase_path_todo(), get_missing_commit_check_level() and the
enum check_level accessible outside sequencer.c, renames check_level to
missing_commit_check_level, and prefixes its value names by
MISSING_COMMIT_ to avoid namespace pollution.

This function and this enum will eventually be moved to
rebase-interactive.c and become static again, so no special attention
was given to the naming.

This will be needed for the rewrite of append_todo_help() from shell to
C, as it will be in a new library source file, rebase-interactive.c.

Signed-off-by: Alban Gruin 
---
No code changes since v3.

 sequencer.c | 22 +-
 sequencer.h |  8 
 2 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 03c47405fb..0fc03dfe48 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -52,7 +52,7 @@ static GIT_PATH_FUNC(rebase_path, "rebase-merge")
  * the lines are processed, they are removed from the front of this
  * file and written to the tail of 'done'.
  */
-static GIT_PATH_FUNC(rebase_path_todo, "rebase-merge/git-rebase-todo")
+GIT_PATH_FUNC(rebase_path_todo, "rebase-merge/git-rebase-todo")
 /*
  * The rebase command lines that have already been processed. A line
  * is moved here when it is first handled, before any associated user
@@ -4245,24 +4245,20 @@ int transform_todos(unsigned flags)
return i;
 }
 
-enum check_level {
-   CHECK_IGNORE = 0, CHECK_WARN, CHECK_ERROR
-};
-
-static enum check_level get_missing_commit_check_level(void)
+enum missing_commit_check_level get_missing_commit_check_level(void)
 {
const char *value;
 
if (git_config_get_value("rebase.missingcommitscheck", ) ||
!strcasecmp("ignore", value))
-   return CHECK_IGNORE;
+   return MISSING_COMMIT_CHECK_IGNORE;
if (!strcasecmp("warn", value))
-   return CHECK_WARN;
+   return MISSING_COMMIT_CHECK_WARN;
if (!strcasecmp("error", value))
-   return CHECK_ERROR;
+   return MISSING_COMMIT_CHECK_ERROR;
warning(_("unrecognized setting %s for option "
  "rebase.missingCommitsCheck. Ignoring."), value);
-   return CHECK_IGNORE;
+   return MISSING_COMMIT_CHECK_IGNORE;
 }
 
 define_commit_slab(commit_seen, unsigned char);
@@ -4274,7 +4270,7 @@ define_commit_slab(commit_seen, unsigned char);
  */
 int check_todo_list(void)
 {
-   enum check_level check_level = get_missing_commit_check_level();
+   enum missing_commit_check_level check_level = 
get_missing_commit_check_level();
struct strbuf todo_file = STRBUF_INIT;
struct todo_list todo_list = TODO_LIST_INIT;
struct strbuf missing = STRBUF_INIT;
@@ -4291,7 +4287,7 @@ int check_todo_list(void)
advise_to_edit_todo = res =
parse_insn_buffer(todo_list.buf.buf, _list);
 
-   if (res || check_level == CHECK_IGNORE)
+   if (res || check_level == MISSING_COMMIT_CHECK_IGNORE)
goto leave_check;
 
/* Mark the commits in git-rebase-todo as seen */
@@ -4326,7 +4322,7 @@ int check_todo_list(void)
if (!missing.len)
goto leave_check;
 
-   if (check_level == CHECK_ERROR)
+   if (check_level == MISSING_COMMIT_CHECK_ERROR)
advise_to_edit_todo = res = 1;
 
fprintf(stderr,
diff --git a/sequencer.h b/sequencer.h
index c5787c6b56..ffab798f1e 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -3,6 +3,7 @@
 
 const char *git_path_commit_editmsg(void);
 const char *git_path_seq_dir(void);
+const char *rebase_path_todo(void);
 
 #define APPEND_SIGNOFF_DEDUP (1u << 0)
 
@@ -57,6 +58,12 @@ struct replay_opts {
 };
 #define REPLAY_OPTS_INIT { .action = -1, .current_fixups = STRBUF_INIT }
 
+enum missing_commit_check_level {
+   MISSING_COMMIT_CHECK_IGNORE = 0,
+   MISSING_COMMIT_CHECK_WARN,
+   MISSING_COMMIT_CHECK_ERROR
+};
+
 /* Call this to setup defaults before parsing command line options */
 void sequencer_init_config(struct replay_opts *opts);
 int sequencer_pick_revisions(struct replay_opts *opts);
@@ -79,6 +86,7 @@ int sequencer_make_script(FILE *out, int argc, const char 
**argv,
 
 int sequencer_add_exec_commands(const char *command);
 int transform_todos(unsigned flags);
+enum missing_commit_check_level get_missing_commit_check_level(void);
 int check_todo_list(void);
 int skip_unnecessary_picks(void);
 int rearrange_squash(void);
-- 
2.18.0

[GSoC][PATCH v4 02/20] rebase -i: rewrite append_todo_help() in C

2018-07-24 Thread Alban Gruin

This rewrites append_todo_help() from shell to C. It also incorporates
some parts of initiate_action() and complete_action() that also write
help texts to the todo file.

This also introduces the source file rebase-interactive.c. This file
will contain functions necessary for interactive rebase that are too
specific for the sequencer, and is part of libgit.a.

Two flags are added to rebase--helper.c: one to call
append_todo_help() (`--append-todo-help`), and another one to tell
append_todo_help() to write the help text suited for the edit-todo
mode (`--write-edit-todo`).

Finally, append_todo_help() is removed from git-rebase--interactive.sh
to use `rebase--helper --append-todo-help` instead.

Signed-off-by: Alban Gruin 
---
No code changes since v3.

 Makefile   |  1 +
 builtin/rebase--helper.c   | 11 --
 git-rebase--interactive.sh | 52 ++---
 rebase-interactive.c   | 68 ++
 rebase-interactive.h   |  6 
 5 files changed, 86 insertions(+), 52 deletions(-)
 create mode 100644 rebase-interactive.c
 create mode 100644 rebase-interactive.h

diff --git a/Makefile b/Makefile
index 08e5c54549..909a687857 100644
--- a/Makefile
+++ b/Makefile
@@ -922,6 +922,7 @@ LIB_OBJS += protocol.o
 LIB_OBJS += quote.o
 LIB_OBJS += reachable.o
 LIB_OBJS += read-cache.o
+LIB_OBJS += rebase-interactive.o
 LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += refs/files-backend.o
diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index f7c2a5fdc8..05e73e71d4 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -3,6 +3,7 @@
 #include "config.h"
 #include "parse-options.h"
 #include "sequencer.h"
+#include "rebase-interactive.h"
 
 static const char * const builtin_rebase_helper_usage[] = {
N_("git rebase--helper []"),
@@ -12,12 +13,12 @@ static const char * const builtin_rebase_helper_usage[] = {
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
struct replay_opts opts = REPLAY_OPTS_INIT;
-   unsigned flags = 0, keep_empty = 0, rebase_merges = 0;
+   unsigned flags = 0, keep_empty = 0, rebase_merges = 0, write_edit_todo 
= 0;
int abbreviate_commands = 0, rebase_cousins = -1;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH,
-   ADD_EXEC
+   ADD_EXEC, APPEND_TODO_HELP
} command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
@@ -27,6 +28,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
OPT_BOOL(0, "rebase-merges", _merges, N_("rebase merge 
commits")),
OPT_BOOL(0, "rebase-cousins", _cousins,
 N_("keep original branch points of cousins")),
+   OPT_BOOL(0, "write-edit-todo", _edit_todo,
+N_("append the edit-todo message to the todo-list")),
OPT_CMDMODE(0, "continue", , N_("continue rebase"),
CONTINUE),
OPT_CMDMODE(0, "abort", , N_("abort rebase"),
@@ -45,6 +48,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
OPT_CMDMODE(0, "add-exec-commands", ,
N_("insert exec commands in todo list"), ADD_EXEC),
+   OPT_CMDMODE(0, "append-todo-help", ,
+   N_("insert the help in the todo list"), 
APPEND_TODO_HELP),
OPT_END()
};
 
@@ -84,5 +89,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!rearrange_squash();
if (command == ADD_EXEC && argc == 2)
return !!sequencer_add_exec_commands(argv[1]);
+   if (command == APPEND_TODO_HELP && argc == 1)
+   return !!append_todo_help(write_edit_todo, keep_empty);
usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 299ded2137..94c23a7af2 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -39,38 +39,6 @@ comment_for_reflog () {
esac
 }
 
-append_todo_help () {
-   gettext "
-Commands:
-p, pick  = use commit
-r, reword  = use commit, but edit the commit message
-e, edit  = use commit, but stop for amending
-s, squash  = use commit, but meld into previous commit
-f, fixup  = like \"squash\", but discard this commit's log message
-x, exec  = run command (the rest of the line) using shell
-d, drop  = remove commit
-l, label  = label current HEAD with a name
-t, reset  = reset HEAD to a label
-m, merge [-C  | -c ]  [# ]
-.   create a merge commit using the original merge commit's
-.   message (or the oneline, if no

[GSoC][PATCH v4 00/20] rebase -i: rewrite in C

2018-07-24 Thread Alban Gruin

This patch series rewrite the interactive rebase from shell to C.

It is based on master (as of 2018-07-24).

Changes since v3:

 - The `--verbose` option is stored directly into opts.verbose

 - Drop includes in rebase-interactive.h

 - skip_unnecessary_picks() now returns an object_id instead of a string

 - Add a test case to ensure interactive rebase aborts when the todo
   list only has commented-out commands

 - complete_action() aborts when the todo list only has commented-out
   commands

 - Drop the `keep_empty` parameter of complete_action()

 - Don’t remove the modes `--shorten-oids` and `--expand-oids` from
   git-rebase--helper

 - Replace `ret = !!x(); return ret` by `ret = x(); return !!ret`

 - Fail if `--make-script` is provided with two arguments instead of
   one

 - Rewrite write_basic_state() and init_basic_state() in C

 - Rewrite git-rebase--interactive.sh as a builtin

Alban Gruin (20):
  sequencer: make two functions and an enum from sequencer.c public
  rebase -i: rewrite append_todo_help() in C
  editor: add a function to launch the sequence editor
  rebase -i: rewrite the edit-todo functionality in C
  sequencer: add a new function to silence a command, except if it fails
  rebase -i: rewrite setup_reflog_action() in C
  rebase -i: rewrite checkout_onto() in C
  sequencer: refactor append_todo_help() to write its message to a
buffer
  sequencer: change the way skip_unnecessary_picks() returns its result
  t3404: todo list with commented-out commands only aborts
  rebase -i: rewrite complete_action() in C
  rebase -i: remove unused modes and functions
  rebase -i: implement the logic to initialize $revisions in C
  rebase -i: rewrite the rest of init_revisions_and_shortrevisions() in
C
  rebase -i: rewrite write_basic_state() in C
  rebase -i: rewrite init_basic_state() in C
  rebase -i: implement the main part of interactive rebase as a builtin
  rebase--interactive2: rewrite the submodes of interactive rebase in C
  rebase -i: remove git-rebase--interactive.sh
  rebase -i: move rebase--helper modes to rebase--interactive

 .gitignore |   1 -
 Makefile   |   5 +-
 builtin.h  |   1 +
 builtin/rebase--helper.c   |  88 --
 builtin/rebase--interactive.c  | 264 
 cache.h|   1 +
 editor.c   |  27 ++-
 git-rebase--interactive.sh | 283 --
 git-rebase--preserve-merges.sh |  10 +-
 git-rebase.sh  |  47 -
 git.c  |   2 +-
 rebase-interactive.c   |  96 ++
 rebase-interactive.h   |   8 +
 sequencer.c| 311 +++--
 sequencer.h|  19 +-
 strbuf.h   |   2 +
 t/t3404-rebase-interactive.sh  |  10 ++
 17 files changed, 729 insertions(+), 446 deletions(-)
 delete mode 100644 builtin/rebase--helper.c
 create mode 100644 builtin/rebase--interactive.c
 delete mode 100644 git-rebase--interactive.sh
 create mode 100644 rebase-interactive.c
 create mode 100644 rebase-interactive.h

-- 
2.18.0

Re: [RFC PATCH 1/5] packfile: make get_delta_base() non static

2018-07-24 Thread Junio C Hamano

Christian Couder  writes:

> From: Jeff King 
>
> As get_delta_base() will be used outside 'packfile.c' in
> a following commit, let's make it non static and let's
> declare it in 'packfile.h'.

As a public function in *.h, don't we want a bit of comment there to
help those who want to use it from outside packfile.c?


> Signed-off-by: Jeff King 
> Signed-off-by: Christian Couder 
> ---
>  packfile.c | 10 +-
>  packfile.h |  3 +++
>  2 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/packfile.c b/packfile.c
> index 7cd45aa4b2..4646bff5ff 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1037,11 +1037,11 @@ const struct packed_git *has_packed_and_bad(const 
> unsigned char *sha1)
>   return NULL;
>  }
>  
> -static off_t get_delta_base(struct packed_git *p,
> - struct pack_window **w_curs,
> - off_t *curpos,
> - enum object_type type,
> - off_t delta_obj_offset)
> +off_t get_delta_base(struct packed_git *p,
> +  struct pack_window **w_curs,
> +  off_t *curpos,
> +  enum object_type type,
> +  off_t delta_obj_offset)
>  {
>   unsigned char *base_info = use_pack(p, w_curs, *curpos, NULL);
>   off_t base_offset;
> diff --git a/packfile.h b/packfile.h
> index cc7eaffe1b..30f0811382 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -125,6 +125,9 @@ extern void *unpack_entry(struct repository *r, struct 
> packed_git *, off_t, enum
>  extern unsigned long unpack_object_header_buffer(const unsigned char *buf, 
> unsigned long len, enum object_type *type, unsigned long *sizep);
>  extern unsigned long get_size_from_delta(struct packed_git *, struct 
> pack_window **, off_t);
>  extern int unpack_object_header(struct packed_git *, struct pack_window **, 
> off_t *, unsigned long *);
> +extern off_t get_delta_base(struct packed_git *p, struct pack_window 
> **w_curs,
> + off_t *curpos, enum object_type type,
> + off_t delta_obj_offset);
>  
>  extern void release_pack_memory(size_t);

Guten Tag, Katie, bitte können wir reden?

2018-07-24 Thread katie higgins

Re: [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc)

2018-07-24 Thread Duy Nguyen

On Tue, Jul 24, 2018 at 6:20 AM Jeff King  wrote:
> At least that's my view of it. unpack_trees() has always been a
> terrifying beast that I've avoided looking too closely at.

/me nods on the terrifying part.

> > After a quick look at the code, the only place I can find that tries to use
> > cache_tree_matches_traversal() is in unpack_callback() and that only happens
> > if n == 1 and in the "git checkout" case, n == 2. Am I missing something?

So we do not actually use cache-tree? Big optimization opportunity (if
we can make it!).

> Looks like it's trying to special-case "diff-index --cached". Which
> kind-of makes sense. In the non-cached case, we're thinking not only
> about the relationship between the index and the tree, but also whether
> the on-disk files are up to date.
>
> And that would be the same for checkout. We want to know not only
> whether there are changes to make to the index, but also whether the
> on-disk files need to be updated from the index.
>
> But I assume in your case that we've just refreshed the index quickly
> using fsmonitor. So I think in the long run what you want is:
>
>   1. fsmonitor tells us which index entries are not clean
>
>   2. based on the unclean list, we invalidate cache-tree entries for
>  those paths
>
>   3. if we have a valid cache-tree entry, we should be able to skip
>  digging into that tree; if not, then we walk the index and tree as
>  normal, adding/deleting index entries and updating (or complaining
>  about) modified on-disk files

If you tie this optimization to twoway_merge specifically (by checking
"fn" field), then I think we can do it even better. Since
cache_tree_matches_traversal() is one (hopefully not too costly)
lookup, we can do it without checking with fsmonitor or whatever and
only do so when we have found a cache tree.

Then if we write this new special code just for twoway_merge, we need
to tighten the checks a bit. I think in this case twoway_merge() will
be called with "oldtree" as same as "newtree" (and "current" may
contains dirty stuff from the index). Then

 - o->df_conflict_entry should be NULL (because we handle it slightly
differently in twoway_merge)
 - "current" should not have CE_CONFLICTED

then I believe we will fall into case /* 20 or 21 */ where
merged_entry() is suppoed to be called on all entries and it would
change nothing in the index since newtree is the same as oldtree, and
we could just jump over the whole tree in traverse_trees().

> I think the "n" adds an extra layer of complexity. n==2 means we're
> doing a "2-way" merge. Moving from tree X to tree Y, and dealing with
> the index as we go. Naively I _think_ we'd be OK to just extend the rule
> to "if both subtrees match each other _and_ match the valid cache-tree,
> then we can skip".
>
> Again, I'm a little out of my area of expertise here, but cargo-culting
> like this:
>
> diff --git a/sha1-file.c b/sha1-file.c
> index de4839e634..c105af70ce 100644
> --- a/sha1-file.c
> +++ b/sha1-file.c
> @@ -1375,6 +1375,7 @@ static void *read_object(const unsigned char *sha1, 
> enum object_type *type,
>
> if (oid_object_info_extended(the_repository, , , 0) < 0)
> return NULL;
> +   trace_printf("reading %s %s", type_name(*type), sha1_to_hex(sha1));
> return content;
>  }
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 66741130ae..cfdad4133d 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1075,6 +1075,23 @@ static int unpack_callback(int n, unsigned long mask, 
> unsigned long dirmask, str
> o->cache_bottom += matches;
> return mask;
> }
> +   } else if (n == 2 && S_ISDIR(names->mode) &&
> +  names[0].mode == names[1].mode &&
> +  !strcmp(names[0].path, names[1].path) &&
> +  !oidcmp(names[0].oid, names[1].oid)
> +  /* && somehow account for modified on-disk files 
> */) {
> +   int matches;
> +
> +   /*
> +* we know that the two trees have the same oid, so we
> +* only need to look at one of them
> +*/
> +   matches = 
> cache_tree_matches_traversal(o->src_index->cache_tree,
> +  names, info);
> +   if (matches) {
> +   o->cache_bottom += matches;
> +   return mask;
> +   }
> }
>
> if (traverse_trees_recursive(n, dirmask, mask & ~dirmask,
>
> seems to avoid the tree reads when running "GIT_TRACE=1 git checkout".
> It also totally empties the index. ;) So clearly we have to do a bit
> more there. Probably rather than just bumping o->cache_bottom forward,
> we'd need to actually move

Re: [RFC PATCH] sequencer: fix quoting in write_author_script

2018-07-24 Thread Junio C Hamano

Phillip Wood  writes:

> From: Phillip Wood 
>
> Single quotes should be escaped as \' not \\'. Note that this only
> affects authors that contain a single quote and then only external
> scripts that read the author script and users whose git is upgraded from
> the shell version of rebase -i while rebase was stopped. This is because
> the parsing in read_env_script() expected the broken version and for
> some reason sq_dequote() called by read_author_ident() seems to handle
> the broken quoting correctly.
>
> Ideally write_author_script() would be rewritten to use
> split_ident_line() and sq_quote_buf() but this commit just fixes the
> immediate bug.
>
> Signed-off-by: Phillip Wood 
> ---
>
> This is untested, unfortuantely I don't have really have time to write a test 
> or
> follow this up at the moment, if someone else want to run with it then please
> do.

Any follow-up on this?

Dscho, I do agree with Phillip that the author-script this code
produces does not quote single quotes correctly (if we consider that
the author-script is meant to be compatible with shell script
version and what "git am" uses, that is), and Phillip's fix looks
like a reasonable first step going forward (the next step would
remove the transitional band-aid from the reading code, also added
by this patch).

Thanks.

>
> sequencer.c | 18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index 5354d4d51e..0b78d1f100 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -638,21 +638,21 @@ static int write_author_script(const char *message)
>   else if (*message != '\'')
>   strbuf_addch(, *(message++));
>   else
> - strbuf_addf(, "'%c'", *(message++));
> + strbuf_addf(, "'\\%c'", *(message++));
>   strbuf_addstr(, "'\nGIT_AUTHOR_EMAIL='");
>   while (*message && *message != '\n' && *message != '\r')
>   if (skip_prefix(message, "> ", ))
>   break;
>   else if (*message != '\'')
>   strbuf_addch(, *(message++));
>   else
> - strbuf_addf(, "'%c'", *(message++));
> + strbuf_addf(, "'\\%c'", *(message++));
>   strbuf_addstr(, "'\nGIT_AUTHOR_DATE='@");
>   while (*message && *message != '\n' && *message != '\r')
>   if (*message != '\'')
>   strbuf_addch(, *(message++));
>   else
> - strbuf_addf(, "'%c'", *(message++));
> + strbuf_addf(, "'\\%c'", *(message++));
>   res = write_message(buf.buf, buf.len, rebase_path_author_script(), 1);
>   strbuf_release();
>   return res;
> @@ -666,13 +666,21 @@ static int read_env_script(struct argv_array *env)
>  {
>   struct strbuf script = STRBUF_INIT;
>   int i, count = 0;
> - char *p, *p2;
> + const char *p2;
> + char *p;
>  
>   if (strbuf_read_file(, rebase_path_author_script(), 256) <= 0)
>   return -1;
>  
>   for (p = script.buf; *p; p++)
> - if (skip_prefix(p, "'''", (const char **)))
> + /*
> +  * write_author_script() used to escape "'" incorrectly as
> +  * "'''" rather than "'\\''" so we check for the correct
> +  * version the incorrect version in case git was upgraded while
> +  * rebase was stopped.
> +  */
> + if (skip_prefix(p, "'\\''", ) ||
> + skip_prefix(p, "'''", ))
>   strbuf_splice(, p - script.buf, p2 - p, "'", 1);
>   else if (*p == '\'')
>   strbuf_splice(, p-- - script.buf, 1, "", 0);

[PATCH v1] config.c: fix msvc compile error

2018-07-24 Thread git

From: Jeff Hostetler 

In commit fb0dc3bac135e9f6243bd6d293e8c9293c73b9cd code was added
to builtin/config.c to define a new function and a forward declaration
for an array of unknown size.  This causes a compile error under MSVC.

Reorder the code to forward declare the function instead of the array.

Signed-off-by: Jeff Hostetler 
---
 builtin/config.c | 79 
 1 file changed, 40 insertions(+), 39 deletions(-)

diff --git a/builtin/config.c b/builtin/config.c
index b29d26d..564f18f 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -67,7 +67,46 @@ static int show_origin;
{ OPTION_CALLBACK, (s), (l), (v), NULL, (h), PARSE_OPT_NOARG | \
PARSE_OPT_NONEG, option_parse_type, (i) }
 
-static struct option builtin_config_options[];
+static int option_parse_type(const struct option *opt, const char *arg,
+int unset);
+
+static struct option builtin_config_options[] = {
+   OPT_GROUP(N_("Config file location")),
+   OPT_BOOL(0, "global", _global_config, N_("use global config file")),
+   OPT_BOOL(0, "system", _system_config, N_("use system config file")),
+   OPT_BOOL(0, "local", _local_config, N_("use repository config 
file")),
+   OPT_STRING('f', "file", _config_source.file, N_("file"), N_("use 
given config file")),
+   OPT_STRING(0, "blob", _config_source.blob, N_("blob-id"), 
N_("read config from given blob object")),
+   OPT_GROUP(N_("Action")),
+   OPT_BIT(0, "get", , N_("get value: name [value-regex]"), 
ACTION_GET),
+   OPT_BIT(0, "get-all", , N_("get all values: key 
[value-regex]"), ACTION_GET_ALL),
+   OPT_BIT(0, "get-regexp", , N_("get values for regexp: 
name-regex [value-regex]"), ACTION_GET_REGEXP),
+   OPT_BIT(0, "get-urlmatch", , N_("get value specific for the 
URL: section[.var] URL"), ACTION_GET_URLMATCH),
+   OPT_BIT(0, "replace-all", , N_("replace all matching variables: 
name value [value_regex]"), ACTION_REPLACE_ALL),
+   OPT_BIT(0, "add", , N_("add a new variable: name value"), 
ACTION_ADD),
+   OPT_BIT(0, "unset", , N_("remove a variable: name 
[value-regex]"), ACTION_UNSET),
+   OPT_BIT(0, "unset-all", , N_("remove all matches: name 
[value-regex]"), ACTION_UNSET_ALL),
+   OPT_BIT(0, "rename-section", , N_("rename section: old-name 
new-name"), ACTION_RENAME_SECTION),
+   OPT_BIT(0, "remove-section", , N_("remove a section: name"), 
ACTION_REMOVE_SECTION),
+   OPT_BIT('l', "list", , N_("list all"), ACTION_LIST),
+   OPT_BIT('e', "edit", , N_("open an editor"), ACTION_EDIT),
+   OPT_BIT(0, "get-color", , N_("find the color configured: slot 
[default]"), ACTION_GET_COLOR),
+   OPT_BIT(0, "get-colorbool", , N_("find the color setting: slot 
[stdout-is-tty]"), ACTION_GET_COLORBOOL),
+   OPT_GROUP(N_("Type")),
+   OPT_CALLBACK('t', "type", , "", N_("value is given this type"), 
option_parse_type),
+   OPT_CALLBACK_VALUE(0, "bool", , N_("value is \"true\" or 
\"false\""), TYPE_BOOL),
+   OPT_CALLBACK_VALUE(0, "int", , N_("value is decimal number"), 
TYPE_INT),
+   OPT_CALLBACK_VALUE(0, "bool-or-int", , N_("value is --bool or 
--int"), TYPE_BOOL_OR_INT),
+   OPT_CALLBACK_VALUE(0, "path", , N_("value is a path (file or 
directory name)"), TYPE_PATH),
+   OPT_CALLBACK_VALUE(0, "expiry-date", , N_("value is an expiry 
date"), TYPE_EXPIRY_DATE),
+   OPT_GROUP(N_("Other")),
+   OPT_BOOL('z', "null", _null, N_("terminate values with NUL byte")),
+   OPT_BOOL(0, "name-only", _values, N_("show variable names only")),
+   OPT_BOOL(0, "includes", _includes_opt, N_("respect include 
directives on lookup")),
+   OPT_BOOL(0, "show-origin", _origin, N_("show origin of config 
(file, standard input, blob, command line)")),
+   OPT_STRING(0, "default", _value, N_("value"), N_("with --get, 
use default value when missing entry")),
+   OPT_END(),
+};
 
 static int option_parse_type(const struct option *opt, const char *arg,
 int unset)
@@ -119,44 +158,6 @@ static int option_parse_type(const struct option *opt, 
const char *arg,
return 0;
 }
 
-static struct option builtin_config_options[] = {
-   OPT_GROUP(N_("Config file location")),
-   OPT_BOOL(0, "global", _global_config, N_("use global config file")),
-   OPT_BOOL(0, "system", _system_config, N_("use system config file")),
-   OPT_BOOL(0, "local", _local_config, N_("use repository config 
file")),
-   OPT_STRING('f', "file", _config_source.file, N_("file"), N_("use 
given config file")),
-   OPT_STRING(0, "blob", _config_source.blob, N_("blob-id"), 
N_("read config from given blob object")),
-   OPT_GROUP(N_("Action")),
-   OPT_BIT(0, "get", , N_("get value: name [value-regex]"), 
ACTION_GET),
-   OPT_BIT(0, "get-all", , N_("get all values: key 
[value-regex]"), ACTION_GET_ALL),
-   OPT_BIT(0, "get-regexp", , N_("get values for regexp:

Re: [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc)

2018-07-24 Thread Duy Nguyen

On Mon, Jul 23, 2018 at 04:51:38PM -0400, Ben Peart wrote:
> >>> What's the current state of the index before this checkout?
> >>
> >> This was after running "git checkout" multiple times so there was really
> >> nothing for git to do.
> > 
> > Hmm.. this means cache-tree is fully valid, unless you have changes in
> > index. We're quite aggressive in repairing cache-tree since aecf567cbf
> > (cache-tree: create/update cache-tree on checkout - 2014-07-05). If we
> > have very good cache-tree records and still spend 33s on
> > traverse_trees, maybe there's something else.
> > 
> 
> I'm not at all familiar with the cache-tree and couldn't find any 
> documentation on it other than index-format.txt which says "it helps 
> speed up tree object generation for a new commit."

I guess you have the starting points you need after Jeff's and Junio's
explanation (and it would be great if cache-tree could actually be for
for this two-way merge). But to make it easier for new people in
future, maybe we should add this?

This is basically a ripoff of Junio's explanation with starting points
(write-tree and index-format.txt). I wanted to incorporate some pieces
from Jeff's too but I think Junio's already covered it well.

-- 8< --
Subject: [PATCH] cache-tree.h: more description of what it is and what's it 
used for

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 cache-tree.h | 29 +
 1 file changed, 29 insertions(+)

diff --git a/cache-tree.h b/cache-tree.h
index cfd5328cc9..d25a800a72 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -5,6 +5,35 @@
 #include "tree.h"
 #include "tree-walk.h"
 
+/*
+ * cache-tree is an index extension that records tree object names for
+ * subdirectories you see in the index. It is mainly used for
+ * generating trees from the index before you create a new commit (see
+ * builtin/write-tree.c as starting point) but it's also used in "git
+ * diff-index --cached $TREE" as an optimization. See index-format.txt
+ * for on-disk format.
+ *
+ * Every time you write the contents of the index as a tree object, we
+ * need to collect the object name for each top-level paths and write
+ * a new top-level tree object out, after doing the same recursively
+ * for any modified subdirectory. Whenever you add, remove or modify a
+ * path in the index, the cache-tree entry for enclosing directories
+ * are invalidated, so a cache-tree entry that is still valid means
+ * that all the paths in the index under that directory match the
+ * contents of the tree object that the cache-tree entry holds.
+ *
+ * And that property is used by "diff-index --cached $TREE" that is
+ * run internally.  When we find that the subdirectory "D"'s
+ * cache-tree entry is valid in the index, and the tree object
+ * recorded in the cache-tree for that subdirectory matches the
+ * subtree D in the tree object $TREE, then "diff-index --cached"
+ * ignores the entire subdirectory D (which saves relatively little in
+ * the index as it only needs to scan what is already in the memory
+ * forward, but on the $TREE traversal side, it does not have to even
+ * open a subtree, that can save a lot), and with a well-populated
+ * cache-tree, it can save a significant processing.
+ */
+
 struct cache_tree;
 struct cache_tree_sub {
struct cache_tree *cache_tree;
-- 
2.18.0.656.gda699b98b3

-- 8< --

for your photos

2018-07-24 Thread Roland


I would like to speak with the person that managing photos for your
company?

We provide image editing like – photos cutting out and retouching.

Enhancing your images is just a part of what we can do for your business.
Whether you’re an ecommerce
store or portrait photographer, real estate professional, or an e-Retailer,
we are your personal team
of photo editors that integrate seamlessly with your business.

Our mainly services are:

. Cut out, masking, clipping path, deep etching, transparent background
. Colour correction, black and white, light and shadows etc.
. Dust cleaning, spot cleaning
. Beauty retouching, skin retouching, face retouching, body retouching
. Fashion/Beauty Image Retouching
. Product image Retouching
. Real estate image Retouching
. Wedding & Event Album Design.
. Restoration and repair old images
. Vector Conversion
. Portrait image Retouching

We can provide you editing test on your photos.
Please reply if you are interested.

Thanks,
Roland

your photos

2018-07-24 Thread Roland


I would like to speak with the person that managing photos for your
company?

We provide image editing like – photos cutting out and retouching.

Enhancing your images is just a part of what we can do for your business.
Whether you’re an ecommerce
store or portrait photographer, real estate professional, or an e-Retailer,
we are your personal team
of photo editors that integrate seamlessly with your business.

Our mainly services are:

. Cut out, masking, clipping path, deep etching, transparent background
. Colour correction, black and white, light and shadows etc.
. Dust cleaning, spot cleaning
. Beauty retouching, skin retouching, face retouching, body retouching
. Fashion/Beauty Image Retouching
. Product image Retouching
. Real estate image Retouching
. Wedding & Event Album Design.
. Restoration and repair old images
. Vector Conversion
. Portrait image Retouching

We can provide you editing test on your photos.
Please reply if you are interested.

Thanks,
Roland

[PATCH v1] msvc: fix non-standard escape sequence in source

2018-07-24 Thread git

From: Jeff Hostetler 

Replace non-standard "\e" escape sequence with "\x1B".

In commit 7a17918c34f4e83982456ffe22d880c3cda5384f a trace message with
several "\e" escape sequences was added.  This causes a compiler warning
under MSVC.

According to [1], the "\e" sequence is an extension supported by GCC,
clang, and tcc.

[1] https://en.wikipedia.org/wiki/Escape_sequences_in_C

Signed-off-by: Jeff Hostetler 
---
 convert.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/convert.c b/convert.c
index 56cfe31..52092be 100644
--- a/convert.c
+++ b/convert.c
@@ -335,7 +335,7 @@ static void trace_encoding(const char *context, const char 
*path,
strbuf_addf(, "%s (%s, considered %s):\n", context, path, 
encoding);
for (i = 0; i < len && buf; ++i) {
strbuf_addf(
-   ,"| \e[2m%2i:\e[0m %2x \e[2m%c\e[0m%c",
+   ,"| \x1B[2m%2i:\x1B[0m %2x \x1B[2m%c\x1B[0m%c",
i,
(unsigned char) buf[i],
(buf[i] > 32 && buf[i] < 127 ? buf[i] : ' '),
-- 
2.9.3

Re: [PATCH v2] Makefile: add a DEVOPTS flag to get pedantic compilation

2018-07-24 Thread Junio C Hamano

Beat Bolli  writes:

> On 23.07.18 20:53, Junio C Hamano wrote:
>> 
>>> This is the convenience knob for all developers that led to the series
>>> bb/pedantic[1]. It does not depend on this series, though.
>> 
>> Yup, but "make DEVELOPER=Yes" build won't pass unless this patch is
>> queued after those clean-up ;-)
>
> Then there's a bug in this patch. It should only have an effect if we
> "make DEVELOPER=Yes DEVOPTS=pedantic". Did you try this?

Sorry, "exercising the feature" is what I meant to say with that
"make" invocation, so I should have also spelled DEVOPTS bit, too.

[PATCH 6/6] strbuf_humanise: use unsigned variables

2018-07-24 Thread Jeff King

All of the numeric formatting done by this function uses
"%u", but we pass in a signed "int". The actual range
doesn't matter here, since the conditional makes sure we're
always showing reasonably small numbers. And even gcc's
format-checker does not seem to mind. But it's potentially
confusing to a reader of the code to see the mismatch.

Signed-off-by: Jeff King 
---
 strbuf.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index db9069c937..54f29bbb23 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -734,18 +734,18 @@ void strbuf_humanise_bytes(struct strbuf *buf, off_t 
bytes)
 {
if (bytes > 1 << 30) {
strbuf_addf(buf, "%u.%2.2u GiB",
-   (int)(bytes >> 30),
-   (int)(bytes & ((1 << 30) - 1)) / 10737419);
+   (unsigned)(bytes >> 30),
+   (unsigned)(bytes & ((1 << 30) - 1)) / 10737419);
} else if (bytes > 1 << 20) {
-   int x = bytes + 5243;  /* for rounding */
+   unsigned x = bytes + 5243;  /* for rounding */
strbuf_addf(buf, "%u.%2.2u MiB",
x >> 20, ((x & ((1 << 20) - 1)) * 100) >> 20);
} else if (bytes > 1 << 10) {
-   int x = bytes + 5;  /* for rounding */
+   unsigned x = bytes + 5;  /* for rounding */
strbuf_addf(buf, "%u.%2.2u KiB",
x >> 10, ((x & ((1 << 10) - 1)) * 100) >> 10);
} else {
-   strbuf_addf(buf, "%u bytes", (int)bytes);
+   strbuf_addf(buf, "%u bytes", (unsigned)bytes);
}
 }
 
-- 
2.18.0.542.g2bf2fc4f7e

[PATCH 5/6] pass st.st_size as hint for strbuf_readlink()

2018-07-24 Thread Jeff King

When we initially added the strbuf_readlink() function in
b11b7e13f4 (Add generic 'strbuf_readlink()' helper function,
2008-12-17), the point was that we generally have a _guess_
as to the correct size based on the stat information, but we
can't necessarily trust it.

Over the years, a few callers have grown up that simply pass
in 0, even though they have the stat information. Let's have
them pass in their hint for consistency (and in theory
efficiency, since it may avoid an extra resize/syscall loop,
but neither location is probably performance critical).

Note that st.st_size is actually an off_t, so in theory we
need xsize_t() here. But none of the other callsites use it,
and since this is just a hint, it doesn't matter either way
(if we wrap we'll simply start with a too-small hint and
then eventually complain when we cannot allocate the
memory).

Signed-off-by: Jeff King 
---
 builtin/init-db.c| 3 ++-
 refs/files-backend.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/init-db.c b/builtin/init-db.c
index 4ecf909368..12ddda7e7b 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -73,7 +73,8 @@ static void copy_templates_1(struct strbuf *path, struct 
strbuf *template_path,
continue;
else if (S_ISLNK(st_template.st_mode)) {
struct strbuf lnk = STRBUF_INIT;
-   if (strbuf_readlink(, template_path->buf, 0) < 0)
+   if (strbuf_readlink(, template_path->buf,
+   st_template.st_size) < 0)
die_errno(_("cannot readlink '%s'"), 
template_path->buf);
if (symlink(lnk.buf, path->buf))
die_errno(_("cannot symlink '%s' '%s'"),
diff --git a/refs/files-backend.c b/refs/files-backend.c
index a9a066dcfb..c110c2520c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -363,7 +363,7 @@ static int files_read_raw_ref(struct ref_store *ref_store,
/* Follow "normalized" - ie "refs/.." symlinks by hand */
if (S_ISLNK(st.st_mode)) {
strbuf_reset(_contents);
-   if (strbuf_readlink(_contents, path, 0) < 0) {
+   if (strbuf_readlink(_contents, path, st.st_size) < 0) {
if (errno == ENOENT || errno == EINVAL)
/* inconsistent with lstat; retry */
goto stat_ref;
-- 
2.18.0.542.g2bf2fc4f7e

[PATCH 4/6] strbuf_readlink: use ssize_t

2018-07-24 Thread Jeff King

The return type of readlink() is ssize_t, not int. This
probably doesn't matter in practice, as it would require a
2GB symlink destination, but it doesn't hurt to be careful.

Signed-off-by: Jeff King 
---
 strbuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/strbuf.c b/strbuf.c
index 6ff1f80129..db9069c937 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -469,7 +469,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, 
size_t hint)
hint = 32;
 
while (hint < STRBUF_MAXLINK) {
-   int len;
+   ssize_t len;
 
strbuf_grow(sb, hint);
len = readlink(path, sb->buf, hint);
-- 
2.18.0.542.g2bf2fc4f7e

[PATCH 3/6] strbuf: use size_t for length in intermediate variables

2018-07-24 Thread Jeff King

A few strbuf functions store the length of a strbuf in a
temporary variable. We should always use size_t for this, as
it's possible for a strbuf to exceed an "int" (e.g., a 2GB
string on a 64-bit system). This is unlikely in practice,
but we should try to behave sensibly on silly or malicious
input.

Signed-off-by: Jeff King 
---
 strbuf.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index e79758b942..6ff1f80129 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -209,7 +209,7 @@ void strbuf_list_free(struct strbuf **sbs)
 
 int strbuf_cmp(const struct strbuf *a, const struct strbuf *b)
 {
-   int len = a->len < b->len ? a->len: b->len;
+   size_t len = a->len < b->len ? a->len: b->len;
int cmp = memcmp(a->buf, b->buf, len);
if (cmp)
return cmp;
@@ -389,7 +389,7 @@ size_t strbuf_expand_dict_cb(struct strbuf *sb, const char 
*placeholder,
 
 void strbuf_addbuf_percentquote(struct strbuf *dst, const struct strbuf *src)
 {
-   int i, len = src->len;
+   size_t i, len = src->len;
 
for (i = 0; i < len; i++) {
if (src->buf[i] == '%')
@@ -960,7 +960,7 @@ static size_t cleanup(char *line, size_t len)
  */
 void strbuf_stripspace(struct strbuf *sb, int skip_comments)
 {
-   int empties = 0;
+   size_t empties = 0;
size_t i, j, len, newlen;
char *eol;
 
-- 
2.18.0.542.g2bf2fc4f7e

[PATCH 2/6] reencode_string: use size_t for string lengths

2018-07-24 Thread Jeff King

The iconv interface takes a size_t, which is the appropriate
type for an in-memory buffer. But our reencode_string_*
functions use integers, meaning we may get confusing results
when the sizes exceed INT_MAX. Let's use size_t
consistently.

Signed-off-by: Jeff King 
---
 convert.c |  6 +++---
 pretty.c  |  2 +-
 strbuf.c  |  2 +-
 utf8.c|  6 +++---
 utf8.h| 10 +-
 5 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/convert.c b/convert.c
index 56cfe31ec5..530a43cf63 100644
--- a/convert.c
+++ b/convert.c
@@ -390,7 +390,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
 struct strbuf *buf, const char *enc, int conv_flags)
 {
char *dst;
-   int dst_len;
+   size_t dst_len;
int die_on_error = conv_flags & CONV_WRITE_OBJECT;
 
/*
@@ -453,7 +453,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
 */
if (die_on_error && check_roundtrip(enc)) {
char *re_src;
-   int re_src_len;
+   size_t re_src_len;
 
re_src = reencode_string_len(dst, dst_len,
 enc, default_encoding,
@@ -481,7 +481,7 @@ static int encode_to_worktree(const char *path, const char 
*src, size_t src_len,
  struct strbuf *buf, const char *enc)
 {
char *dst;
-   int dst_len;
+   size_t dst_len;
 
/*
 * No encoding is specified or there is nothing to encode.
diff --git a/pretty.c b/pretty.c
index 703fa6ff7b..e1e4060243 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1538,7 +1538,7 @@ void format_commit_message(const struct commit *commit,
}
 
if (output_enc) {
-   int outsz;
+   size_t outsz;
char *out = reencode_string_len(sb->buf, sb->len,
output_enc, utf8, );
if (out)
diff --git a/strbuf.c b/strbuf.c
index b0716ac585..e79758b942 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -134,7 +134,7 @@ void strbuf_ltrim(struct strbuf *sb)
 int strbuf_reencode(struct strbuf *sb, const char *from, const char *to)
 {
char *out;
-   int len;
+   size_t len;
 
if (same_encoding(from, to))
return 0;
diff --git a/utf8.c b/utf8.c
index a2fd24c70a..edcd1e835a 100644
--- a/utf8.c
+++ b/utf8.c
@@ -470,7 +470,7 @@ int utf8_fprintf(FILE *stream, const char *format, ...)
 #else
typedef char * iconv_ibp;
 #endif
-char *reencode_string_iconv(const char *in, size_t insz, iconv_t conv, int 
*outsz_p)
+char *reencode_string_iconv(const char *in, size_t insz, iconv_t conv, size_t 
*outsz_p)
 {
size_t outsz, outalloc;
char *out, *outpos;
@@ -534,9 +534,9 @@ static const char *fallback_encoding(const char *name)
return name;
 }
 
-char *reencode_string_len(const char *in, int insz,
+char *reencode_string_len(const char *in, size_t insz,
  const char *out_encoding, const char *in_encoding,
- int *outsz)
+ size_t *outsz)
 {
iconv_t conv;
char *out;
diff --git a/utf8.h b/utf8.h
index db73a2d8d3..ce1c2696e0 100644
--- a/utf8.h
+++ b/utf8.h
@@ -25,14 +25,14 @@ void strbuf_utf8_replace(struct strbuf *sb, int pos, int 
width,
 
 #ifndef NO_ICONV
 char *reencode_string_iconv(const char *in, size_t insz,
-   iconv_t conv, int *outsz);
-char *reencode_string_len(const char *in, int insz,
+   iconv_t conv, size_t *outsz);
+char *reencode_string_len(const char *in, size_t insz,
  const char *out_encoding,
  const char *in_encoding,
- int *outsz);
+ size_t *outsz);
 #else
-static inline char *reencode_string_len(const char *a, int b,
-   const char *c, const char *d, int *e)
+static inline char *reencode_string_len(const char *a, size_t b,
+   const char *c, const char *d, size_t *e)
 { if (e) *e = 0; return NULL; }
 #endif
 
-- 
2.18.0.542.g2bf2fc4f7e

[PATCH 1/6] reencode_string: use st_add/st_mult helpers

2018-07-24 Thread Jeff King

When converting a string with iconv, if the output buffer
isn't big enough, we grow it. But our growth is done without
any concern for integer overflow. So when we add:

  outalloc = sofar + insz * 2 + 32;

we may end up wrapping outalloc (which is a size_t), and
allocating a too-small buffer. We then manipulate it
further:

  outsz = outalloc - sofar - 1;

and feed outsz back to iconv. If outalloc is wrapped and
smaller than sofar, we'll end up with a small allocation but
feed a very large outsz to iconv, which could result in it
overflowing the buffer.

Can we use this to construct an attack wherein the victim
clones a repository with a very large commit object with an
encoding header, and running "git log" reencodes it into
utf8, causing an overflow?

An attack of this sort is likely impossible in practice.
"sofar" is how many output bytes we've written total, and
"insz" is the number of input bytes remaining. Imagine our
input doubles in size as we output it (which is easy to do
by converting latin1 to utf8, for example), and that we
start with N input bytes. Our initial output buffer also
starts at N bytes, so after the first call we'd have N/2
input bytes remaining (insz), and have written N bytes
(sofar). That means our next allocation will be
(N + N/2 * 2 + 32) bytes, or (2N + 32).

We can therefore overflow a 32-bit size_t with a commit
message that's just under 2^31 bytes, assuming it consists
mostly of "doubling" sequences (e.g., latin1 0xe1 which
becomes utf8 0xc3 0xa1).

But we'll never make it that far with such a message. We'll
be spending 2^31 bytes on the original string. And our
initial output buffer will also be 2^31 bytes. Which is not
going to succeed on a system with a 32-bit size_t, since
there will be other things using the address space, too. The
initial malloc will fail.

If we imagine instead that we can triple the size when
converting, then our second allocation becomes
(N + 2/3N * 2 + 32), or (7/3N + 32). That still requires two
allocations of 3/7 of our address space (6/7 of the total)
to succeed.

If we imagine we can quadruple, it becomes (5/2N + 32); we
need to be able to allocate 4/5 of the address space to
succeed.

This might start to get plausible. But is it possible to get
a 4-to-1 increase in size? Probably if you're converting to
some obscure encoding. But since git defaults to utf8 for
its output, that's the likely destination encoding for an
attack. And while there are 4-character utf8 sequences, it's
unlikely that you'd be able find a single-byte source
sequence in any encoding.

So this is certainly buggy code which should be fixed, but
it is probably not a useful attack vector.

Signed-off-by: Jeff King 
---
 utf8.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/utf8.c b/utf8.c
index d55e20c641..a2fd24c70a 100644
--- a/utf8.c
+++ b/utf8.c
@@ -477,7 +477,7 @@ char *reencode_string_iconv(const char *in, size_t insz, 
iconv_t conv, int *outs
iconv_ibp cp;
 
outsz = insz;
-   outalloc = outsz + 1; /* for terminating NUL */
+   outalloc = st_add(outsz, 1); /* for terminating NUL */
out = xmalloc(outalloc);
outpos = out;
cp = (iconv_ibp)in;
@@ -497,7 +497,7 @@ char *reencode_string_iconv(const char *in, size_t insz, 
iconv_t conv, int *outs
 * converting the rest.
 */
sofar = outpos - out;
-   outalloc = sofar + insz * 2 + 32;
+   outalloc = st_add3(sofar, st_mult(insz, 2), 32);
out = xrealloc(out, outalloc);
outpos = out + sofar;
outsz = outalloc - sofar - 1;
-- 
2.18.0.542.g2bf2fc4f7e

[PATCH 0/6] use size_t in iconv/strbuf

2018-07-24 Thread Jeff King

This series is primarily about the first two patches to convert our
iconv helpers to use size_t consistently. I posted them to the
git-security list a while back, wondering if there was something sneaky
you could do here. But after some discussion, the consensus was no, you
can't.

The other four patches are just semi-related cleanups I saw while poking
around the strbuf code. I doubt any of them fixes a user-visible bug,
but I think they're worth doing (and they don't seem to conflict with
anything on pu).

  [1/6]: reencode_string: use st_add/st_mult helpers
  [2/6]: reencode_string: use size_t for string lengths
  [3/6]: strbuf: use size_t for length in intermediate variables
  [4/6]: strbuf_readlink: use ssize_t
  [5/6]: pass st.st_size as hint for strbuf_readlink()
  [6/6]: strbuf_humanise: use unsigned variables

 builtin/init-db.c|  3 ++-
 convert.c|  6 +++---
 pretty.c |  2 +-
 refs/files-backend.c |  2 +-
 strbuf.c | 20 ++--
 utf8.c   | 10 +-
 utf8.h   | 10 +-
 7 files changed, 27 insertions(+), 26 deletions(-)

-Peff

Re: [RFC PATCH 5/5] t: add t9930-delta-islands.sh

2018-07-24 Thread Jeff King

On Sun, Jul 22, 2018 at 07:48:36AM +0200, Christian Couder wrote:

> From: Jeff King 
> 
> Signed-off-by: Jeff King 
> Signed-off-by: Christian Couder 
> ---
>  t/t9930-delta-islands.sh | 143 +++

For topics that I'm not immediately sending upstream, I usually stick
them in the t99xx range, so they don't conflict with upstream tests. But
for upstream, this should probably be in the t53xx range.

-Peff

Re: [RFC PATCH 0/5] Add delta islands support

2018-07-24 Thread Jeff King

On Sun, Jul 22, 2018 at 07:48:31AM +0200, Christian Couder wrote:

> I kept Peff as the author and took the liberty to add his
> Signed-off-by to all the patches.

Yes, that's fine. This is a topic I've been meaning to upstream for a
long time, so I'm happy to see it progressing in that direction.

A few reasons I've been hesitant:

 - this produces worse packs for people cloning. The packs we serve at
   GitHub for heavily-forked repositories can be significantly larger
   than what you could get by repacking in isolation.

   I think this is inherent in the scheme (we're losing some delta
   opportunities). But I think it's also made worse because the delta
   window gets clogged with candidates that are forbidden by the island
   config. Repacking with a big --window helps (and doesn't take as long
   as it otherwise might because we can reject some object pairs based
   on islands before doing any computation on the content).

   I also haven't done much modeling on how this scheme progresses over
   time. Imagine users tend to create forks at random points in time and
   then let them get stale. You're going to accumulate a bunch of
   islands that increasingly don't overlap with each other. And as time
   goes on, your delta opportunities get worse and worse.

   Which isn't to say the scheme doesn't work to some degree. It has
   been running in production at GitHub for several years. But it's very
   cobbled together, and I had always hoped to give it some more careful
   thought before sending it upstream. ;) That may be "the perfect is
   the enemy of the good" though (or at least the enemy of the
   mediocre).

 - the whole regex island-naming configuration feels complicated and
   hacky to me. I haven't been able to come up with a better solution,
   though.

 - the whole islandCore thing is it's own weirdness (more below)

> The patches look good to me except perhaps that "pack.islandCore" is
> not documented. Maybe something could be added in
> Documentation/technical/ too, or maybe improving commit messages could
> be enough.

So basically what islandCore does is make one island "special", and we
write all of its objects out first in the packfile. The idea here was
that it would interact well with the pack-reuse code which tries to send
the early part of the pack out verbatim. And if you could pick the fork
which is most-often cloned, then that would be a win (so you'd pick
torvalds/linux.git here).

But:

 - the pack-reuse code we have upstream is crap; it doesn't kick in for
   this instance anyway, because it's too concerned with whether you're
   sending all of the pack. So if you have a 10GB pack and the "core"
   island is the entire first 1GB, you should be able to ship out that
   first GB easily. But the code says "well, that's only 10%!", which is
   not enough to trigger.

   I have replacement code (which we have been running in production)
   that is more clever about the threshold, and also can handle gaps in
   the continuity (so we might realize we need to send objects 1-5000,
   then skip a few, then 5037-8000, and so on). And after that, it's not
   at all clear to me if the islandCore thing is actually still helpful.

 - it's hard to configure, because it may actually change over time
   (whereas all the other island config is basically static and just impacts
   how we consider the refs that are there)

 - there may actually be _multiple_ forks that are "special". In
   torvalds/linux, for example, there's a fair bit of hierarchy, and
   there are people who forked Linus and have their own active
   sub-community. The actual island concept handles this kind of
   layering OK, but the islandCore thing is pretty static.

So I'm not sure if islandCore is even needed or not. But somebody would
have to do some experiments to figure it out (and obviously I should
share the replacement pack-reuse patches).

-Peff

Re: [RFC PATCH 3/5] pack-objects: add delta-islands support

2018-07-24 Thread Jeff King

On Mon, Jul 23, 2018 at 11:52:49AM -0700, Stefan Beller wrote:

> > +DELTA ISLANDS
> > +-
> [...]
> 
> I had to read all of this [background information] to understand the
> concept and I think it is misnamed, as my gut instinct first told me
> to have deltas only "within an island and no island hopping is allowed".
> (This message reads a bit like a commit message, not as documentation
> as it is long winded, too).

I'm not sure if I'm parsing your sentence correctly, but the reason I
called them "islands" is exactly that you'd have deltas within an island
and want to forbid island hopping. So I wasn't sure if you were saying
"that's what I think, but not how the feature works".

There _is_ a tricky thing, which is that a given object is going to
appear in many islands. So the rule is really "you cannot hop to a base
that is not in all of your islands".

> What about renaming this feature to
> 
> [pack]
> excludePartialReach = refs/virtual/[0-9]]+/tags/
> 
>   "By setting `pack.excludePartialReach`, object deltafication is
>   prohibited for objects that are not reachable from all
>   manifestations of the given regex"
> 
> Cryptic, but it explains it in my mind in a shorter, more concise way. ;-)

So I'm hopelessly biased at this point, having developed and worked with
the feature under the "island" name for several years. But I find your
name and explanation pretty confusing. :)

Worse, though, it does not have any noun to refer to the reachable set.
The regex capture and the island names are an important part of the
feature, because it lets you make a single island out of:

  refs/virtual/([0-9]+)/heads
  refs/virtual/([0-9]+)/tags

but exclude:

  refs/virtual/([0-9]+)/(foo)

which goes into its own island ("123-foo" instead of "123"). So what's
the equivalent nomenclature to "island name"?

> > @@ -3182,6 +3225,8 @@ int cmd_pack_objects(int argc, const char **argv, 
> > const char *prefix)
> >   option_parse_missing_action },
> > OPT_BOOL(0, "exclude-promisor-objects", 
> > _promisor_objects,
> >  N_("do not pack objects in promisor packfiles")),
> > +   OPT_BOOL(0, "delta-islands", _delta_islands,
> > +N_("enable islands for delta compression")),
> 
> We enable this feature, but we disallow certain patterns to be used in 
> packing,
> so it sounds weird to me as we tell Git to *not* explore the full design 
> space,
> so we're not enabling it, but rather restricting it?

Yeah, perhaps "respect islands during delta compression" makes more sense.

-Peff

Re: refs/notes/amlog problems, was Re: [PATCH v3 01/20] linear-assignment: a function to solve least-cost assignment problems

2018-07-24 Thread Jeff King

On Mon, Jul 23, 2018 at 06:50:46PM -0700, Junio C Hamano wrote:

>   Side Note: there are a few workflow elements I do want to
>   keep using but they currently *lose* the mapping info.  An
>   obvious one is
> 
> $ git checkout -b to/pic master &&
> ... review in MUA and then ...
> $ git am -s mbox &&
> ... review in tree, attempt to build, tweak, etc.
>   $ git format-patch --stdout master..to/pic >P &&
>   $ edit P &&
>   $ git reset --hard master &&
>   $ git am P
> 
>   which is far more versatile and efficient when doing certain
>   transformations on the series than running "rebase -i" and
>   reopening and editing the target files of the patches one by
>   one in each step.  But because format-patch does not
>   generate Message-Id header of the original one out of the
>   commit, the post-applypatch hook run by "am" at the end of
>   the steps would not have a chance to record that for the
>   newly created commit.
> 
>   For this one, I think I can use "format-patch --notes=amlog"
>   to produce the patch file and then teach post-applypatch
>   script to pay attention to the Notes annotation without
>   changing anything else to record the message id of the
>   original.

Yes. I wonder if it would make sense to teach format-patch/am a
micro-format to automatically handle this case. I.e., some
machine-readable way of passing the notes in the email message.
Of course it's easy to design a format that covers the relatively
restricted form of these amlog notes, and much harder to cover the
general case.

>   Other workflow elements that lose the notes need
>   to be identified and either a fix implemented or a
>   workaround found for each of them.  For example, I suspect
>   there is no workaround for "cherry-pick" and it would take a
>   real fix.

I think the existing notes.rewriteRef is probably a good match here. I
can definitely think of notes you wouldn't want to cherry-pick, but I'm
having trouble coming up with an example that should survive a rebase
but not a cherry-pick.

> And these (incomplete) reverse mapping entries get in the way to
> maintain and correct the forward mapping.  When a commit that got
> unreachable gets expired, I want "git notes prune" to remove notes
> on them, and I do not want to even think about what should happen to
> the entries in the notes tree that abuse the mechanism to map blobs
> that are otherwise *not* even reachable from the main history.

Right, I think the notes tree is a poor distribution method for that
reason.

> > Though personally, I do not know if there is much point in pushing it
> > out, given that receivers can reverse the mapping themselves.
> 
> Before this thread, I was planning to construct and publish the
> reverse mapping at the end of the day, but do so on a separate notes
> ref (see above---the hacky abuse gets in the way of maintaining and
> debugging the forward mapping, but a separate notes-ref that only
> contains hacks is less worrysome).  But I have changed my mind and
> decided not to generate or publish one.  It is sort of similar to
> the way the pack .idx is constructed only by the receiver [*1*].

Yes, the pack .idx was the same mental model I had when writing my
earlier message.

> > Or is there some argument that there is information in the reverse map
> > that _cannot_ be generated from the forward map?
> 
> I know there is no information loss (after all I was the only one
> who ran that experimental hack), but there is one objection that is
> still possible, even though I admit that is a weak argument.

I wondered if you might have a case like this (building as we go):

 - message-id M becomes commit X
   - we write the forward map X->M
   - we write the reverse map M->X
 - during a rewrite (e.g., --amend), commit X becomes commit Y
   - we write the forward map Y->M
   - we write the reverse map M->Y

The difference between that result and an inverted map created at the
end is that we know that M->Y is the final result. Whereas by looking at
the inverted map, we do not know which of M->X and M->Y is correct. In
fact they are _both_ correct. But only one of X and Y would eventually
get merged (both may make it into the repo's of people fetching from you
if we imagine that X is on "pu" and you push between the two steps).

So I think the inverted mapping is not actually one-to-one, and in
either case you'd want to retain all possible matches (pruning only when
a commit is eventually dropped from the forward mapping, which rewritten
things from "pu" would eventually do). And in that case it does not
matter if you generate it incrementally or all at once.

> If a plumbing "diff-{files,tree,index}" family had a sibling
> "diff-notes" to compare two notes-shaped trees while pretending that
> the object-name fan-out did not exist (i.e. instead, the trees being
> compared is

1 2 >

1 - 100 of 108 matches

Mail list logo