Re: Remove help advice text from git editors for interactive rebase and reword

2017-07-24 Thread Jeff King
On Mon, Jul 24, 2017 at 01:26:09AM +0300, Kirill Likhodedov wrote:

> > Not at all interested, as that would mean your tool will tell its
> > users to set such a configuration variable and their interactive use
> > of Git outside your tool will behave differently from other people
> > who use vanilla Git, and they will complain to us.
> 
> That's not true, since the tool can (and would) use the `git -c
> config.var=value rebase -i` syntax to set the configuration variable
> just for this particular command, without affecting the environment.

Yes, but if you are adding a config variable that is only ever meant to
be used from the command line, it probably makes sense to just add a
command-line option.

> Btw, if my proposal is so uninteresting, why the existing advice.*
> variables were previously introduced? I don't know the motivation, but
> assume that it was about making Git less wordy for experienced users.
> So I don't see any difference here.

That is exactly what advice.* is for, but it is about the _user_
deciding that they don't care about seeing that text. Not a tool that is
calling Git deciding that in one particular context, it would like to
suppress the hint text.

So I actually would be OK with having an advice.* option to squelch
rebase and/or commit instructions. But only if users decide they would
never like to see that text. So yes, your tool could piggy-back on that
config option, but it would be a slight abuse of the intent.

> > But stepping back a bit, as you said in the parentheses, your tool
> > would need to grab these "hints" from Git, instead of having a
> > separate hardcoded hints that will go stale while the underlying Git
> > command improves, to be able to show them "separately".  
> 
> There is no need to call Git to get these "hints". They are quite
> obvious, well-known and can be hardcoded. However, I don't plan to use
> these hints anyway, since they are a bit foreign to the GUI of the
> tool I develop. For instance, for reword I'd like to show an editor
> containing just the plain commit message that the user is about to
> change.

If this is all scripted anyway, wouldn't it be an option to just process
the commit message in your program?  The format is well-known, with
hints and instructions on lines marked by core.commentChar ("#" by
default).

I'm not sure exactly of the flow in which the user sees the commit
message buffer (i.e., if you are invoking the editor yourself, or if you
are relying on git-commit to do so). But even in the latter case, you
can hook the editor invocation to do whatever you like. For example:

  GIT_EDITOR='f() { sed -i /^#/d "$1"; $EDITOR "$1"; }; f' git commit

That allows you not only to strip out the existing instructions, but to
insert whatever other instructions you choose.

-Peff


Re: recursive grep doesn't respect --color=always inside submodules

2017-07-24 Thread Brandon Williams
On 07/23, Jacob Keller wrote:
> On Sat, Jul 22, 2017 at 11:02 PM, Orgad Shaneh  wrote:
> > Hi,
> >
> > When git grep --color=always is used, and the output is redirected to
> > a file or a pipe, results inside submodules are not colored. Results
> > in the supermodule are colored correctly.
> >
> > - Orgad
> 
> This occurs because color isn't passed to the recursive grep submodule
> process we launch. It might be fixed if/when we switch to using the
> repository object to run grep in-process. We could also patch grep to
> pass the color option into the submodule.

This is one of the many downsides to the multi-process approach as it is
very easy to miss passing configuration like '--color' down to the
submodules.  As Jacob pointed out this problem is solved when the
recursion is handled in-process.  The series to implement this is
currently under-review.  The latest version of the patch series you can
find here
(https://public-inbox.org/git/20170718190527.78049-1-bmw...@google.com/)
and it looks like Junio has it as a branch on github at
(https://github.com/gitster/git/tree/bw/grep-recurse-submodules).

-- 
Brandon Williams


Re: [PATCH v2] refs: use skip_prefix() in ref_is_hidden()

2017-07-24 Thread Jeff King
On Sat, Jul 22, 2017 at 06:39:12AM +0200, Christian Couder wrote:

> @@ -1175,10 +1175,9 @@ int ref_is_hidden(const char *refname, const char 
> *refname_full)
>   }
>  
>   /* refname can be NULL when namespaces are used. */
> - if (!subject || !starts_with(subject, match))
> - continue;
> - len = strlen(match);
> - if (!subject[len] || subject[len] == '/')
> + if (subject &&
> + skip_prefix(subject, match, ) &&
> + (!*p || *p == '/'))
>   return !neg;

This looks good to me.

I seem to recall running across a similar pattern elsewhere, where a
caller wanted either an exact match, or a match ending with a particular
character. We could add one more helper like:

  if (subject && skip_prefix_with_sep(subject, match, '/', ))
return !neg;

But I don't offhand recall where that other place was, and it's not like
it's saving a huge amount of complexity in the caller. So I'll file it
away in the back of my mind and see if it comes up a third time. ;)

-Peff


[L10N] Kickoff of translation for Git 2.14.0 round 2

2017-07-24 Thread Jiang Xin
Hi,

In the last round of l10n, some l10n messages from "date.c" are
disappeared because of the l10n unfriendly PRItime macro.  This issue
has been fixed by commit fc0fd5b23b (Makefile: help gettext tools to
cope with our custom PRItime format), so let's start new round of l10n
based on the new generated "po/git.pot" file.

This time there are 9 updated messages need to be translated since last
update:

l10n: git.pot: v2.14.0 round 2 (9 new, 2 removed)

Generate po/git.pot from v2.14.0-rc0-40-g5eada8987e for git
v2.14.0 l10n round 2.

Signed-off-by: Jiang Xin 

You can get it from the usual place:

https://github.com/git-l10n/git-po/

As how to update your XX.po and help to translate Git, please see
"Updating a XX.po file" and other sections in “po/README" file.

--
Jiang Xin


Re: fetch-any-blob / ref-in-want proposal

2017-07-24 Thread Jonathan Tan
On Sun, 23 Jul 2017 09:41:50 +0300
Orgad Shaneh  wrote:

> Hi,
> 
> Jonathan Tan proposed a design and a patch series for requesting a
> specific ref on fetch 4 months ago[1].
> 
> Is there any progress with this?
> 
> - Orgad
> 
> [1] 
> https://public-inbox.org/git/ffd92ad9-39fe-c76b-178d-6e3d6a425...@google.com/

Do you mean requesting a specific blob (as referenced by your link)? If yes, it 
is still being discussed. One such discussion is here: [1]

If you mean ref-in-want, I don't recall anything being done since then.

[1] https://public-inbox.org/git/cover.1499800530.git.jonathanta...@google.com/


Change in output as a result of patch

2017-07-24 Thread Kaartic Sivaraam
The patch in the previous mail results in a change in output as
specified below.

$ git branch
* master
  foo
  bar

Before patch,

$ git branch -m hypothet master
fatal: A branch named 'master' already exists.

$ git branch -m hypothet real
error: refname refs/heads/hypothet not found
fatal: Branch rename failed

After patch,

$ git branch -m hypothet master
fatal: Branch 'hypothet' does not exist.

$ git branch -m hypothet real
fatal: Branch 'hypothet' does not exist.

-- 
Kaartic


Re: Expected behavior of "git check-ignore"...

2017-07-24 Thread Junio C Hamano
John Szakmeister  writes:

> Correct, it appears that if any line in the ignore matches, then it
> exits with 0.  So it's not that it's ignored, but that there is a
> matching line in an ignore file somewhere.  I can see the logic in
> this if it's meant to be a debugging tools, especially combined with
> -v.  Simply changing it does affect quite a few tests, but I'm not
> sure that it was intentional for negation to be treated this way.

I am reasonably sure that the command started its life as a pure
debugging aid.  

The treatment of the negation _might_ impose conflicting goals to
its purpose as a debugging aid---a user who debugs his .gitignore
file would want to know what causes a thing that wants to be ignored
is not or vice versa, and use of the exit status to indicate if it
is ignored may not mesh well with its goal as a debugging aid, but I
didn't think about the potential issues deeply myself while writing
this response.  As you mentioned, use of (or not using) "-v" could
be used as a sign to see which behaviour the end-user expects, I
guess.




Re: change the filetype from binary to text after the file is commited to a git repo

2017-07-24 Thread Jeff King
On Mon, Jul 24, 2017 at 09:02:12PM +0200, tonka3...@gmail.com wrote:

> There is no .gitattributes file in the repo. I think that the git
> heuristic will also detect utf-16 files as binary (in windows), so i
> think that is the reason why my file is binary (i have to check that
> tomorrow).

Correct. UTF-16 _is_ binary, if you are trying to include it alongside
ASCII content (like the rest of the text diff headers). The two cannot
mix.

> If i add a .gitattribute file i have the problem that git
> diff will treat the old and the new blob as utf-8, which generate
> garbage.

Git's diff doesn't look at encodings at all; it does a diff of the
actual bytes without respect to any encoding. So yes, if you use "-a" or
a gitattribute to ask git to show you the bytes, the UTF-16 is likely to
look like garbage (and a commit rewriting from utf-16 to utf-8 will
basically be a rewrite of the whole file contents).

> Do you have another idea?  Could it be possible to add only a space in
> code (utf-8) and then add the real content in a second commit, so the
> old and the new one are both utf-8?

I'm not sure exactly what you're trying to accomplish. If you're unhappy
with the file as utf-16, then you should probably convert to utf-8 as a
single commit (since the diff will otherwise be unreadable) and then
make further changes in utf-8.

If you need the file to remain utf-16 but you want more readable diffs
for those versions, you can ask git to convert to utf-8 before
performing the diff. Such a diff couldn't be applied, but would be
useful for reading. E.g., try:

  echo 'file diff=utf16' >.gitattributes
  git config diff.utf16.textconv 'iconv -f utf16 -t utf8'

You can read more about how this works in the "textconv" section of "git
help attributes".

Note that I'm relying on the external "iconv" tool to do the conversion
there. It's pretty standard on most Unix systems, but I don't know what
would be the best tool on Windows.

-Peff


Re: [PATCH] git-contacts: Add recognition of Reported-by

2017-07-24 Thread Junio C Hamano
Jeff King  writes:

>> I also should have mentioned the need for a way to say "remove all
>> hardcoded default and start from scratch".
>
> There's already some prior art around trailers in the trailer.* config.
> I wonder if it would make sense to claim a new key there, like:
>
>   git config trailer.Reported-by.autocc true
>
> If "Reported-by" is a trailer that your project uses, then there may be
> some benefit to setting up other config related to it, and this would
> mesh nicely. And then potentially other programs besides git-contacts
> would want to respect that flag (perhaps send-email would even want to
> do it itself; I think it already respects cc and s-o-b headers).

Sounds like a good suggestion.  But...

If I understand your proposal, the trailer stuff would still not
care what value .autocc is set to while doing its own thing, but the
programs that read the text file that the trailer can work on would
pay attention to it, and they individually have to do so?  Perhaps
there is a need for another mode "interpret-trailers" is told to run
in, where it is given a text file with trailers in it and is told to
show only the value that has .autocc bit on?  Alternatively, yield
 pairs so that the user of the tool can further process
the value differently depending on the key, or something?


Re: [PATCH] git-contacts: Add recognition of Reported-by

2017-07-24 Thread Jeff King
On Mon, Jul 24, 2017 at 12:29:04PM -0700, Junio C Hamano wrote:

> > There's already some prior art around trailers in the trailer.* config.
> > I wonder if it would make sense to claim a new key there, like:
> >
> >   git config trailer.Reported-by.autocc true
> >
> > If "Reported-by" is a trailer that your project uses, then there may be
> > some benefit to setting up other config related to it, and this would
> > mesh nicely. And then potentially other programs besides git-contacts
> > would want to respect that flag (perhaps send-email would even want to
> > do it itself; I think it already respects cc and s-o-b headers).
> 
> Sounds like a good suggestion.  But...
> 
> If I understand your proposal, the trailer stuff would still not
> care what value .autocc is set to while doing its own thing, but the
> programs that read the text file that the trailer can work on would
> pay attention to it, and they individually have to do so?  Perhaps
> there is a need for another mode "interpret-trailers" is told to run
> in, where it is given a text file with trailers in it and is told to
> show only the value that has .autocc bit on?  Alternatively, yield
>  pairs so that the user of the tool can further process
> the value differently depending on the key, or something?

Yeah, I don't think interpret-trailers is currently a useful tool for
looking at an extra config key like that. I'd expect it would need to be
extended, or a new tool added, or perhaps existing tools would need to
learn more about how trailers work (e.g., it would be nice if git-log
could do matching or even print them via %(trailer:reported-by)
placeholders or something). I think there's a lot of potential work in
that area.

Of course git-contacts (or send-email) _can_ just look look at all of
the trailer.*.autocc config and try to match those manually. But the
point of having trailer config, I think, is that we should stop doing
ad-hoc parsing and have a tool for manipulating and querying trailers.
If interpret-trailers isn't up to the task yet, I'd rather see work go
there.

But that (manual parsing) is basically how the current cc and s-o-b
trailers implemented inside of git-send-email, so I don't think it would
be the end of the world as a quick hack that could later be expanded to
use the trailer infrastructure.

-Peff


[PATCH/RFC] branch: warn user about non-existent branch

2017-07-24 Thread Kaartic Sivaraam
The inexistence of the branch trying to be renamed wasn't checked
and was left for 'rename_ref' to point out. It's better to do it
explicitly as it leads to unconventional behaviour in the following
case,

$ git branch -m foo master
fatal: A branch named 'master' already exists.

It's conventional to report that the 'foo' doesn't exist rather than
repoting that 'master' exists, the same way the 'mv' command does.

$ mv foo existing_file
mv: cannot stat 'foo': No such file or directory

Further, there's no way for 'master' being overwritten with 'foo',
as it doesn't exist. Reporting the existence of 'master' is germane
only when 'master' is *really* going to be overwritten.

So, report the inexistence of the branch explicitly  before reporting
existence of new branch name to be consistent with it's counterpart,
the widely used, the 'mv' command.

Signed-off-by: Kaartic Sivaraam 
---
 I'm sending this patch as I didn't want to leave this thread
 open ended. I'm not yet sure if this is a good thing to do.
 This patch is open to comments, as the prvious ones I've sent
 have been.

 builtin/branch.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/builtin/branch.c b/builtin/branch.c
index a3bd2262b..0a9112335 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -473,6 +473,10 @@ static void rename_branch(const char *oldname, const char 
*newname, int force)
die(_("Invalid branch name: '%s'"), oldname);
}
 
+   /* Check for existence of oldref before proceeding */
+   if(!ref_exists(oldref.buf))
+   die(_("Branch '%s' does not exist."), oldname);
+
/*
 * A command like "git branch -M currentbranch currentbranch" cannot
 * cause the worktree to become inconsistent with HEAD, so allow it.
-- 
2.13.2.23.g14d9f4c6d



Re: fetch-any-blob / ref-in-want proposal

2017-07-24 Thread Orgad Shaneh
On Mon, Jul 24, 2017 at 7:13 PM, Jonathan Tan  wrote:
> On Sun, 23 Jul 2017 09:41:50 +0300
> Orgad Shaneh  wrote:
>
>> Hi,
>>
>> Jonathan Tan proposed a design and a patch series for requesting a
>> specific ref on fetch 4 months ago[1].
>>
>> Is there any progress with this?
>>
>> - Orgad
>>
>> [1] 
>> https://public-inbox.org/git/ffd92ad9-39fe-c76b-178d-6e3d6a425...@google.com/
>
> Do you mean requesting a specific blob (as referenced by your link)? If yes, 
> it is still being discussed. One such discussion is here: [1]
>
> If you mean ref-in-want, I don't recall anything being done since then.
>
> [1] 
> https://public-inbox.org/git/cover.1499800530.git.jonathanta...@google.com/

Sorry, I thought it's the same thing. I mean ref-in-want[1]. This
issue in gerrit[2] was closed, claiming that ref-in-want will solve
it. I just want to know if this is likely to be merged soon enough (I
consider several months "soon enough"), or should I look for other
solutions.

- Orgad

[1] https://public-inbox.org/git/cover.1485381677.git.jonathanta...@google.com/
[2] https://bugs.chromium.org/p/gerrit/issues/detail?id=175#c24


Re: Should I store large text files on Git LFS?

2017-07-24 Thread Junio C Hamano
Jeff King  writes:

> On Mon, Jul 24, 2017 at 02:58:38PM +1000, Andrew Ardill wrote:
>
>> On 24 July 2017 at 13:45, Farshid Zavareh  wrote:
>> > I'll probably test this myself, but would modifying and committing a 4GB
>> > text file actually add 4GB to the repository's size? I anticipate that it
>> > won't, since Git keeps track of the changes only, instead of storing a copy
>> > of the whole file (whereas this is not the case with binary files, hence 
>> > the
>> > need for LFS).
>> 
>> I decided to do a little test myself. I add three versions of the same
>> data set (sometimes slightly different cuts of the parent data set,
>> which I don't have) each between 2 and 4GB in size.
>> Each time I added a new version it added ~500MB to the repository, and
>> operations on the repository took 35-45 seconds to complete.
>> Running `git gc` compressed the objects fairly well, saving ~400MB of
>> space. I would imagine that even more space would be saved
>> (proportionally) if there were a lot more similar files in the repo.
>
> Did you tweak core.bigfilethreshold? Git won't actually try to find
> deltas on files larger than that (500MB by default). So you might be
> seeing just the effects of zlib compression, and not deltas.
>
> You can always check the delta status after a gc by running:
>
>   git rev-list --objects --all |
>   git cat-file --batch-check='%(objectsize:disk) %(objectsize) %(deltabase) 
> %(rest)'
>
> That should give you a sense of how much you're saving due to zlib (by
> comparing the first two numbers for a copy that isn't a delta; i.e.,
> with an all-zeros delta base) and how much due to deltas (how much
> smaller the first number is for an entry that _is_ a delta).

In addition to that, people need to take into account that "binary
vs text" is a secondary criteria when considering how effective our
deltifying algorithm works on their data.

We use the same xdelta algorithm that is oblivious to line breaks,
so given two pairs of input files (T1, T2) and (B1, B2), where T1
and B1 are comparative sizes and T2 and B2 are comparative sizes,
and the change made to T1 to produce T2 (e.g. copy byte range X-Y of
T1 to byte ranage starting from offset O of T2, insert this literal
byte string of length L, etc.) and the change made to B1 to produce
B2 are of comparative sizes (i.e. X-Ys and Os are similar), when T's
are text and B's are binary, you should get similarly sized delta to
represent T2 as a delta to T1 and B2 as a delta to B1.  

The reason why typical "binary" file does not delta well is not
inherent to their "binary"-ness but lies elsewhere.  It is because
tools that produce "binary" files tend not to care too much about
preserving original and only effect changes to limited part.  That
is what makes their data not delta well across versions.

Exceptions are editing exif data without changing the actual image
bits in jpeg files or editing id3 data without changing the actual
sound bits in mp3 files.  Binary files across these kind of
operations delta very well with Git, as "edit" is not done by
completely rewriting everything but is confined in a small area of
the file.




Re: Remove help advice text from git editors for interactive rebase and reword

2017-07-24 Thread SZEDER Gábor

> So I actually would be OK with having an advice.* option to squelch
> rebase and/or commit instructions. But only if users decide they would
> never like to see that text. So yes, your tool could piggy-back on that
> config option, but it would be a slight abuse of the intent.

I don't mind the advice in the interactive rebase TODO list.  It's at
the end of the file, nothing comes after that, so it's never in the
way.

However, I do care about the advices in the commit message template,
because they are _between_ the commit message I'm writing and the diff
(using commit.verbose=true) I'm writing about.  So I build git
for my own use with the patch below for a couple of years now, but
never submitted it.


On a related note, when committing a merge or cherry-pick the commit
message templates includes this:

  # It looks like you may be committing a merge.
  # If this is not correct, please remove the file
  #   .git/MERGE_HEAD
  # and try again.

This text traces back almost to the dawn of time, to commit 9c065315f
(Make "git commit" work correctly in the presense of a manual merge,
2005-06-08).  Now, I can well imagine that stray MERGE_HEAD files
caused troubles back then, especially with those "manual merges"...  
But is it really an issue with modern git?!  I think this is long
outdated and could be removed.


  -- >8 --

Subject: [PATCH] commit: allow suppression of commit message template advices

The commit message template includes a lot of advices:

  - The default commit message template asks the user nicely to write
a commit message and tells about comments and how to abort.
  - It includes some outdated hints about merges and cherry-picks.
  - Finally, in case of 'git commit -v' it reminds about the role of
the scissors line separating the commit message from the diff.

While these reminders are useful for new users, with time they learn
what the score is, and experienced users might find these advices are
just wasting a couple of lines' worth of screen real estate.

Make displaying these advices configurable via the 'advice.commitMsg'
config variable.

Signed-off-by: SZEDER Gábor 
---
 Documentation/config.txt |  2 ++
 advice.c |  2 ++
 advice.h |  1 +
 builtin/commit.c | 69 ++--
 wt-status.c  | 14 +-
 5 files changed, 50 insertions(+), 38 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index d5c9c4cab..29c8736b1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -331,6 +331,8 @@ advice.*::
commitBeforeMerge::
Advice shown when linkgit:git-merge[1] refuses to
merge to avoid overwriting local changes.
+   commitMsg::
+   Advices shown in the commit message template.
resolveConflict::
Advice shown by various commands when conflicts
prevent the operation from being performed.
diff --git a/advice.c b/advice.c
index d81e1cb74..7851bb20c 100644
--- a/advice.c
+++ b/advice.c
@@ -10,6 +10,7 @@ int advice_push_needs_force = 1;
 int advice_status_hints = 1;
 int advice_status_u_option = 1;
 int advice_commit_before_merge = 1;
+int advice_commit_msg = 1;
 int advice_resolve_conflict = 1;
 int advice_implicit_identity = 1;
 int advice_detached_head = 1;
@@ -31,6 +32,7 @@ static struct {
{ "statushints", _status_hints },
{ "statusuoption", _status_u_option },
{ "commitbeforemerge", _commit_before_merge },
+   { "commitmsg", _commit_msg },
{ "resolveconflict", _resolve_conflict },
{ "implicitidentity", _implicit_identity },
{ "detachedhead", _detached_head },
diff --git a/advice.h b/advice.h
index c84a44531..92c9937d6 100644
--- a/advice.h
+++ b/advice.h
@@ -12,6 +12,7 @@ extern int advice_push_needs_force;
 extern int advice_status_hints;
 extern int advice_status_u_option;
 extern int advice_commit_before_merge;
+extern int advice_commit_msg;
 extern int advice_resolve_conflict;
 extern int advice_implicit_identity;
 extern int advice_detached_head;
diff --git a/builtin/commit.c b/builtin/commit.c
index 8e9380251..ead7bf5ef 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -810,38 +810,42 @@ static int prepare_to_commit(const char *index_file, 
const char *prefix,
if (whence != FROM_COMMIT) {
if (cleanup_mode == CLEANUP_SCISSORS)
wt_status_add_cut_line(s->fp);
-   status_printf_ln(s, GIT_COLOR_NORMAL,
-   whence == FROM_MERGE
-   ? _("\n"
-   "It looks like you may be committing a 
merge.\n"
-   "If this is not correct, please remove 
the file\n"
-   "   %s\n"
-   "and try 

Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-24 Thread Junio C Hamano
Jiang Xin  writes:

>> So let's go with Junio's patch.
>
> I agree.  We just go with the sed-then-cleanup version until we meet
> ambiguities (I mean some words other than PRItime need to be
> replaced).

OK, thanks for all involved to get us to a conclusion.  Jiang, I saw
you already made an announcement for the second round.  Thank you
very much for doing so without waiting me---I was stuck on something
else and my morning was blown X-<.  I'll still try to tag -rc1 by
the end of business today.



Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-24 Thread Junio C Hamano
Jiang Xin  writes:

> 2017-07-23 10:33 GMT+08:00 Jean-Noël AVILA :
>> Plus, I hope that some day, instead of translators finding afterwards
>> that a change broke i18n capabilities, developpers would have some kind
>> of sanity check. Requiring special versions of i18n tooling stops this hope.
>
> It would be fun to create some tools to help l10n guys finding l10n
> changes on every git commit.

It is not just fun but would be useful, I would think.


Re: change the filetype from binary to text after the file is commited to a git repo

2017-07-24 Thread tonka3...@gmail.com
Hey Jeff,

Thx for your answer.

There is no .gitattributes file in the repo. I think that the git heuristic 
will also detect utf-16 files as binary (in windows), so i think that is the 
reason why my file is binary (i have to check that tomorrow). If i add a 
.gitattribute file i have the problem that git diff will treat the old and the 
new blob as utf-8, which generate garbage.

Do you have another idea?
Could it be possible to add only a space in code (utf-8) and then add the real 
content in a second commit, so the old and the new one are both utf-8?

> Am 24.07.2017 um 20:18 schrieb Jeff King :
> 
>> On Mon, Jul 24, 2017 at 07:11:06AM +0200, tonka tonka wrote:
>> 
>> I have a problem with an already committed file into my repo. This git
>> repo was converted from svn to git some years ago. Last week I have
>> change some lines in a file and I saw in the diff that it is marked as
>> binary (it's a simple .cpp file). I think on the first commit it was
>> detected as an utf-16 file (on windows). But no matter what I do I
>> can't get it back to a "normal text" text file (git does not detect
>> that), but I is now only utf-8. I also replace the whole content of
>> the file with just 'a' and git say it's binary.
> 
> Git doesn't store a flag for "binary-ness" on each file (though see
> below). As the diffs are generated on the fly when you ask to compare
> two versions, so too is the determination of "is this binary".
> 
> The default heuristic looks at file size (by default, if the file is
> over 500MB it's considered binary) and whether it has any zero-byte
> characters in the first few kilobytes. But note that if _either_ side of
> a diff is considered binary, then Git won't show a text diff.
> 
> If you want a particular diff to show all content, even if it doesn't
> look like text, add "-a" to your git invocation (e.g., "git show -a").
> 
> That said, you can also use .gitattributes (see "git help attributes")
> to mark a file as binary or not-binary, skipping the heuristic check.
> I'm guessing since you converted from svn that you don't have a
> .gitattributes file, but it's possible that somebody later added one
> that marks the file as binary (and so the solution would be to drop that
> entry).
> 
> -Peff


Re: [PATCH] recursive submodules: detach HEAD from new state

2017-07-24 Thread Stefan Beller
On Mon, Jul 24, 2017 at 11:03 AM, Jonathan Nieder  wrote:
> Hi,
>
> Stefan Beller wrote:
>
>> When a submodule is on a branch and in its superproject you run a
>> recursive checkout, the branch of the submodule is updated to what the
>> superproject checks out. This is very unexpected in the current model of
>> Git as e.g. 'submodule update' always detaches the submodule HEAD.
>>
>> Despite having plans to have submodule HEADS not detached in the future,
>> the current behavior is really bad as it doesn't match user expectations
>> and it is not checking for loss of commits (only to be recovered via the
>> reflog).
>
> I think the corrected behavior doesn't match user expectations,
> either.

Well, what is the user expectation?

>
> Could this patch include some documentation to help users know what to
> expect?

Sure, once we figured out what is reasonable.

>
>> Detach the HEAD unconditionally in the submodule when updating it.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>> This is a resend of [1], which did not receive any attention.
>
> Yikes.  Yes, this bug looks problematic.  Thanks for working on it.
>
>> I improved the commit message laying out the current state of affairs,
>> arguing that any future plan should not weigh in as much as the current
>> possible data loss.
>
> Can you elaborate on what you mean about data loss?

Assume we have a submodule 'sub' inside the superproject 'super',
then

git -C super/sub checkout 
git -C super checkout 

modifies my-unrelated-branch in the submodule, which is not related
to the superproject in any way.

This patch would detach from that branch and have the HEAD contain
the desired sha1. To think that further we'd still have potential data loss:

git -C super/sub checkout 
git -C super checkout 
# fine so far as sub is in detached HEAD, but:
 ... hack hack hack ... in 'sub'
git -C super/sub commit -m "work"
git -C super checkout 
# subs work is only to be recovered via reflog!

However this matches the current behavior of
"submodule update" which also tips, that are
not reachable from any ref.

> At first glance
> it would seem to me that detaching HEAD could lead to data loss since
> there isn't a branch to keep track of the user's work.

yes, but that is the same with "submodule update", which is what
people may have in mind?

>  Are you saying
> the current behavior of updating whatever branch HEAD is on (which,
> don't get me wrong, is a wrong behavior that needs fixing) bypassed
> the reflog?

No, I am not saying that.
I am saying that updating an unrelated branch (which is dragged into
the affair just because HEAD points at it) is very subtle thing, as any
commits on that branch can be considered safe (it is on a branch, right?)
but the detached HEAD is the default unsafe mode we currently have.

Thanks,
Stefan


Re: [PATCH v3] merge: add a --signoff flag.

2017-07-24 Thread lukaszgryglicki
Hi, what is the state of this patch?
What else should I do to get it merged into git?
Thanks.

> On 4 Jul 2017, at 11:33, Łukasz Gryglicki  wrote:
> 
> Some projects require every commit to be signed off.
> Our workflow is to create feature branches and require every commit to
> be signed off. When feature is finally approved we need to merge it into
> master. Merge itself is usually trivial and is done by
> `git merge origin/master`.
> 
> Unfortunatelly `merge` command have no --signoff
> flag, so we need to either add signoff line manually or use
> `git commit --amend -s` after the merge.
> 
> First solution is not ideal because not all developers are familiar with
> exact sign-off syntax. The second solution works, but is obviously tedious.
> 
> This patch adds --signoff support to `git-merge` command.
> It works just like --signoff in `git-commit` command.
> 
> More details can be found here:
> https://public-inbox.org/git/CAHv71zK5SqbwrBFX=a8-dy9h3kt4feymgv__p2gzznr0wua...@mail.gmail.com/T/#u
> 
> Signed-off-by: lukaszgryglicki 
> ---
> Documentation/git-merge.txt |  8 ++
> builtin/merge.c |  4 +++
> t/t7614-merge-signoff.sh| 69 +
> 3 files changed, 81 insertions(+)
> create mode 100755 t/t7614-merge-signoff.sh
> 
> diff --git a/Documentation/git-merge.txt b/Documentation/git-merge.txt
> index 04fdd8cf086db..6b308ab6d0b52 100644
> --- a/Documentation/git-merge.txt
> +++ b/Documentation/git-merge.txt
> @@ -64,6 +64,14 @@ OPTIONS
> ---
> include::merge-options.txt[]
> 
> +--signoff::
> + Add Signed-off-by line by the committer at the end of the commit
> + log message.  The meaning of a signoff depends on the project,
> + but it typically certifies that committer has
> + the rights to submit this work under the same license and
> + agrees to a Developer Certificate of Origin
> + (see http://developercertificate.org/ for more information).
> +
> -S[]::
> --gpg-sign[=]::
>   GPG-sign the resulting merge commit. The `keyid` argument is
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 900bafdb45d0b..78c36e9bf353b 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -70,6 +70,7 @@ static int continue_current_merge;
> static int allow_unrelated_histories;
> static int show_progress = -1;
> static int default_to_upstream = 1;
> +static int signoff;
> static const char *sign_commit;
> 
> static struct strategy all_strategy[] = {
> @@ -233,6 +234,7 @@ static struct option builtin_merge_options[] = {
>   { OPTION_STRING, 'S', "gpg-sign", _commit, N_("key-id"),
> N_("GPG sign commit"), PARSE_OPT_OPTARG, NULL, (intptr_t) "" },
>   OPT_BOOL(0, "overwrite-ignore", _ignore, N_("update ignored 
> files (default)")),
> + OPT_BOOL(0, "signoff", , N_("add Signed-off-by:")),
>   OPT_END()
> };
> 
> @@ -763,6 +765,8 @@ static void prepare_to_commit(struct commit_list 
> *remoteheads)
>   strbuf_addch(, '\n');
>   if (0 < option_edit)
>   strbuf_commented_addf(, _(merge_editor_comment), 
> comment_line_char);
> + if (signoff)
> + append_signoff(, ignore_non_trailer(msg.buf, msg.len), 0);
>   write_file_buf(git_path_merge_msg(), msg.buf, msg.len);
>   if (run_commit_hook(0 < option_edit, get_index_file(), 
> "prepare-commit-msg",
>   git_path_merge_msg(), "merge", NULL))
> diff --git a/t/t7614-merge-signoff.sh b/t/t7614-merge-signoff.sh
> new file mode 100755
> index 0..c1b8446f491dc
> --- /dev/null
> +++ b/t/t7614-merge-signoff.sh
> @@ -0,0 +1,69 @@
> +#!/bin/sh
> +
> +test_description='git merge --signoff
> +
> +This test runs git merge --signoff and makes sure that it works.
> +'
> +
> +. ./test-lib.sh
> +
> +# Setup test files
> +test_setup() {
> + # Expected commit message after merge --signoff
> + cat >expected-signed < +Merge branch 'master' into other-branch
> +
> +Signed-off-by: $(git var GIT_COMMITTER_IDENT | sed -e "s/>.*/>/")
> +EOF
> +
> + # Expected commit message after merge without --signoff (or with 
> --no-signoff)
> + cat >expected-unsigned < +Merge branch 'master' into other-branch
> +EOF
> +
> + # Initial commit and feature branch to merge master into it.
> + git commit --allow-empty -m "Initial empty commit" &&
> + git checkout -b other-branch &&
> + test_commit other-branch file1 1
> +}
> +
> +# Setup repository, files & feature branch
> +# This step must be run if You want to test 2,3 or 4
> +# Order of 2,3,4 is not important, but 1 must be run before
> +# For example `-r 1,4` or `-r 1,4,2 -v` etc
> +# But not `-r 2` or `-r 4,3,2,1`
> +test_expect_success 'setup' '
> + test_setup
> +'
> +
> +# Test with --signoff flag
> +test_expect_success 'git merge --signoff adds a sign-off line' '
> + git checkout master &&
> + test_commit master-branch-2 file2 2 &&
> + git checkout other-branch &&
> + git 

Re: change the filetype from binary to text after the file is commited to a git repo

2017-07-24 Thread Jeff King
On Mon, Jul 24, 2017 at 10:26:22PM +0200, tonka3...@gmail.com wrote:

> > I'm not sure exactly what you're trying to accomplish. If you're unhappy
> > with the file as utf-16, then you should probably convert to utf-8 as a
> > single commit (since the diff will otherwise be unreadable) and then
> > make further changes in utf-8.

> That was exactly what i'm searching for. The utf-16 back in the days
> was by accident (thx to visual studio). So if the last commit and the
> acutal change are both utf-8 the diff should work again.  Just for my
> understanding. Git just take the bytes of the whole file on every
> commit, so there is no general problem with that, the size of the
> utf-16 is just twice as big as the utf-8 one, is that correct?

Right. The diff switching the encodings will be listed as "binary" (and
you should write a good commit message explaining what's going on!), but
then after that the changes to the utf-8 version will display as normal
text.  Git only looks at the actual bytes being diffed, not older
versions of the file.

-Peff


Re: [PATCH v3] merge: add a --signoff flag.

2017-07-24 Thread Junio C Hamano
Junio C Hamano  writes:

> lukaszgryglicki  writes:
>
>> Hi, what is the state of this patch?
>
> I was expecting you to respond to Ævar's <87tw2sbnyl@gmail.com>
> to explain how your new version addresses his concerns, or him to
> respond to your new patch to say that it addresses his concerns
> adequately.  I think neither has happened, so the topic is still in
> limbo, I guess...

Sorry, Ævar's message I meant was <87fueferd4@gmail.com>.


Re: [GSoC][PATCH 04/13] submodule: port submodule subcommand 'status' from shell to C

2017-07-24 Thread Brandon Williams
On 07/25, Prathamesh Chavan wrote:
> This aims to make git-submodule 'status' a built-in. Hence, the function
> cmd_status() is ported from shell to C. This is done by introducing
> three functions: module_status(), submodule_status() and print_status().
> 
> The function module_status() acts as the front-end of the subcommand.
> It parses subcommand's options and then calls the function
> module_list_compute() for computing the list of submodules. Then
> this functions calls for_each_submodule_list() looping through the
> list obtained.
> 
> Then for_each_submodule_list() calls submodule_status() for each of the
> submodule in its list. The function submodule_status() is responsible
> for generating the status each submodule it is called for, and
> then calls print_status().
> 
> Finally, the function print_status() handles the printing of submodule's
> status.
> 
> Mentored-by: Christian Couder 
> Mentored-by: Stefan Beller 
> Signed-off-by: Prathamesh Chavan 
> ---
> In this new version of patch, following changes were made:
> * instead of using the ce_match_stat(), cmd_diff_files is used.
> * currently, no comment about future scope of optimization wrt the
>   cmd_diff_files() usage was added as currently, I'm not fully sure of
>   the way to optimize the function.
> 
>  builtin/submodule--helper.c | 151 
> 
>  git-submodule.sh|  49 +-
>  2 files changed, 152 insertions(+), 48 deletions(-)
> 
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 80f744407..b39828174 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -560,6 +560,156 @@ static int module_init(int argc, const char **argv, 
> const char *prefix)
>   return 0;
>  }
>  
> +struct status_cb {
> + const char *prefix;
> + unsigned int quiet: 1;
> + unsigned int recursive: 1;
> + unsigned int cached: 1;
> +};
> +#define STATUS_CB_INIT { NULL, 0, 0, 0 }
> +
> +static void print_status(struct status_cb *info, char state, const char 
> *path,
> +  char *sub_sha1, char *displaypath)

Lets mark these strings as 'const char *' since you aren't modifying
them in the function, only using them.  Also it may make more sense to
pass the 'struct object_id' instead of a hex string of the object id.
Then you can call the necessary 'oid_to_hex' function when you are
adding the object id to the argv array.  That would also eliminate an
allocation in 'status_submodule'.

Also eliminates the need to use the word 'sha1' as we want to eliminate
its usage in the codebase.

> +{
> + if (info->quiet)
> + return;
> +
> + printf("%c%s %s", state, sub_sha1, displaypath);
> +
> + if (state == ' ' || state == '+') {
> + struct argv_array name_rev_args = ARGV_ARRAY_INIT;
> +
> + argv_array_pushl(_rev_args, "print-name-rev",
> +  path, sub_sha1, NULL);
> + print_name_rev(name_rev_args.argc, name_rev_args.argv,
> +info->prefix);
> + } else {
> + printf("\n");
> + }
> +}
> +
> +static int handle_submodule_head_ref(const char *refname,
> +  const struct object_id *oid, int flags,
> +  void *cb_data)
> +{
> + struct strbuf *output = cb_data;
> + if (oid)
> + strbuf_addstr(output, oid_to_hex(oid));

Since we're going to be working with 'struct object_id' instead of
strings this would need to change slightly.  Instead of copying into a
strbuf we could just pass a pointer to a 'struct object_id' and use
'oidcpy' to copy the contents of oid to output.

  struct object_id *output = cb_data;
  if (oid)
oidcpy(output, oid);

> + return 0;
> +}
> +
> +static void status_submodule(const struct cache_entry *list_item, void 
> *cb_data)
> +{
> + struct status_cb *info = cb_data;
> + char *sub_sha1 = xstrdup(oid_to_hex(_item->oid));
> + char *displaypath;
> + struct argv_array diff_files_args = ARGV_ARRAY_INIT;

'diff_files_args' needs to be cleared at the end of the function when
doing cleanup to prevent a memory leak.

> +
> + if (!submodule_from_path(null_sha1, list_item->name))
> + die(_("no submodule mapping found in .gitmodules for path 
> '%s'"),
> +   list_item->name);
> +
> + displaypath = get_submodule_displaypath(list_item->name, info->prefix);
> +
> + if (list_item->ce_flags) {

Is there a particular flag we are interested in here or only that a flag
is set?

> + print_status(info, 'U', list_item->name,
> +  sha1_to_hex(null_sha1), displaypath);

Since we are already using OID's in other parts of this code lets be
consistant and use null_oid instead like 'oid_to_hex(_oid)'.

> + goto cleanup;
> + }
> +
> + if 

Re: Bug^Feature? fetch protects only current working tree branch

2017-07-24 Thread Andreas Heiduk
Am 24.07.2017 um 00:13 schrieb Junio C Hamano:
> Andreas Heiduk  writes:
> 
>> A `git fetch . origin/master:master` protects the currently checked out 
>> branch (HEAD) unless the `-u/--update-head-ok` is supplied. This avoids a
>> mismatch between the index and HEAD. BUT branches which are HEADs in other
>> working trees do not get that care - their state is silently screwed up.
>>
>> Is this intended behaviour or and just an oversight while implementing
>> `git worktree`?
> 
> The latter.  

Ok, so I can adjust some of my helper scripts to catch and forbid this case.

> [...]"git worktree" is an interesting feature and has
> potential to become useful in wider variety of workflows than it
> currently is, but end users should consider it still experimental as
> it still is with many such small rough edges like this one.
> 
> Patches to help improving the feature is of course very welcome.

Since the core of the check is in C, I'll pass on this one. I could
supply a patch adding this to the "BUGS" section of git-worktree(1)
though :-)




Re: [PATCH] sub-process: refactor handshake to common function

2017-07-24 Thread Jonathan Nieder
Hi,

Jonathan Tan wrote:

> Refactor, into a common function, the version and capability negotiation
> done when invoking a long-running process as a clean or smudge filter.
> This will be useful for other Git code that needs to interact similarly
> with a long-running process.
>
> Signed-off-by: Jonathan Tan 
> ---

Sounds like a sensible thing to do.

[...]
> --- a/sub-process.h
> +++ b/sub-process.h
> @@ -18,6 +18,11 @@ struct subprocess_entry {
>   struct child_process process;
>  };
>  
> +struct subprocess_capability {
> + const char *name;
> + unsigned int flag;

What does this flag represent?  What values can it have?  A comment
might help.

[...]
> @@ -41,6 +46,19 @@ static inline struct child_process 
> *subprocess_get_child_process(
>   return >process;
>  }
>  
> +/*
> + * Perform the handshake to a long-running process as described in the
> + * gitattributes documentation using the given requested versions and
> + * capabilities. The "versions" and "capabilities" parameters are arrays
> + * terminated by a 0 or blank struct.
> + */
> +int subprocess_handshake(struct subprocess_entry *entry,
> +  const char *welcome_prefix,
> +  int *versions,
> +  int *chosen_version,
> +  struct subprocess_capability *capabilities,
> +  unsigned int *supported_capabilities);
> +

Is there a more precise pointer within the gitattributes documentation
that describes what this does?  I searched for "handshake" and found
nothing.  Is the "Long Running Filter Process" section where I should
have started?

The API docs for this header are currently in
Documentation/technical/api-sub-process.txt.  Perhaps these docs
should also go there, or, even better, the docs in
Documentation/technical/ could move to this header in a preparatory
patch.

Aside (not about this patch): why is the subprocess object called
struct subprocess_entry?  Would it make sense to rename it to struct
subprocess?

Back to this function.  Is this a function I should only call once,
when first launching a subprocess, or can I call it again later?  A
note about suggested usage might help.

[...]
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -697,7 +697,7 @@ test_expect_success PERL 'invalid process filter must 
> fail (and not hang!)' '
>  
>   cp "$TEST_ROOT/test.o" test.r &&
>   test_must_fail git add . 2>git-stderr.log &&
> - grep "does not support filter protocol version" git-stderr.log
> + grep "expected git-filter-server" git-stderr.log

This error message looks a little less friendly than the old one.
Is that okay because it is not supposed to happen in practice?

[...]
> --- a/convert.c
> +++ b/convert.c
> @@ -512,62 +512,15 @@ static struct hashmap subprocess_map;
>  
>  static int start_multi_file_filter_fn(struct subprocess_entry *subprocess)
>  {
> - int err;
> - struct cmd2process *entry = (struct cmd2process *)subprocess;
[... many lines snipped ...]
> - return err;
> + static int versions[] = {2, 0};
> + static struct subprocess_capability capabilities[] = {
> + {"clean", CAP_CLEAN}, {"smudge", CAP_SMUDGE}, {NULL, 0}
> + };
> + struct cmd2process *entry = (struct cmd2process *)subprocess;
> +
> + return subprocess_handshake(subprocess, "git-filter-", versions, NULL,
> + capabilities,
> + >supported_capabilities);
>  }

API looks nice.

[...]
> --- a/sub-process.c
> +++ b/sub-process.c
> @@ -105,3 +105,97 @@ int subprocess_start(struct hashmap *hashmap, struct 
> subprocess_entry *entry, co

Implementation looks sane from a quick glance.

Thanks,
Jonathan


What's cooking in git.git (Jul 2017, #07; Mon, 24)

2017-07-24 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

With help from Jiang, Dscho and Jean-Noël, we agreed on the way to
handle i18n issues around our custom timestamp_t type and its format
PRItime, and tagging of -rc1 has been delayed a bit.  But now it has
happened.  There is a small unrelated change still cooking somewhere
else that may have to go in the final release, but other than that
one, no topic is expected to go from 'next' to 'master' until the
final.  Regression fixes and reverts are possible but let's hope
there are no need for them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* jc/po-pritime-fix (2017-07-20) 1 commit
  (merged to 'next' on 2017-07-21 at 61f0e3b37f)
 + Makefile: help gettext tools to cope with our custom PRItime format

 We started using "%" PRItime, imitating "%" PRIuMAX and friends, as
 a way to format the internal timestamp value, but this does not
 play well with gettext(1) i18n framework, and causes "make pot"
 that is run by the l10n coordinator to create a broken po/git.pot
 file.  This is a possible workaround for that problem.


* ks/doc-fixes (2017-07-18) 2 commits
  (merged to 'next' on 2017-07-20 at c34b00d0a0)
 + doc: reformat the paragraph containing the 'cut-line'
 + doc: camelCase the i18n config variables to improve readability

 Doc clean-up.


* rj/cygwin-fread-reads-directories (2017-07-21) 1 commit
  (merged to 'next' on 2017-07-21 at 28694cf254)
 + config.mak.uname: set FREAD_READS_DIRECTORIES for cygwin

 It turns out that Cygwin also needs the fopen() wrapper that
 returns failure when a directory is opened for reading.

--
[New Topics]

* js/run-process-parallel-api-fix (2017-07-21) 1 commit
 - run_processes_parallel: change confusing task_cb convention

 API fix.

 Will merge to and cook in 'next'.


* cc/ref-is-hidden-microcleanup (2017-07-24) 1 commit
 - refs: use skip_prefix() in ref_is_hidden()

 Code cleanup.

 Will merge to and cook in 'next'.


* js/blame-lib (2017-07-24) 1 commit
 - blame: fix memory corruption scrambling revision name in error message

 A hotfix to a topic already in 'master'.

 Will merge to 'next'.

--
[Stalled]

* mg/status-in-progress-info (2017-05-10) 2 commits
 - status --short --inprogress: spell it as --in-progress
 - status: show in-progress info for short status

 "git status" learns an option to report various operations
 (e.g. "merging") that the user is in the middle of.

 cf. 


* nd/worktree-move (2017-04-20) 6 commits
 - worktree remove: new command
 - worktree move: refuse to move worktrees with submodules
 - worktree move: accept destination as directory
 - worktree move: new command
 - worktree.c: add update_worktree_location()
 - worktree.c: add validate_worktree()

 "git worktree" learned move and remove subcommands.

 Expecting a reroll.
 cf. <20170420101024.7593-1-pclo...@gmail.com>
 cf. <20170421145916.mknekgqzhxffu...@sigill.intra.peff.net>
 cf. 


* sg/clone-refspec-from-command-line-config (2017-06-16) 2 commits
 - Documentation/clone: document ignored configuration variables
 - clone: respect additional configured fetch refspecs during initial fetch
 (this branch is used by sg/remote-no-string-refspecs.)

 "git clone -c var=val" is a way to set configuration variables in
 the resulting repository, but it is more useful to also make these
 variables take effect while the initial clone is happening,
 e.g. these configuration variables could be fetch refspecs.

 Waiting for a response.
 cf. <20170617112228.vugswym4o4owf...@sigill.intra.peff.net>
 cf. 


* js/rebase-i-final (2017-06-15) 10 commits
 - rebase -i: rearrange fixup/squash lines using the rebase--helper
 - t3415: test fixup with wrapped oneline
 - rebase -i: skip unnecessary picks using the rebase--helper
 - rebase -i: check for missing commits in the rebase--helper
 - t3404: relax rebase.missingCommitsCheck tests
 - rebase -i: also expand/collapse the SHA-1s via the rebase--helper
 - rebase -i: do not invent onelines when expanding/collapsing SHA-1s
 - rebase -i: remove useless indentation
 - rebase -i: generate the script via rebase--helper
 - t3415: verify that an empty instructionFormat is handled as before

 The final batch to "git rebase -i" updates to move more code from
 the shell script to C.

 Expecting a reroll.
 This is at its v5.
 cf. 

Re: reftable [v3]: new ref storage format

2017-07-24 Thread Shawn Pearce
On Mon, Jul 24, 2017 at 3:22 PM, Stefan Beller  wrote:
> On Sat, Jul 22, 2017 at 11:29 AM, Shawn Pearce  wrote:
>> 3rd iteration of the reftable storage format.
>>
>> You can read a rendered version of this here:
>> https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md
>>
>> Significant changes from v2:
>> - efficient lookup by SHA-1 for allow-tip-sha1-in-want.
>
> I'll focus on that in the review, it sounds exciting.


>> ### Ref block format
> ...
>
>> A variable number of 4-byte `restart_offset` values follow the
>> records.  Offsets are relative to the start of the block and refer to
>> the first byte of any `ref_record` whose name has not been prefix
>> compressed.  Readers can start linear scans from any of these records.
>> Offsets in the first block are relative to the start of the file
>> (position 0), and include the file header.  This requires the first
>> restart in the first block to be at offset 8.
>>
>> The 2-byte `restart_count_m1` stores *one less* than the number of
>> entries in the `restart_offset` list.  There is always a restart
>> corresponding to the first ref record. Readers are responsible for
>> computing `restart_count = restart_count_m1 + 1`.
>
> I had to reread these two paragraphs a couple of times as it calls out
> the uninteresting things[1] and the interesting part is the logical
> conclusion that one has to make themselves,
> here is how I would write it:
>
> Readers can start linear scans from any record whose name has
> not been prefix compressed. The first record of a block must not
> be prefix-compressed.
>
> To aid finding the entry points for linear scans, a variable number
> of 4-byte `restart_offset` values follow the records. Offsets are
> relative to the start of the block and refer to the first byte of any
> such `ref_record` that is not prefix compressed.
> The first record can be omitted in the `restart_offset` values as it
> is implicit.

No, the first record must be listed in the restart_offset table. Its
not implicit. The count of it is implicit, to allow up to 65536
restart_offset entries using only a uint16 restart_count. Maybe that
is being too cute, and the count should just be the entry count.

> The `restart_offset' values must be sorted (ascending,
> descending?).

Sorted, ascending.


>> The `value` follows.  Its format is determined by `value_type`, one of
>> the following:
>>
>> - `0x0`: deletion; no value data (see transactions, below)
>> - `0x1`: one 20-byte object id; value of the ref
>> - `0x2`: two 20-byte object ids; value of the ref, peeled target
>
> Up to here it is easy to spot a pattern:
> The number indicates how many oids follow.
>
>> - `0x3`: symbolic reference: `varint( target_len ) target`
>> - `0x4`: length delimited extension: `varint( data_len ) data`
>
> This breaks the pattern.
>
> Instead of hardcoding the numbers here, I wonder if we rather
> want to make the bits more meaningful:
>
>   bit 0, 1: number of oids iff bit 2 unset
>
>   Iff bit 2 set, then we have a "varint (len) data"
>   that follows, bits 0,1 are used for a different purpose,
>   00 indicates 'symlink' and data is the string
>   01 indicates 'multihead' such as FETCH_HEAD
>   1* is reserved for now.
>
> This *may* be neat micro optimization, but hardcoding all bits
> to a lookup table is fine, too.

The problem I have with this bit-based rule is it breaks down later as
you use additional codes, or you find yourself limited by the bit
scheme and don't have the full range of 8 values available. So I
decided to be very literal about what the codes mean, and use a lookup
table.


>   Note that symrefs and multiheads could share the same
>   type, iff we had dissallowed '\n' in refnames. (we do?
>   otherwise FETCH_HEAD would be broken)
>   The differentiator would be the '\n' or '\0' at the end of the
>   first target.

True, \n is not allowed in a ref name, and neither is \0.

Symrefs in loose ref format use "ref: \n" as their content. We
could use that format in reftable, and then HEAD and FETCH_HEAD could
use the same value code, 0x3.


>>  index record
>>
>> An index record describes the last entry in another block.
>> Index records are written as:
>>
>> varint( prefix_length )
>> varint( (suffix_length << 3) | 0 )
>> suffix
>> varint( block_offset )
>>
>> Index records use prefix compression exactly like `ref_record`.
>>
>> Index records store `block_offset` after the suffix, specifying the
>> offset in bytes (from the start of the file) of the block that ends
>> with this reference.
>
> Instead of hardcoding the "0" in the last 3 bits, maybe pick one
> of the reserved bit patterns to be there? I would imagine this
> makes debugging easier:
>
> 0x5? Hah that must be an index block I have been
> looking at the wrong block!

This is an excellent suggestion. I'll include it in the next iteration.


>> ### Obj 

Re: [GSoC][PATCH 03/13] submodule: port set_name_rev() from shell to C

2017-07-24 Thread Brandon Williams
On 07/25, Prathamesh Chavan wrote:
> +static int print_name_rev(int argc, const char **argv, const char *prefix)
> +{
> + char *namerev;
> + if (argc != 3)
> + die("print-name-rev only accepts two arguments:  ");
> +
> + namerev = get_name_rev(argv[1], argv[2]);
> + if (namerev && namerev[0])
> + printf(" (%s)", namerev);
> + printf("\n");
> +

'namerev' needs to be freed at the end of this function.

> + return 0;
> +}
> +

-- 
Brandon Williams


Re: [PATCH/RFC] rebase: make resolve message clearer for inexperienced users

2017-07-24 Thread Junio C Hamano
Phillip Wood  writes:

> git rebase --continue requiring one to git add first confuses/annoys me
> too. I started a patch to autostage unstaged changes if they don't
> contain conflict markers a couple of weeks ago, I'll clean it up and
> post it later this week.

As long as "git rebase" will keep refusing to start in a working
tree with dirty files and/or index, this could be a good change.

But people _may_ be annoyed because they expect "--continue" to
remind them that some conflicts are not concluded with an explicit
"git add", and they would even feel that you made the command unsafe
if "--continue" just goes ahead by auto-adding their change that is
still work-in-progress.  Lack of conflict markers is not a sign that
a file is fully resolved (which they are used to signal by "git
add", and they do so per set of paths).

> I also find it confusing that it asks me to edit the commit message for
> picks, fixups and non-final squashes after conflicts. I can see that
> perhaps one might want to amend the message to reflect any changes that
> were made while resolving the conflicts but I've never had too. I'd
> rather be able to pass --edit to rebase --continue if I needed to edit
> the message in those cases. Looking through the code I think it would
> require saving some extra state when rebase bails out on conflicts so
> rebase --continue could tell if it should be asking the user to amend
> the message.

This is disruptive if done without a careful transition plan and
you'll annoy existing users who expect to be able to edit by
default.  Especially since "rebase" keeps going and potentially
rebuild many commits on top, by the time they realize the mistake of
not passing "--edit", it is too late and they will hate you for
forcing them rebase many commits again.

If these suggestions above were given while "rebase -i" was
developed, it might have made the end-user experience a better one
than what it currently is, but transitioning after the current
behaviour has long been established makes it much harder.




Re: [PATCHv2] blame: fix memory corruption scrambling revision name in error message

2017-07-24 Thread Junio C Hamano
SZEDER Gábor  writes:

> When attempting to blame a non-existing path, git should show an error
> message like this:
>
>   $ git blame e83c51633 -- nonexisting-file
>   fatal: no such path nonexisting-file in e83c51633
>
> Since the recent commit 835c49f7d (blame: rework methods that
> determine 'final' commit, 2017-05-24) the revision name is either
> missing or some scrambled characters are shown instead.  The reason is
> that the revision name must be duplicated, because it is invalidated
> when the pending objects array is cleared in the meantime, but this
> commit dropped the duplication.
>
> Restore the duplication of the revision name in the affected functions
> (find_single_final() and find_single_initial()).
>
> Signed-off-by: SZEDER Gábor 
> ---

Thanks.  

I vaguely recall we had the same breakage in the code that was
already fixed long time ago; that's a strange coincidence if the
js/blame-lib topic reintroduced the same bug ;-).

Will queue.

>
> Use xstrdup_or_null() in the first hunk.
>
>  blame.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/blame.c b/blame.c
> index 91e26e93e..f575e9cbf 100644
> --- a/blame.c
> +++ b/blame.c
> @@ -1663,7 +1663,7 @@ static struct commit *find_single_final(struct rev_info 
> *revs,
>   name = revs->pending.objects[i].name;
>   }
>   if (name_p)
> - *name_p = name;
> + *name_p = xstrdup_or_null(name);
>   return found;
>  }
>  
> @@ -1735,7 +1735,7 @@ static struct commit *find_single_initial(struct 
> rev_info *revs,
>   die("No commit to dig up from?");
>  
>   if (name_p)
> - *name_p = name;
> + *name_p = xstrdup(name);
>   return found;
>  }
>  
> @@ -1843,6 +1843,8 @@ void setup_scoreboard(struct blame_scoreboard *sb, 
> const char *path, struct blam
>  
>   if (orig)
>   *orig = o;
> +
> + free((char *)final_commit_name);
>  }


[PATCH] doc: add missing "none" value for diff.wsErrorHighlight

2017-07-24 Thread Andreas Heiduk
The value has not eluded documentation so far.

Signed-off-by: Andreas Heiduk 
---
 Documentation/diff-config.txt  | 2 +-
 Documentation/diff-options.txt | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index cbce8ec63..c84ced8f6 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -200,7 +200,7 @@ diff.algorithm::
 +
 
 diff.wsErrorHighlight::
-   A comma separated list of `old`, `new`, `context`, that
+   A comma separated list of `old`, `new`, `context` and `none`, that
specifies how whitespace errors on lines are highlighted
with `color.diff.whitespace`.  Can be overridden by the
command line option `--ws-error-highlight=`
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 89cc0f48d..903d68eb7 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -302,7 +302,7 @@ ifndef::git-format-patch[]
 --ws-error-highlight=::
Highlight whitespace errors on lines specified by 
in the color specified by `color.diff.whitespace`.  
-   is a comma separated list of `old`, `new`, `context`.  When
+   is a comma separated list of `old`, `new`, `context` and `none`.  When
this option is not given, only whitespace errors in `new`
lines are highlighted.  E.g. `--ws-error-highlight=new,old`
highlights whitespace errors on both deleted and added lines.
-- 
2.13.3



Re: [GSoC][PATCH 06/13] submodule: port submodule subcommand 'deinit' from shell to C

2017-07-24 Thread Brandon Williams
On 07/25, Prathamesh Chavan wrote:
> The same mechanism is used even for porting this submodule
> subcommand, as used in the ported subcommands till now.
> The function cmd_deinit in split up after porting into three
> functions: module_deinit(), for_each_submodule_list() and
> deinit_submodule().
> 
> Mentored-by: Christian Couder 
> Mentored-by: Stefan Beller 
> Signed-off-by: Prathamesh Chavan 
> ---
>  builtin/submodule--helper.c | 141 
> 
>  git-submodule.sh|  55 +
>  2 files changed, 142 insertions(+), 54 deletions(-)
> 
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 2d1d3984d..5e84fc42d 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -910,6 +910,146 @@ static int module_sync(int argc, const char **argv, 
> const char *prefix)
>   return 0;
>  }
>  
> +struct deinit_cb {
> + const char *prefix;
> + unsigned int quiet: 1;
> + unsigned int force: 1;
> + unsigned int all: 1;
> +};
> +#define DEINIT_CB_INIT { NULL, 0, 0, 0 }
> +
> +static void deinit_submodule(const struct cache_entry *list_item,
> +  void *cb_data)
> +{
> + struct deinit_cb *info = cb_data;
> + const struct submodule *sub;
> + char *displaypath = NULL;
> + struct child_process cp_config = CHILD_PROCESS_INIT;
> + struct strbuf sb_config = STRBUF_INIT;
> + char *sub_git_dir = xstrfmt("%s/.git", list_item->name);
> + struct stat st;
> +
> + sub = submodule_from_path(null_sha1, list_item->name);
> +
> + if (!sub || !sub->name)
> + goto cleanup;
> +
> + displaypath = get_submodule_displaypath(list_item->name, info->prefix);
> +
> + /* remove the submodule work tree (unless the user already did it) */
> + if (is_directory(list_item->name)) {
> + /* protect submodules containing a .git directory */
> + if (is_git_directory(sub_git_dir))

This may be too strict of a test.  The original code simply checks if
'submodule/.git' is a directory and dies if it is.  This adds additional
checks ensuring it is a gitdir.  If we want to have a straight
conversion from the shell code we should have this only check if it is a
directory.

> + die(_("Submodule work tree '%s' contains a .git "
> +   "directory use 'rm -rf' if you really want "
> +   "to remove it including all of its history"),
> +   displaypath);
> +
> + if (!info->force) {
> + struct child_process cp_rm = CHILD_PROCESS_INIT;
> + cp_rm.git_cmd = 1;
> + argv_array_pushl(_rm.args, "rm", "-qn",
> +  list_item->name, NULL);
> +
> + if (run_command(_rm))
> + die(_("Submodule work tree '%s' contains local "
> +   "modifications; use '-f' to discard 
> them"),
> +   displaypath);
> + }
> +
> + if (!lstat(list_item->name, )) {

What's the purpose of the lstat call here?

> + struct strbuf sb_rm = STRBUF_INIT;
> + const char *format;
> +
> + strbuf_addstr(_rm, list_item->name);
> +
> + if (!remove_dir_recursively(_rm, 0))
> + format = _("Cleared directory '%s'\n");
> + else
> + format = _("Could not remove submodule work 
> tree '%s'\n");
> +
> + if (!info->quiet)
> + printf(format, displaypath);
> +
> + strbuf_release(_rm);
> + }
> + }
> +
> + if (mkdir(list_item->name, st.st_mode))

What should the mode be when making the empty directory? Right now you
have the potential for it to be garbage as 'st' has the potential of not
being set by the lstat call above (since it happens inside the above if
statement).  Also we may not want it to depend on what the permissions
were set as on the filesystem assuming the user didn't already remove
the submodule themselves.

> + die(_("could not create empty submodule directory %s"),
> +   displaypath);
> +
> + cp_config.git_cmd = 1;
> + argv_array_pushl(_config.args, "config", "--get-regexp", NULL);
> + argv_array_pushf(_config.args, "submodule.%s\\.", sub->name);
> +
> + /* remove the .git/config entries (unless the user already did it) */
> + if (!capture_command(_config, _config, 0) && sb_config.len) {
> + char *sub_key = xstrfmt("submodule.%s", sub->name);
> + /*
> +  * remove the whole section so we have a clean state when
> +  * the user later decides to init this 

[GSoC][PATCH 04/13] submodule: port submodule subcommand 'status' from shell to C

2017-07-24 Thread Prathamesh Chavan
This aims to make git-submodule 'status' a built-in. Hence, the function
cmd_status() is ported from shell to C. This is done by introducing
three functions: module_status(), submodule_status() and print_status().

The function module_status() acts as the front-end of the subcommand.
It parses subcommand's options and then calls the function
module_list_compute() for computing the list of submodules. Then
this functions calls for_each_submodule_list() looping through the
list obtained.

Then for_each_submodule_list() calls submodule_status() for each of the
submodule in its list. The function submodule_status() is responsible
for generating the status each submodule it is called for, and
then calls print_status().

Finally, the function print_status() handles the printing of submodule's
status.

Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
In this new version of patch, following changes were made:
* instead of using the ce_match_stat(), cmd_diff_files is used.
* currently, no comment about future scope of optimization wrt the
  cmd_diff_files() usage was added as currently, I'm not fully sure of
  the way to optimize the function.

 builtin/submodule--helper.c | 151 
 git-submodule.sh|  49 +-
 2 files changed, 152 insertions(+), 48 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 80f744407..b39828174 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -560,6 +560,156 @@ static int module_init(int argc, const char **argv, const 
char *prefix)
return 0;
 }
 
+struct status_cb {
+   const char *prefix;
+   unsigned int quiet: 1;
+   unsigned int recursive: 1;
+   unsigned int cached: 1;
+};
+#define STATUS_CB_INIT { NULL, 0, 0, 0 }
+
+static void print_status(struct status_cb *info, char state, const char *path,
+char *sub_sha1, char *displaypath)
+{
+   if (info->quiet)
+   return;
+
+   printf("%c%s %s", state, sub_sha1, displaypath);
+
+   if (state == ' ' || state == '+') {
+   struct argv_array name_rev_args = ARGV_ARRAY_INIT;
+
+   argv_array_pushl(_rev_args, "print-name-rev",
+path, sub_sha1, NULL);
+   print_name_rev(name_rev_args.argc, name_rev_args.argv,
+  info->prefix);
+   } else {
+   printf("\n");
+   }
+}
+
+static int handle_submodule_head_ref(const char *refname,
+const struct object_id *oid, int flags,
+void *cb_data)
+{
+   struct strbuf *output = cb_data;
+   if (oid)
+   strbuf_addstr(output, oid_to_hex(oid));
+   return 0;
+}
+
+static void status_submodule(const struct cache_entry *list_item, void 
*cb_data)
+{
+   struct status_cb *info = cb_data;
+   char *sub_sha1 = xstrdup(oid_to_hex(_item->oid));
+   char *displaypath;
+   struct argv_array diff_files_args = ARGV_ARRAY_INIT;
+
+   if (!submodule_from_path(null_sha1, list_item->name))
+   die(_("no submodule mapping found in .gitmodules for path 
'%s'"),
+ list_item->name);
+
+   displaypath = get_submodule_displaypath(list_item->name, info->prefix);
+
+   if (list_item->ce_flags) {
+   print_status(info, 'U', list_item->name,
+sha1_to_hex(null_sha1), displaypath);
+   goto cleanup;
+   }
+
+   if (!is_submodule_active(the_repository, list_item->name)) {
+   print_status(info, '-', list_item->name, sub_sha1, displaypath);
+   goto cleanup;
+   }
+
+   argv_array_pushl(_files_args, "diff-files",
+"--ignore-submodules=dirty", "--quiet", "--",
+list_item->name, NULL);
+
+   if (!cmd_diff_files(diff_files_args.argc, diff_files_args.argv,
+   info->prefix)) {
+   print_status(info, ' ', list_item->name, sub_sha1, displaypath);
+   } else {
+   if (!info->cached) {
+   struct strbuf sb = STRBUF_INIT;
+   if (head_ref_submodule(list_item->name,
+  handle_submodule_head_ref, ))
+   die(_("could not resolve HEAD ref inside the"
+ "submodule '%s'"), list_item->name);
+   print_status(info, '+', list_item->name, sb.buf,
+displaypath);
+   strbuf_release();
+   } else {
+   print_status(info, '+', list_item->name, sub_sha1,
+displaypath);
+   }
+   }
+
+   

[GSoC][PATCH 01/13] submodule--helper: introduce get_submodule_displaypath()

2017-07-24 Thread Prathamesh Chavan
Introduce function get_submodule_displaypath() to replace the code
occurring in submodule_init() for generating displaypath of the
submodule with a call to it.

This new function will also be used in other parts of the system
in later patches.

Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
 builtin/submodule--helper.c | 33 ++---
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6abdad329..7af4de09b 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -220,6 +220,27 @@ static int resolve_relative_url_test(int argc, const char 
**argv, const char *pr
return 0;
 }
 
+static char *get_submodule_displaypath(const char *path, const char *prefix)
+{
+   const char *super_prefix = get_super_prefix();
+
+   if (prefix && super_prefix) {
+   BUG("cannot have prefix '%s' and superprefix '%s'",
+   prefix, super_prefix);
+   } else if (prefix) {
+   struct strbuf sb = STRBUF_INIT;
+   char *displaypath = xstrdup(relative_path(path, prefix, ));
+   strbuf_release();
+   return displaypath;
+   } else if (super_prefix) {
+   int len = strlen(super_prefix);
+   const char *format = is_dir_sep(super_prefix[len - 1]) ? "%s%s" 
: "%s/%s";
+   return xstrfmt(format, super_prefix, path);
+   } else {
+   return xstrdup(path);
+   }
+}
+
 struct module_list {
const struct cache_entry **entries;
int alloc, nr;
@@ -339,16 +360,7 @@ static void init_submodule(const char *path, const char 
*prefix, int quiet)
 
/* Only loads from .gitmodules, no overlay with .git/config */
gitmodules_config();
-
-   if (prefix && get_super_prefix())
-   die("BUG: cannot have prefix and superprefix");
-   else if (prefix)
-   displaypath = xstrdup(relative_path(path, prefix, ));
-   else if (get_super_prefix()) {
-   strbuf_addf(, "%s%s", get_super_prefix(), path);
-   displaypath = strbuf_detach(, NULL);
-   } else
-   displaypath = xstrdup(path);
+   displaypath = get_submodule_displaypath(path, prefix);
 
sub = submodule_from_path(null_sha1, path);
 
@@ -363,7 +375,6 @@ static void init_submodule(const char *path, const char 
*prefix, int quiet)
 * Set active flag for the submodule being initialized
 */
if (!is_submodule_active(the_repository, path)) {
-   strbuf_reset();
strbuf_addf(, "submodule.%s.active", sub->name);
git_config_set_gently(sb.buf, "true");
}
-- 
2.13.0



[GSoC][PATCH 03/13] submodule: port set_name_rev() from shell to C

2017-07-24 Thread Prathamesh Chavan
Function set_name_rev() is ported from git-submodule to the
submodule--helper builtin. The function get_name_rev() generates the
value of the revision name as required, and the function
print_name_rev() handles the formating and printing of the obtained
revision name.

Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
 builtin/submodule--helper.c | 63 +
 git-submodule.sh| 16 ++--
 2 files changed, 65 insertions(+), 14 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index e41572f7a..80f744407 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -244,6 +244,68 @@ static char *get_submodule_displaypath(const char *path, 
const char *prefix)
}
 }
 
+static char *get_name_rev(const char *sub_path, const char* object_id)
+{
+   struct strbuf sb = STRBUF_INIT;
+   const char ***d;
+
+   static const char *describe_bare[] = {
+   NULL
+   };
+
+   static const char *describe_tags[] = {
+   "--tags", NULL
+   };
+
+   static const char *describe_contains[] = {
+   "--contains", NULL
+   };
+
+   static const char *describe_all_always[] = {
+   "--all", "--always", NULL
+   };
+
+   static const char **describe_argv[] = {
+   describe_bare, describe_tags, describe_contains,
+   describe_all_always, NULL
+   };
+
+   for (d = describe_argv; *d; d++) {
+   struct child_process cp = CHILD_PROCESS_INIT;
+   prepare_submodule_repo_env(_array);
+   cp.dir = sub_path;
+   cp.git_cmd = 1;
+   cp.no_stderr = 1;
+
+   argv_array_push(, "describe");
+   argv_array_pushv(, *d);
+   argv_array_push(, object_id);
+
+   if (!capture_command(, , 0) && sb.len) {
+   strbuf_strip_suffix(, "\n");
+   return strbuf_detach(, NULL);
+   }
+
+   }
+
+   strbuf_release();
+   return NULL;
+}
+
+static int print_name_rev(int argc, const char **argv, const char *prefix)
+{
+   char *namerev;
+   if (argc != 3)
+   die("print-name-rev only accepts two arguments:  ");
+
+   namerev = get_name_rev(argv[1], argv[2]);
+   if (namerev && namerev[0])
+   printf(" (%s)", namerev);
+   printf("\n");
+
+   return 0;
+}
+
 struct module_list {
const struct cache_entry **entries;
int alloc, nr;
@@ -1242,6 +1304,7 @@ static struct cmd_struct commands[] = {
{"relative-path", resolve_relative_path, 0},
{"resolve-relative-url", resolve_relative_url, 0},
{"resolve-relative-url-test", resolve_relative_url_test, 0},
+   {"print-name-rev", print_name_rev, 0},
{"init", module_init, SUPPORT_SUPER_PREFIX},
{"remote-branch", resolve_remote_submodule_branch, 0},
{"push-check", push_check, 0},
diff --git a/git-submodule.sh b/git-submodule.sh
index e131760ee..e988167e0 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -759,18 +759,6 @@ cmd_update()
}
 }
 
-set_name_rev () {
-   revname=$( (
-   sanitize_submodule_env
-   cd "$1" && {
-   git describe "$2" 2>/dev/null ||
-   git describe --tags "$2" 2>/dev/null ||
-   git describe --contains "$2" 2>/dev/null ||
-   git describe --all --always "$2"
-   }
-   ) )
-   test -z "$revname" || revname=" ($revname)"
-}
 #
 # Show commit summary for submodules in index or working tree
 #
@@ -1042,14 +1030,14 @@ cmd_status()
fi
if git diff-files --ignore-submodules=dirty --quiet -- 
"$sm_path"
then
-   set_name_rev "$sm_path" "$sha1"
+   revname=$(git submodule--helper print-name-rev 
"$sm_path" "$sha1")
say " $sha1 $displaypath$revname"
else
if test -z "$cached"
then
sha1=$(sanitize_submodule_env; cd "$sm_path" && 
git rev-parse --verify HEAD)
fi
-   set_name_rev "$sm_path" "$sha1"
+   revname=$(git submodule--helper print-name-rev 
"$sm_path" "$sha1")
say "+$sha1 $displaypath$revname"
fi
 
-- 
2.13.0



[GSoC][PATCH 02/13] submodule--helper: introduce for_each_submodule_list()

2017-07-24 Thread Prathamesh Chavan
Introduce function for_each_submodule_list() and
replace a loop in module_init() with a call to it.

The new function will also be used in other parts of the
system in later patches.

Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
 builtin/submodule--helper.c | 39 +--
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 7af4de09b..e41572f7a 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -14,6 +14,9 @@
 #include "refs.h"
 #include "connect.h"
 
+typedef void (*submodule_list_func_t)(const struct cache_entry *list_item,
+ void *cb_data);
+
 static char *get_default_remote(void)
 {
char *dest = NULL, *ret;
@@ -352,17 +355,30 @@ static int module_list(int argc, const char **argv, const 
char *prefix)
return 0;
 }
 
-static void init_submodule(const char *path, const char *prefix, int quiet)
+static void for_each_submodule_list(const struct module_list list,
+   submodule_list_func_t fn, void *cb_data)
 {
+   int i;
+   for (i = 0; i < list.nr; i++)
+   fn(list.entries[i], cb_data);
+}
+
+struct init_cb {
+   const char *prefix;
+   unsigned int quiet: 1;
+};
+#define INIT_CB_INIT { NULL, 0 }
+
+static void init_submodule(const struct cache_entry *list_item, void *cb_data)
+{
+   struct init_cb *info = cb_data;
const struct submodule *sub;
struct strbuf sb = STRBUF_INIT;
char *upd = NULL, *url = NULL, *displaypath;
 
-   /* Only loads from .gitmodules, no overlay with .git/config */
-   gitmodules_config();
-   displaypath = get_submodule_displaypath(path, prefix);
+   displaypath = get_submodule_displaypath(list_item->name, info->prefix);
 
-   sub = submodule_from_path(null_sha1, path);
+   sub = submodule_from_path(null_sha1, list_item->name);
 
if (!sub)
die(_("No url found for submodule path '%s' in .gitmodules"),
@@ -374,7 +390,7 @@ static void init_submodule(const char *path, const char 
*prefix, int quiet)
 *
 * Set active flag for the submodule being initialized
 */
-   if (!is_submodule_active(the_repository, path)) {
+   if (!is_submodule_active(the_repository, list_item->name)) {
strbuf_addf(, "submodule.%s.active", sub->name);
git_config_set_gently(sb.buf, "true");
}
@@ -416,7 +432,7 @@ static void init_submodule(const char *path, const char 
*prefix, int quiet)
if (git_config_set_gently(sb.buf, url))
die(_("Failed to register url for submodule path '%s'"),
displaypath);
-   if (!quiet)
+   if (!info->quiet)
fprintf(stderr,
_("Submodule '%s' (%s) registered for path 
'%s'\n"),
sub->name, url, displaypath);
@@ -445,10 +461,10 @@ static void init_submodule(const char *path, const char 
*prefix, int quiet)
 
 static int module_init(int argc, const char **argv, const char *prefix)
 {
+   struct init_cb info = INIT_CB_INIT;
struct pathspec pathspec;
struct module_list list = MODULE_LIST_INIT;
int quiet = 0;
-   int i;
 
struct option module_init_options[] = {
OPT__QUIET(, N_("Suppress output for initializing a 
submodule")),
@@ -473,8 +489,11 @@ static int module_init(int argc, const char **argv, const 
char *prefix)
if (!argc && git_config_get_value_multi("submodule.active"))
module_list_active();
 
-   for (i = 0; i < list.nr; i++)
-   init_submodule(list.entries[i]->name, prefix, quiet);
+   info.prefix = prefix;
+   info.quiet = !!quiet;
+
+   gitmodules_config();
+   for_each_submodule_list(list, init_submodule, );
 
return 0;
 }
-- 
2.13.0



[GSoC][PATCH 05/13] submodule: port submodule subcommand 'sync' from shell to C

2017-07-24 Thread Prathamesh Chavan
Port the submodule subcommand 'sync' from shell to C using the same
mechanism as that used for porting submodule subcommand 'status'.
Hence, here the function cmd_sync() is ported from shell to C.
This is done by introducing three functions: module_sync(),
sync_submodule() and print_default_remote().

The function print_default_remote() is introduced for getting
the default remote as stdout.

Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
In this new version of patch, following changes were made:
* the code use to die when sub->url was found to be NULL. This was not a
  correct translation of code. It was corrected by using an empty string
  instead of sub->url.
* a process was used in the previous patch for registering the submodule
  url. This was avoided by the suggested changes on the previous patch.
* some nits were also corrected.
 
 builtin/submodule--helper.c | 183 
 git-submodule.sh|  56 +-
 2 files changed, 184 insertions(+), 55 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index b39828174..2d1d3984d 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -44,6 +44,20 @@ static char *get_default_remote(void)
return ret;
 }
 
+static int print_default_remote(int argc, const char **argv, const char 
*prefix)
+{
+   const char *remote;
+
+   if (argc != 1)
+   die(_("submodule--helper print-default-remote takes no 
arguments"));
+
+   remote = get_default_remote();
+   if (remote)
+   puts(remote);
+
+   return 0;
+}
+
 static int starts_with_dot_slash(const char *str)
 {
return str[0] == '.' && is_dir_sep(str[1]);
@@ -379,6 +393,25 @@ static void module_list_active(struct module_list *list)
*list = active_modules;
 }
 
+static char *get_up_path(const char *path)
+{
+   int i;
+   struct strbuf sb = STRBUF_INIT;
+
+   for (i = count_slashes(path); i; i--)
+   strbuf_addstr(, "../");
+
+   /*
+* Check if 'path' ends with slash or not
+* for having the same output for dir/sub_dir
+* and dir/sub_dir/
+*/
+   if (!is_dir_sep(path[strlen(path) - 1]))
+   strbuf_addstr(, "../");
+
+   return strbuf_detach(, NULL);
+}
+
 static int module_list(int argc, const char **argv, const char *prefix)
 {
int i;
@@ -729,6 +762,154 @@ static int module_name(int argc, const char **argv, const 
char *prefix)
return 0;
 }
 
+struct sync_cb {
+   const char *prefix;
+   unsigned int quiet: 1;
+   unsigned int recursive: 1;
+};
+#define SYNC_CB_INIT { NULL, 0, 0 }
+
+static void sync_submodule(const struct cache_entry *list_item, void *cb_data)
+{
+   struct sync_cb *info = cb_data;
+   const struct submodule *sub;
+   char *sub_key, *remote_key;
+   char *sub_origin_url, *super_config_url, *displaypath;
+   struct strbuf sb = STRBUF_INIT;
+   struct child_process cp = CHILD_PROCESS_INIT;
+   struct strbuf sub_repo_path = STRBUF_INIT;
+   char *sub_config_path = NULL;
+
+   if (!is_submodule_active(the_repository, list_item->name))
+   return;
+
+   sub = submodule_from_path(null_sha1, list_item->name);
+
+   if (sub && sub->url) {
+   if (starts_with_dot_dot_slash(sub->url) || 
starts_with_dot_slash(sub->url)) {
+   char *remote_url, *up_path;
+   char *remote = get_default_remote();
+   char *remote_key = xstrfmt("remote.%s.url", remote);
+
+   if (git_config_get_string(remote_key, _url))
+   remote_url = xgetcwd();
+
+   up_path = get_up_path(list_item->name);
+   sub_origin_url = relative_url(remote_url, sub->url, 
up_path);
+   super_config_url = relative_url(remote_url, sub->url, 
NULL);
+
+   free(remote);
+   free(remote_key);
+   free(up_path);
+   free(remote_url);
+   } else {
+   sub_origin_url = xstrdup(sub->url);
+   super_config_url = xstrdup(sub->url);
+   }
+   } else {
+   sub_origin_url = "";
+   super_config_url = "";
+   }
+
+   displaypath = get_submodule_displaypath(list_item->name, info->prefix);
+
+   if (!info->quiet)
+   printf(_("Synchronizing submodule url for '%s'\n"),
+displaypath);
+
+   sub_key = xstrfmt("submodule.%s.url", sub->name);
+   if (git_config_set_gently(sub_key, super_config_url))
+   die(_("failed to register url for submodule path '%s'"),
+ displaypath);
+
+   if 

[GSoC][PATCH 06/13] submodule: port submodule subcommand 'deinit' from shell to C

2017-07-24 Thread Prathamesh Chavan
The same mechanism is used even for porting this submodule
subcommand, as used in the ported subcommands till now.
The function cmd_deinit in split up after porting into three
functions: module_deinit(), for_each_submodule_list() and
deinit_submodule().

Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
 builtin/submodule--helper.c | 141 
 git-submodule.sh|  55 +
 2 files changed, 142 insertions(+), 54 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 2d1d3984d..5e84fc42d 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -910,6 +910,146 @@ static int module_sync(int argc, const char **argv, const 
char *prefix)
return 0;
 }
 
+struct deinit_cb {
+   const char *prefix;
+   unsigned int quiet: 1;
+   unsigned int force: 1;
+   unsigned int all: 1;
+};
+#define DEINIT_CB_INIT { NULL, 0, 0, 0 }
+
+static void deinit_submodule(const struct cache_entry *list_item,
+void *cb_data)
+{
+   struct deinit_cb *info = cb_data;
+   const struct submodule *sub;
+   char *displaypath = NULL;
+   struct child_process cp_config = CHILD_PROCESS_INIT;
+   struct strbuf sb_config = STRBUF_INIT;
+   char *sub_git_dir = xstrfmt("%s/.git", list_item->name);
+   struct stat st;
+
+   sub = submodule_from_path(null_sha1, list_item->name);
+
+   if (!sub || !sub->name)
+   goto cleanup;
+
+   displaypath = get_submodule_displaypath(list_item->name, info->prefix);
+
+   /* remove the submodule work tree (unless the user already did it) */
+   if (is_directory(list_item->name)) {
+   /* protect submodules containing a .git directory */
+   if (is_git_directory(sub_git_dir))
+   die(_("Submodule work tree '%s' contains a .git "
+ "directory use 'rm -rf' if you really want "
+ "to remove it including all of its history"),
+ displaypath);
+
+   if (!info->force) {
+   struct child_process cp_rm = CHILD_PROCESS_INIT;
+   cp_rm.git_cmd = 1;
+   argv_array_pushl(_rm.args, "rm", "-qn",
+list_item->name, NULL);
+
+   if (run_command(_rm))
+   die(_("Submodule work tree '%s' contains local "
+ "modifications; use '-f' to discard 
them"),
+ displaypath);
+   }
+
+   if (!lstat(list_item->name, )) {
+   struct strbuf sb_rm = STRBUF_INIT;
+   const char *format;
+
+   strbuf_addstr(_rm, list_item->name);
+
+   if (!remove_dir_recursively(_rm, 0))
+   format = _("Cleared directory '%s'\n");
+   else
+   format = _("Could not remove submodule work 
tree '%s'\n");
+
+   if (!info->quiet)
+   printf(format, displaypath);
+
+   strbuf_release(_rm);
+   }
+   }
+
+   if (mkdir(list_item->name, st.st_mode))
+   die(_("could not create empty submodule directory %s"),
+ displaypath);
+
+   cp_config.git_cmd = 1;
+   argv_array_pushl(_config.args, "config", "--get-regexp", NULL);
+   argv_array_pushf(_config.args, "submodule.%s\\.", sub->name);
+
+   /* remove the .git/config entries (unless the user already did it) */
+   if (!capture_command(_config, _config, 0) && sb_config.len) {
+   char *sub_key = xstrfmt("submodule.%s", sub->name);
+   /*
+* remove the whole section so we have a clean state when
+* the user later decides to init this submodule again
+*/
+   git_config_rename_section_in_file(NULL, sub_key, NULL);
+   if (!info->quiet)
+   printf(_("Submodule '%s' (%s) unregistered for path 
'%s'\n"),
+sub->name, sub->url, displaypath);
+   free(sub_key);
+   }
+
+cleanup:
+   free(displaypath);
+   free(sub_git_dir);
+   strbuf_release(_config);
+}
+
+static int module_deinit(int argc, const char **argv, const char *prefix)
+{
+   struct deinit_cb info = DEINIT_CB_INIT;
+   struct pathspec pathspec;
+   struct module_list list = MODULE_LIST_INIT;
+   int quiet = 0;
+   int force = 0;
+   int all = 0;
+
+   struct option module_deinit_options[] = {
+   OPT__QUIET(, N_("Suppress submodule status output")),
+ 

[GSoC][PATCH 00/13] Update: Week 10

2017-07-24 Thread Prathamesh Chavan
SUMMARY OF MY PROJECT:

Git submodule subcommands are currently implemented by using shell script
'git-submodule.sh'. There are several reasons why we'll prefer not to
use the shell script. My project intends to convert the subcommands into
C code, thus making them builtins. This will increase Git's portability
and hence the efficiency of working with the git-submodule commands.
Link to the complete proposal: [1]

Mentors:
Stefan Beller 
Christian Couder 

UPDATES:

Following are the updates about my ongoing project:

* status: some of the optimization changes which were suggested
  earlier were reverted, as I wasn't sure it fulfilled the
  purpose.

* sync & summary: some changes to the code were suggested after the
  previous review. Those changes were implemented successfully.

* add: the porting of this subcommand is underway. In the function
  cmd_add(), the value of sm_path undergoes quite a few changes
  which are taken care by the sed command. I'm currently working
  on porting them.

* foreach: as said in the previous update, the former patch [2] was
  spilt up into 4 patches, to get a clear picture of different changes
  made to submodule foreach. All of these patches are attached with
  this update. Also, the subcommand foreach is ported in accordance
  with this new changes as well.

PLAN FOR WEEK-11 (25 July 2017 to 31 July 2017):

* As all the patches prepared so far are posted on the mailing list, I'll
  focus on getting these patches merged and do the needful improvisions.

* Apart from this, as stated before, the porting of submodule subcommand
  is underway and will try my best to finish its porting and discuss it
  with the mentors as well in this following week. 

Also, a complete build report of this work is available at [3].
Branch: week-10
Build #140

And the work has also been pushed on github. It is available at [4].

[1]: 
https://docs.google.com/document/d/1krxVLooWl--75Pot3dazhfygR3wCUUWZWzTXtK1L-xU/
[2]: https://public-inbox.org/git/20170603003710.5558-1-sbel...@google.com/
[3]: https://travis-ci.org/pratham-pc/git/builds/
[4]: https://github.com/pratham-pc/git/commits/week-10

Prathamesh Chavan (13):
  submodule--helper: introduce get_submodule_displaypath()
  submodule--helper: introduce for_each_submodule_list()
  submodule: port set_name_rev() from shell to C
  submodule: port submodule subcommand 'status' from shell to C
  submodule: port submodule subcommand 'sync' from shell to C
  submodule: port submodule subcommand 'deinit' from shell to C
  diff: change scope of the function count_lines()
  submodule: port submodule subcommand 'summary' from shell to C
  submodule foreach: correct '$path' in nested submodules from a
subdirectory
  submodule foreach: document '$sm_path' instead of '$path'
  submodule foreach: clarify the '$toplevel' variable documentation
  submodule foreach: document variable '$displaypath'
  submodule: port submodule subcommand 'foreach' from shell to C

 Documentation/git-submodule.txt |   15 +-
 builtin/submodule--helper.c | 1165 ++-
 diff.c  |2 +-
 diff.h  |1 +
 git-submodule.sh|  394 +
 t/t7407-submodule-foreach.sh|   38 +-
 6 files changed, 1197 insertions(+), 418 deletions(-)

-- 
2.13.0



[GSoC][PATCH 07/13] diff: change scope of the function count_lines()

2017-07-24 Thread Prathamesh Chavan
Change the scope of function count_lines for allowing the function
to be reused in other parts of the code as well.

Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
 diff.c | 2 +-
 diff.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 85e714f6c..03ed64f93 100644
--- a/diff.c
+++ b/diff.c
@@ -425,7 +425,7 @@ struct emit_callback {
struct strbuf *header;
 };
 
-static int count_lines(const char *data, int size)
+int count_lines(const char *data, int size)
 {
int count, ch, completely_empty = 1, nl_just_seen = 0;
count = 0;
diff --git a/diff.h b/diff.h
index 2d442e296..8522514e9 100644
--- a/diff.h
+++ b/diff.h
@@ -273,6 +273,7 @@ extern struct diff_filepair *diff_unmerge(struct 
diff_options *, const char *pat
 extern int parse_long_opt(const char *opt, const char **argv,
 const char **optarg);
 
+extern int count_lines(const char *data, int size);
 extern int git_diff_basic_config(const char *var, const char *value, void *cb);
 extern int git_diff_heuristic_config(const char *var, const char *value, void 
*cb);
 extern void init_diff_ui_defaults(void);
-- 
2.13.0



Re: [PATCH 21/28] commit_packed_refs(): use a staging file separate from the lockfile

2017-07-24 Thread Junio C Hamano
Jeff King  writes:

> On Mon, Jul 24, 2017 at 10:09:15AM -0700, Jonathan Nieder wrote:
>
>> Jeff King wrote:
>> 
>> > This seems like the correct path to me. If the existing behavior is to
>> > lock the referring symref, that seems like a violation of the lock
>> > procedure in the first place. Because if "A" points to "B", we take
>> > "A.lock" and then modify "B". But "B" may have any number of "A" links
>> > pointing to it, eliminating the purpose of the lock.
>> >
>> > I thought we already did this, though. And that modifying HEAD (which
>> > might be a symlink) required LOCK_NO_DEREF.
>> 
>> Yes, I believe the lockfile API already does so.  Since this patch
>> creates a ".new" file, not using the lockfile API, it doesn't benefit
>> from that, though.
>
> Ah, I see. This bug makes much more sense, then. And I agree doubly that
> matching the lock-code's behavior is the right thing to do.

OK.  Anybody wants to take a crack at it (it is of lower priority to
me during the -rc freeze to deal with problems in topics on 'next'
or higher)?

Thanks.


Re: Remove help advice text from git editors for interactive rebase and reword

2017-07-24 Thread Junio C Hamano
SZEDER Gábor  writes:

> While these reminders are useful for new users, with time they learn
> what the score is, and experienced users might find these advices are
> just wasting a couple of lines' worth of screen real estate.
>
> Make displaying these advices configurable via the 'advice.commitMsg'
> config variable.

It may not be a bad idea, but the code after the patch does look
ugly with too deep indentation levels.  Can some refactoring help, I
wonder?

Is that advice.commitMsg?  It looks more like commitEditor advice to
me but it may be just me.


Re: [GSoC][PATCH 05/13] submodule: port submodule subcommand 'sync' from shell to C

2017-07-24 Thread Brandon Williams
On 07/25, Prathamesh Chavan wrote:
> Port the submodule subcommand 'sync' from shell to C using the same
> mechanism as that used for porting submodule subcommand 'status'.
> Hence, here the function cmd_sync() is ported from shell to C.
> This is done by introducing three functions: module_sync(),
> sync_submodule() and print_default_remote().
> 
> The function print_default_remote() is introduced for getting
> the default remote as stdout.
> 
> Mentored-by: Christian Couder 
> Mentored-by: Stefan Beller 
> Signed-off-by: Prathamesh Chavan 
> ---
> In this new version of patch, following changes were made:
> * the code use to die when sub->url was found to be NULL. This was not a
>   correct translation of code. It was corrected by using an empty string
>   instead of sub->url.
> * a process was used in the previous patch for registering the submodule
>   url. This was avoided by the suggested changes on the previous patch.
> * some nits were also corrected.
>  
>  builtin/submodule--helper.c | 183 
> 
>  git-submodule.sh|  56 +-
>  2 files changed, 184 insertions(+), 55 deletions(-)
> 
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index b39828174..2d1d3984d 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -44,6 +44,20 @@ static char *get_default_remote(void)
>   return ret;
>  }
>  
> +static int print_default_remote(int argc, const char **argv, const char 
> *prefix)
> +{
> + const char *remote;
> +
> + if (argc != 1)
> + die(_("submodule--helper print-default-remote takes no 
> arguments"));
> +
> + remote = get_default_remote();
> + if (remote)
> + puts(remote);

Any reason why puts is used instead of a printf function?

> +
> + return 0;
> +}
> +
>  static int starts_with_dot_slash(const char *str)
>  {
>   return str[0] == '.' && is_dir_sep(str[1]);
> @@ -379,6 +393,25 @@ static void module_list_active(struct module_list *list)
>   *list = active_modules;
>  }
>  
> +static char *get_up_path(const char *path)
> +{
> + int i;
> + struct strbuf sb = STRBUF_INIT;
> +
> + for (i = count_slashes(path); i; i--)
> + strbuf_addstr(, "../");
> +
> + /*
> +  * Check if 'path' ends with slash or not
> +  * for having the same output for dir/sub_dir
> +  * and dir/sub_dir/
> +  */
> + if (!is_dir_sep(path[strlen(path) - 1]))
> + strbuf_addstr(, "../");
> +
> + return strbuf_detach(, NULL);
> +}
> +
>  static int module_list(int argc, const char **argv, const char *prefix)
>  {
>   int i;
> @@ -729,6 +762,154 @@ static int module_name(int argc, const char **argv, 
> const char *prefix)
>   return 0;
>  }
>  
> +struct sync_cb {
> + const char *prefix;
> + unsigned int quiet: 1;
> + unsigned int recursive: 1;
> +};
> +#define SYNC_CB_INIT { NULL, 0, 0 }
> +
> +static void sync_submodule(const struct cache_entry *list_item, void 
> *cb_data)
> +{
> + struct sync_cb *info = cb_data;
> + const struct submodule *sub;
> + char *sub_key, *remote_key;
> + char *sub_origin_url, *super_config_url, *displaypath;
> + struct strbuf sb = STRBUF_INIT;
> + struct child_process cp = CHILD_PROCESS_INIT;
> + struct strbuf sub_repo_path = STRBUF_INIT;

Not the most important but you could be a lot more efficient here with
your variables.  You could allocate a single strbuf and reuse it for
'sub_key'. 'remote_key', 'sb', and 'sub_repo_path' as it doesn't look
like you need any of those two at the same time.

> + char *sub_config_path = NULL;
> +
> + if (!is_submodule_active(the_repository, list_item->name))
> + return;
> +
> + sub = submodule_from_path(null_sha1, list_item->name);

Since is_submodule_active also calls into the submodule-config subsystem
to retrieve a submodule this call should never return a NULL ptr...so it
may be safer to return instead of proceeding if 'sub' ends up being null
here.

> +
> + if (sub && sub->url) {
> + if (starts_with_dot_dot_slash(sub->url) || 
> starts_with_dot_slash(sub->url)) {
> + char *remote_url, *up_path;
> + char *remote = get_default_remote();
> + char *remote_key = xstrfmt("remote.%s.url", remote);
> +
> + if (git_config_get_string(remote_key, _url))
> + remote_url = xgetcwd();
> +
> + up_path = get_up_path(list_item->name);
> + sub_origin_url = relative_url(remote_url, sub->url, 
> up_path);
> + super_config_url = relative_url(remote_url, sub->url, 
> NULL);
> +
> + free(remote);
> + free(remote_key);
> + free(up_path);
> + free(remote_url);

[ANNOUNCE] Git v2.14.0-rc1

2017-07-24 Thread Junio C Hamano
A release candidate Git v2.14.0-rc1 is now available for testing
at the usual places.  It is comprised of 708 non-merge commits
since v2.13.0, contributed by 61 people, 14 of which are new faces.

The tarballs are found at:

https://www.kernel.org/pub/software/scm/git/testing/

The following public repositories all have a copy of the
'v2.14.0-rc1' tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git

New contributors whose contributions weren't in v2.13.0 are as follows.
Welcome to the Git development community!

  A. Wilcox, Ben Peart, Brian Malehorn, James Clarke, Jeff Smith,
  Kaartic Sivaraam, Liam Beguin, Phillip Wood, Rikard Falkeborn,
  Sahil Dua, Samuel Lijin, Stephen Kent, Tyler Brazier, and
  xiaoqiang zhao.

Returning contributors who helped this release are as follows.
Thanks for your continued support.

  Adam Dinwoodie, Ævar Arnfjörð Bjarmason, Alejandro R. Sedeño,
  Alexander Shopov, Andreas Heiduk, Beat Bolli, Brandon Williams,
  brian m. carlson, Changwoo Ryu, Christian Couder, David Aguilar,
  David Turner, Dennis Kaarsemaker, Dimitriy Ryazantcev, Eric Wong,
  Jean-Noel Avila, Jeff Hostetler, Jeff King, Jiang Xin, Johannes
  Schindelin, Johannes Sixt, Jonathan Nieder, Jonathan Tan, Jordi
  Mas, Junio C Hamano, Kyle J. McKay, Kyle Meyer, Lars Schneider,
  Marc Branchaud, Michael Haggerty, Miguel Torroja, Mike Hommey,
  Nguyễn Thái Ngọc Duy, Patrick Steinhardt, Peter Krefting,
  Prathamesh Chavan, Ralf Thielow, Ramsay Jones, René Scharfe,
  Stefan Beller, Štěpán Němec, Sven Strickroth, SZEDER Gábor,
  Thomas Gummerer, Torsten Bögershausen, Trần Ngọc Quân,
  and Ville Skyttä.



Git 2.14 Release Notes (draft)
==

Backward compatibility notes and other notable changes.

 * Use of an empty string as a pathspec element that is used for
   'everything matches' is still warned and Git asks users to use a
   more explicit '.' for that instead.  The hope is that existing
   users will not mind this change, and eventually the warning can be
   turned into a hard error, upgrading the deprecation into removal of
   this (mis)feature.  That is not scheduled to happen in the upcoming
   release (yet).

 * Git now avoids blindly falling back to ".git" when the setup
   sequence said we are _not_ in Git repository.  A corner case that
   happens to work right now may be broken by a call to die("BUG").
   We've tried hard to locate such cases and fixed them, but there
   might still be cases that need to be addressed--bug reports are
   greatly appreciated.

 * The experiment to improve the hunk-boundary selection of textual
   diff output has finished, and the "indent heuristics" has now
   become the default.

 * Git can now be built with PCRE v2 instead of v1 of the PCRE
   library. Replace USE_LIBPCRE=YesPlease with USE_LIBPCRE2=YesPlease
   in existing build scripts to build against the new version.  As the
   upstream PCRE maintainer has abandoned v1 maintenance for all but
   the most critical bug fixes, use of v2 is recommended.


Updates since v2.13
---

UI, Workflows & Features

 * The colors in which "git status --short --branch" showed the names
   of the current branch and its remote-tracking branch are now
   configurable.

 * "git clone" learned the "--no-tags" option not to fetch all tags
   initially, and also set up the tagopt not to follow any tags in
   subsequent fetches.

 * "git archive --format=zip" learned to use zip64 extension when
   necessary to go beyond the 4GB limit.

 * "git reset" learned "--recurse-submodules" option.

 * "git diff --submodule=diff" now recurses into nested submodules.

 * "git repack" learned to accept the --threads= option and pass it
   to pack-objects.

 * "git send-email" learned to run sendemail-validate hook to inspect
   and reject a message before sending it out.

 * There is no good reason why "git fetch $there $sha1" should fail
   when the $sha1 names an object at the tip of an advertised ref,
   even when the other side hasn't enabled allowTipSHA1InWant.

 * The "[includeIf "gitdir:$dir"] path=..." mechanism introduced in
   2.13.0 would canonicalize the path of the gitdir being matched,
   and did not match e.g. "gitdir:~/work/*" against a repo in
   "~/work/main" if "~/work" was a symlink to "/mnt/storage/work".
   Now we match both the resolved canonical path and what "pwd" would
   show. The include will happen if either one matches.

 * The "indent" heuristics is now the default in "diff". The
   diff.indentHeuristic configuration variable can be set to "false"
   for those who do not want it.

 * Many commands learned to pay attention to submodule.recurse
   configuration.

 * The convention for a command line is to follow "git cmdname
   --options" with revisions followed by an optional "--"
   

[GSoC][PATCH 09/13] submodule foreach: correct '$path' in nested submodules from a subdirectory

2017-07-24 Thread Prathamesh Chavan
When running 'git submodule foreach' from a subdirectory of your
repository, nested submodules get a bogus value for $sm_path:
For a submodule 'sub' that contains a nested submodule 'nested',
running 'git -C dir submodule foreach echo $path' would report
path='../nested' for the nested submodule. The first part '../' is
derived from the logic computing the relative path from $pwd to the
root of the superproject. The second part is the submodule path inside
the submodule. This value is of little use and is hard to document.

There are two different possible solutions that have more value:
(a) The path value is documented as the path from the toplevel of the
superproject to the mount point of the submodule.
In this case we would want to have path='sub/nested'.

(b) As Ramsay noticed the documented value is wrong. For the non-nested
case the path is equal to the relative path from $pwd to the
submodules working directory. When following this model,
the expected value would be path='../sub/nested'.

The behavior for (b) was introduced in 091a6eb0fe (submodule: drop the
top-level requirement, 2013-06-16) the intent for $path seemed to be
relative to $cwd to the submodule worktree, but that did not work for
nested submodules, as the intermittent submodules were not included in
the path.

If we were to fix the meaning of the $path using (a) such that "path"
is "the path from the toplevel of the superproject to the mount point
of the submodule", we would break any existing submodule user that runs
foreach from non-root of the superproject as the non-nested submodule
'../sub' would change its path to 'sub'.

If we would fix the meaning of the $path using (b), such that "path"
is "the relative path from $pwd to the submodule", then we would break
any user that uses nested submodules (even from the root directory) as
the 'nested' would become 'sub/nested'.

Both groups can be found in the wild.  The author has no data if one group
outweighs the other by large margin, and offending each one seems equally
bad at first.  However in the authors imagination it is better to go with
(a) as running from a sub directory sounds like it is carried out
by a human rather than by some automation task.  With a human on
the keyboard the feedback loop is short and the changed behavior can be
adapted to quickly unlike some automation that can break silently.

Discussed-with: Ramsay Jones 
Signed-off-by: Prathamesh Chavan 
Signed-off-by: Stefan Beller 
---
 git-submodule.sh |  1 -
 t/t7407-submodule-foreach.sh | 36 ++--
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index a427ddafd..493a64372 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -320,7 +320,6 @@ cmd_foreach()
prefix="$prefix$sm_path/"
sanitize_submodule_env
cd "$sm_path" &&
-   sm_path=$(git submodule--helper relative-path 
"$sm_path" "$wt_prefix") &&
# we make $path available to scripts ...
path=$sm_path &&
if test $# -eq 1
diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
index 6ba5daf42..0663622a4 100755
--- a/t/t7407-submodule-foreach.sh
+++ b/t/t7407-submodule-foreach.sh
@@ -82,9 +82,9 @@ test_expect_success 'test basic "submodule foreach" usage' '
 
 cat >expect  expect <

[GSoC][PATCH 08/13] submodule: port submodule subcommand 'summary' from shell to C

2017-07-24 Thread Prathamesh Chavan
The submodule subcommand 'summary' is ported in the process of
making git-submodule a builtin. The function cmd_summary() from
git-submodule.sh is ported to functions module_summary(),
compute_summary_module_list(), prepare_submodule_summary() and
print_submodule_summary().

The first function module_summary() parses the options of submodule
subcommand and also acts as the front-end of this subcommand.
After parsing them, it calls the compute_summary_module_list()

The functions compute_summary_module_list() runs the diff_cmd,
and generates the modules list, as required by the subcommand.
The generation of this module list is done by the using the
callback function submodule_summary_callback(), and stored in the
structure module_cb.

Once the module list is generated, prepare_submodule_summary()
further goes through the list and filters the list, for
eventually calling the print_submodule_summary() function.

Finally, the print_submodule_summary() takes care of generating
and printing the summary for each submodule.

Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
In this new version of patch, following changes were made:
* the way of generating sub_sha1_src and sub_sha1_dst (abbrev of sha1_src
  and sha1_dst resp.) were changed. Since there was no direct way of
  abbrevating a string(sha1_dst), in this patch sha1_dst was converted first
  to an object id (converting to sha1 was avoided) and then abbrevated using
  find_unique_abbrev().
* A few big if() statements were reduced.
* for reducing the two big if() statements, a new function
  verify_submodule_object_name() was introduced.
* this new version also corrects a few other nits.

 builtin/submodule--helper.c | 428 
 git-submodule.sh| 182 +--
 2 files changed, 429 insertions(+), 181 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 5e84fc42d..94d6254f0 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -13,6 +13,9 @@
 #include "remote.h"
 #include "refs.h"
 #include "connect.h"
+#include "revision.h"
+#include "diffcore.h"
+#include "diff.h"
 
 typedef void (*submodule_list_func_t)(const struct cache_entry *list_item,
  void *cb_data);
@@ -762,6 +765,430 @@ static int module_name(int argc, const char **argv, const 
char *prefix)
return 0;
 }
 
+struct module_cb {
+   unsigned int mod_src;
+   unsigned int mod_dst;
+   struct object_id oid_src;
+   struct object_id oid_dst;
+   char status;
+   const char *sm_path;
+};
+#define MODULE_CB_INIT { 0, 0, NULL, NULL, '\0', NULL }
+
+struct module_cb_list {
+   struct module_cb **entries;
+   int alloc, nr;
+};
+#define MODULE_CB_LIST_INIT { NULL, 0, 0 }
+
+struct summary_cb {
+   int argc;
+   const char **argv;
+   const char *prefix;
+   char *diff_cmd;
+   unsigned int cached: 1;
+   unsigned int for_status: 1;
+   unsigned int quiet: 1;
+   unsigned int files: 1;
+   int summary_limits;
+};
+#define SUMMARY_CB_INIT { 0, NULL, NULL, NULL, 0, 0, 0, 0, 0 }
+
+static int verify_submodule_object_name(const char *sm_path, const char *sha1)
+{
+   struct child_process cp_rev_parse = CHILD_PROCESS_INIT;
+
+   cp_rev_parse.git_cmd = 1;
+   cp_rev_parse.no_stdout = 1;
+   cp_rev_parse.dir = sm_path;
+   prepare_submodule_repo_env(_rev_parse.env_array);
+
+   argv_array_pushl(_rev_parse.args, "rev-parse", "-q",
+"--verify", NULL);
+   argv_array_pushf(_rev_parse.args, "%s^0", sha1);
+
+   if (run_command(_rev_parse))
+   return 1;
+
+   return 0;
+}
+
+static void print_submodule_summary(struct summary_cb *info,
+   struct module_cb *p)
+{
+   int missing_src = 0;
+   int missing_dst = 0;
+   char *displaypath;
+   const char *sha1_abbr_src;
+   const char *sha1_abbr_dst;
+   struct object_id oid_dst;
+   int errmsg = 0;
+   int total_commits = -1;
+   const char *sha1_dst = oid_to_hex(>oid_dst);
+   const char *sha1_src = oid_to_hex(>oid_src);
+   char *sm_git_dir = xstrfmt("%s/.git", p->sm_path);
+   int is_sm_git_dir = 0;
+
+   if (!info->cached && !strcmp(sha1_dst, sha1_to_hex(null_sha1))) {
+   if (S_ISGITLINK(p->mod_dst)) {
+   struct child_process cp_rev_parse = CHILD_PROCESS_INIT;
+   struct strbuf sb_rev_parse = STRBUF_INIT;
+
+   cp_rev_parse.git_cmd = 1;
+   cp_rev_parse.no_stderr = 1;
+   cp_rev_parse.dir = p->sm_path;
+   prepare_submodule_repo_env(_rev_parse.env_array);
+
+   argv_array_pushl(_rev_parse.args,
+

[GSoC][PATCH 12/13] submodule foreach: document variable '$displaypath'

2017-07-24 Thread Prathamesh Chavan
It was observer that the variable '$displaypath' was accessible but
undocumented. Hence, document it.

Discussed-with: Ramsay Jones 
Signed-off-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
 Documentation/git-submodule.txt |  6 --
 t/t7407-submodule-foreach.sh| 22 +++---
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 8e7930ebc..0cca702cb 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -183,10 +183,12 @@ information too.
 
 foreach [--recursive] ::
Evaluates an arbitrary shell command in each checked out submodule.
-   The command has access to the variables $name, $sm_path, $sha1 and
-   $toplevel:
+   The command has access to the variables $name, $sm_path, $displaypath,
+   $sha1 and $toplevel:
$name is the name of the relevant submodule section in `.gitmodules`,
$sm_path is the path of the submodule as recorded in the superproject,
+   $displaypath contains the relative path from the current working
+   directory to the submodules root directory,
$sha1 is the commit as recorded in the superproject, and
$toplevel is the absolute path to its superproject, such that
$toplevel/$sm_path is the absolute path of the submodule.
diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
index 0663622a4..6ad57e061 100755
--- a/t/t7407-submodule-foreach.sh
+++ b/t/t7407-submodule-foreach.sh
@@ -82,16 +82,16 @@ test_expect_success 'test basic "submodule foreach" usage' '
 
 cat >expect <../../actual
+   git submodule foreach "echo 
\$toplevel-\$name-\$sm_path-\$displaypath-\$sha1" >../../actual
) &&
test_i18ncmp expect actual
 '
@@ -206,25 +206,25 @@ submodulesha1=$(cd 
clone2/nested1/nested2/nested3/submodule && git rev-parse HEA
 
 cat >expect <../../actual
+   git submodule foreach --recursive "echo 
\$toplevel-\$name-\$sm_path-\$displaypath-\$sha1" >../../actual
) &&
test_i18ncmp expect actual
 '
-- 
2.13.0



[GSoC][PATCH 13/13] submodule: port submodule subcommand 'foreach' from shell to C

2017-07-24 Thread Prathamesh Chavan
This aims to make git-submodule foreach a builtin. This is the very
first step taken in this direction. Hence, 'foreach' is ported to
submodule--helper, and submodule--helper is called from git-submodule.sh.
The code is split up to have one function to obtain all the list of
submodules. This function acts as the front-end of git-submodule foreach
subcommand. It calls the function for_each_submodule_list, which basically
loops through the list and calls function fn, which in this case is
runcommand_in_submodule. This third function is a calling function that
takes care of running the command in that submodule, and recursively
perform the same when --recursive is flagged.

The first function module_foreach first parses the options present in
argv, and then with the help of module_list_compute, generates the list of
submodules present in the current working tree.

The second function for_each_submodule_list traverses through the
list, and calls function fn (which in case of submodule subcommand
foreach is runcommand_in_submodule) is called for each entry.

The third function runcommand_in_submodule, generates a submodule struct sub
for $name, value and then later prepends name=sub->name; and other
value assignment to the env argv_array structure of a child_process.
Also the  of submodule-foreach is push to args argv_array
structure and finally, using run_command the commands are executed
using a shell.

The third function also takes care of the recursive flag, by creating
a separate child_process structure and prepending "--super-prefix displaypath",
to the args argv_array structure. Other required arguments and the
input  of submodule-foreach is also appended to this argv_array.

Helped-by: Brandon Williams 
Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
 builtin/submodule--helper.c | 129 
 git-submodule.sh|  39 +-
 2 files changed, 130 insertions(+), 38 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 94d6254f0..be278bf8d 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -765,6 +765,134 @@ static int module_name(int argc, const char **argv, const 
char *prefix)
return 0;
 }
 
+struct cb_foreach {
+   int argc;
+   const char **argv;
+   const char *prefix;
+   unsigned int quiet: 1;
+   unsigned int recursive: 1;
+};
+#define CB_FOREACH_INIT { 0, NULL, NULL, 0, 0 }
+
+static void runcommand_in_submodule(const struct cache_entry *list_item,
+   void *cb_data)
+{
+   struct cb_foreach *info = cb_data;
+   const struct submodule *sub;
+   struct child_process cp = CHILD_PROCESS_INIT;
+   char *displaypath;
+
+   displaypath = get_submodule_displaypath(list_item->name, info->prefix);
+
+   sub = submodule_from_path(null_sha1, list_item->name);
+
+   if (!sub)
+   die(_("No url found for submodule path '%s' in .gitmodules"),
+ displaypath);
+
+   if (!is_submodule_populated_gently(list_item->name, NULL))
+   goto cleanup;
+
+   prepare_submodule_repo_env(_array);
+   /* For the purpose of executing  in the submodule,
+* separate shell is used for the purpose of running the
+* child process.
+*/
+   cp.use_shell = 1;
+   cp.dir = list_item->name;
+
+   if (info->argc == 1) {
+   char *toplevel = xgetcwd();
+
+   argv_array_pushf(_array, "name=%s", sub->name);
+   argv_array_pushf(_array, "sm_path=%s", list_item->name);
+   argv_array_pushf(_array, "displaypath=%s", displaypath);
+   argv_array_pushf(_array, "sha1=%s",
+oid_to_hex(_item->oid));
+   argv_array_pushf(_array, "toplevel=%s", toplevel);
+
+   /*
+* Since still the path variable was accessible from the
+* script before porting, it is also made available.
+*/
+   argv_array_pushf(, "path=%s; %s",
+list_item->name, info->argv[0]);
+   free(toplevel);
+   } else {
+   argv_array_pushv(, info->argv);
+   }
+
+   if (!info->quiet)
+   printf(_("Entering '%s'\n"), displaypath);
+
+   if (info->argv[0] && run_command())
+   die(_("run_command returned non-zero status for %s\n."),
+ displaypath);
+
+   if (info->recursive) {
+   struct child_process cpr = CHILD_PROCESS_INIT;
+
+   cpr.git_cmd = 1;
+   cpr.dir = list_item->name;
+   prepare_submodule_repo_env(_array);
+
+   argv_array_pushl(, "--super-prefix", displaypath,
+

[GSoC][PATCH 11/13] submodule foreach: clarify the '$toplevel' variable documentation

2017-07-24 Thread Prathamesh Chavan
It does not contain the topmost superproject as the author assumed,
but the direct superproject, such that $toplevel/$sm_path is the
actual absolute path of the submodule.

Discussed-with: Ramsay Jones 
Signed-off-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
 Documentation/git-submodule.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index a23baef62..8e7930ebc 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -188,7 +188,8 @@ foreach [--recursive] ::
$name is the name of the relevant submodule section in `.gitmodules`,
$sm_path is the path of the submodule as recorded in the superproject,
$sha1 is the commit as recorded in the superproject, and
-   $toplevel is the absolute path to the top-level of the superproject.
+   $toplevel is the absolute path to its superproject, such that
+   $toplevel/$sm_path is the absolute path of the submodule.
Note that to avoid conflicts with '$PATH' on Windows, the '$path'
variable is now a deprecated synonym of '$sm_path' variable.
Any submodules defined in the superproject but not checked out are
-- 
2.13.0



[GSoC][PATCH 10/13] submodule foreach: document '$sm_path' instead of '$path'

2017-07-24 Thread Prathamesh Chavan
As using a variable '$path' may be harmful to users due to
capitalization issues, see 64394e3ae9 (git-submodule.sh: Don't
use $path variable in eval_gettext string, 2012-04-17). Adjust
the documentation to advocate for using $sm_path,  which contains
the same value. We still make the 'path' variable available and
document it as a deprecated synonym of 'sm_path'.

Discussed-with: Ramsay Jones 
Signed-off-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
---
 Documentation/git-submodule.txt | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index ff612001d..a23baef62 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -183,12 +183,14 @@ information too.
 
 foreach [--recursive] ::
Evaluates an arbitrary shell command in each checked out submodule.
-   The command has access to the variables $name, $path, $sha1 and
+   The command has access to the variables $name, $sm_path, $sha1 and
$toplevel:
$name is the name of the relevant submodule section in `.gitmodules`,
-   $path is the name of the submodule directory relative to the
-   superproject, $sha1 is the commit as recorded in the superproject,
-   and $toplevel is the absolute path to the top-level of the superproject.
+   $sm_path is the path of the submodule as recorded in the superproject,
+   $sha1 is the commit as recorded in the superproject, and
+   $toplevel is the absolute path to the top-level of the superproject.
+   Note that to avoid conflicts with '$PATH' on Windows, the '$path'
+   variable is now a deprecated synonym of '$sm_path' variable.
Any submodules defined in the superproject but not checked out are
ignored by this command. Unless given `--quiet`, foreach prints the name
of each submodule before evaluating the command.
-- 
2.13.0



Re: Change in output as a result of patch

2017-07-24 Thread Junio C Hamano
Kaartic Sivaraam  writes:

> The patch in the previous mail results in a change in output as
> specified below.
>
> $ git branch
> * master
>   foo
>   bar
>
> Before patch,
>
> $ git branch -m hypothet master
> fatal: A branch named 'master' already exists.
>
> $ git branch -m hypothet real
> error: refname refs/heads/hypothet not found
> fatal: Branch rename failed
>
> After patch,
>
> $ git branch -m hypothet master
> fatal: Branch 'hypothet' does not exist.
>
> $ git branch -m hypothet real
> fatal: Branch 'hypothet' does not exist.

Imagine this scenario instead, which I think is more realistic
example of making a typo.  The set of existing branches are like
this:

 $ git branch
   devel
 * test

And then you get these with your patch:

 $ git branch -m tets devel
 fatal: Branch 'tets' does not exist.

 $ git branch -m test devel
 fatal: A branch named 'devel' already exists.

My reaction to the above exchange would be a moderately strong
annoyance.  If I were told that I am using 'devel' for something
else already, my "corrected" attempt to turn the 'test' branch to a
real development branch after getting the first error would have
been:

 $ git branch -m test devel2

and I didn't have to get the second error.

I think your patch points in the right direction---if an operation
can fail due to two reasons, reordering the two checks and still
fail with a report for just the first one you happened to check
first does not give us a real improvement.  If it is easy to check
the other one after you noticed one of the condition is violated,
then you could give a more useful diagnosis, namely, "There is
branch 'tets' to rename, and there already is 'devel' branch".

I suspect that with a moderately-sized refactoring around
validate_new_branchname() function, this should be doable.  Instead
of passing two "int" parameters force and attr_only, make them into
a single "unsigned flag" and allow the caller to pass another option
to tell the function "do not die, do not issue a message, but tell
the caller what is wrong with a return value".  And by using that
updated API, rename_branch() could tell four cases apart and fail
the three bad cases in a more sensible way.

In any case, the illustrations of interaction before and after the
change is a very good thing to have when discussing a patch, and you
are encouraged to put them in your proposed log message.





Re: [PATCH] recursive submodules: detach HEAD from new state

2017-07-24 Thread Junio C Hamano
Jonathan Nieder  writes:

> Yikes.  Yes, this bug looks problematic.  Thanks for working on it.
>
>> I improved the commit message laying out the current state of affairs,
>> arguing that any future plan should not weigh in as much as the current
>> possible data loss.
>
> Can you elaborate on what you mean about data loss?  At first glance
> it would seem to me that detaching HEAD could lead to data loss since
> there isn't a branch to keep track of the user's work.  Are you saying
> the current behavior of updating whatever branch HEAD is on (which,
> don't get me wrong, is a wrong behavior that needs fixing) bypassed
> the reflog?

Also, while I do agree with you that the problem exists, it is
unclear why this patch is a solution and not a hack that sweeps a
problem under the rug.  

It is unclear why this "silently detach HEAD without telling the
user" is a better solution than erroring out, for example [*1*].


[Footnote]

*1* For example, I would imagine that the problem can also be
"fixed" by detecting that the HEAD is on a branch, and noticing
that it will force rewinding of that branch if we did update-ref
in this codepath, and signal the failure to the caller of
submodule_move_head() without making the damage larger.  And
tell the user what is going on, and perhaps suggest to detach
HEAD to avoid clobbering their branch.

>
> Thanks, Jonathan
>
>> [1] https://public-inbox.org/git/20170630003851.17288-1-sbel...@google.com/
> [...]
>> --- a/submodule.c
>> +++ b/submodule.c
>> @@ -1653,7 +1653,8 @@ int submodule_move_head(const char *path,
>>  cp.dir = path;
>>  
>>  prepare_submodule_repo_env(_array);
>> -argv_array_pushl(, "update-ref", "HEAD", new, 
>> NULL);
>> +argv_array_pushl(, "update-ref", "HEAD",
>> + "--no-deref", new, NULL);
>>  
>>  if (run_command()) {
>>  ret = -1;


[PATCH] sub-process: refactor handshake to common function

2017-07-24 Thread Jonathan Tan
Refactor, into a common function, the version and capability negotiation
done when invoking a long-running process as a clean or smudge filter.
This will be useful for other Git code that needs to interact similarly
with a long-running process.

Signed-off-by: Jonathan Tan 
---
This will be useful for anyone implementing functionality like that in
[1].

It is unfortunate that the code grew larger because it had to be more
generic, but I think this is better than having (in the future) 2 or
more separate handshake functions.

I also don't think that the protocol should be permissive - I think
things would be simpler if all protocol errors were fatal errors. As it
is, most parts are permissive, but packet_read_line() already dies
anyway upon a malformed packet, so it may not be too drastic a change to
change this. For reference, the original protocol comes from [2].

[1] https://public-inbox.org/git/20170714132651.170708-2-benpe...@microsoft.com/
[2] edcc858 ("convert: add filter..process option", 2016-10-17)
---
 convert.c | 61 -
 pkt-line.c| 19 ---
 pkt-line.h|  2 --
 sub-process.c | 94 +++
 sub-process.h | 18 ++
 t/t0021-conversion.sh |  2 +-
 6 files changed, 120 insertions(+), 76 deletions(-)

diff --git a/convert.c b/convert.c
index deaf0ba7b..ec8ecc2ea 100644
--- a/convert.c
+++ b/convert.c
@@ -512,62 +512,15 @@ static struct hashmap subprocess_map;
 
 static int start_multi_file_filter_fn(struct subprocess_entry *subprocess)
 {
-   int err;
+   static int versions[] = {2, 0};
+   static struct subprocess_capability capabilities[] = {
+   {"clean", CAP_CLEAN}, {"smudge", CAP_SMUDGE}, {NULL, 0}
+   };
struct cmd2process *entry = (struct cmd2process *)subprocess;
-   struct string_list cap_list = STRING_LIST_INIT_NODUP;
-   char *cap_buf;
-   const char *cap_name;
-   struct child_process *process = >process;
-   const char *cmd = subprocess->cmd;
-
-   sigchain_push(SIGPIPE, SIG_IGN);
-
-   err = packet_writel(process->in, "git-filter-client", "version=2", 
NULL);
-   if (err)
-   goto done;
-
-   err = strcmp(packet_read_line(process->out, NULL), "git-filter-server");
-   if (err) {
-   error("external filter '%s' does not support filter protocol 
version 2", cmd);
-   goto done;
-   }
-   err = strcmp(packet_read_line(process->out, NULL), "version=2");
-   if (err)
-   goto done;
-   err = packet_read_line(process->out, NULL) != NULL;
-   if (err)
-   goto done;
-
-   err = packet_writel(process->in, "capability=clean", 
"capability=smudge", NULL);
-
-   for (;;) {
-   cap_buf = packet_read_line(process->out, NULL);
-   if (!cap_buf)
-   break;
-   string_list_split_in_place(_list, cap_buf, '=', 1);
-
-   if (cap_list.nr != 2 || strcmp(cap_list.items[0].string, 
"capability"))
-   continue;
-
-   cap_name = cap_list.items[1].string;
-   if (!strcmp(cap_name, "clean")) {
-   entry->supported_capabilities |= CAP_CLEAN;
-   } else if (!strcmp(cap_name, "smudge")) {
-   entry->supported_capabilities |= CAP_SMUDGE;
-   } else {
-   warning(
-   "external filter '%s' requested unsupported 
filter capability '%s'",
-   cmd, cap_name
-   );
-   }
-
-   string_list_clear(_list, 0);
-   }
-
-done:
-   sigchain_pop(SIGPIPE);
 
-   return err;
+   return subprocess_handshake(subprocess, "git-filter-", versions, NULL,
+   capabilities,
+   >supported_capabilities);
 }
 
 static int apply_multi_file_filter(const char *path, const char *src, size_t 
len,
diff --git a/pkt-line.c b/pkt-line.c
index 9d845ecc3..7db911957 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -171,25 +171,6 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...)
return status;
 }
 
-int packet_writel(int fd, const char *line, ...)
-{
-   va_list args;
-   int err;
-   va_start(args, line);
-   for (;;) {
-   if (!line)
-   break;
-   if (strlen(line) > LARGE_PACKET_DATA_MAX)
-   return -1;
-   err = packet_write_fmt_gently(fd, "%s\n", line);
-   if (err)
-   return err;
-   line = va_arg(args, const char*);
-   }
-   va_end(args);
-   return packet_flush_gently(fd);
-}
-
 static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 {
static char 

Re: change the filetype from binary to text after the file is commited to a git repo

2017-07-24 Thread tonka3...@gmail.com
Thx jeff, i will try it tomorrow.

> Am 24.07.2017 um 22:32 schrieb Jeff King :
> 
> On Mon, Jul 24, 2017 at 10:26:22PM +0200, tonka3...@gmail.com wrote:
> 
>>> I'm not sure exactly what you're trying to accomplish. If you're unhappy
>>> with the file as utf-16, then you should probably convert to utf-8 as a
>>> single commit (since the diff will otherwise be unreadable) and then
>>> make further changes in utf-8.
> 
>> That was exactly what i'm searching for. The utf-16 back in the days
>> was by accident (thx to visual studio). So if the last commit and the
>> acutal change are both utf-8 the diff should work again.  Just for my
>> understanding. Git just take the bytes of the whole file on every
>> commit, so there is no general problem with that, the size of the
>> utf-16 is just twice as big as the utf-8 one, is that correct?
> 
> Right. The diff switching the encodings will be listed as "binary" (and
> you should write a good commit message explaining what's going on!), but
> then after that the changes to the utf-8 version will display as normal
> text.  Git only looks at the actual bytes being diffed, not older
> versions of the file.
> 
> -Peff


Re: [PATCH] recursive submodules: detach HEAD from new state

2017-07-24 Thread Junio C Hamano
Stefan Beller  writes:

>>  Are you saying
>> the current behavior of updating whatever branch HEAD is on (which,
>> don't get me wrong, is a wrong behavior that needs fixing) bypassed
>> the reflog?
>
> No, I am not saying that.
> I am saying that updating an unrelated branch (which is dragged into
> the affair just because HEAD points at it) is very subtle thing, as any
> commits on that branch can be considered safe (it is on a branch, right?)
> but the detached HEAD is the default unsafe mode we currently have.

Then it is not a data loss as you claimed in the proposed log
message, but is something else, no?


[PATCHv2] blame: fix memory corruption scrambling revision name in error message

2017-07-24 Thread SZEDER Gábor
When attempting to blame a non-existing path, git should show an error
message like this:

  $ git blame e83c51633 -- nonexisting-file
  fatal: no such path nonexisting-file in e83c51633

Since the recent commit 835c49f7d (blame: rework methods that
determine 'final' commit, 2017-05-24) the revision name is either
missing or some scrambled characters are shown instead.  The reason is
that the revision name must be duplicated, because it is invalidated
when the pending objects array is cleared in the meantime, but this
commit dropped the duplication.

Restore the duplication of the revision name in the affected functions
(find_single_final() and find_single_initial()).

Signed-off-by: SZEDER Gábor 
---

Use xstrdup_or_null() in the first hunk.

 blame.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/blame.c b/blame.c
index 91e26e93e..f575e9cbf 100644
--- a/blame.c
+++ b/blame.c
@@ -1663,7 +1663,7 @@ static struct commit *find_single_final(struct rev_info 
*revs,
name = revs->pending.objects[i].name;
}
if (name_p)
-   *name_p = name;
+   *name_p = xstrdup_or_null(name);
return found;
 }
 
@@ -1735,7 +1735,7 @@ static struct commit *find_single_initial(struct rev_info 
*revs,
die("No commit to dig up from?");
 
if (name_p)
-   *name_p = name;
+   *name_p = xstrdup(name);
return found;
 }
 
@@ -1843,6 +1843,8 @@ void setup_scoreboard(struct blame_scoreboard *sb, const 
char *path, struct blam
 
if (orig)
*orig = o;
+
+   free((char *)final_commit_name);
 }
 
 
-- 
2.14.0.rc0.88.ge338f4246



Re: reftable [v3]: new ref storage format

2017-07-24 Thread Stefan Beller
On Sat, Jul 22, 2017 at 11:29 AM, Shawn Pearce  wrote:
> 3rd iteration of the reftable storage format.
>
> You can read a rendered version of this here:
> https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md
>
> Significant changes from v2:
> - efficient lookup by SHA-1 for allow-tip-sha1-in-want.

I'll focus on that in the review, it sounds exciting.

> - type 0x4 for FETCH_HEAD, MERGE_HEAD.

So we'd have
varint( (suffix_length << 3) | value_type )
which leaves us with 5 bits for the varint. One bit is needed
to indicate if ew have more bytes coming in the varint, so
only 4 bits in the first byte to encode the suffix length.
That means we can have suffixes up to 15 and we'd still be fine
with just one byte IIUC. That should be enough, specifically for
contiguous numbers as ref names.

> ### Block size
>
> The `block_size` is arbitrarily determined by the writer, and does not
> have to be a power of 2.  The block size must be larger than the
> longest reference name or deflated log entry used in the repository,
> as references cannot span blocks.
>
> Powers of two that are friendly to the virtual memory system or
> filesystem (such as 4k or 8k) are recommended.  Larger sizes (64k) can
> yield better compression, with a possible increased cost incurred by
> readers during access.
>
> The largest block size is `16777215` bytes (15.99 MiB).

Thanks for calling this out, 16MB ought to be enough for everyone.
(Not a joke, as we can have multiple blocks)

> ### Ref block format
...

> A variable number of 4-byte `restart_offset` values follow the
> records.  Offsets are relative to the start of the block and refer to
> the first byte of any `ref_record` whose name has not been prefix
> compressed.  Readers can start linear scans from any of these records.
> Offsets in the first block are relative to the start of the file
> (position 0), and include the file header.  This requires the first
> restart in the first block to be at offset 8.
>
> The 2-byte `restart_count_m1` stores *one less* than the number of
> entries in the `restart_offset` list.  There is always a restart
> corresponding to the first ref record. Readers are responsible for
> computing `restart_count = restart_count_m1 + 1`.

I had to reread these two paragraphs a couple of times as it calls out
the uninteresting things[1] and the interesting part is the logical
conclusion that one has to make themselves,
here is how I would write it:

Readers can start linear scans from any record whose name has
not been prefix compressed. The first record of a block must not
be prefix-compressed.

To aid finding the entry points for linear scans, a variable number
of 4-byte `restart_offset` values follow the records. Offsets are
relative to the start of the block and refer to the first byte of any
such `ref_record` that is not prefix compressed.
The first record can be omitted in the `restart_offset` values as it
is implicit.

The `restart_offset' values must be sorted (ascending,
descending?).

The 2-byte `restart_count_m1` stores the number of optional
entry points, i.e. the number of values in `restart_offset'.
As the first record is omitted in the offset table, there is at
least one entry at 0. the `restart_count_m1` can be zero for
best compression.

[1] The last paragraph in my proposal sounds less like an
off-by-one error to me, but just states what the number is and how
it relates to the surroundings in the file format.

...

> The `value` follows.  Its format is determined by `value_type`, one of
> the following:
>
> - `0x0`: deletion; no value data (see transactions, below)
> - `0x1`: one 20-byte object id; value of the ref
> - `0x2`: two 20-byte object ids; value of the ref, peeled target

Up to here it is easy to spot a pattern:
The number indicates how many oids follow.

> - `0x3`: symbolic reference: `varint( target_len ) target`
> - `0x4`: length delimited extension: `varint( data_len ) data`

This breaks the pattern.

Instead of hardcoding the numbers here, I wonder if we rather
want to make the bits more meaningful:

  bit 0, 1: number of oids iff bit 2 unset

  Iff bit 2 set, then we have a "varint (len) data"
  that follows, bits 0,1 are used for a different purpose,
  00 indicates 'symlink' and data is the string
  01 indicates 'multihead' such as FETCH_HEAD
  1* is reserved for now.

This *may* be neat micro optimization, but hardcoding all bits
to a lookup table is fine, too.

  Note that symrefs and multiheads could share the same
  type, iff we had dissallowed '\n' in refnames. (we do?
  otherwise FETCH_HEAD would be broken)
  The differentiator would be the '\n' or '\0' at the end of the
  first target.

> Types `0x5..0x7` are reserved.

The proposal above would make 0x3, 0x6, 0x7 be reserved.

> ### Ref index
...
>
> An index block should only be written if there are at least 4 blocks
> in the file, as cold 

Re: [PATCH] recursive submodules: detach HEAD from new state

2017-07-24 Thread Junio C Hamano
Junio C Hamano  writes:

> Also, while I do agree with you that the problem exists, it is
> unclear why this patch is a solution and not a hack that sweeps a
> problem under the rug.  
>
> It is unclear why this "silently detach HEAD without telling the
> user" is a better solution than erroring out, for example [*1*].

Just to avoid possible confusion; I am not claiming that it would be
more (or less for that matter) sensible to error out than silently
detaching HEAD, because I am not giving the reason to substantiate
the claim and I do not have a strong opinion to favour which one (or
another potential solution, if any).  

I am just saying that the patch that proposes a solution should be
backed with an explanation why it is a good idea, especially when
there are obvious alternatives that are not so clearly inferior.

Thanks.


Re: [PATCH 21/28] commit_packed_refs(): use a staging file separate from the lockfile

2017-07-24 Thread Jeff King
On Thu, Jul 20, 2017 at 11:01:45PM -0700, Junio C Hamano wrote:

> The general strategy we take for these atomic update of a file
> entity at $path is to:
> 
>  (1) try to create $path.lock exclusively; if we cannot, somebody
>  else holds the lock so fail (or backoff and retry, which
>  amounts to the same thing in the larger picture).
> 
>  (2) update $path.lock and close the fd.
> 
>  (3) rename $path.lock to $path atomically to unlock.
> 
> Would it be sufficent to tweak the above procedure with a new,
> zeroth step?
> 
>  (0) see $path is a symlink; if so, readlink it and assign the
>  result to $path.  Then go on to step (1) above.
> 
> I do not think such an enhancement would break atomicity guarantee
> we want from the locking.  When updating .git/packed-refs in an
> ancient new-workdir'ed directory, we would end up locking the
> corresponding file in the real repository, which is exactly what we
> want anyway.  As the basic assumption of having a symbolic link in
> the new-workdir'ed directory is that a symlink can stay the same
> forever while the linked target will be updated, I think this would
> be a reasonable short-term "fix".

This seems like the correct path to me. If the existing behavior is to
lock the referring symref, that seems like a violation of the lock
procedure in the first place. Because if "A" points to "B", we take
"A.lock" and then modify "B". But "B" may have any number of "A" links
pointing to it, eliminating the purpose of the lock.

I thought we already did this, though. And that modifying HEAD (which
might be a symlink) required LOCK_NO_DEREF.

-Peff


Re: [PATCH] objects: scope count variable to loop

2017-07-24 Thread Jeff King
On Mon, Jul 24, 2017 at 10:12:59AM -0700, Stefan Beller wrote:

> > Interestingly I have no problems compiling it here. I wonder if Stefan's
> > config.mak is supplying -std=c89 or some other restrictive flag. Or if
> > his compiler is a different version (though I tried with gcc-6, gcc-4.9,
> > and clang-3.8).
> 
> Before this patch, I only had
>   CFLAGS += -g -O0
> in config.mak (as I switched working directories recently), I'll throw in
>   DEVELOPER=1
> 
> My compiler version is ancient (gcc 4.8.4-2ubuntu1~14.04.3)
> apparently (why did I never check in this environment?)

Ah, indeed, it's the compiler version. And I actually screwed up my
gcc-4.9 test. It complains, too. It looks like the default for gcc
bumped from gnu90 to gnu11 in gcc 5.

-Peff


Re: [PATCH 21/28] commit_packed_refs(): use a staging file separate from the lockfile

2017-07-24 Thread Jonathan Nieder
Hi,

Jeff King wrote:

> This seems like the correct path to me. If the existing behavior is to
> lock the referring symref, that seems like a violation of the lock
> procedure in the first place. Because if "A" points to "B", we take
> "A.lock" and then modify "B". But "B" may have any number of "A" links
> pointing to it, eliminating the purpose of the lock.
>
> I thought we already did this, though. And that modifying HEAD (which
> might be a symlink) required LOCK_NO_DEREF.

Yes, I believe the lockfile API already does so.  Since this patch
creates a ".new" file, not using the lockfile API, it doesn't benefit
from that, though.

Thanks,
Jonathan


Re: [PATCH] git-contacts: Add recognition of Reported-by

2017-07-24 Thread Jeff King
On Fri, Jul 21, 2017 at 09:03:16AM -0700, Junio C Hamano wrote:

> Eric Blake  writes:
> 
> > You mean, something like
> >
> > git config --add contacts.autocc Reported-by
> > git config --add contacts.autocc Suggested-by
> >
> > where contacts.autocc would be a new multi-valued config option
> > specifying additional Tag: patterns to scrape out of the commit message?
> 
> Yes, something along that line, and you are correct to point out
> that I should have mentioned the need for command-line override.
> 
> In fact, if you anticipate that the primary use of this contributed
> script is as "send-email --cccmd", then we probably are better off
> doing this without any configuration variables, but just add the
> mechanism for command-line override of the hardcoded default.
> 
> I also should have mentioned the need for a way to say "remove all
> hardcoded default and start from scratch".

There's already some prior art around trailers in the trailer.* config.
I wonder if it would make sense to claim a new key there, like:

  git config trailer.Reported-by.autocc true

If "Reported-by" is a trailer that your project uses, then there may be
some benefit to setting up other config related to it, and this would
mesh nicely. And then potentially other programs besides git-contacts
would want to respect that flag (perhaps send-email would even want to
do it itself; I think it already respects cc and s-o-b headers).

-Peff


Re: [PATCH] objects: scope count variable to loop

2017-07-24 Thread Jeff King
On Wed, Jul 19, 2017 at 11:23:42AM -0700, Brandon Williams wrote:

> > object.c: In function ‘object_array_remove_duplicates’:
> > object.c:404:2: error: ‘for’ loop initial declarations are only allowed in 
> > C99 mode
> >   for (unsigned src = 0; src < nr; src++) {
> >   ^
> > object.c:404:2: note: use option -std=c99 or -std=gnu99 to compile your code
> > 
> > Using -std=c99 works for me.
> 
> This would need a change to the makefile then wouldn't it?

Actually, it complicates things even more, I'd think. We probably can't
just blindly add "-std=c99" to CFLAGS, as not all compilers would
support it (even if they _do_ support this construct).

Interestingly I have no problems compiling it here. I wonder if Stefan's
config.mak is supplying -std=c89 or some other restrictive flag. Or if
his compiler is a different version (though I tried with gcc-6, gcc-4.9,
and clang-3.8).

-Peff


[PATCH] blame: fix memory corruption scrambling revision name in error message

2017-07-24 Thread SZEDER Gábor
When attempting to blame a non-existing path, git should show an error
message like this:

  $ git blame e83c51633 -- nonexisting-file
  fatal: no such path nonexisting-file in e83c51633

Since the recent commit 835c49f7d (blame: rework methods that
determine 'final' commit, 2017-05-24) the revision name is either
missing or some scrambled characters are shown instead.  The reason is
that the revision name must be duplicated, because it is invalidated
when the pending objects array is cleared in the meantime, but this
commit dropped the duplication.

Restore the duplication of the revision name in the affected functions
(find_single_final() and find_single_initial()).

Signed-off-by: SZEDER Gábor 
---
 blame.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/blame.c b/blame.c
index 91e26e93e..9ee0a375f 100644
--- a/blame.c
+++ b/blame.c
@@ -1662,8 +1662,8 @@ static struct commit *find_single_final(struct rev_info 
*revs,
found = (struct commit *)obj;
name = revs->pending.objects[i].name;
}
-   if (name_p)
-   *name_p = name;
+   if (name_p && name)
+   *name_p = xstrdup(name);
return found;
 }
 
@@ -1735,7 +1735,7 @@ static struct commit *find_single_initial(struct rev_info 
*revs,
die("No commit to dig up from?");
 
if (name_p)
-   *name_p = name;
+   *name_p = xstrdup(name);
return found;
 }
 
@@ -1843,6 +1843,8 @@ void setup_scoreboard(struct blame_scoreboard *sb, const 
char *path, struct blam
 
if (orig)
*orig = o;
+
+   free((char *)final_commit_name);
 }
 
 
-- 
2.14.0.rc0.88.ge338f4246



Re: [PATCH 21/28] commit_packed_refs(): use a staging file separate from the lockfile

2017-07-24 Thread Jeff King
On Mon, Jul 24, 2017 at 10:09:15AM -0700, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > This seems like the correct path to me. If the existing behavior is to
> > lock the referring symref, that seems like a violation of the lock
> > procedure in the first place. Because if "A" points to "B", we take
> > "A.lock" and then modify "B". But "B" may have any number of "A" links
> > pointing to it, eliminating the purpose of the lock.
> >
> > I thought we already did this, though. And that modifying HEAD (which
> > might be a symlink) required LOCK_NO_DEREF.
> 
> Yes, I believe the lockfile API already does so.  Since this patch
> creates a ".new" file, not using the lockfile API, it doesn't benefit
> from that, though.

Ah, I see. This bug makes much more sense, then. And I agree doubly that
matching the lock-code's behavior is the right thing to do.

-Peff


Re: change the filetype from binary to text after the file is commited to a git repo

2017-07-24 Thread Jeff King
On Mon, Jul 24, 2017 at 07:11:06AM +0200, tonka tonka wrote:

> I have a problem with an already committed file into my repo. This git
> repo was converted from svn to git some years ago. Last week I have
> change some lines in a file and I saw in the diff that it is marked as
> binary (it's a simple .cpp file). I think on the first commit it was
> detected as an utf-16 file (on windows). But no matter what I do I
> can't get it back to a "normal text" text file (git does not detect
> that), but I is now only utf-8. I also replace the whole content of
> the file with just 'a' and git say it's binary.

Git doesn't store a flag for "binary-ness" on each file (though see
below). As the diffs are generated on the fly when you ask to compare
two versions, so too is the determination of "is this binary".

The default heuristic looks at file size (by default, if the file is
over 500MB it's considered binary) and whether it has any zero-byte
characters in the first few kilobytes. But note that if _either_ side of
a diff is considered binary, then Git won't show a text diff.

If you want a particular diff to show all content, even if it doesn't
look like text, add "-a" to your git invocation (e.g., "git show -a").

That said, you can also use .gitattributes (see "git help attributes")
to mark a file as binary or not-binary, skipping the heuristic check.
I'm guessing since you converted from svn that you don't have a
.gitattributes file, but it's possible that somebody later added one
that marks the file as binary (and so the solution would be to drop that
entry).

-Peff


Re: Handling of paths

2017-07-24 Thread Jeff King
On Fri, Jul 21, 2017 at 08:15:17AM -0700, Junio C Hamano wrote:

> In general, I (and other experienced reviewers here) prefer to give
> chances to people who are new to the Git development community and
> are inclined to do so to scratch their own itch, by giving analysis
> of the problem and a suggested route to solve it, but without giving
> the final solution in a patch form.  After all, many developers
> (including me) started from small changes before getting involved
> more deeply to the project and starting to play more important
> roles.

This is a good point, and I should remember to do it more, too.
It's often faster to do a small patch yourself than to help walk a
first-timer through it, but keeping the community healthy is an
important step.

At any rate, your patch to use config_pathname() looks like the right
thing to me.

> Having said all that, I suspect that your original problem
> description might point at another thing we may want to look into.
> 
> The patch under discussion may have solved the "~[username]/" prefix
> issue, but I offhand am not sure if a path-like variable that holds
> a relative path behaves sensibly when they appear in configuration
> files and in a file that has configuration snippets that is included
> with the "[include] path=..." thing, and if there is a need to clarify
> and/or update the rules.

The "[include]path" behavior is intentional and documented: it takes the
path relative to the including file. I think that would be a reasonable
behavior for path-like variables in general (and a path-like variable in
an included file would be relative to that included file; this should
Just Work because the include mechanism keeps a stack of files).

I could probably sketch out a patch, but per the above discussion I'll
leave it for now. Also, I'm lazy. ;)

-Peff


Re: [PATCH] objects: scope count variable to loop

2017-07-24 Thread Stefan Beller
On Mon, Jul 24, 2017 at 10:08 AM, Jeff King  wrote:
> On Wed, Jul 19, 2017 at 11:23:42AM -0700, Brandon Williams wrote:
>
>> > object.c: In function ‘object_array_remove_duplicates’:
>> > object.c:404:2: error: ‘for’ loop initial declarations are only allowed in 
>> > C99 mode
>> >   for (unsigned src = 0; src < nr; src++) {
>> >   ^
>> > object.c:404:2: note: use option -std=c99 or -std=gnu99 to compile your 
>> > code
>> >
>> > Using -std=c99 works for me.
>>
>> This would need a change to the makefile then wouldn't it?
>
> Actually, it complicates things even more, I'd think. We probably can't
> just blindly add "-std=c99" to CFLAGS, as not all compilers would
> support it (even if they _do_ support this construct).
>
> Interestingly I have no problems compiling it here. I wonder if Stefan's
> config.mak is supplying -std=c89 or some other restrictive flag. Or if
> his compiler is a different version (though I tried with gcc-6, gcc-4.9,
> and clang-3.8).

Before this patch, I only had
  CFLAGS += -g -O0
in config.mak (as I switched working directories recently), I'll throw in
  DEVELOPER=1

My compiler version is ancient (gcc 4.8.4-2ubuntu1~14.04.3)
apparently (why did I never check in this environment?)


[PATCH] recursive submodules: detach HEAD from new state

2017-07-24 Thread Stefan Beller
When a submodule is on a branch and in its superproject you run a
recursive checkout, the branch of the submodule is updated to what the
superproject checks out. This is very unexpected in the current model of
Git as e.g. 'submodule update' always detaches the submodule HEAD.

Despite having plans to have submodule HEADS not detached in the future,
the current behavior is really bad as it doesn't match user expectations
and it is not checking for loss of commits (only to be recovered via the
reflog).

Detach the HEAD unconditionally in the submodule when updating it.

Signed-off-by: Stefan Beller 
---

This is a resend of [1], which did not receive any attention.
I improved the commit message laying out the current state of affairs,
arguing that any future plan should not weigh in as much as the current
possible data loss.

[1] https://public-inbox.org/git/20170630003851.17288-1-sbel...@google.com/

Thanks,
Stefan

 submodule.c   |  3 ++-
 t/lib-submodule-update.sh | 17 +
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/submodule.c b/submodule.c
index ef83c2a9ee..4b7c0a4c82 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1653,7 +1653,8 @@ int submodule_move_head(const char *path,
cp.dir = path;
 
prepare_submodule_repo_env(_array);
-   argv_array_pushl(, "update-ref", "HEAD", new, 
NULL);
+   argv_array_pushl(, "update-ref", "HEAD",
+"--no-deref", new, NULL);
 
if (run_command()) {
ret = -1;
diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 2d26f86800..fc406b95d7 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -848,6 +848,23 @@ test_submodule_switch_recursing_with_args () {
test_submodule_content sub1 origin/add_sub1
)
'
+   test_expect_success "$command: submodule branch is not changed, detach 
HEAD instead" '
+   prolog &&
+   reset_work_tree_to_interested add_sub1 &&
+   (
+   cd submodule_update &&
+   git -C sub1 checkout -b keep_branch &&
+   git -C sub1 rev-parse HEAD >expect &&
+   git branch -t check-keep origin/modify_sub1 &&
+   $command check-keep &&
+   test_superproject_content origin/modify_sub1 &&
+   test_submodule_content sub1 origin/modify_sub1 &&
+   git -C sub1 rev-parse keep_branch >actual &&
+   test_cmp expect actual &&
+   test_must_fail git -C sub1 symbolic-ref HEAD
+   )
+   '
+
# Replacing a tracked file with a submodule produces a checked out 
submodule
test_expect_success "$command: replace tracked file with submodule 
checks out submodule" '
prolog &&
-- 
2.14.0.rc0.3.g6c2e499285



Re: [PATCH] recursive submodules: detach HEAD from new state

2017-07-24 Thread Jonathan Nieder
Hi,

Stefan Beller wrote:

> When a submodule is on a branch and in its superproject you run a
> recursive checkout, the branch of the submodule is updated to what the
> superproject checks out. This is very unexpected in the current model of
> Git as e.g. 'submodule update' always detaches the submodule HEAD.
>
> Despite having plans to have submodule HEADS not detached in the future,
> the current behavior is really bad as it doesn't match user expectations
> and it is not checking for loss of commits (only to be recovered via the
> reflog).

I think the corrected behavior doesn't match user expectations,
either.

Could this patch include some documentation to help users know what to
expect?

> Detach the HEAD unconditionally in the submodule when updating it.
>
> Signed-off-by: Stefan Beller 
> ---
> This is a resend of [1], which did not receive any attention.

Yikes.  Yes, this bug looks problematic.  Thanks for working on it.

> I improved the commit message laying out the current state of affairs,
> arguing that any future plan should not weigh in as much as the current
> possible data loss.

Can you elaborate on what you mean about data loss?  At first glance
it would seem to me that detaching HEAD could lead to data loss since
there isn't a branch to keep track of the user's work.  Are you saying
the current behavior of updating whatever branch HEAD is on (which,
don't get me wrong, is a wrong behavior that needs fixing) bypassed
the reflog?

Thanks,
Jonathan

> [1] https://public-inbox.org/git/20170630003851.17288-1-sbel...@google.com/
[...]
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1653,7 +1653,8 @@ int submodule_move_head(const char *path,
>   cp.dir = path;
>  
>   prepare_submodule_repo_env(_array);
> - argv_array_pushl(, "update-ref", "HEAD", new, 
> NULL);
> + argv_array_pushl(, "update-ref", "HEAD",
> +  "--no-deref", new, NULL);
>  
>   if (run_command()) {
>   ret = -1;


Re: Should I store large text files on Git LFS?

2017-07-24 Thread Jeff King
On Mon, Jul 24, 2017 at 02:58:38PM +1000, Andrew Ardill wrote:

> On 24 July 2017 at 13:45, Farshid Zavareh  wrote:
> > I'll probably test this myself, but would modifying and committing a 4GB
> > text file actually add 4GB to the repository's size? I anticipate that it
> > won't, since Git keeps track of the changes only, instead of storing a copy
> > of the whole file (whereas this is not the case with binary files, hence the
> > need for LFS).
> 
> I decided to do a little test myself. I add three versions of the same
> data set (sometimes slightly different cuts of the parent data set,
> which I don't have) each between 2 and 4GB in size.
> Each time I added a new version it added ~500MB to the repository, and
> operations on the repository took 35-45 seconds to complete.
> Running `git gc` compressed the objects fairly well, saving ~400MB of
> space. I would imagine that even more space would be saved
> (proportionally) if there were a lot more similar files in the repo.

Did you tweak core.bigfilethreshold? Git won't actually try to find
deltas on files larger than that (500MB by default). So you might be
seeing just the effects of zlib compression, and not deltas.

You can always check the delta status after a gc by running:

  git rev-list --objects --all |
  git cat-file --batch-check='%(objectsize:disk) %(objectsize) %(deltabase) 
%(rest)'

That should give you a sense of how much you're saving due to zlib (by
comparing the first two numbers for a copy that isn't a delta; i.e.,
with an all-zeros delta base) and how much due to deltas (how much
smaller the first number is for an entry that _is_ a delta).

-Peff


Re: [GSoC][PATCH 13/13] submodule: port submodule subcommand 'foreach' from shell to C

2017-07-24 Thread Brandon Williams
On 07/25, Prathamesh Chavan wrote:
> This aims to make git-submodule foreach a builtin. This is the very
> first step taken in this direction. Hence, 'foreach' is ported to
> submodule--helper, and submodule--helper is called from git-submodule.sh.
> The code is split up to have one function to obtain all the list of
> submodules. This function acts as the front-end of git-submodule foreach
> subcommand. It calls the function for_each_submodule_list, which basically
> loops through the list and calls function fn, which in this case is
> runcommand_in_submodule. This third function is a calling function that
> takes care of running the command in that submodule, and recursively
> perform the same when --recursive is flagged.
> 
> The first function module_foreach first parses the options present in
> argv, and then with the help of module_list_compute, generates the list of
> submodules present in the current working tree.
> 
> The second function for_each_submodule_list traverses through the
> list, and calls function fn (which in case of submodule subcommand
> foreach is runcommand_in_submodule) is called for each entry.
> 
> The third function runcommand_in_submodule, generates a submodule struct sub
> for $name, value and then later prepends name=sub->name; and other
> value assignment to the env argv_array structure of a child_process.
> Also the  of submodule-foreach is push to args argv_array
> structure and finally, using run_command the commands are executed
> using a shell.
> 
> The third function also takes care of the recursive flag, by creating
> a separate child_process structure and prepending "--super-prefix 
> displaypath",
> to the args argv_array structure. Other required arguments and the
> input  of submodule-foreach is also appended to this argv_array.
> 
> Helped-by: Brandon Williams 
> Mentored-by: Christian Couder 
> Mentored-by: Stefan Beller 
> Signed-off-by: Prathamesh Chavan 
> ---
>  builtin/submodule--helper.c | 129 
> 
>  git-submodule.sh|  39 +-
>  2 files changed, 130 insertions(+), 38 deletions(-)
> 
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 94d6254f0..be278bf8d 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -765,6 +765,134 @@ static int module_name(int argc, const char **argv, 
> const char *prefix)
>   return 0;
>  }
>  
> +struct cb_foreach {
> + int argc;
> + const char **argv;
> + const char *prefix;
> + unsigned int quiet: 1;
> + unsigned int recursive: 1;
> +};
> +#define CB_FOREACH_INIT { 0, NULL, NULL, 0, 0 }
> +
> +static void runcommand_in_submodule(const struct cache_entry *list_item,
> + void *cb_data)
> +{
> + struct cb_foreach *info = cb_data;
> + const struct submodule *sub;
> + struct child_process cp = CHILD_PROCESS_INIT;
> + char *displaypath;
> +
> + displaypath = get_submodule_displaypath(list_item->name, info->prefix);
> +
> + sub = submodule_from_path(null_sha1, list_item->name);
> +
> + if (!sub)
> + die(_("No url found for submodule path '%s' in .gitmodules"),
> +   displaypath);
> +
> + if (!is_submodule_populated_gently(list_item->name, NULL))
> + goto cleanup;
> +
> + prepare_submodule_repo_env(_array);
> + /* For the purpose of executing  in the submodule,
> +  * separate shell is used for the purpose of running the
> +  * child process.
> +  */

comment style

> + cp.use_shell = 1;
> + cp.dir = list_item->name;
> +
> + if (info->argc == 1) {

Why are you only exposing these variables if argc == 1?

> + char *toplevel = xgetcwd();
> +
> + argv_array_pushf(_array, "name=%s", sub->name);
> + argv_array_pushf(_array, "sm_path=%s", list_item->name);
> + argv_array_pushf(_array, "displaypath=%s", displaypath);
> + argv_array_pushf(_array, "sha1=%s",
> +  oid_to_hex(_item->oid));
> + argv_array_pushf(_array, "toplevel=%s", toplevel);
> +
> + /*
> +  * Since still the path variable was accessible from the
> +  * script before porting, it is also made available.
> +  */
> + argv_array_pushf(, "path=%s; %s",
> +  list_item->name, info->argv[0]);

This bit looks odd. Why are you appending argv[0] after a semicolon?
Oh...its to handle the funny path stuff.  I'd add a comment indicating
why you have to expose path  via the args argv_array and not the
env_array.

> + free(toplevel);
> + } else {
> + argv_array_pushv(, info->argv);
> + }
> +
> + if (!info->quiet)
> + printf(_("Entering '%s'\n"), displaypath);
> +
> + if (info->argv[0] && 

Re: [GSoC][PATCH 09/13] submodule foreach: correct '$path' in nested submodules from a subdirectory

2017-07-24 Thread Brandon Williams
On 07/25, Prathamesh Chavan wrote:
> When running 'git submodule foreach' from a subdirectory of your
> repository, nested submodules get a bogus value for $sm_path:
> For a submodule 'sub' that contains a nested submodule 'nested',
> running 'git -C dir submodule foreach echo $path' would report
> path='../nested' for the nested submodule. The first part '../' is
> derived from the logic computing the relative path from $pwd to the
> root of the superproject. The second part is the submodule path inside
> the submodule. This value is of little use and is hard to document.
> 
> There are two different possible solutions that have more value:
> (a) The path value is documented as the path from the toplevel of the
> superproject to the mount point of the submodule.
> In this case we would want to have path='sub/nested'.
> 
> (b) As Ramsay noticed the documented value is wrong. For the non-nested
> case the path is equal to the relative path from $pwd to the
> submodules working directory. When following this model,
> the expected value would be path='../sub/nested'.
> 
> The behavior for (b) was introduced in 091a6eb0fe (submodule: drop the
> top-level requirement, 2013-06-16) the intent for $path seemed to be
> relative to $cwd to the submodule worktree, but that did not work for
> nested submodules, as the intermittent submodules were not included in
> the path.
> 
> If we were to fix the meaning of the $path using (a) such that "path"
> is "the path from the toplevel of the superproject to the mount point
> of the submodule", we would break any existing submodule user that runs
> foreach from non-root of the superproject as the non-nested submodule
> '../sub' would change its path to 'sub'.
> 
> If we would fix the meaning of the $path using (b), such that "path"
> is "the relative path from $pwd to the submodule", then we would break
> any user that uses nested submodules (even from the root directory) as
> the 'nested' would become 'sub/nested'.
> 
> Both groups can be found in the wild.  The author has no data if one group
> outweighs the other by large margin, and offending each one seems equally
> bad at first.  However in the authors imagination it is better to go with
> (a) as running from a sub directory sounds like it is carried out
> by a human rather than by some automation task.  With a human on
> the keyboard the feedback loop is short and the changed behavior can be
> adapted to quickly unlike some automation that can break silently.

Great explanation, and I agree with going with choice (a).

> 
> Discussed-with: Ramsay Jones 
> Signed-off-by: Prathamesh Chavan 
> Signed-off-by: Stefan Beller 
> ---
>  git-submodule.sh |  1 -
>  t/t7407-submodule-foreach.sh | 36 ++--
>  2 files changed, 34 insertions(+), 3 deletions(-)
> 
> diff --git a/git-submodule.sh b/git-submodule.sh
> index a427ddafd..493a64372 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -320,7 +320,6 @@ cmd_foreach()
>   prefix="$prefix$sm_path/"
>   sanitize_submodule_env
>   cd "$sm_path" &&
> - sm_path=$(git submodule--helper relative-path 
> "$sm_path" "$wt_prefix") &&
>   # we make $path available to scripts ...
>   path=$sm_path &&
>   if test $# -eq 1
> diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
> index 6ba5daf42..0663622a4 100755
> --- a/t/t7407-submodule-foreach.sh
> +++ b/t/t7407-submodule-foreach.sh
> @@ -82,9 +82,9 @@ test_expect_success 'test basic "submodule foreach" usage' '
>  
>  cat >expect <  Entering '../sub1'
> -$pwd/clone-foo1-../sub1-$sub1sha1
> +$pwd/clone-foo1-sub1-$sub1sha1
>  Entering '../sub3'
> -$pwd/clone-foo3-../sub3-$sub3sha1
> +$pwd/clone-foo3-sub3-$sub3sha1
>  EOF
>  
>  test_expect_success 'test "submodule foreach" from subdirectory' '
> @@ -196,6 +196,38 @@ test_expect_success 'test messages from "foreach 
> --recursive" from subdirectory'
>   ) &&
>   test_i18ncmp expect actual
>  '
> +sub1sha1=$(cd clone2/sub1 && git rev-parse HEAD)
> +sub2sha1=$(cd clone2/sub2 && git rev-parse HEAD)
> +sub3sha1=$(cd clone2/sub3 && git rev-parse HEAD)
> +nested1sha1=$(cd clone2/nested1 && git rev-parse HEAD)
> +nested2sha1=$(cd clone2/nested1/nested2 && git rev-parse HEAD)
> +nested3sha1=$(cd clone2/nested1/nested2/nested3 && git rev-parse HEAD)
> +submodulesha1=$(cd clone2/nested1/nested2/nested3/submodule && git rev-parse 
> HEAD)
> +
> +cat >expect < +Entering '../nested1'
> +$pwd/clone2-nested1-nested1-$nested1sha1
> +Entering '../nested1/nested2'
> +$pwd/clone2/nested1-nested2-nested2-$nested2sha1
> +Entering '../nested1/nested2/nested3'
> +$pwd/clone2/nested1/nested2-nested3-nested3-$nested3sha1
> +Entering 

Re: [GSoC][PATCH 10/13] submodule foreach: document '$sm_path' instead of '$path'

2017-07-24 Thread Brandon Williams
On 07/25, Prathamesh Chavan wrote:
> As using a variable '$path' may be harmful to users due to
> capitalization issues, see 64394e3ae9 (git-submodule.sh: Don't
> use $path variable in eval_gettext string, 2012-04-17). Adjust
> the documentation to advocate for using $sm_path,  which contains
> the same value. We still make the 'path' variable available and
> document it as a deprecated synonym of 'sm_path'.

I assume then at some point we would want to drop support for 'path' via
a normal deprecation cycle (whatever that might be, 6 months or a year
or more).

> 
> Discussed-with: Ramsay Jones 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Prathamesh Chavan 
> ---
>  Documentation/git-submodule.txt | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index ff612001d..a23baef62 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -183,12 +183,14 @@ information too.
>  
>  foreach [--recursive] ::
>   Evaluates an arbitrary shell command in each checked out submodule.
> - The command has access to the variables $name, $path, $sha1 and
> + The command has access to the variables $name, $sm_path, $sha1 and
>   $toplevel:
>   $name is the name of the relevant submodule section in `.gitmodules`,
> - $path is the name of the submodule directory relative to the
> - superproject, $sha1 is the commit as recorded in the superproject,
> - and $toplevel is the absolute path to the top-level of the superproject.
> + $sm_path is the path of the submodule as recorded in the superproject,
> + $sha1 is the commit as recorded in the superproject, and
> + $toplevel is the absolute path to the top-level of the superproject.
> + Note that to avoid conflicts with '$PATH' on Windows, the '$path'
> + variable is now a deprecated synonym of '$sm_path' variable.
>   Any submodules defined in the superproject but not checked out are
>   ignored by this command. Unless given `--quiet`, foreach prints the name
>   of each submodule before evaluating the command.
> -- 
> 2.13.0
> 

-- 
Brandon Williams


Re: [GSoC][PATCH 08/13] submodule: port submodule subcommand 'summary' from shell to C

2017-07-24 Thread Brandon Williams
On 07/25, Prathamesh Chavan wrote:
> The submodule subcommand 'summary' is ported in the process of
> making git-submodule a builtin. The function cmd_summary() from
> git-submodule.sh is ported to functions module_summary(),
> compute_summary_module_list(), prepare_submodule_summary() and
> print_submodule_summary().
> 
> The first function module_summary() parses the options of submodule
> subcommand and also acts as the front-end of this subcommand.
> After parsing them, it calls the compute_summary_module_list()
> 
> The functions compute_summary_module_list() runs the diff_cmd,
> and generates the modules list, as required by the subcommand.
> The generation of this module list is done by the using the
> callback function submodule_summary_callback(), and stored in the
> structure module_cb.
> 
> Once the module list is generated, prepare_submodule_summary()
> further goes through the list and filters the list, for
> eventually calling the print_submodule_summary() function.
> 
> Finally, the print_submodule_summary() takes care of generating
> and printing the summary for each submodule.
> 
> Mentored-by: Christian Couder 
> Mentored-by: Stefan Beller 
> Signed-off-by: Prathamesh Chavan 

This is a big one, hopefully I can try to understand everything that is
happening.

> ---
> In this new version of patch, following changes were made:
> * the way of generating sub_sha1_src and sub_sha1_dst (abbrev of sha1_src
>   and sha1_dst resp.) were changed. Since there was no direct way of
>   abbrevating a string(sha1_dst), in this patch sha1_dst was converted first
>   to an object id (converting to sha1 was avoided) and then abbrevated using
>   find_unique_abbrev().
> * A few big if() statements were reduced.
> * for reducing the two big if() statements, a new function
>   verify_submodule_object_name() was introduced.
> * this new version also corrects a few other nits.
> 
>  builtin/submodule--helper.c | 428 
> 
>  git-submodule.sh| 182 +--
>  2 files changed, 429 insertions(+), 181 deletions(-)
> 
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 5e84fc42d..94d6254f0 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -13,6 +13,9 @@
>  #include "remote.h"
>  #include "refs.h"
>  #include "connect.h"
> +#include "revision.h"
> +#include "diffcore.h"
> +#include "diff.h"
>  
>  typedef void (*submodule_list_func_t)(const struct cache_entry *list_item,
> void *cb_data);
> @@ -762,6 +765,430 @@ static int module_name(int argc, const char **argv, 
> const char *prefix)
>   return 0;
>  }
>  
> +struct module_cb {
> + unsigned int mod_src;
> + unsigned int mod_dst;
> + struct object_id oid_src;
> + struct object_id oid_dst;
> + char status;
> + const char *sm_path;
> +};
> +#define MODULE_CB_INIT { 0, 0, NULL, NULL, '\0', NULL }
> +
> +struct module_cb_list {
> + struct module_cb **entries;
> + int alloc, nr;
> +};
> +#define MODULE_CB_LIST_INIT { NULL, 0, 0 }
> +
> +struct summary_cb {
> + int argc;
> + const char **argv;
> + const char *prefix;
> + char *diff_cmd;
> + unsigned int cached: 1;
> + unsigned int for_status: 1;
> + unsigned int quiet: 1;
> + unsigned int files: 1;
> + int summary_limits;
> +};
> +#define SUMMARY_CB_INIT { 0, NULL, NULL, NULL, 0, 0, 0, 0, 0 }
> +
> +static int verify_submodule_object_name(const char *sm_path, const char 
> *sha1)
> +{
> + struct child_process cp_rev_parse = CHILD_PROCESS_INIT;
> +
> + cp_rev_parse.git_cmd = 1;
> + cp_rev_parse.no_stdout = 1;
> + cp_rev_parse.dir = sm_path;
> + prepare_submodule_repo_env(_rev_parse.env_array);
> +
> + argv_array_pushl(_rev_parse.args, "rev-parse", "-q",
> +  "--verify", NULL);
> + argv_array_pushf(_rev_parse.args, "%s^0", sha1);
> +
> + if (run_command(_rev_parse))
> + return 1;
> +
> + return 0;
> +}
> +
> +static void print_submodule_summary(struct summary_cb *info,
> + struct module_cb *p)
> +{

This is one large function so it may be difficult to keep track of
everything and I don't know how easy it would be to split.

> + int missing_src = 0;
> + int missing_dst = 0;
> + char *displaypath;
> + const char *sha1_abbr_src;
> + const char *sha1_abbr_dst;
> + struct object_id oid_dst;
> + int errmsg = 0;
> + int total_commits = -1;
> + const char *sha1_dst = oid_to_hex(>oid_dst);
> + const char *sha1_src = oid_to_hex(>oid_src);

You have these two variables which are defined as 'const char *' yet you
allocate memory for them at different points via 'xstrdup' and then
never free it.  Really what you should be trying to do is use 'struct
object_id' as much as you can instead of storing the 

Re: [GSoC][PATCH 12/13] submodule foreach: document variable '$displaypath'

2017-07-24 Thread Brandon Williams
On 07/25, Prathamesh Chavan wrote:
> It was observer that the variable '$displaypath' was accessible but

s/observer/observed 

> undocumented. Hence, document it.
> 
> Discussed-with: Ramsay Jones 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Prathamesh Chavan 
> ---
>  Documentation/git-submodule.txt |  6 --
>  t/t7407-submodule-foreach.sh| 22 +++---
>  2 files changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index 8e7930ebc..0cca702cb 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -183,10 +183,12 @@ information too.
>  
>  foreach [--recursive] ::
>   Evaluates an arbitrary shell command in each checked out submodule.
> - The command has access to the variables $name, $sm_path, $sha1 and
> - $toplevel:
> + The command has access to the variables $name, $sm_path, $displaypath,
> + $sha1 and $toplevel:
>   $name is the name of the relevant submodule section in `.gitmodules`,
>   $sm_path is the path of the submodule as recorded in the superproject,
> + $displaypath contains the relative path from the current working
> + directory to the submodules root directory,
>   $sha1 is the commit as recorded in the superproject, and
>   $toplevel is the absolute path to its superproject, such that
>   $toplevel/$sm_path is the absolute path of the submodule.
> diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
> index 0663622a4..6ad57e061 100755
> --- a/t/t7407-submodule-foreach.sh
> +++ b/t/t7407-submodule-foreach.sh
> @@ -82,16 +82,16 @@ test_expect_success 'test basic "submodule foreach" 
> usage' '
>  
>  cat >expect <  Entering '../sub1'
> -$pwd/clone-foo1-sub1-$sub1sha1
> +$pwd/clone-foo1-sub1-../sub1-$sub1sha1
>  Entering '../sub3'
> -$pwd/clone-foo3-sub3-$sub3sha1
> +$pwd/clone-foo3-sub3-../sub3-$sub3sha1
>  EOF
>  
>  test_expect_success 'test "submodule foreach" from subdirectory' '
>   mkdir clone/sub &&
>   (
>   cd clone/sub &&
> - git submodule foreach "echo \$toplevel-\$name-\$sm_path-\$sha1" 
> >../../actual
> + git submodule foreach "echo 
> \$toplevel-\$name-\$sm_path-\$displaypath-\$sha1" >../../actual
>   ) &&
>   test_i18ncmp expect actual
>  '
> @@ -206,25 +206,25 @@ submodulesha1=$(cd 
> clone2/nested1/nested2/nested3/submodule && git rev-parse HEA
>  
>  cat >expect <  Entering '../nested1'
> -$pwd/clone2-nested1-nested1-$nested1sha1
> +$pwd/clone2-nested1-nested1-../nested1-$nested1sha1
>  Entering '../nested1/nested2'
> -$pwd/clone2/nested1-nested2-nested2-$nested2sha1
> +$pwd/clone2/nested1-nested2-nested2-../nested1/nested2-$nested2sha1
>  Entering '../nested1/nested2/nested3'
> -$pwd/clone2/nested1/nested2-nested3-nested3-$nested3sha1
> +$pwd/clone2/nested1/nested2-nested3-nested3-../nested1/nested2/nested3-$nested3sha1
>  Entering '../nested1/nested2/nested3/submodule'
> -$pwd/clone2/nested1/nested2/nested3-submodule-submodule-$submodulesha1
> +$pwd/clone2/nested1/nested2/nested3-submodule-submodule-../nested1/nested2/nested3/submodule-$submodulesha1
>  Entering '../sub1'
> -$pwd/clone2-foo1-sub1-$sub1sha1
> +$pwd/clone2-foo1-sub1-../sub1-$sub1sha1
>  Entering '../sub2'
> -$pwd/clone2-foo2-sub2-$sub2sha1
> +$pwd/clone2-foo2-sub2-../sub2-$sub2sha1
>  Entering '../sub3'
> -$pwd/clone2-foo3-sub3-$sub3sha1
> +$pwd/clone2-foo3-sub3-../sub3-$sub3sha1
>  EOF
>  
>  test_expect_success 'test "submodule foreach --recursive" from subdirectory' 
> '
>   (
>   cd clone2/untracked &&
> - git submodule foreach --recursive "echo 
> \$toplevel-\$name-\$sm_path-\$sha1" >../../actual
> + git submodule foreach --recursive "echo 
> \$toplevel-\$name-\$sm_path-\$displaypath-\$sha1" >../../actual
>   ) &&
>   test_i18ncmp expect actual
>  '
> -- 
> 2.13.0
> 

-- 
Brandon Williams


Re: [PATCH v3] merge: add a --signoff flag.

2017-07-24 Thread Stefan Beller
On Mon, Jul 24, 2017 at 10:00 PM, lukaszgryglicki  wrote:
> I can't reach Him, every time I send an email it is returned "no such 
> address".
> Can You please ask Him to take a look?

Junio referred to
https://public-inbox.org/git/87fueferd4@gmail.com/
His actual email address is in the cc list.



>> On 24 Jul 2017, at 22:42, Junio C Hamano  wrote:
>>
>> Junio C Hamano  writes:
>>
>>> lukaszgryglicki  writes:
>>>
 Hi, what is the state of this patch?
>>>
>>> I was expecting you to respond to Ævar's <87tw2sbnyl@gmail.com>
>>> to explain how your new version addresses his concerns, or him to
>>> respond to your new patch to say that it addresses his concerns
>>> adequately.  I think neither has happened, so the topic is still in
>>> limbo, I guess...
>>
>> Sorry, Ævar's message I meant was <87fueferd4@gmail.com>.
>


Re: Request for git merge --signoff

2017-07-24 Thread Christian Couder
On Sat, Jul 1, 2017 at 5:24 PM, Dan Kohn  wrote:
> https://github.com/coreinfrastructure/best-practices-badge is a user
> of the https://github.com/probot/dco bot which checks that commits
> have a signoff. The issue is that there is no `--signoff` option in
> git for merge commits, which is a standard part of our workflow with
> feature branches. Here is a workflow where we currently get stuck:

[...]

> Or, I could manually add the Signoff line to the proposed git merge
> commit message, which would allow me to skip the `--amend` step.

Perhaps you could use a prepare-commit-msg hook (or maybe a commit-msg
hook) to automatically add your Signoff line to any commit message if
it isn't there already.

In the prepare-commit-msg hook sample there is already commented out
code to do that:

https://github.com/git/git/blob/master/templates/hooks--prepare-commit-msg.sample#L35-L36

Alternatively you might want to use `git interpret-trailers` to do
that or more fancy trailer related things.

> Could you please add a `--signoff` option to `git merge`?

I am not opposed to add a `--signoff` option to `git merge`, but I
think the main plan to improve git in this area has been to first make
it possible for git commands that can create commits to accept options
like "--trailer 'Signed-off-by: Alice '" and to
pass them to `git interpret-trailers` (or its underlying code).


Greetings||||||

2017-07-24 Thread Dr Paul Benk
Greetings!

I have a proposal for you, this however is not mandatory nor will I in
any manner compel you to honor against your will. I am Dr.Paul Benk a
former executive director with Arab Tunisian Bank here in Tunis; I
retired 1 year 7 months ago after putting in 28 years of meticulous
service. During my days with Arab Tunisian Bank, I was the personal
account officer and one of the financial advisers to Zine Al-Abidine
Ben Ali the past Tunisian President in self exile at Saudi Arabia.
During his trying period, he instructed me to move all his investment
in my care which consists of US$115M and 767KG of gold out of the Gulf
States for safe keeping; and that I successfully did by moving US$50
Million to Madrid Spain, US$50 Million to Dubai United Arab Emirate,
US$15 Million to Burkina Faso and the 767KG of gold to Burkina Faso in
West Africa as an anonymous deposits, so that the funds will in no way
be traced to him. He has instructed me to find an investor who would
stand as the anonymous depositor of the fund and gold; and claim it
for further investment.

Consequent upon the above, my proposal is that I would like you as a
foreigner to stand in as the anonymous depositor of this fund and gold
that I have successfully moved outside the country and provide an
account overseas where this said fund will be transferred into. It is
a careful network and my voluntary retirement from the Arab Tunisian
Bank is to ensure a hitch-free operation as all modalities for you to
stand as beneficiary and owner of the deposits has been perfected by
me. Mr. Zine al-Abidine Ben Ali will be offering you 20% of the total
investment if you can be the investor and claim this deposits in Spain
and Burkina Faso as the anonymous depositor.

Now my questions are:-

1. Can you handle this transaction?
2. Can I give you this trust?

Consider this and get back to me as soon as possible on my private
email Only here: paulben...@gmail.com, so that I can give you more
details regarding this transaction. Finally, it is my humble request
that the information as contained herein be accorded the necessary
attention, urgency as well as the secrecy it deserves.

I expect your urgent response if you can handle this project.

Respectfully yours,

From:Dr.Paul Benk.
Email: paulben...@gmail.com

--
--


Re: fetch-any-blob / ref-in-want proposal

2017-07-24 Thread Jonathan Tan
On Mon, Jul 24, 2017 at 12:38 PM, Orgad Shaneh  wrote:
> Sorry, I thought it's the same thing. I mean ref-in-want[1]. This
> issue in gerrit[2] was closed, claiming that ref-in-want will solve
> it. I just want to know if this is likely to be merged soon enough (I
> consider several months "soon enough"), or should I look for other
> solutions.
>
> - Orgad
>
> [1] 
> https://public-inbox.org/git/cover.1485381677.git.jonathanta...@google.com/
> [2] https://bugs.chromium.org/p/gerrit/issues/detail?id=175#c24

Right now, we have been working on other things - so it is unlikely
that it will be merged soon enough.


Re: [PATCH v3] merge: add a --signoff flag.

2017-07-24 Thread lukaszgryglicki
I can't reach Him, every time I send an email it is returned "no such address".
Can You please ask Him to take a look?
> On 24 Jul 2017, at 22:42, Junio C Hamano  wrote:
> 
> Junio C Hamano  writes:
> 
>> lukaszgryglicki  writes:
>> 
>>> Hi, what is the state of this patch?
>> 
>> I was expecting you to respond to Ævar's <87tw2sbnyl@gmail.com>
>> to explain how your new version addresses his concerns, or him to
>> respond to your new patch to say that it addresses his concerns
>> adequately.  I think neither has happened, so the topic is still in
>> limbo, I guess...
> 
> Sorry, Ævar's message I meant was <87fueferd4@gmail.com>.



[PATCH] hash: Allow building with the external sha1dc library

2017-07-24 Thread Takashi Iwai
Some distros provide SHA1 collision detect code as a shared library.
It's the very same code as we have in git tree, and git can link with
it as well; at least, it may make maintenance easier, according to our
security guys.

This patch allows user to build git linking with the external sha1dc
library instead of the built-in sha1dc code.  User needs to define
DC_SHA1_EXTERNAL explicitly.  As default, the built-in sha1dc code is
used like before.

Signed-off-by: Takashi Iwai 
---
 Makefile | 12 
 hash.h   |  4 +++-
 sha1dc_git_ext.c | 11 +++
 sha1dc_git_ext.h | 25 +
 4 files changed, 51 insertions(+), 1 deletion(-)
 create mode 100644 sha1dc_git_ext.c
 create mode 100644 sha1dc_git_ext.h

diff --git a/Makefile b/Makefile
index 461c845d33cb..f1a262d56254 100644
--- a/Makefile
+++ b/Makefile
@@ -162,6 +162,12 @@ all::
 # algorithm. This is slower, but may detect attempted collision attacks.
 # Takes priority over other *_SHA1 knobs.
 #
+# Define DC_SHA1_EXTERNAL in addition to DC_SHA1 if you want to build / link
+# git with the external sha1collisiondetection library.
+# Without this option, i.e. the default behavior is to build git with its
+# own sha1dc code.  If any extra linker option is required, define them in
+# DC_SHA1_LINK variable in addition.
+#
 # Define DC_SHA1_SUBMODULE in addition to DC_SHA1 to use the
 # sha1collisiondetection shipped as a submodule instead of the
 # non-submodule copy in sha1dc/. This is an experimental option used
@@ -1472,6 +1478,11 @@ ifdef APPLE_COMMON_CRYPTO
BASIC_CFLAGS += -DSHA1_APPLE
 else
DC_SHA1 := YesPlease
+ifdef DC_SHA1_EXTERNAL
+   LIB_OBJS += sha1dc_git_ext.o
+   BASIC_CFLAGS += -DSHA1_DC -DDC_SHA1_EXTERNAL
+   EXTLIBS += $(DC_SHA1_LINK) -lsha1detectcoll
+else
 ifdef DC_SHA1_SUBMODULE
LIB_OBJS += sha1collisiondetection/lib/sha1.o
LIB_OBJS += sha1collisiondetection/lib/ubc_check.o
@@ -1492,6 +1503,7 @@ endif
 endif
 endif
 endif
+endif
 
 ifdef SHA1_MAX_BLOCK_SIZE
LIB_OBJS += compat/sha1-chunked.o
diff --git a/hash.h b/hash.h
index bef3e630a093..dce327d58d07 100644
--- a/hash.h
+++ b/hash.h
@@ -8,7 +8,9 @@
 #elif defined(SHA1_OPENSSL)
 #include 
 #elif defined(SHA1_DC)
-#ifdef DC_SHA1_SUBMODULE
+#if defined(DC_SHA1_EXTERNAL)
+#include "sha1dc_git_ext.h"
+#elif defined(DC_SHA1_SUBMODULE)
 #include "sha1collisiondetection/lib/sha1.h"
 #else
 #include "sha1dc/sha1.h"
diff --git a/sha1dc_git_ext.c b/sha1dc_git_ext.c
new file mode 100644
index ..359439fc3d93
--- /dev/null
+++ b/sha1dc_git_ext.c
@@ -0,0 +1,11 @@
+/* Only for DC_SHA1_EXTERNAL; sharing the same hooks as built-in sha1dc */
+
+#include "cache.h"
+#include 
+#include "sha1dc_git.c"
+
+void git_SHA1DCInit(SHA1_CTX *ctx)
+{
+   SHA1DCInit(ctx);
+   SHA1DCSetSafeHash(ctx, 0);
+}
diff --git a/sha1dc_git_ext.h b/sha1dc_git_ext.h
new file mode 100644
index ..d0ea8ce518db
--- /dev/null
+++ b/sha1dc_git_ext.h
@@ -0,0 +1,25 @@
+/*
+ * This file is included by hash.h for DC_SHA1_EXTERNAL
+ */
+
+#include 
+
+/*
+ * Same as SHA1DCInit, but with default save_hash=0
+ */
+void git_SHA1DCInit(SHA1_CTX *);
+
+/*
+ * Same as SHA1DCFinal, but convert collision attack case into a verbose die().
+ */
+void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
+
+/*
+ * Same as SHA1DCUpdate, but adjust types to match git's usual interface.
+ */
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
+
+#define platform_SHA_CTX SHA1_CTX
+#define platform_SHA1_Init git_SHA1DCInit
+#define platform_SHA1_Update git_SHA1DCUpdate
+#define platform_SHA1_Final git_SHA1DCFinal
-- 
2.13.3



Re: Expected behavior of "git check-ignore"...

2017-07-24 Thread John Szakmeister
On Sun, Jul 23, 2017 at 12:33 PM, Philip Oakley  wrote:
[snip]
>>
>>
>> git init .
>> echo 'foo/*' > .gitignore
>> echo '!foo/bar' > .gitignore
>
>
> Is this missing the >> append to get the full two line .gitignore?
> adding in a `cat .gitignore` would help check.

Yes, sorry about that.

>
>> mkdir foo
>> touch foo/bar
>
> I don't think you need these. It's the given pathnames that are checked, not
> the file system content.

It was there so you could see that `git status` ignores foo/bar
(though that wasn't part of the little script).

>> git check-ignore foo/bar
>
> Does this need the `-q` option to set the exit status?

No, it's always set.

> echo $? # to display the status.

Sure.  So, to recap the update reproduction recipe would be:


git init .
echo 'foo/*' > .gitignore
echo '!foo/bar' >> .gitignore
mkdir foo
touch foo/bar
git status # foo/ shows as untracked because bar is present
git check-ignore foo/bar
echo $? # show the exit status


It seems like it should print "1", but it prints "0".

>> I expect the last command to return 1 (no files are ignored), but it
>> doesn't.  The StackOverflow user had the same expectation, and imagine
>> others do as well.  OTOH, it looks like the command is really meant to
>> be a debugging tool--to show me the line in a .gitignore associated
>> with this file, if there is one.  In which case, the behavior is
>> correct but the return code description is a bit misleading (0 means
>> the file is ignored, which isn't true here).
>
>
> Maybe the logic isn't that clear? Maybe it is simply detecting if any one of
> the ignore lines is active, and doesn't reset the status for a negation?
>
> I appear to get the same response as yourself, but I haven't spent much time
> on it - I'm clearing a backlog of work at the moment.

Correct, it appears that if any line in the ignore matches, then it
exits with 0.  So it's not that it's ignored, but that there is a
matching line in an ignore file somewhere.  I can see the logic in
this if it's meant to be a debugging tools, especially combined with
-v.  Simply changing it does affect quite a few tests, but I'm not
sure that it was intentional for negation to be treated this way.

> I also tried the -v -n options, and if I swap the ignore lines around it
> still says line 2 is the one that ignores.
> It gets more interesting if two paths are given `foo/bar foo/baz`, to see
> which line picks up which pathname (and with the swapped ignore lines).
>
> Is there a test for this in the test suite?

There are several.  But line 427, test_expect_success_multi 'nested
include', is one that I think is pretty direct about testing this.  I
imagine what happened is that gitignores used to contain only things
you wanted to ignore and when the ability to negate came along the
semantics of this was never changed--and possibly for good reason.
I'm just wondering if it should change, or if the documentation should
be updated to reflect how it actually behaves (the file may not be
ignored, but a line is present in a gitignore that affects its
status).  The behavior is definitely a little unexpected as it stands,
given the documentation though.

Thanks for taking a look Philip!

-John


Re: [PATCH v3] merge: add a --signoff flag.

2017-07-24 Thread Junio C Hamano
lukaszgryglicki  writes:

> Hi, what is the state of this patch?

I was expecting you to respond to Ævar's <87tw2sbnyl@gmail.com>
to explain how your new version addresses his concerns, or him to
respond to your new patch to say that it addresses his concerns
adequately.  I think neither has happened, so the topic is still in
limbo, I guess...



[PATCH v3 3/3] interpret-trailers: add options for actions

2017-07-24 Thread Paolo Bonzini
From: Paolo Bonzini 

Allow using non-default values for trailers without having to set
them up in .gitconfig first.  For example, if you have the following
configuration

 trailer.signed-off-by.where = end

you may use "--where before" when a patch author forgets his
Signed-off-by and provides it in a separate email.  Likewise for
--if-exists and --if-missing

Reverting to the behavior specified by .gitconfig is done with
--no-where, --no-if-exists and --no-if-missing.

Signed-off-by: Paolo Bonzini 
---
 Documentation/git-interpret-trailers.txt | 23 +++
 builtin/interpret-trailers.c | 32 
 t/t7513-interpret-trailers.sh| 66 
 trailer.c| 27 ++---
 trailer.h|  7 
 5 files changed, 149 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-interpret-trailers.txt 
b/Documentation/git-interpret-trailers.txt
index 31cdeaecd..76d5fdfaf 100644
--- a/Documentation/git-interpret-trailers.txt
+++ b/Documentation/git-interpret-trailers.txt
@@ -80,6 +80,29 @@ OPTIONS
trailer to the input messages. See the description of this
command.
 
+--where ::
+--no-where::
+   Specify where all new trailers will be added.  A setting
+   provided with '--where' overrides all configuration variables
+   and applies to all '--trailer' options until the next occurrence of
+   '--where' or '--no-where'.
+
+--if-exists ::
+--no-if-exists::
+   Specify what action will be performed when there is already at
+   least one trailer with the same  in the message.  A setting
+   provided with '--if-exists' overrides all configuration variables
+   and applies to all '--trailer' options until the next occurrence of
+   '--if-exists' or '--no-if-exists'.
+
+--if-missing ::
+--no-if-missing::
+   Specify what action will be performed when there is no other
+   trailer with the same  in the message.  A setting
+   provided with '--if-missing' overrides all configuration variables
+   and applies to all '--trailer' options until the next occurrence of
+   '--if-missing' or '--no-if-missing'.
+
 CONFIGURATION VARIABLES
 ---
 
diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c
index 8f38fa318..83249e3eb 100644
--- a/builtin/interpret-trailers.c
+++ b/builtin/interpret-trailers.c
@@ -16,6 +16,28 @@ static const char * const git_interpret_trailers_usage[] = {
NULL
 };
 
+static enum trailer_where where;
+static enum trailer_if_exists if_exists;
+static enum trailer_if_missing if_missing;
+
+static int option_parse_where(const struct option *opt,
+ const char *arg, int unset)
+{
+   return trailer_set_where(, arg);
+}
+
+static int option_parse_if_exists(const struct option *opt,
+ const char *arg, int unset)
+{
+   return trailer_set_if_exists(_exists, arg);
+}
+
+static int option_parse_if_missing(const struct option *opt,
+  const char *arg, int unset)
+{
+   return trailer_set_if_missing(_missing, arg);
+}
+
 static void new_trailers_clear(struct list_head *trailers)
 {
struct list_head *pos, *tmp;
@@ -44,6 +66,9 @@ static int option_parse_trailer(const struct option *opt,
 
item = xmalloc(sizeof *item);
item->text = arg;
+   item->where = where;
+   item->if_exists = if_exists;
+   item->if_missing = if_missing;
list_add_tail(>list, trailers);
return 0;
 }
@@ -58,6 +83,13 @@ int cmd_interpret_trailers(int argc, const char **argv, 
const char *prefix)
OPT_BOOL(0, "in-place", _place, N_("edit files in place")),
OPT_BOOL(0, "trim-empty", _empty, N_("trim empty 
trailers")),
 
+   OPT_CALLBACK(0, "where", NULL, N_("action"),
+N_("where to place the new trailer"), 
option_parse_where),
+   OPT_CALLBACK(0, "if-exists", NULL, N_("action"),
+N_("action if trailer already exists"), 
option_parse_if_exists),
+   OPT_CALLBACK(0, "if-missing", NULL, N_("action"),
+N_("action if trailer is missing"), 
option_parse_if_missing),
+
OPT_CALLBACK(0, "trailer", , N_("trailer"),
N_("trailer(s) to add"), option_parse_trailer),
OPT_END()
diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh
index 0c6f91c43..adbdf54f8 100755
--- a/t/t7513-interpret-trailers.sh
+++ b/t/t7513-interpret-trailers.sh
@@ -681,6 +681,36 @@ test_expect_success 'using "where = before"' '
test_cmp expected actual
 '
 
+test_expect_success 'overriding configuration with "--where after"' '
+   git config trailer.ack.where "before" &&
+   cat complex_message_body >expected 

[PATCH v3 1/3] trailers: export action enums and corresponding lookup functions

2017-07-24 Thread Paolo Bonzini
From: Paolo Bonzini 

Separate the mechanical changes out of the next patch.  The functions
are changed to take a pointer to enum, because struct conf_info is not
going to be public.

Set the default values explicitly in default_conf_info, since they are
not anymore close to default_conf_info and it's not obvious which
constant has value 0.  With the next patches, in fact, the values will
not be zero anymore!

Signed-off-by: Paolo Bonzini 
---
 trailer.c | 65 ---
 trailer.h | 22 +
 2 files changed, 55 insertions(+), 32 deletions(-)

diff --git a/trailer.c b/trailer.c
index 751b56c00..f02895373 100644
--- a/trailer.c
+++ b/trailer.c
@@ -10,18 +10,13 @@
  * Copyright (c) 2013, 2014 Christian Couder 
  */
 
-enum action_where { WHERE_END, WHERE_AFTER, WHERE_BEFORE, WHERE_START };
-enum action_if_exists { EXISTS_ADD_IF_DIFFERENT_NEIGHBOR, 
EXISTS_ADD_IF_DIFFERENT,
-   EXISTS_ADD, EXISTS_REPLACE, EXISTS_DO_NOTHING };
-enum action_if_missing { MISSING_ADD, MISSING_DO_NOTHING };
-
 struct conf_info {
char *name;
char *key;
char *command;
-   enum action_where where;
-   enum action_if_exists if_exists;
-   enum action_if_missing if_missing;
+   enum trailer_where where;
+   enum trailer_if_exists if_exists;
+   enum trailer_if_missing if_missing;
 };
 
 static struct conf_info default_conf_info;
@@ -63,7 +58,7 @@ static const char *git_generated_prefixes[] = {
pos != (head); \
pos = is_reverse ? pos->prev : pos->next)
 
-static int after_or_end(enum action_where where)
+static int after_or_end(enum trailer_where where)
 {
return (where == WHERE_AFTER) || (where == WHERE_END);
 }
@@ -201,7 +196,7 @@ static int check_if_different(struct trailer_item *in_tok,
  int check_all,
  struct list_head *head)
 {
-   enum action_where where = arg_tok->conf.where;
+   enum trailer_where where = arg_tok->conf.where;
struct list_head *next_head;
do {
if (same_trailer(in_tok, arg_tok))
@@ -306,7 +301,7 @@ static void apply_arg_if_exists(struct trailer_item *in_tok,
 static void apply_arg_if_missing(struct list_head *head,
 struct arg_item *arg_tok)
 {
-   enum action_where where;
+   enum trailer_where where;
struct trailer_item *to_add;
 
switch (arg_tok->conf.if_missing) {
@@ -331,7 +326,7 @@ static int find_same_and_apply_arg(struct list_head *head,
struct trailer_item *in_tok;
struct trailer_item *on_tok;
 
-   enum action_where where = arg_tok->conf.where;
+   enum trailer_where where = arg_tok->conf.where;
int middle = (where == WHERE_AFTER) || (where == WHERE_BEFORE);
int backwards = after_or_end(where);
struct trailer_item *start_tok;
@@ -373,44 +368,44 @@ static void process_trailers_lists(struct list_head *head,
}
 }
 
-static int set_where(struct conf_info *item, const char *value)
+int trailer_set_where(enum trailer_where *item, const char *value)
 {
if (!strcasecmp("after", value))
-   item->where = WHERE_AFTER;
+   *item = WHERE_AFTER;
else if (!strcasecmp("before", value))
-   item->where = WHERE_BEFORE;
+   *item = WHERE_BEFORE;
else if (!strcasecmp("end", value))
-   item->where = WHERE_END;
+   *item = WHERE_END;
else if (!strcasecmp("start", value))
-   item->where = WHERE_START;
+   *item = WHERE_START;
else
return -1;
return 0;
 }
 
-static int set_if_exists(struct conf_info *item, const char *value)
+int trailer_set_if_exists(enum trailer_if_exists *item, const char *value)
 {
if (!strcasecmp("addIfDifferent", value))
-   item->if_exists = EXISTS_ADD_IF_DIFFERENT;
+   *item = EXISTS_ADD_IF_DIFFERENT;
else if (!strcasecmp("addIfDifferentNeighbor", value))
-   item->if_exists = EXISTS_ADD_IF_DIFFERENT_NEIGHBOR;
+   *item = EXISTS_ADD_IF_DIFFERENT_NEIGHBOR;
else if (!strcasecmp("add", value))
-   item->if_exists = EXISTS_ADD;
+   *item = EXISTS_ADD;
else if (!strcasecmp("replace", value))
-   item->if_exists = EXISTS_REPLACE;
+   *item = EXISTS_REPLACE;
else if (!strcasecmp("doNothing", value))
-   item->if_exists = EXISTS_DO_NOTHING;
+   *item = EXISTS_DO_NOTHING;
else
return -1;
return 0;
 }
 
-static int set_if_missing(struct conf_info *item, const char *value)
+int trailer_set_if_missing(enum trailer_if_missing *item, const char *value)
 {
if (!strcasecmp("doNothing", value))
-   

[PATCH v3 2/3] trailers: introduce struct new_trailer_item

2017-07-24 Thread Paolo Bonzini
From: Paolo Bonzini 

This will provide a place to store the current state of the
--where, --if-exists and --if-missing options.

Signed-off-by: Paolo Bonzini 
---
 builtin/interpret-trailers.c | 41 +
 trailer.c| 21 -
 trailer.h| 14 +-
 3 files changed, 62 insertions(+), 14 deletions(-)

diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c
index 175f14797..8f38fa318 100644
--- a/builtin/interpret-trailers.c
+++ b/builtin/interpret-trailers.c
@@ -16,17 +16,50 @@ static const char * const git_interpret_trailers_usage[] = {
NULL
 };
 
+static void new_trailers_clear(struct list_head *trailers)
+{
+   struct list_head *pos, *tmp;
+   struct new_trailer_item *item;
+
+   list_for_each_safe(pos, tmp, trailers) {
+   item = list_entry(pos, struct new_trailer_item, list);
+   list_del(pos);
+   free(item);
+   }
+}
+
+static int option_parse_trailer(const struct option *opt,
+  const char *arg, int unset)
+{
+   struct list_head *trailers = opt->value;
+   struct new_trailer_item *item;
+
+   if (unset) {
+   new_trailers_clear(trailers);
+   return 0;
+   }
+
+   if (!arg)
+   return -1;
+
+   item = xmalloc(sizeof *item);
+   item->text = arg;
+   list_add_tail(>list, trailers);
+   return 0;
+}
+
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix)
 {
int in_place = 0;
int trim_empty = 0;
-   struct string_list trailers = STRING_LIST_INIT_NODUP;
+   LIST_HEAD(trailers);
 
struct option options[] = {
OPT_BOOL(0, "in-place", _place, N_("edit files in place")),
OPT_BOOL(0, "trim-empty", _empty, N_("trim empty 
trailers")),
-   OPT_STRING_LIST(0, "trailer", , N_("trailer"),
-   N_("trailer(s) to add")),
+
+   OPT_CALLBACK(0, "trailer", , N_("trailer"),
+   N_("trailer(s) to add"), option_parse_trailer),
OPT_END()
};
 
@@ -43,7 +76,7 @@ int cmd_interpret_trailers(int argc, const char **argv, const 
char *prefix)
process_trailers(NULL, in_place, trim_empty, );
}
 
-   string_list_clear(, 0);
+   new_trailers_clear();
 
return 0;
 }
diff --git a/trailer.c b/trailer.c
index f02895373..aed5fa1f8 100644
--- a/trailer.c
+++ b/trailer.c
@@ -669,14 +669,15 @@ static void add_arg_item(struct list_head *arg_head, char 
*tok, char *val,
 }
 
 static void process_command_line_args(struct list_head *arg_head,
- struct string_list *trailers)
+ struct list_head *new_trailer_head)
 {
-   struct string_list_item *tr;
+   struct new_trailer_item *tr;
struct arg_item *item;
struct strbuf tok = STRBUF_INIT;
struct strbuf val = STRBUF_INIT;
const struct conf_info *conf;
struct list_head *pos;
+   const char *string;
 
/*
 * In command-line arguments, '=' is accepted (in addition to the
@@ -695,18 +696,19 @@ static void process_command_line_args(struct list_head 
*arg_head,
}
 
/* Add an arg item for each trailer on the command line */
-   for_each_string_list_item(tr, trailers) {
-   int separator_pos = find_separator(tr->string, cl_separators);
+   list_for_each(pos, new_trailer_head) {
+   tr = list_entry(pos, struct new_trailer_item, list);
+   string = tr->text;
+   int separator_pos = find_separator(string, cl_separators);
if (separator_pos == 0) {
struct strbuf sb = STRBUF_INIT;
-   strbuf_addstr(, tr->string);
+   strbuf_addstr(, string);
strbuf_trim();
error(_("empty trailer token in trailer '%.*s'"),
  (int) sb.len, sb.buf);
strbuf_release();
} else {
-   parse_trailer(, , , tr->string,
- separator_pos);
+   parse_trailer(, , , string, separator_pos);
add_arg_item(arg_head,
 strbuf_detach(, NULL),
 strbuf_detach(, NULL),
@@ -969,7 +971,8 @@ static FILE *create_in_place_tempfile(const char *file)
return outfile;
 }
 
-void process_trailers(const char *file, int in_place, int trim_empty, struct 
string_list *trailers)
+void process_trailers(const char *file, int in_place, int trim_empty,
+ struct list_head *new_trailer_head)
 {
LIST_HEAD(head);

Re: [PATCH/RFC] rebase: make resolve message clearer for inexperienced users

2017-07-24 Thread Phillip Wood
On 16/07/17 12:39, Philip Oakley wrote:
> 
> From: "Junio C Hamano" 
> Sent: Wednesday, July 12, 2017 10:29 PM
>> William Duclot  writes:
>>
  - The original said "When you have resolved this problem", without
giving a guidance how to resolve, and without saying what the
problem is.  The updated one says "conflict" to clarify the
"problem", and suggests "git add" as the tool to use after a
manual resolition.

Modulo that there are cases where "git rm" is the right tool, the
updated one is strict improvement.
>>>
>>> I also wrote "" when there could be several. Maybe
>>> 'mark it as resolved with "git add/rm"' would be a better (and shorter)
>>> formulation?
>>
>> Another potential source of confusion is if we are seeing "a"
>> conflict, or multiple ones.  I'd say it is OK to treat the whole
>> thing as "a conflict" that Git needs help with by the user editing
>> multiple files and using multiple "git add" or "git rm".  So "mark
>> it as resolved with 'git add/rm'" is fine, I would think, but
>> anything that I say about UI's understandability to new people needs
>> to be taken with a large grain of salt ;-).
>>
>>> ... I feel like a lot of git messages could be improved this way
>>> to offer a UI more welcoming to inexperienced user (which is a
>>> *broad* segment of users). But I am not aware of the cost of
>>> translation of this kind of patch: would several patches like this
>>> one be welcomed?
>>
>> Surely, as long as I can depend on other reviewers who are more
>> passionate about end-user experience than I am, I'll take such
>> patches with their help.
>>
>> Thanks.
> 
> One of the other confusions I had / have (and I have a saved note to
> remind me) is when rebase stops with a conflict, and asks the user to
> "fix" it, then ues "--continue".
> 
> I always expect that Git will do the 'add' of the resolved conflict
> because that is what it would do normally as the next step after the merge.
> 
> I also had a similar issue with the --allow-empty case of 'nothing added
> to commit but untracked files present' where I had been expecting the
> commit to be simply omitted. You have to go through a reset dance before
> continuing.
> 
> Philip
> [I'll be off line till Friday]

git rebase --continue requiring one to git add first confuses/annoys me
too. I started a patch to autostage unstaged changes if they don't
contain conflict markers a couple of weeks ago, I'll clean it up and
post it later this week.

I also find it confusing that it asks me to edit the commit message for
picks, fixups and non-final squashes after conflicts. I can see that
perhaps one might want to amend the message to reflect any changes that
were made while resolving the conflicts but I've never had too. I'd
rather be able to pass --edit to rebase --continue if I needed to edit
the message in those cases. Looking through the code I think it would
require saving some extra state when rebase bails out on conflicts so
rebase --continue could tell if it should be asking the user to amend
the message.

Best Wishes

Phillip


[PATCH v3 0/3] interpret-trailers: add --where, --if-exists, --if-missing

2017-07-24 Thread Paolo Bonzini
From: Paolo Bonzini 

These options are useful to experiment with "git interpret-trailers"
without having to tinker with .gitconfig (Junio said git should ahve
done this first and only added configuration afterwards).  It can
be useful in the case where you want a different placement for the trailer,
or for scripts/aliases that don't want to rely on specific .gitconfig
settings.

Compared to v2, the main change is that option order on the command-line
is respected.  That is,

--trailer 'acked-by: foo' --where end --trailer 'signed-off-by: me'

will only apply where=end to the second trailer.  Likewise,

--where end --trailer 'signed-off-by: me' --no-where \
--trailer 'acked-by: foo'

will only apply it to the first, reverting to trailer.*.where for the
"acked-by" trailer.

Paolo

v1->v2: support --no-* options, minor code fixes

v2->v3: largely rewritten to respect option order on the command-line;
keep trailer.h namespace clean (Christian)

Paolo Bonzini (3):
  trailers: export action enums and corresponding lookup functions
  trailers: introduce struct new_trailer_item
  interpret-trailers: add options for actions

 Documentation/git-interpret-trailers.txt |  24 +++
 builtin/interpret-trailers.c |  73 ++--
 t/t7513-interpret-trailers.sh|  66 ++
 trailer.c| 113 ++-
 trailer.h|  43 +++-
 5 files changed, 267 insertions(+), 52 deletions(-)

-- 
2.13.3



Re: git gc seems to break --symbolic-full-name

2017-07-24 Thread Stas Sergeev

24.07.2017 07:02, Jacob Keller пишет:

generally, I'd suggest using "git describe" to output a version based
on tag, and as part of your build system set that in some sort of
--version output of some kind.

I am not sure I understand that suggestion.
I can use 'git describe' (and that seems to be
what linux kernel does too), but I don't see
how can I get away without a temporary file,
maually comparing current 'git describe' with
the one stored, and updating that file in case
of a mismatch.
Could you please clarify? Maybe I am missing
some makefile trick which you assume I am
aware about. :)