date:20170814

Re: [PATCH v2] commit: skip discarding the index if there is no pre-commit hook

2017-08-14 Thread Jeff King

On Tue, Aug 15, 2017 at 04:23:50AM +, Kevin Willford wrote:

> > This read_cache_from() should be a noop, right, because it immediately
> > sees istate->initialized is set? So it shouldn't matter that it is not
> > in the conditional with discard_cache(). Though if its only purpose is
> > to re-read the just-discarded contents, perhaps it makes sense to put it
> > there for readability.
> 
> I thought about that and didn't know if there were cases when this would be 
> called
> and the cache has not been loaded.  It didn't look like it since it is only 
> called from 
> cmd_commit and prepare_index is called before it.  Also if in the future this 
> call would
> be made when it had not read the index yet so thought it was safest just to 
> leave
> this as always being called since it is basically a noop if the 
> istate->initialized is set.

Yeah, I agree it might be safer as you have it. And hopefully the
comment above the discard would point anybody in the right direction.

> Also based on this commit
> https://github.com/git/git/commit/2888605c649ccd423232161186d72c0e6c458a48
> it looked like the discard_cache was added independent of the read_cache_from 
> call,
> which made me think that the two were not tied together.

Yeah, I tried to dig in the history and figure it out, but it was not
nearly as clear as I would have liked. :)

-Peff

RE: [PATCH v2] commit: skip discarding the index if there is no pre-commit hook

2017-08-14 Thread Kevin Willford

> On Mon, Aug 14, 2017 at 03:54:25PM -0600, Kevin Willford wrote:
> 
> > If there is not a pre-commit hook, there is no reason to discard
> > the index and reread it.
> >
> > This change checks to presence of a pre-commit hook and then only
> > discards the index if there was one.
> >
> > Signed-off-by: Kevin Willford 
> > ---
> >  builtin/commit.c | 15 +--
> >  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> Thanks, this looks nice and simple.
> 
> > diff --git a/builtin/commit.c b/builtin/commit.c
> > index e7a2cb6285..ab71b93518 100644
> > --- a/builtin/commit.c
> > +++ b/builtin/commit.c
> > @@ -940,12 +940,15 @@ static int prepare_to_commit(const char *index_file,
> const char *prefix,
> > return 0;
> > }
> >
> > -   /*
> > -* Re-read the index as pre-commit hook could have updated it,
> > -* and write it out as a tree.  We must do this before we invoke
> > -* the editor and after we invoke run_status above.
> > -*/
> > -   discard_cache();
> > +   if (!no_verify && find_hook("pre-commit")) {
> > +   /*
> > +* Re-read the index as pre-commit hook could have updated it,
> > +* and write it out as a tree.  We must do this before we invoke
> > +* the editor and after we invoke run_status above.
> > +*/
> > +   discard_cache();
> > +   }
> > +
> > read_cache_from(index_file);
> 
> This read_cache_from() should be a noop, right, because it immediately
> sees istate->initialized is set? So it shouldn't matter that it is not
> in the conditional with discard_cache(). Though if its only purpose is
> to re-read the just-discarded contents, perhaps it makes sense to put it
> there for readability.
> 
> -Peff

I thought about that and didn't know if there were cases when this would be 
called
and the cache has not been loaded.  It didn't look like it since it is only 
called from 
cmd_commit and prepare_index is called before it.  Also if in the future this 
call would
be made when it had not read the index yet so thought it was safest just to 
leave
this as always being called since it is basically a noop if the 
istate->initialized is set.

Also based on this commit
https://github.com/git/git/commit/2888605c649ccd423232161186d72c0e6c458a48
it looked like the discard_cache was added independent of the read_cache_from 
call,
which made me think that the two were not tied together.

Kevin

Re: reftable [v5]: new ref storage format

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 04:05:05PM +, David Turner wrote:

> > All that aside, we could simply add an EXCLUSIVE open-flag to LMDB, and
> > prevent multiple processes from using the DB concurrently. In that case,
> > maintaining coherence with other NFS clients is a non-issue. It strikes me 
> > that git
> > doesn't require concurrent multi-process access anyway, and any particular
> > process would only use the DB for a short time before closing it and going 
> > away.
> 
> Git, in general, does require concurrent multi-process access, depending on 
> what 
> that means.
> 
> For example, a post-receive hook might call some git command which opens the 
> ref database.  This means that git receive-pack would have to close and 
> re-open the ref database.  More generally, a fair number of git commands are
> implemented in terms of other git commands, and might need the same treatment.
> We could, in general, close and re-open the database around fork/exec, but I 
> am
> not sure that this solves the general problem -- by mere happenstance, one 
> might
> be e.g. pushing in one terminal while running git checkout in another.  This 
> is 
> especially true with git worktrees, which share one ref database across 
> multiple
> working directories.

Yeah, I'd agree that git's multi-process way of working would probably
cause some headaches if there were a broad lock.

I had the impression that Howard meant we would lock for _read_
operations, too. If so, I think that's probably going to cause a
noticeable performance problem for servers.  A repository which is
serving fetches to a lot of clients (even if some of those are noops)
has to send the current ref state out to each client. I don't think we'd
want to add a serial bottleneck to that portion of each process, which
can otherwise happen totally in parallel.

Serializing writes is probably not so big a deal as long as it is kept
to the portion where the process is actively writing out values. And as
long as there's a reasonable backoff/retry protocol; right now we don't
generally bother retrying ref locks because they're taken individually,
so racing on a lock almost certainly[1] means that you've lost the
sha1-lease and need to restart the larger operation.

-Peff

[1] Actually, we've found this isn't always true. Things like ref
packing require taking locks for correctness, which means they can
interfere with actual ref updates. That's yet another thing it would
be nice to get rid of when moving away from the loose/packed
storage.

Re: [PATCH v2 0/2] clang-format

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 08:03:19PM -0700, Junio C Hamano wrote:

> > I'm tempted to say that "config --list" should normalize this case into:
> >
> >   mysection.mykey=true
> >
> > Normally we avoid coercing values without knowing the context in which
> > they'll be used. But the syntax in the original file means the user is
> > telling us it's a boolean and they expect it to be treated that way.
> >
> > The only downside is if the user is wrong, it might be coerced into
> > the string "true" instead of throwing an error. That seems like a minor
> > drawback for eliminating a potentially confusing corner case from the
> > plumbing output.
> 
> Because we cannot sensibly always normalize a variable set to 'yes',
> 'on', etc. to all "true", the degree it can help the reading scripts
> is quite limited, as they need to be prepared to see other
> representation of the truth values anyway.  Even though I too found
> the approach somewhat tempting, because there is no ambiguity in
> "[section] var" that it means a boolean "true", I doubt it would
> help very much.

Good point. This is the only case that is _syntactically_ a problem at
the key/value level, which is why we noticed it (the reader barfed on
the unknown input). But that same reader could be interpreting values
incorrectly and we'd have no idea. And that applies to types beyond
booleans (you showed numbers like "2k" below, but there are others, like
--path or --get-color).

The one nice thing about fixing this syntactic issue is that the current
behavior affected this reader even though it didn't care about
particular config key in question. I.e., in this output:

  my.boolVal
  my.intVal=2k
  my.valueWeCareAbout=some string

if we don't care about the meaning of boolVal or intVal, we could still
parse this output fine if not for the syntactic irregularity of
my.boolVal.

That's what might tempt me to fix this independent of the deeper
problem. But I really think we should try to address that deeper
problem.

> The way they pass "non_string_options" dict to the loader is quite
> sensible for that purpose, as it allows the reader to say "I care
> about this and that variables, and I know I want them interpreted as
> int (e.g. 1M) and bool (e.g. 'on') and returned in a normalized
> form".
> 
> I do not mind adding "git config --list --autotype" option, though,
> with which the reading script tells us that it accepts the chance of
> false conversion, so that
> [...]

I think an "--autotype" is always going to have false positives.
Integers and booleans we can make guesses at. But "--path" or "--color"
are much tougher.

The right answer is to make it easier for that non_string_options
information to make it to git-config so it can do the interpretation for
the caller. The way that happens now is:

  git config --int my.intVal
  git config --bool my.boolVal
  git config --path my.pathVal

and so on. But I think one reason people turn to --list is that it
requires only a single process invocation, rather than repeatedly
calling git-config for each variable which might be of interest.  I know
that diff-highlight's startup is measurably slower due to the six "git
config --get-color" calls it must make, and I've been looking for a way
to do it with a single invocation.

I suspect we need a "--get-stdin" which can accept (key,type) tuples and
return the result over stdout. And in some cases it's more than just a
pair; for example, colors need an extra "parse this default" argument).

-Peff

Re: [PATCH v2 0/2] clang-format

2017-08-14 Thread Junio C Hamano

Jeff King  writes:

> On Mon, Aug 14, 2017 at 04:15:40PM -0700, Stefan Beller wrote:
>
>> + llvm-...@lists.llvm.org
>> 
>> The Git community is currently discussing adopting a coding style
>> defined by clang-format, here is a bug report:
>
> Since we've added a cc, let me try to give a little more context.

Thanks for relaying; I've dropped them from CC: for this message, as
the remainder is not of interest to them.

> I'm tempted to say that "config --list" should normalize this case into:
>
>   mysection.mykey=true
>
> Normally we avoid coercing values without knowing the context in which
> they'll be used. But the syntax in the original file means the user is
> telling us it's a boolean and they expect it to be treated that way.
>
> The only downside is if the user is wrong, it might be coerced into
> the string "true" instead of throwing an error. That seems like a minor
> drawback for eliminating a potentially confusing corner case from the
> plumbing output.

Because we cannot sensibly always normalize a variable set to 'yes',
'on', etc. to all "true", the degree it can help the reading scripts
is quite limited, as they need to be prepared to see other
representation of the truth values anyway.  Even though I too found
the approach somewhat tempting, because there is no ambiguity in
"[section] var" that it means a boolean "true", I doubt it would
help very much.

The way they pass "non_string_options" dict to the loader is quite
sensible for that purpose, as it allows the reader to say "I care
about this and that variables, and I know I want them interpreted as
int (e.g. 1M) and bool (e.g. 'on') and returned in a normalized
form".

I do not mind adding "git config --list --autotype" option, though,
with which the reading script tells us that it accepts the chance of
false conversion, so that

[my]
intVal = 2k
boolVal
someVal = on
otherVal = 1
moreVal = 2
anotherVal = 0
no = no

might be listed as

my.intval=2048
my.boolval=true
my.someval=true
my.otherval=1
my.moreval=2
my.anotherval=0
my.no=false

Disambiguation rules are up to debate; the above illustrates an
extreme position that coerses anything that could be taken as an int
and a string that can be taken as a bool as such, but it may not
help very much if the reader wants to see boolean values.

Re: Bug?: git archive exclude pathspec and gitattributes export-ignore

2017-08-14 Thread David Adam

On Mon, 14 Aug 2017, René Scharfe wrote:
> Am 13.08.2017 um 06:53 schrieb David Adam:
> > I think I have a bug in git (tested 2.11.0 on Debian 8, 2.14.1 on OS X and
> > 2.14.1.145.gb3622a4 on OS X).
> > 
> > Given a repository with an export-ignore directive for a subdirectory in
> > .gitattributes, `git archive` with a pathspec that excludes a different
> > subdirectory produces no output file and git exits with -1 as the return
> > status.
> 
> Thanks for the thoughtful bug report!
> 
> The problem seems to be that archive.c::write_archive_entry() returns 0
> instead of READ_TREE_RECURSIVE for directories with the attribute
> "export-ignore", and archive.c::write_directory() gets caught by
> surprise by that and returns -1, which ends up causing git archive to
> exit with return code 255 without actually writing anything.
> 
> This should only happen if you use wildcards like "*", i.e. git archive
> should behave as expected if you spell out the full name of the
> directory.  Can you confirm that?

Yes - that's definitely the case.

The reason I am trying to use the wildcard is that using an ":(exclude)b" 
pathspec excludes the contents of, but not the actual b directory 
itself:

> git archive HEAD ':(top)' ':(exclude)b' | tar t
.gitattributes
b/

Whereas I would like to export the archive without the b directory 
entirely.

> The real solution is probably to teach tree-walk.c::do_match() how to
> handle attributes and then inject ":(attr:-export-ignore)" as a default
> internal pathspec in archive.c::parse_pathspec_arg() instead of handling
> it in archive.c::write_archive_entry().

Many thanks

David Adam
zanc...@ucc.gu.uwa.edu.au

Re: [PATCH v3 3/3] diff: define block by number of non-space chars

2017-08-14 Thread Stefan Beller

On Mon, Aug 14, 2017 at 4:57 PM, Jonathan Tan  wrote:
> The existing behavior of diff --color-moved=zebra does not define the
> minimum size of a block at all, instead relying on a heuristic applied
> later to filter out sets of adjacent moved lines that are shorter than 3
> lines long. This can be confusing, because a block could thus be colored
> as moved at the source but not at the destination (or vice versa),
> depending on its neighbors.
>
> Instead, teach diff that the minimum size of a block is 10
> non-whitespace characters. This allows diff to still exclude
> uninteresting lines appearing on their own (such as those solely
> consisting of one or a few closing braces), as was the intention of the
> adjacent-moved-line heuristic.

After some thought, I really like this heuristic, however allow me
a moment to bikeshed 10 as a number here.

One could think that 10 equals roughly 3 lines a 3 characters and
in C based languages the shortest meaningful lines have more than
3 characters ("i++;", "a();", "int i;" have 4 or 5 each), but I would still
think that 10 is too much, as we'd want to detect the closing braces
in their own lines.

>  dimmed_zebra::
> Similar to 'zebra', but additional dimming of uninteresting parts
> of moved code is performed. The bordering lines of two adjacent
> diff --git a/diff.c b/diff.c
> index f598d8a3a..305ce4126 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -864,19 +864,28 @@ static int shrink_potential_moved_blocks(struct 
> moved_entry **pmb,
>  /*
>   * If o->color_moved is COLOR_MOVED_PLAIN, this function does nothing.
>   *
> - * Otherwise, if the last block has fewer lines than
> - * COLOR_MOVED_MIN_BLOCK_LENGTH, unset DIFF_SYMBOL_MOVED_LINE on all lines in
> - * that block.
> + * Otherwise, if the last block has fewer non-space characters than
> + * COLOR_MOVED_MIN_NON_SPACE_COUNT, unset DIFF_SYMBOL_MOVED_LINE on all lines
> + * in that block.
>   *
>   * The last block consists of the (n - block_length)'th line up to but not
>   * including the nth line.
>   */
>  static void adjust_last_block(struct diff_options *o, int n, int 
> block_length)
>  {
> -   int i;
> -   if (block_length >= COLOR_MOVED_MIN_BLOCK_LENGTH ||
> -   o->color_moved == COLOR_MOVED_PLAIN)
> +   int i, non_space_count = 0;
> +   if (o->color_moved == COLOR_MOVED_PLAIN)
> return;
> +   for (i = 1; i < block_length + 1; i++) {
> +   const char *c = o->emitted_symbols->buf[n - i].line;
> +   for (; *c; c++) {
> +   if (isspace(*c))
> +   continue;
> +   non_space_count++;
> +   if (non_space_count >= 
> COLOR_MOVED_MIN_NON_SPACE_COUNT)
> +   return;

When we do this counting, we could count the lines ourselves here as well.
`n-block_count` should be equal to the line that has a different
(flags & (DIFF_SYMBOL_MOVED_LINE | DIFF_SYMBOL_MOVED_LINE_ALT))
pattern than those before. (although we'd also have to check for i > 0, too)
Your choice.

> +   }
> +   }
> for (i = 1; i < block_length + 1; i++)
> o->emitted_symbols->buf[n - i].flags &= 
> ~DIFF_SYMBOL_MOVED_LINE;
>  }
> @@ -923,7 +932,6 @@ static void mark_color_as_moved(struct diff_options *o,
> }
>
> l->flags |= DIFF_SYMBOL_MOVED_LINE;
> -   block_length++;
>
> if (o->color_moved == COLOR_MOVED_PLAIN)
> continue;
> @@ -953,8 +961,13 @@ static void mark_color_as_moved(struct diff_options *o,
> }
>
> flipped_block = (flipped_block + 1) % 2;
> +
> +   adjust_last_block(o, n, block_length);
> +   block_length = 0;
> }
>
> +   block_length++;
> +
> if (flipped_block)
> l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT;
> }
> diff --git a/diff.h b/diff.h
> index 5755f465d..9e2fece5b 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -195,7 +195,7 @@ struct diff_options {
> COLOR_MOVED_ZEBRA_DIM = 3,
> } color_moved;
> #define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA
> -   #define COLOR_MOVED_MIN_BLOCK_LENGTH 3
> +   #define COLOR_MOVED_MIN_NON_SPACE_COUNT 10
>  };
>
>  void diff_emit_submodule_del(struct diff_options *o, const char *line);
> diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
> index 6f7758e5c..d8e7b77b9 100755
> --- a/t/t4015-diff-whitespace.sh
> +++ b/t/t4015-diff-whitespace.sh
> @@ -1101,9 +1101,9 @@ test_expect_success 'detect malicious moved code, 
> inside file' '
> -{
> -if (!u->is_allowed_foo)
> -return;
> -   -foo(u);
> -   -}
> -   -
> +   -foo(u);
> +   -}
> +   -

Here we have 2 blocks, the first has 7 character,
which we may want to

Re: [PATCH v2 2/2] Makefile: add style build rule

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 04:47:45PM -0700, Junio C Hamano wrote:

> > By the way, I do not know which vintage of /usr/bin/git-clang-format
> > I happen to have on my box, but I needed a crude workaround patch
> > (attached at the end) ...
> 
> I guess you hit the same thing while our messages crossing ;-)

Yep. Our solutions were at opposite ends of the spectrum, though. :)

> > As to what it does, the first example I tried may not have been a
> > great one.  I got this:
> >
> > git clang-format --style file --diff --extensions c,h
> > diff --git a/cache.h b/cache.h
> > index 73e0085186..6462fe25bc 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1498,11 +1498,8 @@ struct checkout {
> > const char *base_dir;
> > int base_dir_len;
> > struct delayed_checkout *delayed_checkout;
> > -   unsigned force:1,
> > -quiet:1,
> > -not_new:1,
> > -   a_new_field:1,
> > -refresh_cache:1;
> > +   unsigned force : 1, quiet : 1, not_new : 1, a_new_field : 1,
> > +   refresh_cache : 1;
> >  };
> >  #define CHECKOUT_INIT { NULL, "" }
> >  
> > which is not wrong per-se, but I have a mixed feelings.  I do not
> > want it to complain if the original tried to fit many items on a
> > single line, but if the original wanted to have one item per line,
> > I'd rather see it kept as-is.
> 
> To clarify, the above is after I added a_new_field that is one-bit
> wide without doing anything else.  I do not mind the checker
> complaining the existing force, quiet, etc. whose widths are all
> spelled without SP around ':', because they appear near-by, as a
> collateral damage.  My only gripe is that the result got squished
> into a single line.

Yes, agreed. My personal rule with a list like this is often "once you
have to start breaking across multiple lines, you should put one per
line". I don't know if there's a way to codify that in clang-format,
though.

The case I fed it (which is just nonsense I made up that does not fit
our style) also left me a bit confused at first, but I think it was
because the .clang-format parser was bailing as soon as it found an
unrecognized entry, but then formatting according to bogus rules. With
the original file from Brandon I got:

diff --git a/builtin/merge-file.c b/builtin/merge-file.c
index 7e8371670b..8994450e0c 100644
--- a/builtin/merge-file.c
+++ b/builtin/merge-file.c
@@ -21,10 +21,7 @@ static int label_cb(const struct option *opt, const char 
*arg, int unset)
return 0;
 }
 
-static
-int foo(void* bar,int baz) {
-   /* nothing */
-}
+static int foo(void *bar, int baz) { /* nothing */ }
 
 int cmd_merge_file(int argc, const char **argv, const char *prefix)
 {

which is clearly not our style. And then after removing the entries I
mentioned elsewhere, I get:

diff --git a/builtin/merge-file.c b/builtin/merge-file.c
index 7e8371670b..574ba6d86f 100644
--- a/builtin/merge-file.c
+++ b/builtin/merge-file.c
@@ -21,8 +21,8 @@ static int label_cb(const struct option *opt, const char 
*arg, int unset)
return 0;
 }
 
-static
-int foo(void* bar,int baz) {
+static int foo(void *bar, int baz)
+{
/* nothing */
 }

which looks right. So you might want to double-check that it was
respecting our settings, and there were no warnings to stderr.

-Peff

Re: [PATCH v2 0/2] clang-format

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 04:15:40PM -0700, Stefan Beller wrote:

> + llvm-...@lists.llvm.org
> 
> The Git community is currently discussing adopting a coding style
> defined by clang-format, here is a bug report:

Since we've added a cc, let me try to give a little more context.

> > One more oddity I found while playing with this that Git folks might run
> > into:
> >
> >   $ git init tmp && cd tmp
> >   $ git commit --allow-empty -m foo
> >   $ echo "[mysection]mykey" >>.git/config
> >   $ git clang-format-5.0
> >   Traceback (most recent call last):
> > File "/usr/bin/git-clang-format-5.0", line 579, in 
> >   main()
> > File "/usr/bin/git-clang-format-5.0", line 62, in main
> >   config = load_git_config()
> > File "/usr/bin/git-clang-format-5.0", line 194, in load_git_config
> >   name, value = entry.split('\n', 1)
> >   ValueError: need more than 1 value to unpack
> >
> >   $ sed -i 's/mykey/&=true/' .git/config
> >   $ git clang-format-5.0
> >   no modified files to format
> >
> > So it looks like they do their own config parsing and it's not quite
> > compatible. :(

In Git's config files, doing this:

  [mysection]
  mykey

is a shorthand for setting mysection.mkykey to "true". And the output
from "git config --list" will show just the keyname without a value,
like:

  mysection.mykey

instead of:

  some.key=this one has a value

There's a possible patch elsewhere in the thread:

  https://public-inbox.org/git/xmqqzib1sp6z@gitster.mtv.corp.google.com/

I'm happy to see it is running "git config --list", which means it's
responding to syntactic funny-ness in the output of that command, not in
the original config (and other features like includes should Just Work
without the script caring).

I'm tempted to say that "config --list" should normalize this case into:

  mysection.mykey=true

Normally we avoid coercing values without knowing the context in which
they'll be used. But the syntax in the original file means the user is
telling us it's a boolean and they expect it to be treated that way.

The only downside is if the user is wrong, it might be coerced into
the string "true" instead of throwing an error. That seems like a minor
drawback for eliminating a potentially confusing corner case from the
plumbing output.

-Peff

Re: [RFC] clang-format: outline the git project's coding style

2017-08-14 Thread Jonathan Nieder

Hi,

brian m. carlson wrote:
>> On Wed, Aug 09, 2017 at 03:53:17PM -0700, Stefan Beller wrote:

>>> We may have different opinions on what is readable/beautiful code.
>>> If we were to follow a mutual agreed style that is produced by a tool,
>>> we could use clean/smudge filters with different settings each.

I think this is a long way away --- long enough away that by the time
such a change could be a serious possibility, a lot may have changed
and the project is likely to know a lot more.  In other words, I don't
see speculating about that future as being likely to produce useful
results today.

It would be a different story if we were writing a new codebase from
scratch.  In that case, I would be all for the gofmt approach. :)

> On Wed, Aug 09, 2017 at 07:19:00PM -0400, Jeff King wrote:

>> I'm less worried about a difference of opinion between humans. My
>> concern is that there are cases that the tool's formatting makes _worse_
>> than what any human would write. And either we accept ugly code because
>> the tool sucks, or we spend a bunch of time fighting with the tool to
>> try to make its output look good.
>
> This has been my issue with clang-format in the past.  I have an SHA-256
> implementation with an array of 64 32-bit hex integers.  These fit six
> to a line, but for neatness and consistency reasons, I'd like them four
> to a line (4 divides 64, but 6 does not).  Last I checked, clang-format
> didn't allow me that option: it reordered them because it could fit six
> on a line.  This is not the only issue I discovered, just the most
> memorable.

In case it comes up again for you in a project that has adopted the
gofmt approach: you can signify that your line breaks are intentional
by putting line comments at the end of each line and clang-format will
respect them.

The clang-format documentation also mentions[1] that you can do

  /* clang-format off */
  const double kIdentityMatrix[] = {
1, 0, 0,
0, 1, 0,
0, 0, 1,
  };
  /* clang-format on */

Thanks,
Jonathan

[1] 
http://clang.llvm.org/docs/ClangFormatStyleOptions.html#disabling-formatting-on-a-piece-of-code

Re: [PATCH] t1002: stop using sum(1)

2017-08-14 Thread Jonathan Nieder

Hi,

René Scharfe wrote:

> sum(1) is a command for calculating checksums of the contents of files.
> It was part of early editions of Unix ("Research Unix", 1972/1973, [1]).
> cksum(1) appeared in 4.4BSD (1993) as a replacement [2], and became part
> of POSIX.1-2008 [3].  OpenBSD 5.6 (2014) removed sum(1).
>
> We only use sum(1) in t1002 to check for changes in three files.  On
> MinGW we use md5sum(1) instead.  We could switch to the standard command
> cksum(1) for all platforms; MinGW comes with GNU coreutils now, which
> provides sum(1), cksum(1) and md5sum(1).  Use our standard method for
> checking for file changes instead: test_cmp.
>
> It's more convenient because it shows differences nicely, it's faster on
> MinGW because we have a special implementation there based only on
> shell-internal commands, it's simpler as it allows us to avoid stripping
> out unnecessary entries from the checksum file using grep(1), and it's
> more consistent with the rest of the test suite.
>
> We already compare changed files with their expected new contents using
> diff(1), so we don't need to check with "test_must_fail test_cmp" if
> they differ from their original state.  A later patch could convert the
> direct diff(1) calls to test_cmp as well.
>
> With all sum(1) calls gone, remove the MinGW-specific implementation
> from test-lib.sh as well.
>
> [1] http://minnie.tuhs.org/cgi-bin/utree.pl?file=V3/man/man1/sum.1
> [2] 
> http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.4BSD/usr/share/man/cat1/cksum.0
> [3] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cksum.html
> ---
>  t/t1002-read-tree-m-u-2way.sh | 67 
> ++-
>  t/test-lib.sh |  3 --
>  2 files changed, 35 insertions(+), 35 deletions(-)

Nicely analyzed.  May we forge your sign-off?

[...]
> --- a/t/t1002-read-tree-m-u-2way.sh
> +++ b/t/t1002-read-tree-m-u-2way.sh
[...]
> @@ -132,8 +138,8 @@ test_expect_success \
>   git ls-files --stage >7.out &&
>   test_cmp M.out 7.out &&
>   check_cache_at frotz dirty &&
> - sum bozbar frotz nitfol >actual7.sum &&
> - if cmp M.sum actual7.sum; then false; else :; fi &&
> + test_cmp bozbar.M bozbar &&
> + test_cmp nitfol.M nitfol &&

This one is strange.  What is that '! cmp' trying to check for?
Does the replacement capture the same thing?

E.g., does it need a '! test_cmp frotz.M frotz &&' line?

I haven't looked at the context closely --- another option could be a
note in the commit message about how that '! cmp' line was not testing
anything useful in the first place.

[...]
> @@ -209,11 +217,8 @@ test_expect_success \
>   git ls-files --stage >14.out &&
>   test_must_fail git diff -U0 --no-index M.out 14.out >14diff.out &&
>   compare_change 14diff.out expected &&
> - sum bozbar frotz >actual14.sum &&
> - grep -v nitfol M.sum > expected14.sum &&
> - cmp expected14.sum actual14.sum &&
> - sum bozbar frotz nitfol >actual14a.sum &&
> - if cmp M.sum actual14a.sum; then false; else :; fi &&
> + test_cmp bozbar.M bozbar &&
> + test_cmp frotz.M frotz &&

Same question here: the preimage seems to be a stricter test than the
postimage.

[...]
> @@ -231,11 +236,8 @@ test_expect_success \
>   test_must_fail git diff -U0 --no-index M.out 15.out >15diff.out &&
>   compare_change 15diff.out expected &&
>   check_cache_at nitfol dirty &&
> - sum bozbar frotz >actual15.sum &&
> - grep -v nitfol M.sum > expected15.sum &&
> - cmp expected15.sum actual15.sum &&
> - sum bozbar frotz nitfol >actual15a.sum &&
> - if cmp M.sum actual15a.sum; then false; else :; fi &&
> + test_cmp bozbar.M bozbar &&
> + test_cmp frotz.M frotz &&

Likewise.

[...]
> @@ -281,11 +285,8 @@ test_expect_success \
>   git ls-files --stage >19.out &&
>   test_cmp M.out 19.out &&
>   check_cache_at bozbar dirty &&
> - sum frotz nitfol >actual19.sum &&
> - grep -v bozbar  M.sum > expected19.sum &&
> - cmp expected19.sum actual19.sum &&
> - sum bozbar frotz nitfol >actual19a.sum &&
> - if cmp M.sum actual19a.sum; then false; else :; fi &&
> + test_cmp frotz.M frotz &&
> + test_cmp nitfol.M nitfol &&

Likewise.

The rest looks good.

Thanks,
Jonathan

Re: [RFC] clang-format: outline the git project's coding style

2017-08-14 Thread brian m. carlson

On Wed, Aug 09, 2017 at 07:19:00PM -0400, Jeff King wrote:
> On Wed, Aug 09, 2017 at 03:53:17PM -0700, Stefan Beller wrote:
> 
> > >> Right, the reason I stopped pursuing it was that I couldn't find a way
> > >> to have it make suggestions for new code without nagging about existing
> > >> code. If we were to aggressively reformat to match the tool for existing
> > >> code, that would help. But I'm a bit worried that there would always be
> > >> suggestions from the tool that we don't agree with (i.e., where the
> > >> guiding principle is "do what is readable").
> > 
> > We may have different opinions on what is readable/beautiful code.
> > If we were to follow a mutual agreed style that is produced by a tool,
> > we could use clean/smudge filters with different settings each.
> 
> I'm less worried about a difference of opinion between humans. My
> concern is that there are cases that the tool's formatting makes _worse_
> than what any human would write. And either we accept ugly code because
> the tool sucks, or we spend a bunch of time fighting with the tool to
> try to make its output look good.

This has been my issue with clang-format in the past.  I have an SHA-256
implementation with an array of 64 32-bit hex integers.  These fit six
to a line, but for neatness and consistency reasons, I'd like them four
to a line (4 divides 64, but 6 does not).  Last I checked, clang-format
didn't allow me that option: it reordered them because it could fit six
on a line.  This is not the only issue I discovered, just the most
memorable.

Other tools, such as perltidy, have traditionally honored existing line
breaks better (although not perfectly), which lets humans optimize for
readability.

Of course, clang-format could have dramatically improved since I last
looked (which was around clang 3.4 or 3.6, I think).

Overall, I do like the idea of using tidy tools, because it does reduce
quibbling over style quite a bit.  I just like the tools to be more
responsive to the whitespace they're given on input.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

signature.asc
Description: PGP signature

Re: [PATCH 1/9] Convert pack-objects to size_t

2017-08-14 Thread Ramsay Jones



On 14/08/17 21:32, Torsten Bögershausen wrote:
> In general, yes.
> I had a patch that allowed a 32 bit linux to commit a file >4GB.
> There is however a restriction: The file must not yet be known to the
> index. Otherwise the "diff" machinery kicks in, and tries to
> compare the "old" and the "new" versions, which means that -both-
> must fit into "memory" at the same time. Memory means here the available
> address space rather then real memory.
> So there may be room for improvements for 32 bit systems, but that is another
> story, which can be developped independent of the ulong->size_t conversion.

Oh, I absolutely agree.

> Thanks to Martin for working on this.

Indeed! Thanks Martin.

ATB,
Ramsay Jones

Re: [PATCH v2 2/2] Makefile: add style build rule

2017-08-14 Thread Brandon Williams

On 08/14, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
> > By the way, I do not know which vintage of /usr/bin/git-clang-format
> > I happen to have on my box, but I needed a crude workaround patch
> > (attached at the end) ...
> 
> I guess you hit the same thing while our messages crossing ;-)
> 
> 
> > As to what it does, the first example I tried may not have been a
> > great one.  I got this:
> >
> > git clang-format --style file --diff --extensions c,h
> > diff --git a/cache.h b/cache.h
> > index 73e0085186..6462fe25bc 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1498,11 +1498,8 @@ struct checkout {
> > const char *base_dir;
> > int base_dir_len;
> > struct delayed_checkout *delayed_checkout;
> > -   unsigned force:1,
> > -quiet:1,
> > -not_new:1,
> > -   a_new_field:1,
> > -refresh_cache:1;
> > +   unsigned force : 1, quiet : 1, not_new : 1, a_new_field : 1,
> > +   refresh_cache : 1;
> >  };
> >  #define CHECKOUT_INIT { NULL, "" }
> >  
> > which is not wrong per-se, but I have a mixed feelings.  I do not
> > want it to complain if the original tried to fit many items on a
> > single line, but if the original wanted to have one item per line,
> > I'd rather see it kept as-is.
> 
> To clarify, the above is after I added a_new_field that is one-bit
> wide without doing anything else.  I do not mind the checker
> complaining the existing force, quiet, etc. whose widths are all
> spelled without SP around ':', because they appear near-by, as a
> collateral damage.  My only gripe is that the result got squished
> into a single line.

Yeah the result doesn't look too good and I'm not sure which option to
tweak for that.  I'm sure that the problem would fix itself if all the
bit fields where defined on their own lines:

  unsigned force : 1;
  unsigned not_new : 1; 
  ... etc ...

I'm sure there are a bunch of other things that we'd need to tweak
before this is ready to be used by all contributors.  Specifically the
penalties to help determine when to break a line.

> 
> > Anyway, we cannot have perfect checker from the day one, and
> > considering this is an initial attempt, I'd say it is a good start.

-- 
Brandon Williams

[PATCH v3 3/3] diff: define block by number of non-space chars

2017-08-14 Thread Jonathan Tan

The existing behavior of diff --color-moved=zebra does not define the
minimum size of a block at all, instead relying on a heuristic applied
later to filter out sets of adjacent moved lines that are shorter than 3
lines long. This can be confusing, because a block could thus be colored
as moved at the source but not at the destination (or vice versa),
depending on its neighbors.

Instead, teach diff that the minimum size of a block is 10
non-whitespace characters. This allows diff to still exclude
uninteresting lines appearing on their own (such as those solely
consisting of one or a few closing braces), as was the intention of the
adjacent-moved-line heuristic.

This requires a change in the test "detect malicious moved code, inside
file" in that some of its lines are no longer considered to be part of a
block, because they are too short. The malicious move can still be
observed, however.

Signed-off-by: Jonathan Tan 
---
 Documentation/diff-options.txt |  8 ++--
 diff.c | 27 +---
 diff.h |  2 +-
 t/t4015-diff-whitespace.sh | 96 +++---
 4 files changed, 113 insertions(+), 20 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index bc52bd0b9..d0bdc849f 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -254,13 +254,11 @@ plain::
moved line, but it is not very useful in a review to determine
if a block of code was moved without permutation.
 zebra::
-   Blocks of moved code are detected greedily. The detected blocks are
+   Blocks of moved text of at least 10 non-whitespace characters
+   are detected greedily. The detected blocks are
painted using either the 'color.diff.{old,new}Moved' color or
'color.diff.{old,new}MovedAlternative'. The change between
-   the two colors indicates that a new block was detected. If there
-   are fewer than 3 adjacent moved lines, they are not marked up
-   as moved, but the regular colors 'color.diff.{old,new}' will be
-   used.
+   the two colors indicates that a new block was detected.
 dimmed_zebra::
Similar to 'zebra', but additional dimming of uninteresting parts
of moved code is performed. The bordering lines of two adjacent
diff --git a/diff.c b/diff.c
index f598d8a3a..305ce4126 100644
--- a/diff.c
+++ b/diff.c
@@ -864,19 +864,28 @@ static int shrink_potential_moved_blocks(struct 
moved_entry **pmb,
 /*
  * If o->color_moved is COLOR_MOVED_PLAIN, this function does nothing.
  *
- * Otherwise, if the last block has fewer lines than
- * COLOR_MOVED_MIN_BLOCK_LENGTH, unset DIFF_SYMBOL_MOVED_LINE on all lines in
- * that block.
+ * Otherwise, if the last block has fewer non-space characters than
+ * COLOR_MOVED_MIN_NON_SPACE_COUNT, unset DIFF_SYMBOL_MOVED_LINE on all lines
+ * in that block.
  *
  * The last block consists of the (n - block_length)'th line up to but not
  * including the nth line.
  */
 static void adjust_last_block(struct diff_options *o, int n, int block_length)
 {
-   int i;
-   if (block_length >= COLOR_MOVED_MIN_BLOCK_LENGTH ||
-   o->color_moved == COLOR_MOVED_PLAIN)
+   int i, non_space_count = 0;
+   if (o->color_moved == COLOR_MOVED_PLAIN)
return;
+   for (i = 1; i < block_length + 1; i++) {
+   const char *c = o->emitted_symbols->buf[n - i].line;
+   for (; *c; c++) {
+   if (isspace(*c))
+   continue;
+   non_space_count++;
+   if (non_space_count >= COLOR_MOVED_MIN_NON_SPACE_COUNT)
+   return;
+   }
+   }
for (i = 1; i < block_length + 1; i++)
o->emitted_symbols->buf[n - i].flags &= ~DIFF_SYMBOL_MOVED_LINE;
 }
@@ -923,7 +932,6 @@ static void mark_color_as_moved(struct diff_options *o,
}
 
l->flags |= DIFF_SYMBOL_MOVED_LINE;
-   block_length++;
 
if (o->color_moved == COLOR_MOVED_PLAIN)
continue;
@@ -953,8 +961,13 @@ static void mark_color_as_moved(struct diff_options *o,
}
 
flipped_block = (flipped_block + 1) % 2;
+
+   adjust_last_block(o, n, block_length);
+   block_length = 0;
}
 
+   block_length++;
+
if (flipped_block)
l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT;
}
diff --git a/diff.h b/diff.h
index 5755f465d..9e2fece5b 100644
--- a/diff.h
+++ b/diff.h
@@ -195,7 +195,7 @@ struct diff_options {
COLOR_MOVED_ZEBRA_DIM = 3,
} color_moved;
#define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA
-   #define COLOR_MOVED_MIN_BLOCK_LENGTH 3
+   #define COLOR_MOVED_MIN_NON_SPACE_COUNT 10

[PATCH v3 0/3] "diff --color-moved" with different heuristic

2017-08-14 Thread Jonathan Tan

These patches are on sb/diff-color-move.

Patches 1 and 2 are unchanged.

If we're planning to cook a heuristic, we might as well try a better
one. What do you think of this? A heuristic of number of non-whitespace
characters, applied at the block level, and not dependent on the block's
neighbors.

As for the test, good point about testing the boundary conditions - I
have included another test that does that.

Jonathan Tan (3):
  diff: avoid redundantly clearing a flag
  diff: respect MIN_BLOCK_LENGTH for last block
  diff: define block by number of non-space chars

 Documentation/diff-options.txt |   8 +--
 diff.c |  44 +++---
 diff.h |   2 +-
 t/t4015-diff-whitespace.sh | 129 +++--
 4 files changed, 163 insertions(+), 20 deletions(-)

-- 
2.14.1.480.gb18f417b89-goog

[PATCH v3 1/3] diff: avoid redundantly clearing a flag

2017-08-14 Thread Jonathan Tan

No code in diff.c sets DIFF_SYMBOL_MOVED_LINE except in
mark_color_as_moved(), so it is redundant to clear it for the current
line. Therefore, clear it only for previous lines.

This makes a refactoring in a subsequent patch easier.

Signed-off-by: Jonathan Tan 
---
 diff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 4af73a7e0..23311f9c0 100644
--- a/diff.c
+++ b/diff.c
@@ -898,7 +898,7 @@ static void mark_color_as_moved(struct diff_options *o,
if (!match) {
if (block_length < COLOR_MOVED_MIN_BLOCK_LENGTH &&
o->color_moved != COLOR_MOVED_PLAIN) {
-   for (i = 0; i < block_length + 1; i++) {
+   for (i = 1; i < block_length + 1; i++) {
l = >emitted_symbols->buf[n - i];
l->flags &= ~DIFF_SYMBOL_MOVED_LINE;
}
-- 
2.14.1.480.gb18f417b89-goog

[PATCH v3 2/3] diff: respect MIN_BLOCK_LENGTH for last block

2017-08-14 Thread Jonathan Tan

Currently, MIN_BLOCK_LENGTH is only checked when diff encounters a line
that does not belong to the current block. In particular, this means
that MIN_BLOCK_LENGTH is not checked after all lines are encountered.

Perform that check.

Signed-off-by: Jonathan Tan 
---
 diff.c | 29 ++---
 t/t4015-diff-whitespace.sh | 35 +++
 2 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/diff.c b/diff.c
index 23311f9c0..f598d8a3a 100644
--- a/diff.c
+++ b/diff.c
@@ -861,6 +861,26 @@ static int shrink_potential_moved_blocks(struct 
moved_entry **pmb,
return rp + 1;
 }
 
+/*
+ * If o->color_moved is COLOR_MOVED_PLAIN, this function does nothing.
+ *
+ * Otherwise, if the last block has fewer lines than
+ * COLOR_MOVED_MIN_BLOCK_LENGTH, unset DIFF_SYMBOL_MOVED_LINE on all lines in
+ * that block.
+ *
+ * The last block consists of the (n - block_length)'th line up to but not
+ * including the nth line.
+ */
+static void adjust_last_block(struct diff_options *o, int n, int block_length)
+{
+   int i;
+   if (block_length >= COLOR_MOVED_MIN_BLOCK_LENGTH ||
+   o->color_moved == COLOR_MOVED_PLAIN)
+   return;
+   for (i = 1; i < block_length + 1; i++)
+   o->emitted_symbols->buf[n - i].flags &= ~DIFF_SYMBOL_MOVED_LINE;
+}
+
 /* Find blocks of moved code, delegate actual coloring decision to helper */
 static void mark_color_as_moved(struct diff_options *o,
struct hashmap *add_lines,
@@ -896,13 +916,7 @@ static void mark_color_as_moved(struct diff_options *o,
}
 
if (!match) {
-   if (block_length < COLOR_MOVED_MIN_BLOCK_LENGTH &&
-   o->color_moved != COLOR_MOVED_PLAIN) {
-   for (i = 1; i < block_length + 1; i++) {
-   l = >emitted_symbols->buf[n - i];
-   l->flags &= ~DIFF_SYMBOL_MOVED_LINE;
-   }
-   }
+   adjust_last_block(o, n, block_length);
pmb_nr = 0;
block_length = 0;
continue;
@@ -944,6 +958,7 @@ static void mark_color_as_moved(struct diff_options *o,
if (flipped_block)
l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT;
}
+   adjust_last_block(o, n, block_length);
 
free(pmb);
 }
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index c3b697411..6f7758e5c 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1382,6 +1382,41 @@ EOF
test_cmp expected actual
 '
 
+test_expect_success '--color-moved block at end of diff output respects 
MIN_BLOCK_LENGTH' '
+   git reset --hard &&
+   >bar &&
+   cat <<-\EOF >foo &&
+   irrelevant_line
+   line1
+   EOF
+   git add foo bar &&
+   git commit -m x &&
+
+   cat <<-\EOF >bar &&
+   line1
+   EOF
+   cat <<-\EOF >foo &&
+   irrelevant_line
+   EOF
+
+   git diff HEAD --color-moved=zebra --no-renames | grep -v "index" | 
test_decode_color >actual &&
+   cat >expected <<-\EOF &&
+   diff --git a/bar b/bar
+   --- a/bar
+   +++ b/bar
+   @@ -0,0 +1 @@
+   +line1
+   diff --git a/foo b/foo
+   --- a/foo
+   +++ b/foo
+   @@ -1,2 +1 @@
+irrelevant_line
+   -line1
+   EOF
+
+   test_cmp expected actual
+'
+
 test_expect_success 'move detection with submodules' '
test_create_repo bananas &&
echo ripe >bananas/recipe &&
-- 
2.14.1.480.gb18f417b89-goog

What's cooking in git.git (Aug 2017, #03; Mon, 14)

2017-08-14 Thread Junio C Hamano

Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

The tip of 'next' has been rebuilt on top of the 'master' branch,
and a handful of new topics floating since v2.14 pre-release freeze
period have been picked up.

The top part of the draft release notes for the next cycle, which I
tentatively called Git 2.15, declares that "git add ''" will still
be supported up to this release but will become illegal after that.
Given that a typical release cycle is 8-12 weeks, that will happen
sometime early next year.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ah/doc-wserrorhighlight (2017-07-25) 1 commit
  (merged to 'next' on 2017-07-27 at cd1bb28d95)
 + doc: add missing values "none" and "default" for diff.wsErrorHighlight

 Doc update.


* bc/object-id (2017-07-17) 12 commits
  (merged to 'next' on 2017-07-18 at fd161056e4)
 + sha1_name: convert uses of 40 to GIT_SHA1_HEXSZ
 + sha1_name: convert GET_SHA1* flags to GET_OID*
 + sha1_name: convert get_sha1* to get_oid*
 + Convert remaining callers of get_sha1 to get_oid.
 + builtin/unpack-file: convert to struct object_id
 + bisect: convert bisect_checkout to struct object_id
 + builtin/update_ref: convert to struct object_id
 + sequencer: convert to struct object_id
 + remote: convert struct push_cas to struct object_id
 + submodule: convert submodule config lookup to use object_id
 + builtin/merge-tree: convert remaining caller of get_sha1 to object_id
 + builtin/fsck: convert remaining caller of get_sha1 to object_id
 (this branch is used by bw/submodule-config-cleanup; uses sb/object-id.)

 Conversion from uchar[20] to struct object_id continues.


* bw/object-id (2017-07-17) 3 commits
  (merged to 'next' on 2017-07-18 at 90d27c0e7c)
 + receive-pack: don't access hash of NULL object_id pointer
 + notes: don't access hash of NULL object_id pointer
 + tree-diff: don't access hash of NULL object_id pointer

 Conversion from uchar[20] to struct object_id continues.


* cc/ref-is-hidden-microcleanup (2017-07-24) 1 commit
  (merged to 'next' on 2017-07-27 at 37af544e1c)
 + refs: use skip_prefix() in ref_is_hidden()

 Code cleanup.


* dc/fmt-merge-msg-microcleanup (2017-07-25) 1 commit
  (merged to 'next' on 2017-07-27 at 6df06eb788)
 + fmt-merge-msg: fix coding style

 Code cleanup.


* dl/credential-cache-socket-in-xdg-cache (2017-07-27) 1 commit
  (merged to 'next' on 2017-08-01 at 87687ae1d4)
 + credential-cache: interpret an ECONNRESET as an EOF

 A recently added test for the "credential-cache" helper revealed
 that EOF detection done around the time the connection to the cache
 daemon is torn down were flaky.  This was fixed by reacting to
 ECONNRESET and behaving as if we got an EOF.


* eb/contacts-reported-by (2017-07-27) 1 commit
  (merged to 'next' on 2017-08-01 at cca9972d6b)
 + git-contacts: also recognise "Reported-by:"

 "git contacts" (in contrib/) now lists the address on the
 "Reported-by:" trailer to its output, in addition to those on
 S-o-b: and other trailers, to make it easier to notify (and thank)
 the original bug reporter.


* hb/gitweb-project-list (2017-07-18) 1 commit
  (merged to 'next' on 2017-07-27 at c25d65ca25)
 + gitweb: skip unreadable subdirectories

 When a directory is not readable, "gitweb" fails to build the
 project list.  Work this around by skipping such a directory.

 It might end up hiding a problem under the rug and a better
 solution might be to loudly complain to the administrator pointing
 out the problematic directory, but this will at least make it
 "work".


* jb/t8008-cleanup (2017-07-26) 1 commit
  (merged to 'next' on 2017-08-01 at f979c9340d)
 + t8008: rely on rev-parse'd HEAD instead of sha1 value

 Code clean-up.


* jc/http-sslkey-and-ssl-cert-are-paths (2017-07-20) 1 commit
  (merged to 'next' on 2017-07-20 at 5489304b99)
 + http.c: http.sslcert and http.sslkey are both pathnames

 The http.{sslkey,sslCert} configuration variables are to be
 interpreted as a pathname that honors "~[username]/" prefix, but
 weren't, which has been fixed.


* jk/c99 (2017-07-18) 2 commits
  (merged to 'next' on 2017-07-18 at 1cfc30f7c1)
 + clean.c: use designated initializer
 + strbuf: use designated initializers in STRBUF_INIT

 Start using selected c99 constructs in small, stable and
 essentialpart of the system to catch people who care about
 older compilers that do not grok them.


* jk/ref-filter-colors (2017-07-13) 15 commits
  (merged to 'next' on 2017-07-18 at 75d4eb7ecf)
 + ref-filter: consult want_color() before emitting colors
 + pretty: respect color settings for %C placeholders
 + rev-list: pass

Re: [PATCH v2 2/2] Makefile: add style build rule

2017-08-14 Thread Junio C Hamano

Junio C Hamano  writes:

> By the way, I do not know which vintage of /usr/bin/git-clang-format
> I happen to have on my box, but I needed a crude workaround patch
> (attached at the end) ...

I guess you hit the same thing while our messages crossing ;-)


> As to what it does, the first example I tried may not have been a
> great one.  I got this:
>
> git clang-format --style file --diff --extensions c,h
> diff --git a/cache.h b/cache.h
> index 73e0085186..6462fe25bc 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1498,11 +1498,8 @@ struct checkout {
> const char *base_dir;
> int base_dir_len;
> struct delayed_checkout *delayed_checkout;
> - unsigned force:1,
> -  quiet:1,
> -  not_new:1,
> - a_new_field:1,
> -  refresh_cache:1;
> + unsigned force : 1, quiet : 1, not_new : 1, a_new_field : 1,
> + refresh_cache : 1;
>  };
>  #define CHECKOUT_INIT { NULL, "" }
>  
> which is not wrong per-se, but I have a mixed feelings.  I do not
> want it to complain if the original tried to fit many items on a
> single line, but if the original wanted to have one item per line,
> I'd rather see it kept as-is.

To clarify, the above is after I added a_new_field that is one-bit
wide without doing anything else.  I do not mind the checker
complaining the existing force, quiet, etc. whose widths are all
spelled without SP around ':', because they appear near-by, as a
collateral damage.  My only gripe is that the result got squished
into a single line.

> Anyway, we cannot have perfect checker from the day one, and
> considering this is an initial attempt, I'd say it is a good start.

Re: [PATCH v2 2/2] Makefile: add style build rule

2017-08-14 Thread Junio C Hamano

Jeff King  writes:

> I suspect the "-p" version is going to be the one people invoke the most
> often. Should it take the coveted "make style" slot, and the diff get
> pushed off to another target?
>
> I was also confused at first that the "-p" version requires you to stage
> the changes first. I don't know if we can make that less confusing via a
> "make style". Or if it's just something people would get used to. But
> sadly it makes the command not-quite orthogonal to "make test" in the
> workflow. You can't "make style && make test && git add -p".  You have
> to add first, then check style, then you'd want to test that result to
> make sure it didn't change the meaning of the code.

Perhaps.

By the way, I do not know which vintage of /usr/bin/git-clang-format
I happen to have on my box, but I needed a crude workaround patch
(attached at the end) to get it even run.  The first thing it does
is to call load_git_config() and it barfs because I have boolean
configuration variables set to true in the correct way, which it
does not seem to recognise.

As to what it does, the first example I tried may not have been a
great one.  I got this:

git clang-format --style file --diff --extensions c,h
diff --git a/cache.h b/cache.h
index 73e0085186..6462fe25bc 100644
--- a/cache.h
+++ b/cache.h
@@ -1498,11 +1498,8 @@ struct checkout {
const char *base_dir;
int base_dir_len;
struct delayed_checkout *delayed_checkout;
-   unsigned force:1,
-quiet:1,
-not_new:1,
-   a_new_field:1,
-refresh_cache:1;
+   unsigned force : 1, quiet : 1, not_new : 1, a_new_field : 1,
+   refresh_cache : 1;
 };
 #define CHECKOUT_INIT { NULL, "" }

which is not wrong per-se, but I have a mixed feelings.  I do not
want it to complain if the original tried to fit many items on a
single line, but if the original wanted to have one item per line,
I'd rather see it kept as-is.

Anyway, we cannot have perfect checker from the day one, and
considering this is an initial attempt, I'd say it is a good start.

Thanks.

diff --git a/git-clang-format b/git-clang-format
index 60cd4fb25b..e8429b2750 100755
--- a/usr/bin/git-clang-format
+++ 
b/usr/local/google/home/jch/g/Ubuntu-14.04-x86_64/gitstuff/bin/git-clang-format
@@ -191,10 +191,13 @@ def load_git_config(non_string_options=None):
   out = {}
   for entry in run('git', 'config', '--list', '--null').split('\0'):
 if entry:
-  name, value = entry.split('\n', 1)
-  if name in non_string_options:
-value = run('git', 'config', non_string_options[name], name)
-  out[name] = value
+  if '\n' in entry:
+name, value = entry.split('\n', 1)
+if name in non_string_options:
+  value = run('git', 'config', non_string_options[name], name)
+out[name] = value
+  else:
+   out[entry] = "true";
   return out

Re: [PATCH v2 1/2] format-patch: have progress option while generating patches

2017-08-14 Thread Junio C Hamano

Jeff King  writes:

> On Mon, Aug 14, 2017 at 03:42:14PM -0700, Junio C Hamano wrote:
>
>> Jeff King  writes:
>> 
>> > If it's smooth, the (50,1) case is slightly nicer in that it puts the
>> > progress in front of the user more quickly. I'm not sure if that's
>> > actually worth pushing an additional decision onto the person writing
>> > the calling code, though (especially when we are just now puzzling out
>> > the method for making such a decision from first principles).
>> >
>> > So I'd vote to drop that parameter entirely. And if 1 second seems
>> > noticeably snappier, then we should probably just move everything to a 1
>> > second delay (I don't have a strong feeling either way).
>> 
>> Sounds like a good idea to me.  
>> 
>> I've already locally tweaked Kevin's patch to use (0,2) instead of
>> (0,1) without introducing the simpler wrapper.  It should be trivial
>> to do a wrapper to catch and migrate all the (0,2) users to a
>> start_delayed_progress() that takes neither percentage or time with
>> mechanical replacement.
>
> I was actually proposing to move (50,1) cases to the simpler wrapper,
> too. IOW, drop the delayed_percent_treshold code entirely.

I should have mentioned that (50,1) should also be treated just like
(0,2) case--they should mean the same thing.  Other oddness like 95
might merit individual consideration, and I didn't want to add (0,1)
as another oddball to the mix.

Re: [PATCH v2 0/2] clang-format

2017-08-14 Thread Stefan Beller

+ llvm-...@lists.llvm.org

The Git community is currently discussing adopting a coding style
defined by clang-format, here is a bug report:

On Mon, Aug 14, 2017 at 4:06 PM, Jeff King  wrote:
>
> One more oddity I found while playing with this that Git folks might run
> into:
>
>   $ git init tmp && cd tmp
>   $ git commit --allow-empty -m foo
>   $ echo "[mysection]mykey" >>.git/config
>   $ git clang-format-5.0
>   Traceback (most recent call last):
> File "/usr/bin/git-clang-format-5.0", line 579, in 
>   main()
> File "/usr/bin/git-clang-format-5.0", line 62, in main
>   config = load_git_config()
> File "/usr/bin/git-clang-format-5.0", line 194, in load_git_config
>   name, value = entry.split('\n', 1)
>   ValueError: need more than 1 value to unpack
>
>   $ sed -i 's/mykey/&=true/' .git/config
>   $ git clang-format-5.0
>   no modified files to format
>
> So it looks like they do their own config parsing and it's not quite
> compatible. :(
>
> That's not the end of the world, and something we can try to fix
> upstream. I just wanted to mention it here so other people don't waste
> time trying to track down the problem.
>
> -Peff

Re: [PATCH v2 1/2] format-patch: have progress option while generating patches

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 03:42:14PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > If it's smooth, the (50,1) case is slightly nicer in that it puts the
> > progress in front of the user more quickly. I'm not sure if that's
> > actually worth pushing an additional decision onto the person writing
> > the calling code, though (especially when we are just now puzzling out
> > the method for making such a decision from first principles).
> >
> > So I'd vote to drop that parameter entirely. And if 1 second seems
> > noticeably snappier, then we should probably just move everything to a 1
> > second delay (I don't have a strong feeling either way).
> 
> Sounds like a good idea to me.  
> 
> I've already locally tweaked Kevin's patch to use (0,2) instead of
> (0,1) without introducing the simpler wrapper.  It should be trivial
> to do a wrapper to catch and migrate all the (0,2) users to a
> start_delayed_progress() that takes neither percentage or time with
> mechanical replacement.

I was actually proposing to move (50,1) cases to the simpler wrapper,
too. IOW, drop the delayed_percent_treshold code entirely.

-Peff

Re: [PATCH v2 0/2] clang-format

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 02:30:44PM -0700, Brandon Williams wrote:

> Changes in v2:
>  * Changed a couple rules to be more inline with our coding style.
>  * Added a Makefile build rule to run git-clang-format on the diff of the
>working tree to suggest style changes.
> 
> I found that the llvm project also has the git-clang-format tool which will
> allow for doing formating changes based on diffs so that the parts of the code
> you didn't touch won't be formated.  It also has a nice '-p' option to only
> apply hunks of suggested changes.  I also saw what Ramsay did and attempted to
> add a Makefile rule to run clang-format.  This bit may need more tweaking to
> get it right.

One more oddity I found while playing with this that Git folks might run
into:

  $ git init tmp && cd tmp
  $ git commit --allow-empty -m foo
  $ echo "[mysection]mykey" >>.git/config
  $ git clang-format-5.0
  Traceback (most recent call last):
File "/usr/bin/git-clang-format-5.0", line 579, in 
  main()
File "/usr/bin/git-clang-format-5.0", line 62, in main
  config = load_git_config()
File "/usr/bin/git-clang-format-5.0", line 194, in load_git_config
  name, value = entry.split('\n', 1)
  ValueError: need more than 1 value to unpack

  $ sed -i 's/mykey/&=true/' .git/config
  $ git clang-format-5.0
  no modified files to format

So it looks like they do their own config parsing and it's not quite
compatible. :(

That's not the end of the world, and something we can try to fix
upstream. I just wanted to mention it here so other people don't waste
time trying to track down the problem.

-Peff

Re: [PATCH v2 1/2] clang-format: outline the git project's coding style

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 03:54:30PM -0700, Brandon Williams wrote:

> > And removing that gives me a clean output. I have no idea why my clang
> > doesn't like these (but presumably yours does). It's clang-format-5.0 in
> > Debian unstable (and clang-format-3.8, etc).
> 
> Those must be features in version 6 (which is what I seem to have
> installed on my machine).

OK, that makes sense. The most recent one package for Debian unstable is
5.0. AFAICT 5.0 is actually in release freeze for another week or two,
and 6 is just bleeding-edge that moved on after the release freeze a few
weeks ago.

I'm not sure which version it makes sense to target as a minimum, but
probably not 6 yet. :)

-Peff

Re: [PATCH v2 2/2] Makefile: add style build rule

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 03:28:50PM -0700, Stefan Beller wrote:

> >> +.PHONY: style
> >> +style:
> >> + git clang-format --style file --diff --extensions c,h
> >
> > Did we get "git clang-format" subcommand, or is "s/git //" implied
> > somewhere?
> 
> You need to have clang-format installed (debian/Ubuntu package, or
> however it is named in your distribution), which provides this command
> for us.

Sadly it is called git-clang-format-3.8, git-clang-format-5.0, etc, in
the Debian packages. I think we need a CLANG_FORMAT variable that can be
overridden.

I am surprised that there's no base "git-clang-format" symlink for the
default version. There is for "clang-format" and "clang" themselves. So
arguably this is a bug in the Debian packaging.

I suspect the "-p" version is going to be the one people invoke the most
often. Should it take the coveted "make style" slot, and the diff get
pushed off to another target?

I was also confused at first that the "-p" version requires you to stage
the changes first. I don't know if we can make that less confusing via a
"make style". Or if it's just something people would get used to. But
sadly it makes the command not-quite orthogonal to "make test" in the
workflow. You can't "make style && make test && git add -p".  You have
to add first, then check style, then you'd want to test that result to
make sure it didn't change the meaning of the code.

-Peff

Re: [PATCH v2 1/2] clang-format: outline the git project's coding style

2017-08-14 Thread Brandon Williams

On 08/14, Jeff King wrote:
> On Mon, Aug 14, 2017 at 06:48:31PM -0400, Jeff King wrote:
> 
> > On Mon, Aug 14, 2017 at 02:30:45PM -0700, Brandon Williams wrote:
> > 
> > > +# Align escaped newlines as far left as possible
> > > +# #define A   \
> > > +#   int ; \
> > > +#   int b;\
> > > +#   int ;
> > > +AlignEscapedNewlines: Left
> > 
> > I get:
> > 
> >   $ git clang-format-5.0 --style file -p --extensions c,h
> >   YAML:34:23: error: unknown key 'AlignEscapedNewlines'
> >   AlignEscapedNewlines: Left
> > ^~~~
> >   Error reading /home/peff/compile/git/.clang-format: Invalid argument
> > 
> > Same with clang-format-3.8.
> 
> And if I remove that line, I get:
> 
>   YAML:155:25: error: unknown key 'PenaltyBreakAssignment'
>   PenaltyBreakAssignment: 100
> 
> Removing that gives:
> 
>   YAML:86:22: error: unknown key 'BreakStringLiterals'
>   BreakStringLiterals: false
> 
> And removing that gives me a clean output. I have no idea why my clang
> doesn't like these (but presumably yours does). It's clang-format-5.0 in
> Debian unstable (and clang-format-3.8, etc).
> 
> -Peff

Those must be features in version 6 (which is what I seem to have
installed on my machine).

-- 
Brandon Williams

Add --ignore-missing to git-pack-objects?

2017-08-14 Thread ch


Hi.

Is it possible to add an option akin to git-rev-list's '--ignore-missing' to
git-pack-objects?

I use git bundles to (incrementally) backup my repositories. My script inspects
all bundles in the backup and passes their contained refs as excludes to
git-pack-objects to build the pack for the new bundle. This works fine as long
as none of these refs have been garbage-collected in the source repository.
Something like '--ignore-missing' would be really handy here to ask
git-pack-objects to simply ignore any of the passed revs that are not present
(anymore).

Thanks in advance.

Re: [PATCH v2 1/2] clang-format: outline the git project's coding style

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 06:48:31PM -0400, Jeff King wrote:

> On Mon, Aug 14, 2017 at 02:30:45PM -0700, Brandon Williams wrote:
> 
> > +# Align escaped newlines as far left as possible
> > +# #define A   \
> > +#   int ; \
> > +#   int b;\
> > +#   int ;
> > +AlignEscapedNewlines: Left
> 
> I get:
> 
>   $ git clang-format-5.0 --style file -p --extensions c,h
>   YAML:34:23: error: unknown key 'AlignEscapedNewlines'
>   AlignEscapedNewlines: Left
> ^~~~
>   Error reading /home/peff/compile/git/.clang-format: Invalid argument
> 
> Same with clang-format-3.8.

And if I remove that line, I get:

  YAML:155:25: error: unknown key 'PenaltyBreakAssignment'
  PenaltyBreakAssignment: 100

Removing that gives:

  YAML:86:22: error: unknown key 'BreakStringLiterals'
  BreakStringLiterals: false

And removing that gives me a clean output. I have no idea why my clang
doesn't like these (but presumably yours does). It's clang-format-5.0 in
Debian unstable (and clang-format-3.8, etc).

-Peff

Re: [PATCH v2 1/2] clang-format: outline the git project's coding style

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 02:30:45PM -0700, Brandon Williams wrote:

> +# Align escaped newlines as far left as possible
> +# #define A   \
> +#   int ; \
> +#   int b;\
> +#   int ;
> +AlignEscapedNewlines: Left

I get:

  $ git clang-format-5.0 --style file -p --extensions c,h
  YAML:34:23: error: unknown key 'AlignEscapedNewlines'
  AlignEscapedNewlines: Left
^~~~
  Error reading /home/peff/compile/git/.clang-format: Invalid argument

Same with clang-format-3.8.

-Peff

Re: [PATCH v2 3/3] diff: check MIN_BLOCK_LENGTH at start of new block

2017-08-14 Thread Stefan Beller

On Mon, Aug 14, 2017 at 2:31 PM, Jonathan Tan  wrote:
> The existing documentation states "If there are fewer than 3 adjacent
> moved lines, they are not marked up as moved", which is ambiguous as to
> whether "adjacent moved lines" must be adjacent both at the source and
> at the destination, or be adjacent merely at the source or merely at the
> destination. The behavior of the current code takes the latter
> interpretation, but the behavior of blocks being conceptually painted as
> blocks and then "unpainted" as lines is confusing to me.
>
> Therefore, clarify the ambiguity in the documentation in the stricter
> direction - a block is completely painted or not at all - and update the
> code accordingly.
>
> This requires a change in the test "detect malicious moved code, inside
> file" in that the malicious change is now marked without the move
> colors (because the blocks involved are too small), contrasting with
> the subsequent test where the non-malicious change is marked with move
> colors.
>
> Signed-off-by: Jonathan Tan 

I wonder if these changes ought to be a new mode
(C.f. "mountain zebra" and "imperial zebra" for slight
changes in coloring ;) or if we can settle on one true way.

The 3 lines heuristic is a bad heuristic IMHO (it works reasonable well
for little effort but the fact that we discuss this patch makes it a bad
heuristic as we discuss corner cases that are not relevant. The heuristic
originally wanted to filter out stray single braces that were "moved",
it did not want to suppress small original moved pieces of code),
which this covers up a bit.

Maybe we'll cook this in next for a while to see how people
react to it?

> ---
>  Documentation/diff-options.txt |  6 ++--
>  diff.c |  6 +++-
>  t/t4015-diff-whitespace.sh | 71 
> +-
>  3 files changed, 65 insertions(+), 18 deletions(-)
>
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index bc52bd0b9..1ee3ca3f6 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -257,10 +257,10 @@ zebra::
> Blocks of moved code are detected greedily. The detected blocks are
> painted using either the 'color.diff.{old,new}Moved' color or
> 'color.diff.{old,new}MovedAlternative'. The change between
> -   the two colors indicates that a new block was detected. If there
> -   are fewer than 3 adjacent moved lines, they are not marked up
> +   the two colors indicates that a new block was detected. If a block
> +   has fewer than 3 adjacent moved lines, it is not marked up
> as moved, but the regular colors 'color.diff.{old,new}' will be
> -   used.
> +   used instead.

Thanks for clarifying the docs.

> diff --git a/diff.c b/diff.c
> index f598d8a3a..20b784736 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -923,7 +923,6 @@ static void mark_color_as_moved(struct diff_options *o,
> }
>
> l->flags |= DIFF_SYMBOL_MOVED_LINE;
> -   block_length++;
>
> if (o->color_moved == COLOR_MOVED_PLAIN)
> continue;
> @@ -953,8 +952,13 @@ static void mark_color_as_moved(struct diff_options *o,
> }
>
> flipped_block = (flipped_block + 1) % 2;
> +
> +   adjust_last_block(o, n, block_length);
> +   block_length = 0;
> }
>
> +   block_length++;
> +
> if (flipped_block)
> l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT;
> }

This changes the algorithm in a non-obvious way.
When the min-length heuristic is strictly bound to each block,
the function can be simplified more than adding on these tweaks,

1) remove variable block_length, needing to count in the adjust function

2) assign DIFF_SYMBOL_MOVED_LINE either in
COLOR_MOVED_PLAIN case (and continue) or later (where
block_length is increased in this patch)

No need to do these, just as thoughts on how to reduce complexity.

> diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
> index 6f7758e5c..d0613a189 100755
> --- a/t/t4015-diff-whitespace.sh
> +++ b/t/t4015-diff-whitespace.sh
> @@ -1097,13 +1097,13 @@ test_expect_success 'detect malicious moved code, 
> inside file' '

This test would not 'detect malicious moved code, inside file'  any more,
I think instead we'd rather want to have a more realistic test case,
which has more lines in it? (This test is about the block detection
not about the omission of short blocks, which was an after thought)

> +test_expect_success '--color-moved treats adjacent blocks as separate for 
> MIN_BLOCK_LENGTH' '

Thanks for providing a test here! For testing MIN_BLOCK_LENGTH
for each block I would have imagined the tests would have a block of
length (1,)2,3(,4) lines and then we'd see that the

Re: [PATCH v2 1/2] format-patch: have progress option while generating patches

2017-08-14 Thread Junio C Hamano

Jeff King  writes:

> If it's smooth, the (50,1) case is slightly nicer in that it puts the
> progress in front of the user more quickly. I'm not sure if that's
> actually worth pushing an additional decision onto the person writing
> the calling code, though (especially when we are just now puzzling out
> the method for making such a decision from first principles).
>
> So I'd vote to drop that parameter entirely. And if 1 second seems
> noticeably snappier, then we should probably just move everything to a 1
> second delay (I don't have a strong feeling either way).

Sounds like a good idea to me.  

I've already locally tweaked Kevin's patch to use (0,2) instead of
(0,1) without introducing the simpler wrapper.  It should be trivial
to do a wrapper to catch and migrate all the (0,2) users to a
start_delayed_progress() that takes neither percentage or time with
mechanical replacement.

Re: [PATCH v2 1/2] format-patch: have progress option while generating patches

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 11:35:33AM -0700, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > Perhaps we may want to replace the calls to progress_delay() with a
> > call to a simpler wrapper that does not let the callers give their
> > own delay threashold to simplify the API.
> 
> ... which does not look too bad, but because it makes me wonder if
> we even need to make the delay-threshold customizable per callers,
> I'll wait further discussions before committing to the approach.

For what it's worth, I had a similar "why is this even part of the API"
thought when writing earlier in the thread.

> For example, I do not quite understand why 95 is a good value for
> prune-packed while 0 is a good value for prune.

And I also wondered this same thing. I do not have a good answer.

> The rename detection in diffcore-rename, delay in blame, and
> checkout (via unpack-trees) all of which use 50-percent threshold
> with 1 second delay, sort of make sense to me in that if we
> completed 50 percent within 1 second, it is likely we will finish
> all in 2 seconds (which is the norm for everybody else), perhaps as
> a better version of 0-percent 2 seconds rule.

Just thinking what could go wrong for a moment.

If we reach 51% in one second, would we then never show progress? That
seems like a misfeature when the counter is spiky. E.g., consider
something like object transfer (which doesn't do a delayed progress, but
pretend for a moment as it makes a simple example). Imagine we have 100
objects, 99 of which are 10KB and one of which is 100MB. And that the
big object comes at slot 75.

No matter how the delay works, the counter is going to jump quickly to
75% as we send the small objects, and then appear to stall on the large
one. That's unavoidable without feeding the progress code more data
about the items. But what does the user see?

With (0,2), we start the progress meter at 2 seconds, the user sees it
stall at 75%, and then eventually it unclogs.

With (50,1), we check the percentage after 1 second and see we are
already at 75%. We then disable the meter totally. After 2 seconds, we
get the same stall but the user sees nothing.

So in that case we should always base our decision only on time taken.

And that gives me an answer for your question above: the difference is
whether we expect the counter to move smoothly, or to be spiky.

If it's smooth, the (50,1) case is slightly nicer in that it puts the
progress in front of the user more quickly. I'm not sure if that's
actually worth pushing an additional decision onto the person writing
the calling code, though (especially when we are just now puzzling out
the method for making such a decision from first principles).

So I'd vote to drop that parameter entirely. And if 1 second seems
noticeably snappier, then we should probably just move everything to a 1
second delay (I don't have a strong feeling either way).

-Peff

Re: [PATCH v2 2/2] Makefile: add style build rule

2017-08-14 Thread Stefan Beller

On Mon, Aug 14, 2017 at 3:25 PM, Junio C Hamano  wrote:
> Brandon Williams  writes:
>
>> Add the 'style' build rule which will run git-clang-format on the diff
>> between HEAD and the current worktree.  The result is a diff of
>> suggested changes.
>>
>> Signed-off-by: Brandon Williams 
>> ---
>>  Makefile | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/Makefile b/Makefile
>> index ba4359ef8..acfd096b7 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -2414,6 +2414,10 @@ $(SP_OBJ): %.sp: %.c GIT-CFLAGS FORCE
>>  .PHONY: sparse $(SP_OBJ)
>>  sparse: $(SP_OBJ)
>>
>> +.PHONY: style
>> +style:
>> + git clang-format --style file --diff --extensions c,h
>
> Did we get "git clang-format" subcommand, or is "s/git //" implied
> somewhere?

You need to have clang-format installed (debian/Ubuntu package, or
however it is named in your distribution), which provides this command
for us.

Re: [PATCH v2 2/2] Makefile: add style build rule

2017-08-14 Thread Junio C Hamano

Brandon Williams  writes:

> Add the 'style' build rule which will run git-clang-format on the diff
> between HEAD and the current worktree.  The result is a diff of
> suggested changes.
>
> Signed-off-by: Brandon Williams 
> ---
>  Makefile | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/Makefile b/Makefile
> index ba4359ef8..acfd096b7 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2414,6 +2414,10 @@ $(SP_OBJ): %.sp: %.c GIT-CFLAGS FORCE
>  .PHONY: sparse $(SP_OBJ)
>  sparse: $(SP_OBJ)
>  
> +.PHONY: style
> +style:
> + git clang-format --style file --diff --extensions c,h

Did we get "git clang-format" subcommand, or is "s/git //" implied
somewhere?

> +
>  check: common-cmds.h
>   @if sparse; \
>   then \

Re: [PATCH v2] doc: clarify "config --bool" behaviour with empty values

2017-08-14 Thread Junio C Hamano

Andreas Heiduk  writes:

> `git config --bool xxx.yyy` returns `true` for `[xxx]yyy` but
> `false` for `[xxx]yyy=` or `[xxx]yyy=""`.  This is tested in
> t1300-repo-config.sh since 09bc098c2.
>
> Signed-off-by: Andreas Heiduk 
> ---
>  Documentation/config.txt | 10 +-
>  Documentation/git.txt|  3 ++-
>  2 files changed, 7 insertions(+), 6 deletions(-)

This looks good to me.  Will queue.



> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index d5c9c4cab..478b9431e 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -216,15 +216,15 @@ boolean::
> synonyms are accepted for 'true' and 'false'; these are all
> case-insensitive.
>  
> -   true;; Boolean true can be spelled as `yes`, `on`, `true`,
> - or `1`.  Also, a variable defined without `= `
> + true;; Boolean true literals are `yes`, `on`, `true`,
> + and `1`.  Also, a variable defined without `= `
>   is taken as true.
>  
> -   false;; Boolean false can be spelled as `no`, `off`,
> - `false`, or `0`.
> + false;; Boolean false literals are `no`, `off`, `false`,
> + `0` and the empty string.
>  +
>  When converting value to the canonical form using `--bool` type
> -specifier; 'git config' will ensure that the output is "true" or
> +specifier, 'git config' will ensure that the output is "true" or
>  "false" (spelled in lowercase).
>  
>  integer::
> diff --git a/Documentation/git.txt b/Documentation/git.txt
> index 7dd5e0328..6e3a6767e 100644
> --- a/Documentation/git.txt
> +++ b/Documentation/git.txt
> @@ -75,7 +75,8 @@ example the following invocations are equivalent:
>  Note that omitting the `=` in `git -c foo.bar ...` is allowed and sets
>  `foo.bar` to the boolean true value (just like `[foo]bar` would in a
>  config file). Including the equals but with an empty value (like `git -c
> -foo.bar= ...`) sets `foo.bar` to the empty string.
> +foo.bar= ...`) sets `foo.bar` to the empty string which ` git config
> +--bool` will convert to `false`.
>  
>  --exec-path[=]::
>   Path to wherever your core Git programs are installed.

Re: [PATCH] doc: clarify "config --bool" behaviour with empty values

2017-08-14 Thread Junio C Hamano

Andreas Heiduk  writes:

>> However, I think this "no value (but still with '=')" is making it
>> more confusing than necessary for two reasons.
> [...]
>  
>> I notice that in this Values section (where the boolean:: is the
>> first entry) there is no mention on how to spell a string value.
>
> I assumed this is due to the pretext of the definition list:
>
>   Values of many variables are treated as a simple string, but there
>   are variables that take values of specific types and there are rules
>   as to how to spell them.

I assumed so too.  

But if you knew that "[section] var =" is a valid way to spell an
empty string, I'd thought that you wouldn't have written "no value
but still with '=')" there.

The description for "true" (i.e. "[section] var" and nothing else)
is also spelled out perfectly well in the Syntax section, but it is
duplicated in Values section.  I think that it is a good thing to
have the complete picture in a single Values section, without
assuming readers to know what is in the other Syntax section.

So if it bothers you to have a non-specific "string" description in
the Values section, I think it would be more helpful to update the
pretext so that including the description of a simple string there
does not look unnatural, IMHO.

Re: [PATCH v2] commit: skip discarding the index if there is no pre-commit hook

2017-08-14 Thread Jeff King

On Mon, Aug 14, 2017 at 03:54:25PM -0600, Kevin Willford wrote:

> If there is not a pre-commit hook, there is no reason to discard
> the index and reread it.
> 
> This change checks to presence of a pre-commit hook and then only
> discards the index if there was one.
> 
> Signed-off-by: Kevin Willford 
> ---
>  builtin/commit.c | 15 +--
>  1 file changed, 9 insertions(+), 6 deletions(-)

Thanks, this looks nice and simple.

> diff --git a/builtin/commit.c b/builtin/commit.c
> index e7a2cb6285..ab71b93518 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -940,12 +940,15 @@ static int prepare_to_commit(const char *index_file, 
> const char *prefix,
>   return 0;
>   }
>  
> - /*
> -  * Re-read the index as pre-commit hook could have updated it,
> -  * and write it out as a tree.  We must do this before we invoke
> -  * the editor and after we invoke run_status above.
> -  */
> - discard_cache();
> + if (!no_verify && find_hook("pre-commit")) {
> + /*
> +  * Re-read the index as pre-commit hook could have updated it,
> +  * and write it out as a tree.  We must do this before we invoke
> +  * the editor and after we invoke run_status above.
> +  */
> + discard_cache();
> + }
> +
>   read_cache_from(index_file);

This read_cache_from() should be a noop, right, because it immediately
sees istate->initialized is set? So it shouldn't matter that it is not
in the conditional with discard_cache(). Though if its only purpose is
to re-read the just-discarded contents, perhaps it makes sense to put it
there for readability.

-Peff

[PATCH v2] doc: clarify "config --bool" behaviour with empty values

2017-08-14 Thread Andreas Heiduk

`git config --bool xxx.yyy` returns `true` for `[xxx]yyy` but
`false` for `[xxx]yyy=` or `[xxx]yyy=""`.  This is tested in
t1300-repo-config.sh since 09bc098c2.

Signed-off-by: Andreas Heiduk 
---
 Documentation/config.txt | 10 +-
 Documentation/git.txt|  3 ++-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index d5c9c4cab..478b9431e 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -216,15 +216,15 @@ boolean::
synonyms are accepted for 'true' and 'false'; these are all
case-insensitive.
 
-   true;; Boolean true can be spelled as `yes`, `on`, `true`,
-   or `1`.  Also, a variable defined without `= `
+   true;; Boolean true literals are `yes`, `on`, `true`,
+   and `1`.  Also, a variable defined without `= `
is taken as true.
 
-   false;; Boolean false can be spelled as `no`, `off`,
-   `false`, or `0`.
+   false;; Boolean false literals are `no`, `off`, `false`,
+   `0` and the empty string.
 +
 When converting value to the canonical form using `--bool` type
-specifier; 'git config' will ensure that the output is "true" or
+specifier, 'git config' will ensure that the output is "true" or
 "false" (spelled in lowercase).
 
 integer::
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 7dd5e0328..6e3a6767e 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -75,7 +75,8 @@ example the following invocations are equivalent:
 Note that omitting the `=` in `git -c foo.bar ...` is allowed and sets
 `foo.bar` to the boolean true value (just like `[foo]bar` would in a
 config file). Including the equals but with an empty value (like `git -c
-foo.bar= ...`) sets `foo.bar` to the empty string.
+foo.bar= ...`) sets `foo.bar` to the empty string which ` git config
+--bool` will convert to `false`.
 
 --exec-path[=]::
Path to wherever your core Git programs are installed.
-- 
2.14.1

Re: [PATCH] doc: clarify "config --bool" behaviour with empty values

2017-08-14 Thread Andreas Heiduk

Am 14.08.2017 um 19:53 schrieb Junio C Hamano:
> Andreas Heiduk  writes:
> 
>> `git config --bool xxx.yyy` returns `true` for `[xxx]yyy` but
>> `false` for `[xxx]yyy=` or `[xxx]yyy=""`.  This is tested in
>> t1300-repo-config.sh since 09bc098c2.
>>
>> Signed-off-by: Andreas Heiduk 
>> ---
>>  Documentation/config.txt | 3 ++-
>>  Documentation/git.txt| 3 ++-
>>  2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index d5c9c4cab..d3261006b 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -221,7 +221,8 @@ boolean::
>>  is taken as true.
>>  
>> false;; Boolean false can be spelled as `no`, `off`,
>> -`false`, or `0`.
>> +`false`, `0`, no value (but still with `=`) or the
>> +empty string.
> 
[...]

> However, I think this "no value (but still with '=')" is making it
> more confusing than necessary for two reasons.
[...]

> I notice that in this Values section (where the boolean:: is the
> first entry) there is no mention on how to spell a string value.

I assumed this is due to the pretext of the definition list:

Values of many variables are treated as a simple string, but there
are variables that take values of specific types and there are rules
as to how to spell them.

After that I would NOT expect string values to be "specific". Also: If string 
values are explained here in the "Values" section, the line-breaking and escape 
sequences syntax should be here too.

So my (minimal) suggestion is:

   false;; Boolean false literals are `no`, `off`,
`false`, `0` and the empty string.

I'll adapt `true` in the same style and resend a patch.

Re: [PATCH v2 1/2] clang-format: outline the git project's coding style

2017-08-14 Thread Stefan Beller

On Mon, Aug 14, 2017 at 2:30 PM, Brandon Williams  wrote:
> Add a '.clang-format' file which outlines the git project's coding
> style.  This can be used with clang-format to auto-format .c and .h
> files to conform with git's style.
>
> Signed-off-by: Brandon Williams 

Applying this patch and running
clang-format -i -style file *.c *.h builtin/*.c
produces a diff, that I'd mostly agree with.
This style guide is close to our current style.

As noted in patch 2/2 we'd now need an easy way to
expose this for use in various situations, such as
* contributor wanting to format their patch
* reformatting code for readability

Thanks,
Stefan

[PATCH v2] commit: skip discarding the index if there is no pre-commit hook

2017-08-14 Thread Kevin Willford

If there is not a pre-commit hook, there is no reason to discard
the index and reread it.

This change checks to presence of a pre-commit hook and then only
discards the index if there was one.

Signed-off-by: Kevin Willford 
---
 builtin/commit.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index e7a2cb6285..ab71b93518 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -940,12 +940,15 @@ static int prepare_to_commit(const char *index_file, 
const char *prefix,
return 0;
}
 
-   /*
-* Re-read the index as pre-commit hook could have updated it,
-* and write it out as a tree.  We must do this before we invoke
-* the editor and after we invoke run_status above.
-*/
-   discard_cache();
+   if (!no_verify && find_hook("pre-commit")) {
+   /*
+* Re-read the index as pre-commit hook could have updated it,
+* and write it out as a tree.  We must do this before we invoke
+* the editor and after we invoke run_status above.
+*/
+   discard_cache();
+   }
+
read_cache_from(index_file);
if (update_main_cache_tree(0)) {
error(_("Error building trees"));
-- 
2.14.1.481.g785c1dc9ae

Re: [PATCH v2 2/2] Makefile: add style build rule

2017-08-14 Thread Stefan Beller

On Mon, Aug 14, 2017 at 2:30 PM, Brandon Williams  wrote:
> Add the 'style' build rule which will run git-clang-format on the diff
> between HEAD and the current worktree.  The result is a diff of
> suggested changes.

Notes from in-office discussion:

* 'git clang-format --style file -f --extensions c,h'
   to apply suggested changes. (Useful for contributors,
   maybe even as a precommit hook)
* you can also give a range of commits to git clang-format, as
   git clang-format --diff origin/master..HEAD to see if other
   local commits need tweaking.

>
> Signed-off-by: Brandon Williams 
> ---
>  Makefile | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/Makefile b/Makefile
> index ba4359ef8..acfd096b7 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2414,6 +2414,10 @@ $(SP_OBJ): %.sp: %.c GIT-CFLAGS FORCE
>  .PHONY: sparse $(SP_OBJ)
>  sparse: $(SP_OBJ)
>
> +.PHONY: style
> +style:
> +   git clang-format --style file --diff --extensions c,h
> +
>  check: common-cmds.h
> @if sparse; \
> then \
> --
> 2.14.1.480.gb18f417b89-goog
>

Re: [PATCH 0/4] dropping support for older curl

2017-08-14 Thread Johannes Schindelin

Hi Paolo,

On Thu, 10 Aug 2017, Paolo Ciarrocchi wrote:

> Il 10 ago 2017 11:39 AM, "Johannes Schindelin" 
> ha scritto:
> 
> 
> 
> Footnote *1*: It is no secret that I find our patch submission less than
> inviting. Granted, *I* use it. *I* did not have problems entering the
> mailing list. But then, my mails were not swallowed silently, because my
> mail program does not send HTML by default. And prepared by the German
> school system (I learned the term "sugar coating" only when exposed to
> some US culture), I had little emotional problems with being criticized
> and not thanked for my contribution, I persisted nevertheless. The opinion
> that the Git contribution process is a lot less inviting than it could be
> is not only my view, by the way. I hear this a lot. I give you that we are
> not quite as textbook "keep out from here unless you look like us, smell
> like us, talk like us, have the same genital setup like us" as the Linux
> kernel mailing list, but we are in a different universe compared to, say,
> the Drupal community. And their universe is a lot nicer to live in.
> 
> 
> Isn't SumbitGit a possible answer to your doubts (I strongly agree with
> you) about the current development process?

No. I hate to say that SubmitGit neither integrates well with GitHub Pull
Requests (code comments on GitHub are approximately 1,523x easier to
write, read, associate with the actual code, and see the current state of,
compared to the mailing list, and SubmitGit does not even hint at
integrating with that user experience).

Also, the barrier to start using SubmitGit is rather high. If you open a
Pull Request on github.com/git/git, you get *no* indication that SubmitGit
is an option to *actually* get the code into Git. There are also concerns
about required permissions that Junio Hamano himself would not accept.

Now, let's assume that you submitted the code via SubmitGit. The
challenges of the patch submission process do not end there, yet SubmitGit
goes home and has a beer. But the hard part, the discussions on the
mailing list, the status updates in the completely separate What's cooking
mails, the missing links back to the original source code (let alone the
information in which worktree on your computer under which branch name
that topic was developed again?), the diverging mail threads, the
"rerolls" that should not forget to Cc: all reviewers of previous rounds,
all that jazz is still all very, very manual.

And even if there was an easier path from having a local branch that works
to finally getting it onto the list in the required form, your mail is an
eloquent example of one of the most preposterous hurdles along the way: we
pride ourselves with the openness we demonstrate by communicating via a
mailing list, everybody has a mail address, amirite? But of course, HTML
mails, like, about 130% of all mails on the internet (or so it feels),
including yours, are dropped. Silently. Not so open anymore, are we.

It is all so frustrating, really. I work in a team (Visual Studio Team
Services, you can think of it as kind of a Microsoft take on Git hosting
plus a lot of other tooling) where we get really, really positive feedback
regarding our user experience, in particular the frequent enhancements to
PRs. It is really powerful to open, review and merge PRs, interact with
other developers, see the state of PRs and their respective discussions,
open and resolve issues, automate workflows and build tasks [*1*]. And
then I try to convince people here on this mailing list that it really
makes a difference if you start using tools to lighten the load of
everybody, and... little changes.

At least thanks to Lars Schneider's incredible efforts we have Continuous
Testing (I know how much time he spent on this, it is another thing
deserving the label "preposterous", and I know how much time I spent on
top to add the Windows part which mostly works). If only we could push it
further to true Continuous Integration. Or at least to accepting PRs from
professionals who simply have no time to fight with our patch contribution
process and whose expertise we lose as a consequence.

Ciao,
Johannes

Footnote *1*: The funniest part about this is that I do get mails about
all of this all the time. When I am pulled in as a reviewer. When a build
failed. When a previously failing task was fixed by a new build. When
somebody responded to my comments. The difference to the mailing
list-centric approach is of course that those mails are only
notifications, and link back to the tool appropriately supporting what
I, the user, want to get done.

[PATCH v2] stash: prevent warning about null bytes in input

2017-08-14 Thread Kevin Daudt

The `no_changes` function calls the `untracked_files` function through
command substitution. `untracked_files` will return null bytes because it
runs ls-files with the '-z' option.

Bash since version 4.4 warns about these null bytes. As they are not
required for the test that is being done, make sure `untracked_files`
does not output null bytes when not required.

This is achieved by adding a parameter to the `untracked_files` function to
specify wither `-z` should be passed to ls-files or not.

This warning is triggered when running git stash save -u resulting in
two warnings:

git-stash: line 43: warning: command substitution: ignored null byte
in input

Signed-off-by: Kevin Daudt 
---
 git-stash.sh | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/git-stash.sh b/git-stash.sh
index 9b6c2da7b..5f09a47f0 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -43,9 +43,16 @@ no_changes () {
 }
 
 untracked_files () {
+   if test "$1" = "-z"
+   then
+   shift
+   z=-z
+   else
+   z=
+   fi
excl_opt=--exclude-standard
test "$untracked" = "all" && excl_opt=
-   git ls-files -o -z $excl_opt -- "$@"
+   git ls-files -o $z $excl_opt -- "$@"
 }
 
 clear_stash () {
@@ -114,7 +121,7 @@ create_stash () {
# Untracked files are stored by themselves in a parentless 
commit, for
# ease of unpacking later.
u_commit=$(
-   untracked_files "$@" | (
+   untracked_files -z "$@" | (
GIT_INDEX_FILE="$TMPindex" &&
export GIT_INDEX_FILE &&
rm -f "$TMPindex" &&
-- 
2.14.1.145.gb3622a4ee9

Re: [PATCH v2 0/2] clang-format

2017-08-14 Thread Brandon Williams

On 08/14, Brandon Williams wrote:
> Changes in v2:
>  * Changed a couple rules to be more inline with our coding style.
>  * Added a Makefile build rule to run git-clang-format on the diff of the
>working tree to suggest style changes.
> 
> I found that the llvm project also has the git-clang-format tool which will
> allow for doing formating changes based on diffs so that the parts of the code
> you didn't touch won't be formated.  It also has a nice '-p' option to only
> apply hunks of suggested changes.  I also saw what Ramsay did and attempted to
> add a Makefile rule to run clang-format.  This bit may need more tweaking to
> get it right.

Forgot to include the [RFC] bit because from the discussion it seems
like we still have a lot to talk about before we decide that this is the
path we are taking.

> 
> Brandon Williams (2):
>   clang-format: outline the git project's coding style
>   Makefile: add style build rule
> 
>  .clang-format | 165 
> ++
>  Makefile  |   4 ++
>  2 files changed, 169 insertions(+)
>  create mode 100644 .clang-format
> 
> -- 
> 2.14.1.480.gb18f417b89-goog
> 

-- 
Brandon Williams

[PATCH v2 1/2] clang-format: outline the git project's coding style

2017-08-14 Thread Brandon Williams

Add a '.clang-format' file which outlines the git project's coding
style.  This can be used with clang-format to auto-format .c and .h
files to conform with git's style.

Signed-off-by: Brandon Williams 
---
 .clang-format | 165 ++
 1 file changed, 165 insertions(+)
 create mode 100644 .clang-format

diff --git a/.clang-format b/.clang-format
new file mode 100644
index 0..3ede2628d
--- /dev/null
+++ b/.clang-format
@@ -0,0 +1,165 @@
+# Defaults
+
+# Use tabs whenever we need to fill whitespace that spans at least from one tab
+# stop to the next one.
+UseTab: Always
+TabWidth: 8
+IndentWidth: 8
+ContinuationIndentWidth: 8
+ColumnLimit: 80
+
+# C Language specifics
+Language: Cpp
+
+# Align parameters on the open bracket
+# someLongFunction(argument1,
+#  argument2);
+AlignAfterOpenBracket: Align
+
+# Don't align consecutive assignments
+# int  = 12;
+# int b = 14;
+AlignConsecutiveAssignments: false
+
+# Don't align consecutive declarations
+# int  = 12;
+# double b = 3.14;
+AlignConsecutiveDeclarations: false
+
+# Align escaped newlines as far left as possible
+# #define A   \
+#   int ; \
+#   int b;\
+#   int ;
+AlignEscapedNewlines: Left
+
+# Align operands of binary and ternary expressions
+# int aaa = bbb +
+#   cc;
+AlignOperands: true
+
+# Don't align trailing comments
+# int a; // Comment a
+# int b = 2; // Comment b
+AlignTrailingComments: false
+
+# By default don't allow putting parameters onto the next line
+# myFunction(foo, bar, baz);
+AllowAllParametersOfDeclarationOnNextLine: false
+
+# Don't allow short braced statements to be on a single line
+# if (a)   not   if (a) return;
+#   return;
+AllowShortBlocksOnASingleLine: false
+AllowShortCaseLabelsOnASingleLine: false
+AllowShortFunctionsOnASingleLine: false
+AllowShortIfStatementsOnASingleLine: false
+AllowShortLoopsOnASingleLine: false
+
+# By default don't add a line break after the return type of top-level 
functions
+# int foo();
+AlwaysBreakAfterReturnType: None
+
+# Pack as many parameters or arguments onto the same line as possible
+# int myFunction(int , int ,
+#int );
+BinPackArguments: true
+BinPackParameters: true
+
+# Attach braces to surrounding context except break before braces on function
+# definitions.
+# void foo()
+# {
+#if (true) {
+#} else {
+#}
+# };
+BreakBeforeBraces: Linux
+
+# Break after operators
+# int valuve = a +
+#  bb -
+#  ccc;
+BreakBeforeBinaryOperators: None
+BreakBeforeTernaryOperators: false
+
+# Don't break string literals
+BreakStringLiterals: false
+
+# Use the same indentation level as for the switch statement.
+# Switch statement body is always indented one level more than case labels.
+IndentCaseLabels: false
+
+# Don't indent a function definition or declaration if it is wrapped after the
+# type
+IndentWrappedFunctionNames: false
+
+# Align pointer to the right
+# int *a;
+PointerAlignment: Right
+
+# Don't insert a space after a cast
+# x = (int32)y;notx = (int32) y;
+SpaceAfterCStyleCast: false
+
+# Insert spaces before and after assignment operators
+# int a = 5;notint a=5;
+# a += 42; a+=42;
+SpaceBeforeAssignmentOperators: true
+
+# Put a space before opening parentheses only after control statement keywords.
+# void f() {
+#   if (true) {
+# f();
+#   }
+# }
+SpaceBeforeParens: ControlStatements
+
+# Don't insert spaces inside empty '()'
+SpaceInEmptyParentheses: false
+
+# The number of spaces before trailing line comments (// - comments).
+# This does not affect trailing block comments (/* - comments).
+SpacesBeforeTrailingComments: 1
+
+# Don't insert spaces in casts
+# x = (int32) y;notx = ( int32 ) y;
+SpacesInCStyleCastParentheses: false
+
+# Don't insert spaces inside container literals
+# var arr = [1, 2, 3];notvar arr = [ 1, 2, 3 ];
+SpacesInContainerLiterals: false
+
+# Don't insert spaces after '(' or before ')'
+# f(arg);notf( arg );
+SpacesInParentheses: false
+
+# Don't insert spaces after '[' or before ']'
+# int a[5];notint a[ 5 ];
+SpacesInSquareBrackets: false
+
+# Insert a space after '{' and before '}' in struct initializers
+Cpp11BracedListStyle: false
+
+# A list of macros that should be interpreted as foreach loops instead of as
+# function calls.
+ForEachMacros: ['for_each_string_list_item']
+
+# The maximum number of consecutive empty lines to keep.
+MaxEmptyLinesToKeep: 1
+
+# No empty line at the start of a block.
+KeepEmptyLinesAtTheStartOfBlocks: false
+
+# Penalties
+# This decides what order things should be done if a line is too long
+PenaltyBreakAssignment: 100
+PenaltyBreakBeforeFirstCallParameter: 100
+PenaltyBreakComment: 100
+PenaltyBreakFirstLessLess: 0
+PenaltyBreakString: 100
+PenaltyExcessCharacter: 5

[PATCH v2 3/3] diff: check MIN_BLOCK_LENGTH at start of new block

2017-08-14 Thread Jonathan Tan

The existing documentation states "If there are fewer than 3 adjacent
moved lines, they are not marked up as moved", which is ambiguous as to
whether "adjacent moved lines" must be adjacent both at the source and
at the destination, or be adjacent merely at the source or merely at the
destination. The behavior of the current code takes the latter
interpretation, but the behavior of blocks being conceptually painted as
blocks and then "unpainted" as lines is confusing to me.

Therefore, clarify the ambiguity in the documentation in the stricter
direction - a block is completely painted or not at all - and update the
code accordingly.

This requires a change in the test "detect malicious moved code, inside
file" in that the malicious change is now marked without the move
colors (because the blocks involved are too small), contrasting with
the subsequent test where the non-malicious change is marked with move
colors.

Signed-off-by: Jonathan Tan 
---
 Documentation/diff-options.txt |  6 ++--
 diff.c |  6 +++-
 t/t4015-diff-whitespace.sh | 71 +-
 3 files changed, 65 insertions(+), 18 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index bc52bd0b9..1ee3ca3f6 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -257,10 +257,10 @@ zebra::
Blocks of moved code are detected greedily. The detected blocks are
painted using either the 'color.diff.{old,new}Moved' color or
'color.diff.{old,new}MovedAlternative'. The change between
-   the two colors indicates that a new block was detected. If there
-   are fewer than 3 adjacent moved lines, they are not marked up
+   the two colors indicates that a new block was detected. If a block
+   has fewer than 3 adjacent moved lines, it is not marked up
as moved, but the regular colors 'color.diff.{old,new}' will be
-   used.
+   used instead.
 dimmed_zebra::
Similar to 'zebra', but additional dimming of uninteresting parts
of moved code is performed. The bordering lines of two adjacent
diff --git a/diff.c b/diff.c
index f598d8a3a..20b784736 100644
--- a/diff.c
+++ b/diff.c
@@ -923,7 +923,6 @@ static void mark_color_as_moved(struct diff_options *o,
}
 
l->flags |= DIFF_SYMBOL_MOVED_LINE;
-   block_length++;
 
if (o->color_moved == COLOR_MOVED_PLAIN)
continue;
@@ -953,8 +952,13 @@ static void mark_color_as_moved(struct diff_options *o,
}
 
flipped_block = (flipped_block + 1) % 2;
+
+   adjust_last_block(o, n, block_length);
+   block_length = 0;
}
 
+   block_length++;
+
if (flipped_block)
l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT;
}
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 6f7758e5c..d0613a189 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1097,13 +1097,13 @@ test_expect_success 'detect malicious moved code, 
inside file' '
 printf("World\n");
 }
 
-   -int secure_foo(struct user *u)
-   -{
-   -if (!u->is_allowed_foo)
-   -return;
-   -foo(u);
-   -}
-   -
+   -int secure_foo(struct user *u)
+   -{
+   -if (!u->is_allowed_foo)
+   -return;
+   -foo(u);
+   -}
+   -
 int main()
 {
 foo();
@@ -1115,13 +1115,13 @@ test_expect_success 'detect malicious moved code, 
inside file' '
 printf("Hello World, but different\n");
 }
 
-   +int secure_foo(struct user *u)
-   +{
-   +foo(u);
-   +if (!u->is_allowed_foo)
-   +return;
-   +}
-   +
+   +int secure_foo(struct user *u)
+   +{
+   +foo(u);
+   +if (!u->is_allowed_foo)
+   +return;
+   +}
+   +
 int another_function()
 {
 bar();
@@ -1417,6 +1417,49 @@ test_expect_success '--color-moved block at end of diff 
output respects MIN_BLOC
test_cmp expected actual
 '
 
+test_expect_success '--color-moved treats adjacent blocks as separate for 
MIN_BLOCK_LENGTH' '
+   git reset --hard &&
+   cat <<-\EOF >foo &&
+   line1
+   irrelevant_line
+   line2
+   line3
+   EOF
+   >bar &&
+   git add foo bar &&
+   git commit -m x &&
+
+   cat <<-\EOF >foo &&
+   irrelevant_line
+   EOF
+   cat <<-\EOF >bar &&
+   line2
+   line3
+   line1
+   EOF
+
+   git diff HEAD --color-moved=zebra --no-renames | grep -v "index" | 
test_decode_color >actual &&
+   cat >expected <<-\EOF &&
+   diff --git a/bar b/bar
+   --- a/bar
+   +++ b/bar
+   @@ -0,0 +1,3 @@
+   +line2
+   +line3
+   +line1
+

[PATCH v2 0/3] Fixes to "diff --color-moved" MIN_BLOCK_LENGTH handling

2017-08-14 Thread Jonathan Tan

These patches are on sb/diff-color-move.

Patches 1 and 2 are unchanged.

It was pointed out to me that the documentation for
"--color-moved=zebra" is ambiguous, but could plausibly describe the
current behavior. I still think that the current behavior is confusing,
so I have retained patch 3, but have updated the commit message and
documentation to describe this situation. I also proceeded with updating
the test mentioned in the previous cover letter to produce the expected
result (describing the modification in the commit message).

(If we do decide that the current behavior is correct, I can send
another patch to update the documentation to be less ambiguous and maybe
update the code to not use a name like MIN_BLOCK_LENGTH, as it is the
number of adjacent moved lines that we are checking, not the length of a
block.)

Jonathan Tan (3):
  diff: avoid redundantly clearing a flag
  diff: respect MIN_BLOCK_LENGTH for last block
  diff: check MIN_BLOCK_LENGTH at start of new block

 Documentation/diff-options.txt |   6 +--
 diff.c |  35 ++
 t/t4015-diff-whitespace.sh | 106 +++--
 3 files changed, 122 insertions(+), 25 deletions(-)

-- 
2.14.1.480.gb18f417b89-goog

[PATCH v2 1/3] diff: avoid redundantly clearing a flag

2017-08-14 Thread Jonathan Tan

No code in diff.c sets DIFF_SYMBOL_MOVED_LINE except in
mark_color_as_moved(), so it is redundant to clear it for the current
line. Therefore, clear it only for previous lines.

This makes a refactoring in a subsequent patch easier.

Signed-off-by: Jonathan Tan 
---
 diff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 4af73a7e0..23311f9c0 100644
--- a/diff.c
+++ b/diff.c
@@ -898,7 +898,7 @@ static void mark_color_as_moved(struct diff_options *o,
if (!match) {
if (block_length < COLOR_MOVED_MIN_BLOCK_LENGTH &&
o->color_moved != COLOR_MOVED_PLAIN) {
-   for (i = 0; i < block_length + 1; i++) {
+   for (i = 1; i < block_length + 1; i++) {
l = >emitted_symbols->buf[n - i];
l->flags &= ~DIFF_SYMBOL_MOVED_LINE;
}
-- 
2.14.1.480.gb18f417b89-goog

[PATCH v2 2/3] diff: respect MIN_BLOCK_LENGTH for last block

2017-08-14 Thread Jonathan Tan

Currently, MIN_BLOCK_LENGTH is only checked when diff encounters a line
that does not belong to the current block. In particular, this means
that MIN_BLOCK_LENGTH is not checked after all lines are encountered.

Perform that check.

Signed-off-by: Jonathan Tan 
---
 diff.c | 29 ++---
 t/t4015-diff-whitespace.sh | 35 +++
 2 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/diff.c b/diff.c
index 23311f9c0..f598d8a3a 100644
--- a/diff.c
+++ b/diff.c
@@ -861,6 +861,26 @@ static int shrink_potential_moved_blocks(struct 
moved_entry **pmb,
return rp + 1;
 }
 
+/*
+ * If o->color_moved is COLOR_MOVED_PLAIN, this function does nothing.
+ *
+ * Otherwise, if the last block has fewer lines than
+ * COLOR_MOVED_MIN_BLOCK_LENGTH, unset DIFF_SYMBOL_MOVED_LINE on all lines in
+ * that block.
+ *
+ * The last block consists of the (n - block_length)'th line up to but not
+ * including the nth line.
+ */
+static void adjust_last_block(struct diff_options *o, int n, int block_length)
+{
+   int i;
+   if (block_length >= COLOR_MOVED_MIN_BLOCK_LENGTH ||
+   o->color_moved == COLOR_MOVED_PLAIN)
+   return;
+   for (i = 1; i < block_length + 1; i++)
+   o->emitted_symbols->buf[n - i].flags &= ~DIFF_SYMBOL_MOVED_LINE;
+}
+
 /* Find blocks of moved code, delegate actual coloring decision to helper */
 static void mark_color_as_moved(struct diff_options *o,
struct hashmap *add_lines,
@@ -896,13 +916,7 @@ static void mark_color_as_moved(struct diff_options *o,
}
 
if (!match) {
-   if (block_length < COLOR_MOVED_MIN_BLOCK_LENGTH &&
-   o->color_moved != COLOR_MOVED_PLAIN) {
-   for (i = 1; i < block_length + 1; i++) {
-   l = >emitted_symbols->buf[n - i];
-   l->flags &= ~DIFF_SYMBOL_MOVED_LINE;
-   }
-   }
+   adjust_last_block(o, n, block_length);
pmb_nr = 0;
block_length = 0;
continue;
@@ -944,6 +958,7 @@ static void mark_color_as_moved(struct diff_options *o,
if (flipped_block)
l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT;
}
+   adjust_last_block(o, n, block_length);
 
free(pmb);
 }
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index c3b697411..6f7758e5c 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1382,6 +1382,41 @@ EOF
test_cmp expected actual
 '
 
+test_expect_success '--color-moved block at end of diff output respects 
MIN_BLOCK_LENGTH' '
+   git reset --hard &&
+   >bar &&
+   cat <<-\EOF >foo &&
+   irrelevant_line
+   line1
+   EOF
+   git add foo bar &&
+   git commit -m x &&
+
+   cat <<-\EOF >bar &&
+   line1
+   EOF
+   cat <<-\EOF >foo &&
+   irrelevant_line
+   EOF
+
+   git diff HEAD --color-moved=zebra --no-renames | grep -v "index" | 
test_decode_color >actual &&
+   cat >expected <<-\EOF &&
+   diff --git a/bar b/bar
+   --- a/bar
+   +++ b/bar
+   @@ -0,0 +1 @@
+   +line1
+   diff --git a/foo b/foo
+   --- a/foo
+   +++ b/foo
+   @@ -1,2 +1 @@
+irrelevant_line
+   -line1
+   EOF
+
+   test_cmp expected actual
+'
+
 test_expect_success 'move detection with submodules' '
test_create_repo bananas &&
echo ripe >bananas/recipe &&
-- 
2.14.1.480.gb18f417b89-goog

[PATCH v2 0/2] clang-format

2017-08-14 Thread Brandon Williams

Changes in v2:
 * Changed a couple rules to be more inline with our coding style.
 * Added a Makefile build rule to run git-clang-format on the diff of the
   working tree to suggest style changes.

I found that the llvm project also has the git-clang-format tool which will
allow for doing formating changes based on diffs so that the parts of the code
you didn't touch won't be formated.  It also has a nice '-p' option to only
apply hunks of suggested changes.  I also saw what Ramsay did and attempted to
add a Makefile rule to run clang-format.  This bit may need more tweaking to
get it right.

Brandon Williams (2):
  clang-format: outline the git project's coding style
  Makefile: add style build rule

 .clang-format | 165 ++
 Makefile  |   4 ++
 2 files changed, 169 insertions(+)
 create mode 100644 .clang-format

-- 
2.14.1.480.gb18f417b89-goog

[PATCH v2 2/2] Makefile: add style build rule

2017-08-14 Thread Brandon Williams

Add the 'style' build rule which will run git-clang-format on the diff
between HEAD and the current worktree.  The result is a diff of
suggested changes.

Signed-off-by: Brandon Williams 
---
 Makefile | 4 
 1 file changed, 4 insertions(+)

diff --git a/Makefile b/Makefile
index ba4359ef8..acfd096b7 100644
--- a/Makefile
+++ b/Makefile
@@ -2414,6 +2414,10 @@ $(SP_OBJ): %.sp: %.c GIT-CFLAGS FORCE
 .PHONY: sparse $(SP_OBJ)
 sparse: $(SP_OBJ)
 
+.PHONY: style
+style:
+   git clang-format --style file --diff --extensions c,h
+
 check: common-cmds.h
@if sparse; \
then \
-- 
2.14.1.480.gb18f417b89-goog

Re: [PATCH v1 1/1] dir: teach status to show ignored directories

2017-08-14 Thread Stefan Beller

On Fri, Aug 11, 2017 at 10:48 AM, Jameson Miller
 wrote:
> The set of files listed by "--ignored" changes when different
> values are given to "--untracked-files".  If would be nice to
> be able to make the ignored output independent of the untracked
> settings.  This patch attempts to do that while maintaining
> backward compatibility with the existing behavior.

So after rereading this, maybe we could have
'--ignored[=as-untracked, new-mode]' with as-untracked as default?

This would satisfy your goal, while still maintaining backwards
compatibility.

> We want to see the paths that explicitly match an ignore pattern.
> This means that if a directory only contains ignored files, but the
> directory itself is not explicitly ignored, then we want to see the
> individual files. This is slightly different than the current behavior
> of "--ignored".
>
> I am open to suggestions on how to present the options to control
> this behavior.

I would strongly favor not introducing yet another flag to tweak the
behavior of another command line option. (The UX of Git is already
bloated, IMHO. But that is just my general stance on things)

So we'd have two proposals:

(1) The current patch proposes --show-ignored-directory that modifies
the behavior of other flags
(2) extend an already existing option to take another mode (which I
propose)

Strong arguments for (1) would be if the flag is orthogonal and apply
to the other flags in a uniform way, i.e.:
status --no-ignored --untracked=no --show-ignored-directory
would make sense and produce an output that a user doesn't
find confusing.

Arguments for (2), are
* less documentation for users (see below)
* if the two flags are truly orthogonal, test effort is decreased
* no exposure of an API that may yield strange results when the user
  gives strange input.

>>> This
>>> change exposes a flag to report all untracked files while not showing
>>> individual files in ignored directories.
>>
>> By the description up to here, it sounds as if you want to introduce
>> mode for --untracked, e.g. "normal-adjusted-for-ignored" (it's a bad
>> suggestion)? However the patch seems to add an orthogonal flag,
>> such that
>>
>>status --no-ignored --untracked=no --show-ignored-directory
>>
>> would also be possible. What is a reasonable expectation for
>> the output of such?
>
>
> The current patch does add another flag. This flag only has meaning if
> the "--ignored" and "--untracked=all" flags are also specified. Another
> option I had considered is to let the "--ignored" flag take an argument.

I would think that is a better solution, as then untracked and ignored
files (and their parameters) will be orthogonal in the long run.
Things like

  --untracked=all --ignored=none
  --untracked=none --ignored=only-dirs
  --untracked=reasonable-default --ignored=other-reasonable-default

will be possible and eventually the default for ignored files may be
independent of untracked settings.

> Then, we could express this new behavior through (for example) a
> "--ignored=exact" flag to reflect the fact that this new option
> returns paths that match the ignore pattern, and does not
> collapse directories that contain only ignored files.

Yes, and we would not need to worry about how the new
flag interacts with other flags. I would prefer that solution as it
seems the API will be cleaner that way.

Less things depend on each other, e.g. as a user I do not need to
know about the 'other' flag, be it ignored or untracked to solve my
problem. If we'd go with this third flag, we'd need to (a) document it,
but (b) also have to add a sentence to untracked and ignored flag
"In the corner case of also having the third flag, we change
behavior once again"
which I'd dislike from a users perspective.

>>> Commands:
>>>   1) status
>>>   2) status --ignored
>>>   3) status --ignored --untracked-files=all
>>>   4) status --ignored --untracked-files=all --show-ignored-directory
>>> (2) is --untracked-files=normal I'd presume, such that this flag
>>> can be understood as a tweak to "normal" based on the similar size
>>> between 2 and 4? (The timing improvement from 2 to 4 is huge though).
>
> (2) is --untracked-files=normal. Although the count of ignored
> files similar between 2 and 4, I consider this flag more of a
> tweak on 3, as we want the untracked files reported with
> the "--untracked=all" flag. The counts between 2 and 4 are
> similar in this case because most of the ignored files are
> contained in ignored directories.
>
> Our application calls status including the following flags:
>
> --porcelain=v2 --ignored --untracked-files=all --ignore-submodules=none
>
> This means we have bad performance compared to just "git status"
> when there is a large number of files in ignored directories
> With this new behavior, our application would move from case 3 to
> case 4 for this repository.
>
> You also point out the timing difference between case 2 and 4. I

Re: [PATCH] stash: prevent warning about null bytes in input

2017-08-14 Thread Junio C Hamano

Kevin Daudt  writes:

> On Mon, Aug 14, 2017 at 12:51:26PM -0700, Junio C Hamano wrote:
>> Kevin Daudt  writes:
>> 
>> > The no_changes function calls the untracked_files function through
>> > command substitution. untracked_files will return null bytes because it
>> > runs ls-files with the '-z' option.
>> >
>> > Bash since version 4.4 warns about these null bytes. As they are not
>> > required for the test that is being done, remove null bytes from the
>> > input.
>> 
>> That's an interesting one ;-)
>> 
>> I wonder if you considered giving an option to untracked_files
>> helper function, though.  After all, it has only two callers,
>> and it feels a bit suboptimal to ask the command to do a special
>> thing (i.e. "-z") only to clean it up with a pipe.
>
> As a matter of fact, I did not consider that option. I do agree that's a
> much better approach.
>
>> 
>> IOW, something along the lines of (totally untested)...
>> 
>
> How should I proceed with this? Resubmit it after testing with the
> appropriate attribution?

Sure.  

An appropriate attribution would be a "Helped-by: me" at most, but I
do not think for something this small it may not even bee needed.

>
>>  git-stash.sh | 11 +--
>>  1 file changed, 9 insertions(+), 2 deletions(-)
>> 
>> diff --git a/git-stash.sh b/git-stash.sh
>> index 9b6c2da7b4..5f09a47f0a 100755
>> --- a/git-stash.sh
>> +++ b/git-stash.sh
>> @@ -43,9 +43,16 @@ no_changes () {
>>  }
>>  
>>  untracked_files () {
>> +if test "$1" = "-z"
>> +then
>> +shift
>> +z=-z
>> +else
>> +z=
>> +fi
>>  excl_opt=--exclude-standard
>>  test "$untracked" = "all" && excl_opt=
>> -git ls-files -o -z $excl_opt -- "$@"
>> +git ls-files -o $z $excl_opt -- "$@"
>>  }
>>  
>>  clear_stash () {
>> @@ -114,7 +121,7 @@ create_stash () {
>>  # Untracked files are stored by themselves in a parentless 
>> commit, for
>>  # ease of unpacking later.
>>  u_commit=$(
>> -untracked_files "$@" | (
>> +untracked_files -z "$@" | (
>>  GIT_INDEX_FILE="$TMPindex" &&
>>  export GIT_INDEX_FILE &&
>>  rm -f "$TMPindex" &&
>> 
>>

Re: [PATCH] t1002: stop using sum(1)

2017-08-14 Thread Junio C Hamano

René Scharfe  writes:

> It's more convenient because it shows differences nicely, it's faster on
> MinGW because we have a special implementation there based only on
> shell-internal commands,...

This made me wonder why we are not using that "faster" one
everywhere, but it turns out that it depends on bash-ism "local",
which is perfectly fine when limited to MinGW but not safe to assume
in general.

> ...
> With all sum(1) calls gone, remove the MinGW-specific implementation
> from test-lib.sh as well.
>
> [1] http://minnie.tuhs.org/cgi-bin/utree.pl?file=V3/man/man1/sum.1
> [2] 
> http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.4BSD/usr/share/man/cat1/cksum.0
> [3] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cksum.html
> ---
>  t/t1002-read-tree-m-u-2way.sh | 67 
> ++-
>  t/test-lib.sh |  3 --
>  2 files changed, 35 insertions(+), 35 deletions(-)

Sounds like a sensible approach to clean things up; I didn't check
with fine toothed comb if the patch does follow that approach
correctly without breaking things, though.

Re: [PATCH 1/9] Convert pack-objects to size_t

2017-08-14 Thread Torsten Bögershausen

On Mon, Aug 14, 2017 at 08:31:50PM +0100, Ramsay Jones wrote:
> 
> 
> On 14/08/17 18:08, Junio C Hamano wrote:
> > Junio C Hamano  writes:
> > 
> >> One interesting question is which of these two types we should use
> >> for the size of objects Git uses.  
> >>
> >> Most of the "interesting" operations done by Git require that the
> >> thing is in core as a whole before we can do anything (e.g. compare
> >> two such things to produce delta, have one in core and apply patch),
> >> so it is tempting that we deal with size_t, but at the lowest level
> >> to serve as a SCM, i.e. recording the state of a file at each
> >> version, we actually should be able to exceed the in-core
> >> limit---both "git add" of a huge file whose contents would not fit
> >> in-core and "git checkout" of a huge blob whose inflated contents
> >> would not fit in-core should (in theory, modulo bugs) be able to
> >> exercise the streaming interface to handle such case without holding
> >> everything in-core at once.  So from that point of view, even size_t
> >> may not be the "correct" type to use.
> > 
> > A few additions to the above observations.
> > 
> >  - We have varint that encodes how far the location from a delta
> >representation of an object to its base object in the packfile.
> >Both encoding and decoding sides in the current code use off_t to
> >represent this offset, so we can already reference an object that
> >is far in the same packfile as a base.
> > 
> >  - I think it is OK in practice to limit the size of individual
> >objects to size_t (i.e. on 32-bit arch, you cannot interact with
> >a repository with an object whose size exceeds 4GB).  Using off_t
> >would allow occasional ultra-huge objects that can only be added
> >and checked in via the streaming API on such a platform, but I
> >suspect that it may become too much of a hassle to maintain.
> 
> In a previous comment, I said that (on 32-bit Linux) it was likely
> that an object of > 4GB could not be handled correctly anyway. (more
> likely > 2GB). This was based on the code from (quite some) years ago.
> In particular, before you added the "streaming API". So, maybe a 32-bit
> arch _should_ be able to handle objects as large as the LFS API allows.
> (Ignoring, for the moment, that I think anybody who puts files of that
> size into an SCM probably gets what they deserve. :-P ).

In general, yes.
I had a patch that allowed a 32 bit linux to commit a file >4GB.
There is however a restriction: The file must not yet be known to the
index. Otherwise the "diff" machinery kicks in, and tries to
compare the "old" and the "new" versions, which means that -both-
must fit into "memory" at the same time. Memory means here the available
address space rather then real memory.
So there may be room for improvements for 32 bit systems, but that is another
story, which can be developped independent of the ulong->size_t conversion.

> 
> The two patches I commented on, however, changed the type of some
> variables from off_t to size_t. In general, the patches did not
> seem to make anything worse, but these type changes could potentially
> do harm. Hence my comment. (I still haven't tried the patches on my
> 32-bit Linux system. I only boot it up about once a week, and I would
> rather wait until the patches are in the 'pu' branch before testing).
> 
> ATB,
> Ramsay Jones
> 

And here (in agreement with the answer from Junio):
We should not change any off_t into size_t, since that means a possible
downgrade.
Whenever an off_t is converted into size_t, a function in git-compat-util.h is
used:

static inline size_t xsize_t(off_t len)
{
if (len > (size_t) len)
die("Cannot handle files this big");
return (size_t)len;
}

Having written this, it may be a good idea to define a similar function
xulong() which is used when we do calls into zlib.

Thanks to Martin for working on this.
If there is a branch on a public repo, I can e.g. run the test suite on
different systems.

Re: [PATCH] stash: prevent warning about null bytes in input

2017-08-14 Thread Kevin Daudt

On Mon, Aug 14, 2017 at 12:51:26PM -0700, Junio C Hamano wrote:
> Kevin Daudt  writes:
> 
> > The no_changes function calls the untracked_files function through
> > command substitution. untracked_files will return null bytes because it
> > runs ls-files with the '-z' option.
> >
> > Bash since version 4.4 warns about these null bytes. As they are not
> > required for the test that is being done, remove null bytes from the
> > input.
> 
> That's an interesting one ;-)
> 
> I wonder if you considered giving an option to untracked_files
> helper function, though.  After all, it has only two callers,
> and it feels a bit suboptimal to ask the command to do a special
> thing (i.e. "-z") only to clean it up with a pipe.

As a matter of fact, I did not consider that option. I do agree that's a
much better approach.

> 
> IOW, something along the lines of (totally untested)...
> 

How should I proceed with this? Resubmit it after testing with the
appropriate attribution?


>  git-stash.sh | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/git-stash.sh b/git-stash.sh
> index 9b6c2da7b4..5f09a47f0a 100755
> --- a/git-stash.sh
> +++ b/git-stash.sh
> @@ -43,9 +43,16 @@ no_changes () {
>  }
>  
>  untracked_files () {
> + if test "$1" = "-z"
> + then
> + shift
> + z=-z
> + else
> + z=
> + fi
>   excl_opt=--exclude-standard
>   test "$untracked" = "all" && excl_opt=
> - git ls-files -o -z $excl_opt -- "$@"
> + git ls-files -o $z $excl_opt -- "$@"
>  }
>  
>  clear_stash () {
> @@ -114,7 +121,7 @@ create_stash () {
>   # Untracked files are stored by themselves in a parentless 
> commit, for
>   # ease of unpacking later.
>   u_commit=$(
> - untracked_files "$@" | (
> + untracked_files -z "$@" | (
>   GIT_INDEX_FILE="$TMPindex" &&
>   export GIT_INDEX_FILE &&
>   rm -f "$TMPindex" &&
> 
>

Re: [PATCH] win32: plug memory leak on realloc() failure in syslog()

2017-08-14 Thread Johannes Schindelin

Hi René,

On Thu, 10 Aug 2017, René Scharfe wrote:

> If realloc() fails then the original buffer is still valid.  Free it
> before exiting the function.
> 
> Signed-off-by: Rene Scharfe 

The subject had me worried for a second... The realloc() fails so rarely
that I, for one, have never encountered that problem.

Still, it is a correct fix for a real bug.

Thanks,
Dscho

Re: [PATCH v3 1/2 / RFC] builtin/branch: stop supporting the use of --set-upstream option

2017-08-14 Thread Junio C Hamano

Kaartic Sivaraam  writes:

> The '--set-upstream' option of branch was deprecated in,
>
> b347d06bf branch: deprecate --set-upstream and show help if we
> detect possible mistaken use (Thu, 30 Aug 2012 19:23:13 +0200)
>
> It was deprecated for the reasons specified in the commit message of the
> referenced commit.

I wonder if these two lines add any value here.  Those who know the
reason would not be helped, and those who don't know have to view
"git show b347d06bf" anyway.

> Make 'branch' die with an appropraite error message when the '--set-upstream'
> option is used.

OK.

> Note that there's a reason behind "dying with an error message" instead of
> "not accepting the option". 'git branch' would *accept* '--set-upstream'
> even after it's removal as a consequence of,
>
> Unique prefix can be abbrievated in option names
>
>   AND
>
> '--set-upstream' is a unique prefix of '--set-upstream-to'
>(when the '--set-upstream' option has been removed)
>
> In order to smooth the transition for users and to avoid them being affected
> by the "prefix issue" it was decided to make branch die when seeing the
> '--set-upstream' flag for a few years and let the users know that it would be
> removed some time in the future.

I somehow think the above wastes bits a bit too much.  Wouldn't it
be sufficient to say

In order to prevent "--set-upstream" on a command line from
being taken as an abbreviated form of "--set-upstream-to",
explicitly catch "--set-upstream" option and die, instead of
just removing it from the list of options.

> $ git branch --set-upstream origin/master
> The --set-upstream flag is deprecated and will be removed. Consider using 
> --track or --set-upstream-to
> Branch origin/master set up to track local branch master.

> After,
>
> $ git branch
> * master
>
> $ git branch --set-upstream origin/master
> fatal: the '--set-upstream' flag is no longer supported and will be 
> removed. Consider using '--track' or '--set-upstream-to'

Because from the end-user's point of view, it has already been
removed, I'd phrase it more like

The --set-upstream option has been removed.  Use --track or ...

and make sure we do not list "--set-upstream" in the list of
supported options in

git branch -h

output.

>  A query,
>
> I see the following code in the code path a little above the die statement
> added in this change,
>
> if (!strcmp(argv[0], "HEAD"))
>   die(_("it does not make sense to create 'HEAD' 
> manually"));
>
> It does seem to be doing quite a nice job of avoiding an ambiguity that 
> could
> have bad consequences but it's still possible to create a branch named 
> 'HEAD'
> using the '-b' option of 'checkout'. Should 'git checkout -b HEAD' 
> actually
> fail(it does not currently) for the same reason 'git branch HEAD' fails?
>
> My guess is that people would use 'git checkout -b  
> '
> more than it's 'git branch' counterpart.

Thanks for noticing.  I offhand see no reason not to do what you
suggest above.

> - OPT_SET_INT( 0, "set-upstream",  , N_("change upstream 
> info"),
> + OPT_SET_INT( 0, "set-upstream",  , N_("no longer 
> supported"),
>   BRANCH_TRACK_OVERRIDE),

Here we would want to use something like

{ OPTION_SET_INT, 0, "set-upstream", , NULL, N_("do not use"),
  PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, NULL, BRANCH_TRACK_OVERRIDE },

in order to hide the option from "git branch -h" output.

All review comments from Martin were also good ones, and I won't
repeat them here.

Thanks.

[PATCH] t1002: stop using sum(1)

2017-08-14 Thread René Scharfe

sum(1) is a command for calculating checksums of the contents of files.
It was part of early editions of Unix ("Research Unix", 1972/1973, [1]).
cksum(1) appeared in 4.4BSD (1993) as a replacement [2], and became part
of POSIX.1-2008 [3].  OpenBSD 5.6 (2014) removed sum(1).

We only use sum(1) in t1002 to check for changes in three files.  On
MinGW we use md5sum(1) instead.  We could switch to the standard command
cksum(1) for all platforms; MinGW comes with GNU coreutils now, which
provides sum(1), cksum(1) and md5sum(1).  Use our standard method for
checking for file changes instead: test_cmp.

It's more convenient because it shows differences nicely, it's faster on
MinGW because we have a special implementation there based only on
shell-internal commands, it's simpler as it allows us to avoid stripping
out unnecessary entries from the checksum file using grep(1), and it's
more consistent with the rest of the test suite.

We already compare changed files with their expected new contents using
diff(1), so we don't need to check with "test_must_fail test_cmp" if
they differ from their original state.  A later patch could convert the
direct diff(1) calls to test_cmp as well.

With all sum(1) calls gone, remove the MinGW-specific implementation
from test-lib.sh as well.

[1] http://minnie.tuhs.org/cgi-bin/utree.pl?file=V3/man/man1/sum.1
[2] 
http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.4BSD/usr/share/man/cat1/cksum.0
[3] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cksum.html
---
 t/t1002-read-tree-m-u-2way.sh | 67 ++-
 t/test-lib.sh |  3 --
 2 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/t/t1002-read-tree-m-u-2way.sh b/t/t1002-read-tree-m-u-2way.sh
index e3bf821694..7ca2e65d10 100755
--- a/t/t1002-read-tree-m-u-2way.sh
+++ b/t/t1002-read-tree-m-u-2way.sh
@@ -51,7 +51,9 @@ test_expect_success \
  treeM=$(git write-tree) &&
  echo treeM $treeM &&
  git ls-tree $treeM &&
- sum bozbar frotz nitfol >M.sum &&
+ cp bozbar bozbar.M &&
+ cp frotz frotz.M &&
+ cp nitfol nitfol.M &&
  git diff-tree $treeH $treeM'
 
 test_expect_success \
@@ -61,8 +63,9 @@ test_expect_success \
  read_tree_u_must_succeed -m -u $treeH $treeM &&
  git ls-files --stage >1-3.out &&
  cmp M.out 1-3.out &&
- sum bozbar frotz nitfol >actual3.sum &&
- cmp M.sum actual3.sum &&
+ test_cmp bozbar.M bozbar &&
+ test_cmp frotz.M frotz &&
+ test_cmp nitfol.M nitfol &&
  check_cache_at bozbar clean &&
  check_cache_at frotz clean &&
  check_cache_at nitfol clean'
@@ -79,8 +82,9 @@ test_expect_success \
  test_might_fail git diff -U0 --no-index M.out 4.out >4diff.out &&
  compare_change 4diff.out expected &&
  check_cache_at yomin clean &&
- sum bozbar frotz nitfol >actual4.sum &&
- cmp M.sum actual4.sum &&
+ test_cmp bozbar.M bozbar &&
+ test_cmp frotz.M frotz &&
+ test_cmp nitfol.M nitfol &&
  echo yomin >yomin1 &&
  diff yomin yomin1 &&
  rm -f yomin1'
@@ -98,8 +102,9 @@ test_expect_success \
  test_might_fail git diff -U0 --no-index M.out 5.out >5diff.out &&
  compare_change 5diff.out expected &&
  check_cache_at yomin dirty &&
- sum bozbar frotz nitfol >actual5.sum &&
- cmp M.sum actual5.sum &&
+ test_cmp bozbar.M bozbar &&
+ test_cmp frotz.M frotz &&
+ test_cmp nitfol.M nitfol &&
  : dirty index should have prevented -u from checking it out. &&
  echo yomin yomin >yomin1 &&
  diff yomin yomin1 &&
@@ -115,8 +120,9 @@ test_expect_success \
  git ls-files --stage >6.out &&
  test_cmp M.out 6.out &&
  check_cache_at frotz clean &&
- sum bozbar frotz nitfol >actual3.sum &&
- cmp M.sum actual3.sum &&
+ test_cmp bozbar.M bozbar &&
+ test_cmp frotz.M frotz &&
+ test_cmp nitfol.M nitfol &&
  echo frotz >frotz1 &&
  diff frotz frotz1 &&
  rm -f frotz1'
@@ -132,8 +138,8 @@ test_expect_success \
  git ls-files --stage >7.out &&
  test_cmp M.out 7.out &&
  check_cache_at frotz dirty &&
- sum bozbar frotz nitfol >actual7.sum &&
- if cmp M.sum actual7.sum; then false; else :; fi &&
+ test_cmp bozbar.M bozbar &&
+ test_cmp nitfol.M nitfol &&
  : dirty index should have prevented -u from checking it out. &&
  echo frotz frotz >frotz1 &&
  diff frotz frotz1 &&
@@ -165,8 +171,10 @@ test_expect_success \
  read_tree_u_must_succeed -m -u $treeH $treeM &&
  git ls-files --stage >10.out &&
  cmp M.out 10.out &&
- sum bozbar frotz nitfol >actual10.sum &&
- cmp M.sum actual10.sum'
+ test_cmp bozbar.M bozbar &&
+ test_cmp frotz.M frotz &&
+ test_cmp nitfol.M nitfol
+'
 
 test_expect_success \
 '11 - dirty path removed.' \
@@ -209,11 +217,8 @@ test_expect_success \
  git ls-files --stage >14.out &&
  test_must_fail git diff -U0 --no-index M.out 14.out >14diff.out &&
  compare_change 14diff.out

Re: [PATCH 1/9] Convert pack-objects to size_t

2017-08-14 Thread Junio C Hamano

Ramsay Jones  writes:

> In a previous comment, I said that (on 32-bit Linux) it was likely
> that an object of > 4GB could not be handled correctly anyway. (more
> likely > 2GB). This was based on the code from (quite some) years ago.
> In particular, before you added the "streaming API". So, maybe a 32-bit
> arch _should_ be able to handle objects as large as the LFS API allows.
> (Ignoring, for the moment, that I think anybody who puts files of that
> size into an SCM probably gets what they deserve. :-P ).
>
> The two patches I commented on, however, changed the type of some
> variables from off_t to size_t. In general, the patches did not
> seem to make anything worse, but these type changes could potentially
> do harm. Hence my comment. (I still haven't tried the patches on my
> 32-bit Linux system. I only boot it up about once a week, and I would
> rather wait until the patches are in the 'pu' branch before testing).

We are in perfect agreement.

I didn't mean to say that it is OK to replace off_t with size_t
without a good reason, especially when the current code (at least
the part I looked at anyway, like the OFS_DELTA part) seems to use
off_t correctly, and your review comments are very much appreciated,
so is the effort started by Martin to take us in the direction of
using types more appropriate than "ulong".

Re: [RFC PATCH 0/3] Fixes to "diff --color-moved" MIN_BLOCK_LENGTH handling

2017-08-14 Thread Stefan Beller

On Mon, Aug 14, 2017 at 12:37 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> On Fri, Aug 11, 2017 at 5:39 PM, Junio C Hamano  wrote:
>> ...
>>> My preference however is to keep sb/diff-color-move topic as-is
>>> without replacing and fixing it with incremental updates like these
>>> patches.
>>
>> I would have hoped to not need to reroll that topic.
>> Though I do find patches 1&2 valuable either on top or squashed
>> into "[PATCH] diff.c: color moved lines differently" and
>> "[PATCH] diff.c: color moved lines differently, plain mode"
>> respectively.
>>
>> So I'd ask to pick at least patches 1&2 on top of that series, please?
>
> Yeah, that is exactly what I did before reading this message but
> after reading your comments on the patches ;-)
>
>> (I am missing the context for *why* you preference is to not do
>> exactly this).
>
> I see what I wrote can be misread, especially due to its lack of
> ",instead", that I want to keep the broken one as-is, with neither
> reroll nor fixup.  That is not what I meant.
>
>  - If you choose to squash so that the resulting history after the
>series graduates to 'master' will be simpler to read (due to lack
>of "oops, that was a mistake"), I do not mind a reroll.
>
>  - On the other hand, as the topic has been in 'next' for some time
>and presumably people tried it in their real daily work when
>needed, keeping what is queued as-is has a value---we have a
>fixed reference point that we can go back to to compare the code
>with and without the fix.
>
> I do not have a strong preference, but if I were asked to choose,
> I'd choose the latter.

We'll go with the latter then. Thanks!

Other reasons for the latter that I want to add:
* The patches are written 2 month apart, which may indicate that
  there was real usage and hence fixes with a more substantiated
  understanding of the new feature.
* We should not strive for "perfect" history IMHO. That is because
  commit messages provide a lot of reasoning and add a lot of value
  for understanding the code. If I were to squash and reroll, I would
  need to make sure these points are addressed in the commit
  message to have a result that is equally good.
  The history only needs to be "good-enough", which we defined to
  "bisectable on all platforms that we care about", fixups/bugfixes
  are like the cherry on the cake, it draw attention on its own.
  Not a bad thing IMHO.

Re: [PATCH] stash: prevent warning about null bytes in input

2017-08-14 Thread Junio C Hamano

Kevin Daudt  writes:

> The no_changes function calls the untracked_files function through
> command substitution. untracked_files will return null bytes because it
> runs ls-files with the '-z' option.
>
> Bash since version 4.4 warns about these null bytes. As they are not
> required for the test that is being done, remove null bytes from the
> input.

That's an interesting one ;-)

I wonder if you considered giving an option to untracked_files
helper function, though.  After all, it has only two callers,
and it feels a bit suboptimal to ask the command to do a special
thing (i.e. "-z") only to clean it up with a pipe.

IOW, something along the lines of (totally untested)...

 git-stash.sh | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/git-stash.sh b/git-stash.sh
index 9b6c2da7b4..5f09a47f0a 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -43,9 +43,16 @@ no_changes () {
 }
 
 untracked_files () {
+   if test "$1" = "-z"
+   then
+   shift
+   z=-z
+   else
+   z=
+   fi
excl_opt=--exclude-standard
test "$untracked" = "all" && excl_opt=
-   git ls-files -o -z $excl_opt -- "$@"
+   git ls-files -o $z $excl_opt -- "$@"
 }
 
 clear_stash () {
@@ -114,7 +121,7 @@ create_stash () {
# Untracked files are stored by themselves in a parentless 
commit, for
# ease of unpacking later.
u_commit=$(
-   untracked_files "$@" | (
+   untracked_files -z "$@" | (
GIT_INDEX_FILE="$TMPindex" &&
export GIT_INDEX_FILE &&
rm -f "$TMPindex" &&

Re: [RFC PATCH 0/3] Fixes to "diff --color-moved" MIN_BLOCK_LENGTH handling

2017-08-14 Thread Junio C Hamano

Stefan Beller  writes:

> On Fri, Aug 11, 2017 at 5:39 PM, Junio C Hamano  wrote:
> ...
>> My preference however is to keep sb/diff-color-move topic as-is
>> without replacing and fixing it with incremental updates like these
>> patches.
>
> I would have hoped to not need to reroll that topic.
> Though I do find patches 1&2 valuable either on top or squashed
> into "[PATCH] diff.c: color moved lines differently" and
> "[PATCH] diff.c: color moved lines differently, plain mode"
> respectively.
>
> So I'd ask to pick at least patches 1&2 on top of that series, please?

Yeah, that is exactly what I did before reading this message but
after reading your comments on the patches ;-)

> (I am missing the context for *why* you preference is to not do
> exactly this).

I see what I wrote can be misread, especially due to its lack of
",instead", that I want to keep the broken one as-is, with neither
reroll nor fixup.  That is not what I meant.

 - If you choose to squash so that the resulting history after the
   series graduates to 'master' will be simpler to read (due to lack
   of "oops, that was a mistake"), I do not mind a reroll.  

 - On the other hand, as the topic has been in 'next' for some time
   and presumably people tried it in their real daily work when
   needed, keeping what is queued as-is has a value---we have a
   fixed reference point that we can go back to to compare the code
   with and without the fix.

I do not have a strong preference, but if I were asked to choose,
I'd choose the latter.

Thanks.

Re: [PATCH 1/9] Convert pack-objects to size_t

2017-08-14 Thread Ramsay Jones

On 14/08/17 18:08, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
>> One interesting question is which of these two types we should use
>> for the size of objects Git uses.  
>>
>> Most of the "interesting" operations done by Git require that the
>> thing is in core as a whole before we can do anything (e.g. compare
>> two such things to produce delta, have one in core and apply patch),
>> so it is tempting that we deal with size_t, but at the lowest level
>> to serve as a SCM, i.e. recording the state of a file at each
>> version, we actually should be able to exceed the in-core
>> limit---both "git add" of a huge file whose contents would not fit
>> in-core and "git checkout" of a huge blob whose inflated contents
>> would not fit in-core should (in theory, modulo bugs) be able to
>> exercise the streaming interface to handle such case without holding
>> everything in-core at once.  So from that point of view, even size_t
>> may not be the "correct" type to use.
> 
> A few additions to the above observations.
> 
>  - We have varint that encodes how far the location from a delta
>representation of an object to its base object in the packfile.
>Both encoding and decoding sides in the current code use off_t to
>represent this offset, so we can already reference an object that
>is far in the same packfile as a base.
> 
>  - I think it is OK in practice to limit the size of individual
>objects to size_t (i.e. on 32-bit arch, you cannot interact with
>a repository with an object whose size exceeds 4GB).  Using off_t
>would allow occasional ultra-huge objects that can only be added
>and checked in via the streaming API on such a platform, but I
>suspect that it may become too much of a hassle to maintain.

In a previous comment, I said that (on 32-bit Linux) it was likely
that an object of > 4GB could not be handled correctly anyway. (more
likely > 2GB). This was based on the code from (quite some) years ago.
In particular, before you added the "streaming API". So, maybe a 32-bit
arch _should_ be able to handle objects as large as the LFS API allows.
(Ignoring, for the moment, that I think anybody who puts files of that
size into an SCM probably gets what they deserve. :-P ).

The two patches I commented on, however, changed the type of some
variables from off_t to size_t. In general, the patches did not
seem to make anything worse, but these type changes could potentially
do harm. Hence my comment. (I still haven't tried the patches on my
32-bit Linux system. I only boot it up about once a week, and I would
rather wait until the patches are in the 'pu' branch before testing).

ATB,
Ramsay Jones

Re: Not understanding with git wants to copy one file to another

2017-08-14 Thread Junio C Hamano

Stefan Beller  writes:

> On Fri, Aug 11, 2017 at 1:41 PM, Harry Putnam  wrote:
>> Stefan Beller  writes:
>>
>>
>> [...]
>>
>>> Ah. Sorry for confusing even more.
>>> By pointing out the options for git-diff, I just wanted to point out that
>>> such a mechanism ("rename/copy detection") exists.
>>
>>
>> [...]
>>
 What am I missing?

>>>
>>> https://www.reddit.com/r/git/comments/3ogkk1/beginner_disable_rename_detection/
>>>
>>> "Rename detection is just GUI sugar".
>>
>> Thanks there is a nice full explanation at the cited url.
>>
>> What is still a bit puzzling is that in that same commit, there are
>> files that are true copies of each other, just in different locations,
>> But nothing pops up about them in a git commit.
>>
>
> The heuristic to find the renames/copies only looks at modified files
> to be fast(, the assumption is that each commit only touches few
> files, but the project consists of a lot of files).
>
> For that git-diff knows about '--find-copies-harder' that looks at
> all files even those not modified. This would point out the true
> copies, I would assume.
>
> I don't think we'd want to include the '--find-copies-harder' flag
> to status or commit, as it may take some time in large projects.

Yeah, thanks for helping in this discussion.

Re: [PATCH v3 1/2 / RFC] builtin/branch: stop supporting the use of --set-upstream option

2017-08-14 Thread Martin Ågren

On 14 August 2017 at 10:54, Kaartic Sivaraam
 wrote:
> The '--set-upstream' option of branch was deprecated in,
>
> b347d06bf branch: deprecate --set-upstream and show help if we
> detect possible mistaken use (Thu, 30 Aug 2012 19:23:13 +0200)
>
> It was deprecated for the reasons specified in the commit message of the
> referenced commit.
>
> Make 'branch' die with an appropraite error message when the '--set-upstream'
> option is used.
>
> Note that there's a reason behind "dying with an error message" instead of
> "not accepting the option". 'git branch' would *accept* '--set-upstream'
> even after it's removal as a consequence of,
>
> Unique prefix can be abbrievated in option names
>
>   AND
>
> '--set-upstream' is a unique prefix of '--set-upstream-to'
>(when the '--set-upstream' option has been removed)
>
> In order to smooth the transition for users and to avoid them being affected
> by the "prefix issue" it was decided to make branch die when seeing the
> '--set-upstream' flag for a few years and let the users know that it would be
> removed some time in the future.
>
> The before/after behaviour for a simple case follows,
>
> $ git remote
> origin
>
> Before,
>
> $ git branch
> * master
>
> $ git branch --set-upstream origin/master
> The --set-upstream flag is deprecated and will be removed. Consider using 
> --track or --set-upstream-to
> Branch origin/master set up to track local branch master.
>
> $ echo $?
> 0
>
> $ git branch
> * master
>   origin/master
>
> After,
>
> $ git branch
> * master
>
> $ git branch --set-upstream origin/master
> fatal: the '--set-upstream' flag is no longer supported and will be 
> removed. Consider using '--track' or '--set-upstream-to'
>
> $ echo $?
> 128
>
> $ git branch
> * master
>
> Signed-off-by: Kaartic Sivaraam 
> ---
>  Changes in v3:
>
> A few tweaks to the following:
>  * Commit message
>  * Error message (the one shown when '--set-upstream' is seen)
>  * Updated the corresponding message in the options structure
>  * Documentation
>
>  A query,
>
> I see the following code in the code path a little above the die statement
> added in this change,
>
> if (!strcmp(argv[0], "HEAD"))
> die(_("it does not make sense to create 'HEAD' 
> manually"));
>
> It does seem to be doing quite a nice job of avoiding an ambiguity that 
> could
> have bad consequences but it's still possible to create a branch named 
> 'HEAD'
> using the '-b' option of 'checkout'. Should 'git checkout -b HEAD' 
> actually
> fail(it does not currently) for the same reason 'git branch HEAD' fails?
>
> My guess is that people would use 'git checkout -b  
> '
> more than it's 'git branch' counterpart.
>
>
>  Documentation/git-branch.txt |  8 +++
>  builtin/branch.c | 23 ++--
>  t/t3200-branch.sh| 50 
> 
>  t/t6040-tracking-info.sh | 16 +++---
>  4 files changed, 18 insertions(+), 79 deletions(-)
>
> diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
> index 81bd0a7b7..93ee05c55 100644
> --- a/Documentation/git-branch.txt
> +++ b/Documentation/git-branch.txt
> @@ -195,10 +195,10 @@ start-point is either a local or remote-tracking branch.
> branch.autoSetupMerge configuration variable is true.
>
>  --set-upstream::
> -   If specified branch does not exist yet or if `--force` has been
> -   given, acts exactly like `--track`. Otherwise sets up configuration
> -   like `--track` would when creating the branch, except that where
> -   branch points to is not changed.
> +   As this option has confusing syntax it's no longer supported. Please 
> use

"has" or "had"? (I guess when someone reads this, it "has" no syntax at
all. ;) )

> +  --track or --set-upstream-to  instead.

Maybe indent with a tab instead of two spaces for consistency with the
rest of the file.

> ++
> +Note: This could possibly become an alias of --set-upstream-to in the future.

(But not here.)

>
>  -u ::
>  --set-upstream-to=::
> diff --git a/builtin/branch.c b/builtin/branch.c
> index a3bd2262b..b92070393 100644
> --- a/builtin/branch.c
> +++ b/builtin/branch.c
> @@ -557,7 +557,7 @@ int cmd_branch(int argc, const char **argv, const char 
> *prefix)
> OPT__QUIET(, N_("suppress informational messages")),
> OPT_SET_INT('t', "track",  , N_("set up tracking mode 
> (see git-pull(1))"),
> BRANCH_TRACK_EXPLICIT),
> -   OPT_SET_INT( 0, "set-upstream",  , N_("change upstream 
> info"),
> +   OPT_SET_INT( 0, "set-upstream",  , N_("no longer 
> supported"),
> BRANCH_TRACK_OVERRIDE),
>

Re: Bug with corruption on clone/fsck/... with large packs + 64-bit Windows, problem with usage of "long" datatype for sizes/offsets?

2017-08-14 Thread Junio C Hamano

Stefan Beller  writes:

> On Mon, Aug 14, 2017 at 9:05 AM, Johannes Schindelin
>  wrote:
>> Hi Christoph,
>>
>> On Fri, 11 Aug 2017, Dr.-Ing. Christoph Cullmann wrote:
>>
>>> on Windows 64-bit, for a repository having a .pack file > 4GB I get
>>> during cloning:
>>
>> The reason is Git's source code that over-uses the `unsigned long`
>> datatype.
>>
>> In a nearby-thread, an underappreciated effort by Martin Koegler is
>> underway to get the ball rolling in getting it fixed. Maybe you can help
>> making Martin a lot more welcome on the Git mailing list, and maybe even
>> help getting his patches reviewed and integrated?
>
> 'nearby' as in [1] ;-)
>
> [1] 
> https://public-inbox.org/git/1502527643-21944-1-git-send-email-martin@mail.zuhause/
>
> I had the impression the review is going well there?

I do not know if it is "going well", but I do not agree with the
"underappreciated" bit at all.  I find such a blanket statement
toxic and detrimental to the community.

It is true that many topics in flight are broken with a tree-wide
change that is presented as a single ball of wax.  Unlike the
ongoing "struct object_id" effort, which also is tree-wide but tries
to find a step-wise refinement to reduce its impact on other topics,
it is harder to integrate such a change.  But that does not have any
relation to how much the effort is appreciated.

GOOD NEWS

2017-08-14 Thread REV .JIMMY


Attention My Dear

Your first payment of $5000.00 will be sent today via Western Union.if you 
provide the needed information 
You are advise to Contact Western Union with your full information to enable 
them give 
you  the Senders Name, Question and Answer to pick up your First $5000  
 For more information contact Rev. Jimmy 
Wilson, text me (229) 231-6473)  he'll keep sending you payment until your 
total fund of $500.000usd is Completed 

Receiver's Name___
Address: 
Country: 
Phone Number: _

Best Regards,
Mr emeka  Oba or
( 254)6724705)

Re: [PATCH] cache-tree: remove use of strbuf_addf in update_one

2017-08-14 Thread Junio C Hamano

Jeff King  writes:

> On Thu, Aug 10, 2017 at 11:58:34AM -0700, Stefan Beller wrote:
>
>> On Thu, Aug 10, 2017 at 11:47 AM, Kevin Willford  
>> wrote:
>> > String formatting can be a performance issue when there are
>> > hundreds of thousands of trees.
>> 
>> When changing this for the sake of performance, could you give
>> an example (which kind of repository you need for this to become
>> a bottleneck? I presume the large Windows repo? Or can I
>> reproduce it with a small repo such as linux.git or even git.git?)
>> and some numbers how this improves the performance?
>
> I was about to say the same thing. Normally I don't mind a small
> optimization without numbers if the result is obviously an improvement.
>
> But in this case the result is a lot less readable, and it's not
> entirely clear to me that it would always be an improvement (we now
> always run 3 strbuf calls instead of one, and have to check the length
> for each one).
>
> What I'm wondering specifically is if vsnprintf() on Kevin's platform
> (which I'll assume is Windows) is slow, and we would do better to
> replace it with a faster compat/ routine.

Yeah, I had the same reaction.

Re: [PATCH v2 1/2] format-patch: have progress option while generating patches

2017-08-14 Thread Junio C Hamano

Junio C Hamano  writes:

> Perhaps we may want to replace the calls to progress_delay() with a
> call to a simpler wrapper that does not let the callers give their
> own delay threashold to simplify the API.

... which does not look too bad, but because it makes me wonder if
we even need to make the delay-threshold customizable per callers,
I'll wait further discussions before committing to the approach.

For example, I do not quite understand why 95 is a good value for
prune-packed while 0 is a good value for prune.  

The rename detection in diffcore-rename, delay in blame, and
checkout (via unpack-trees) all of which use 50-percent threshold
with 1 second delay, sort of make sense to me in that if we
completed 50 percent within 1 second, it is likely we will finish
all in 2 seconds (which is the norm for everybody else), perhaps as
a better version of 0-percent 2 seconds rule.

 builtin/blame.c|  3 +--
 builtin/fsck.c |  2 +-
 builtin/prune-packed.c |  4 ++--
 builtin/prune.c|  2 +-
 builtin/rev-list.c |  2 +-
 diffcore-rename.c  |  4 ++--
 progress.c | 10 --
 progress.h |  4 ++--
 unpack-trees.c |  3 +--
 9 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index bda1a78726..05115900f3 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -925,8 +925,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
sb.found_guilty_entry = _guilty_entry;
sb.found_guilty_entry_data = 
if (show_progress)
-   pi.progress = start_progress_delay(_("Blaming lines"),
-  sb.num_lines, 50, 1);
+   pi.progress = start_delayed_progress(_("Blaming lines"), 
sb.num_lines, 50);
 
assign_blame(, opt);
 
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 99dea7adf6..6a26079ebc 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -188,7 +188,7 @@ static int traverse_reachable(void)
unsigned int nr = 0;
int result = 0;
if (show_progress)
-   progress = start_progress_delay(_("Checking connectivity"), 0, 
0, 2);
+   progress = start_delayed_progress(_("Checking connectivity"), 
0, 0);
while (pending.nr) {
struct object_array_entry *entry;
struct object *obj;
diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
index ac978ad401..4fc6b175de 100644
--- a/builtin/prune-packed.c
+++ b/builtin/prune-packed.c
@@ -37,8 +37,8 @@ static int prune_object(const struct object_id *oid, const 
char *path,
 void prune_packed_objects(int opts)
 {
if (opts & PRUNE_PACKED_VERBOSE)
-   progress = start_progress_delay(_("Removing duplicate objects"),
-   256, 95, 2);
+   progress = start_delayed_progress(_("Removing duplicate 
objects"),
+ 256, 95);
 
for_each_loose_file_in_objdir(get_object_directory(),
  prune_object, NULL, prune_subdir, );
diff --git a/builtin/prune.c b/builtin/prune.c
index c378690545..f32adaa0e9 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -138,7 +138,7 @@ int cmd_prune(int argc, const char **argv, const char 
*prefix)
if (show_progress == -1)
show_progress = isatty(2);
if (show_progress)
-   progress = start_progress_delay(_("Checking connectivity"), 0, 
0, 2);
+   progress = start_delayed_progress(_("Checking connectivity"), 
0, 0);
 
mark_reachable_objects(, 1, expire, progress);
stop_progress();
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 95d84d5cda..5cfc4b35f4 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -364,7 +364,7 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
revs.limited = 1;
 
if (show_progress)
-   progress = start_progress_delay(show_progress, 0, 0, 2);
+   progress = start_delayed_progress(show_progress, 0, 0);
 
if (use_bitmap_index && !revs.prune) {
if (revs.count && !revs.left_right && !revs.cherry_mark) {
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 786f389498..0eafa43618 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -532,9 +532,9 @@ void diffcore_rename(struct diff_options *options)
}
 
if (options->show_rename_progress) {
-   progress = start_progress_delay(
+   progress = start_delayed_progress(
_("Performing inexact rename detection"),
-   rename_dst_nr * rename_src_nr, 50, 1);
+   rename_dst_nr * rename_src_nr, 50);
}
 
mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_create), sizeof(*mx));
diff --git a/progress.c b/progress.c
index 73e36d4a42..eeebe15b6f

Re: [External] Re: gitk -m ?

2017-08-14 Thread Uxío Prego

Maybe there is a chance that combining `gitk `git log -m --follow
--pretty=format:%h PATHSPEC`` with typing in _Enter files and directories to
include, one per line:_ any (or maybe all) of the paths the doc had through
history, would do; but /me more pessimistic now.
Uxío Prego



Madiva Soluciones
CL / SERRANO GALVACHE 56
BLOQUE ABEDUL PLANTA 4
28033 MADRID
+34 917 56 84 94
www.madiva.com
www.bbva.com

The activity of email inboxes can be systematically tracked by
colleagues, business partners and third parties. Turn off automatic
loading of images to hamper it.


2017-08-14 19:54 GMT+02:00 Burkhardt, Glenn BUTAS
:
> Neither of those two work for me.  They don't limit the view to the single 
> file of interest.
>
> Also, I tried "additional arguments to git log", using "-m --follow".  I 
> filled in the single file of interest in the 'Enter files' section.  The 
> error message was:
>
> Can't parse git log output:  {commit 9cc8be... faab99... }
>
> -Original Message-
> From: Uxío Prego [mailto:upr...@madiva.com]
> Sent: Monday, August 14, 2017 12:47
> To: Burkhardt, Glenn B UTAS
> Cc: git@vger.kernel.org
> Subject: Re: [External] Re: gitk -m ?
>
> I do not know if you can do what you want, mostly if you can do what you want 
> as simply as you might be wanting that you want it, but I guess you could use 
> this gitk boot command as a _simple_ work around somehow aliased within your 
> command line configuration:
>
> $ gitk (--all)? $(git log -m --follow --pretty=format:%h PATHSPEC)
>
> Alternatively, there is a _view configuration_ menu (_new view_, _edit view_) 
> where there is a text box labeled _Command to generate more commits to 
> include_. If you type here:
>
> git log -m --follow --pretty=format:%h PATHSPEC
>
> I do not know what will happen and I can not test that now (I eventually 
> will), but chances are it could do what you wanted. Maybe you can even use 
> custom aliases in there that text box.
>
> I guess you are receiving a more authorized answer soonish, in the meanwhile, 
> hope that helped.
>
> Regards,
>
>
> On 14 Aug 2017, at 15:20, Burkhardt, Glenn B UTAS 
>  wrote:
>
> They don't.  In particular, information about commits that are parts of 
> merges is missing.
>
> Here's an example.  There are only two entries listed in 'gitk --all'
> for a particular file (sorry, I'd prefer to include a screen sho, but the 
> mailing list doesn't allow HTML messages).
>
> gitk --all MANIFEST.MF
>
> Parent: f7462684ae78720aac05c929256d770118cf01fa (initial clone from 
> Clearcase integ3 branch)
> Branches: master, remotes/origin/master, remotes/origin/ww, ww
> Follows:
> Precedes:
>
>require java 1.8
>
> Child:  240f151d61fd4fd06f377bc52970b3574e5f9031 (require java 1.8)
> Branches: master, remotes/origin/master, remotes/origin/ww, ww
> Follows:
> Precedes:
>
>initial clone from Clearcase integ3 branch
>
>
> git log with '-m' and '-follow' shows:
>
> $ git log -m --follow --oneline MANIFEST.MF
> 9cc8be4 (from 1222d7c) Merge branch 'master' into ww; strategy "ours"
> a423f2d (from f869950) merge from ww branch; remove Bundle-NativeCode
> 51f0628 (from 2c6478c) Merge branch 'ww' of coverity:rmps into ww
> 240f151 require java 1.8
> f746268 initial clone from Clearcase integ3 branch
>
>
> -Original Message-
> From: Uxío Prego [mailto:upr...@madiva.com]
> Sent: Monday, August 14, 2017 01:12
> To: Burkhardt, Glenn B UTAS
> Cc: git@vger.kernel.org
> Subject: [External] Re: gitk -m ?
>
> Not sure what you are wanting to achieve, but please make sure neither `gitk 
> PATHSPEC` nor `gitk --all PATHSPEC` are presenting you enough information.
>
> On 3 Aug 2017, at 19:37, Burkhardt, Glenn B UTAS 
>  wrote:
>
> I've been looking in 'gitk' for an option that does what 'git log -m'
> does.  Did I miss something?  In particular, I'd like to get information 
> about a file that's currently available with "git log -m --all --follow", but 
> presented in 'gitk'.  If it's not there, please consider this a feature 
> request.
>
> Thanks.

Re: [PATCH] hook: use correct logical variable

2017-08-14 Thread Junio C Hamano

Kaartic Sivaraam  writes:

> Sign-off added should be that of the "committer" not that of the
> "commit's author".
>
> Use the correct logical variable that identifies the committer.
>
> Signed-off-by: Kaartic Sivaraam 
> ---
>  This fixes a small issue when trying to do the following with the script 
> enabled,
>
> $ git commit --amend -s
>
>  If the commit being amended was signed off by the commit's author
>  then the above command would *append* the sign-off of the
>  committer followed by that of the commit's author.
>  That' because the script is invoked only after the sign-off is
>  added by the '-s' option AND the default of 'trailer.ifexists'
>  for interpret-trailers currently defaults to the
>  'addIfDifferentNeighbor' thus interpret-trailer fails to identify
>  the existing sign-off of the commit's author and adds it.
>
>  Anyways, it doesn't make sense for a script to add the sign-off
>  of the commit's author. So, fixing it seemed correct to me.

I tend to agree with "Anyways" above ;-) simply because I found the
long paragraph more confusing than enlightening, leaving me wonder
"why is the user using different settings for author and committer
name" at the end, which _is_ an irrelevant point in the issue being
addressed.

The real issue is that this commented-out sample hook uses "author"
ident for the sign-off, while all the rest of Git (i.e. all callers
of append_signoff() in sequencer.c) use committer ident.

If anything to be changed to this patch, I would say its "should be"
in the log message can be clarified with "why", perhaps like:

Sign-off added should be that of the "committer", not that of
the "commit's author"; that is how the rest of Git adds sign-off
using sequencer.c::append_signoff().

Thanks.

>  templates/hooks--prepare-commit-msg.sample | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/templates/hooks--prepare-commit-msg.sample 
> b/templates/hooks--prepare-commit-msg.sample
> index a84c3e5a8..12dd8fd88 100755
> --- a/templates/hooks--prepare-commit-msg.sample
> +++ b/templates/hooks--prepare-commit-msg.sample
> @@ -34,7 +34,7 @@ SHA1=$3
>  #  *) ;;
>  # esac
>  
> -# SOB=$(git var GIT_AUTHOR_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: 
> \1/p')
> +# SOB=$(git var GIT_COMMITTER_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: 
> \1/p')
>  # git interpret-trailers --in-place --trailer "$SOB" "$COMMIT_MSG_FILE"
>  # if test -z "$COMMIT_SOURCE"
>  # then

Re: Not understanding with git wants to copy one file to another

2017-08-14 Thread Stefan Beller

On Fri, Aug 11, 2017 at 1:41 PM, Harry Putnam  wrote:
> Stefan Beller  writes:
>
>
> [...]
>
>> Ah. Sorry for confusing even more.
>> By pointing out the options for git-diff, I just wanted to point out that
>> such a mechanism ("rename/copy detection") exists.
>
>
> [...]
>
>>> What am I missing?
>>>
>>
>> https://www.reddit.com/r/git/comments/3ogkk1/beginner_disable_rename_detection/
>>
>> "Rename detection is just GUI sugar".
>
> Thanks there is a nice full explanation at the cited url.
>
> What is still a bit puzzling is that in that same commit, there are
> files that are true copies of each other, just in different locations,
> But nothing pops up about them in a git commit.
>

The heuristic to find the renames/copies only looks at modified files
to be fast(, the assumption is that each commit only touches few
files, but the project consists of a lot of files).

For that git-diff knows about '--find-copies-harder' that looks at
all files even those not modified. This would point out the true
copies, I would assume.

I don't think we'd want to include the '--find-copies-harder' flag
to status or commit, as it may take some time in large projects.

Re: [PATCH v4 0/4] interpret-trailers: add --where, --if-exists, --if-missing

2017-08-14 Thread Junio C Hamano

Paolo Bonzini  writes:

> On 01/08/2017 11:03, Paolo Bonzini wrote:
>> From: Paolo Bonzini 
>> 
>> These options are useful to experiment with "git interpret-trailers"
>> without having to tinker with .gitconfig (Junio said git should ahve
>> done this first and only added configuration afterwards).  It can
>> be useful in the case where you want a different placement for the trailer,
>> or for scripts/aliases that don't want to rely on specific .gitconfig
>> settings.
>> 
>> Compared to v2, the main change is that option order on the command-line
>> is respected.  That is,
>> 
>>  --trailer 'acked-by: foo' --where end --trailer 'signed-off-by: me'
>> 
>> will only apply where=end to the second trailer.  Likewise,
>> 
>>  --where end --trailer 'signed-off-by: me' --no-where \
>>  --trailer 'acked-by: foo'
>> 
>> will only apply it to the first, reverting to trailer.*.where for the
>> "acked-by" trailer.
>
> Junio, I see you haven't yet applied this v4 to origin/pu, did you miss it?

Thanks for pinging.  Either it was not noticed by mistake or was
deliberately ignored during the pre-release freeze, I do not
remember.

Re: [PATCH] hook: use correct logical variable

2017-08-14 Thread Stefan Beller

On Mon, Aug 14, 2017 at 1:46 AM, Kaartic Sivaraam
 wrote:
> Sign-off added should be that of the "committer" not that of the
> "commit's author".
>
> Use the correct logical variable that identifies the committer.
>
> Signed-off-by: Kaartic Sivaraam 
> ---
>  This fixes a small issue when trying to do the following with the script 
> enabled,
>
> $ git commit --amend -s
>
>  If the commit being amended was signed off by the commit's author then the 
> above command
>  would *append* the sign-off of the committer followed by that of the 
> commit's author.
>  That' because the script is invoked only after the sign-off is added by the 
> '-s' option AND
>  the default of 'trailer.ifexists' for interpret-trailers currently defaults 
> to the 'addIfDifferentNeighbor'
>  thus interpret-trailer fails to identify the existing sign-off of the 
> commit's author and adds it.

The background knowledge provided up to here seems like
a valuable information that we'd want to preserve in the commit
history, i.e. make it part of the commit message?

Code looks good.

Thanks,
Stefan

>
>  Anyways, it doesn't make sense for a script to add the sign-off of the 
> commit's author. So,
>  fixing it seemed correct to me.
>
>  templates/hooks--prepare-commit-msg.sample | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/templates/hooks--prepare-commit-msg.sample 
> b/templates/hooks--prepare-commit-msg.sample
> index a84c3e5a8..12dd8fd88 100755
> --- a/templates/hooks--prepare-commit-msg.sample
> +++ b/templates/hooks--prepare-commit-msg.sample
> @@ -34,7 +34,7 @@ SHA1=$3
>  #  *) ;;
>  # esac
>
> -# SOB=$(git var GIT_AUTHOR_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: 
> \1/p')
> +# SOB=$(git var GIT_COMMITTER_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: 
> \1/p')
>  # git interpret-trailers --in-place --trailer "$SOB" "$COMMIT_MSG_FILE"
>  # if test -z "$COMMIT_SOURCE"
>  # then
> --
> 2.14.1.534.g641031ecb
>

RE: [External] Re: gitk -m ?

2017-08-14 Thread Burkhardt, Glenn B UTAS

Neither of those two work for me.  They don't limit the view to the single file 
of interest.  

Also, I tried "additional arguments to git log", using "-m --follow".  I filled 
in the single file of interest in the 'Enter files' section.  The error message 
was:

Can't parse git log output:  {commit 9cc8be... faab99... }

-Original Message-
From: Uxío Prego [mailto:upr...@madiva.com] 
Sent: Monday, August 14, 2017 12:47
To: Burkhardt, Glenn B UTAS
Cc: git@vger.kernel.org
Subject: Re: [External] Re: gitk -m ?

I do not know if you can do what you want, mostly if you can do what you want 
as simply as you might be wanting that you want it, but I guess you could use 
this gitk boot command as a _simple_ work around somehow aliased within your 
command line configuration:

$ gitk (--all)? $(git log -m --follow --pretty=format:%h PATHSPEC)

Alternatively, there is a _view configuration_ menu (_new view_, _edit view_) 
where there is a text box labeled _Command to generate more commits to 
include_. If you type here:

git log -m --follow --pretty=format:%h PATHSPEC

I do not know what will happen and I can not test that now (I eventually will), 
but chances are it could do what you wanted. Maybe you can even use custom 
aliases in there that text box.

I guess you are receiving a more authorized answer soonish, in the meanwhile, 
hope that helped.

Regards,


On 14 Aug 2017, at 15:20, Burkhardt, Glenn B UTAS 
 wrote:

They don't.  In particular, information about commits that are parts of merges 
is missing.

Here's an example.  There are only two entries listed in 'gitk --all'
for a particular file (sorry, I'd prefer to include a screen sho, but the 
mailing list doesn't allow HTML messages).

gitk --all MANIFEST.MF

Parent: f7462684ae78720aac05c929256d770118cf01fa (initial clone from Clearcase 
integ3 branch)
Branches: master, remotes/origin/master, remotes/origin/ww, ww
Follows:
Precedes:

   require java 1.8

Child:  240f151d61fd4fd06f377bc52970b3574e5f9031 (require java 1.8)
Branches: master, remotes/origin/master, remotes/origin/ww, ww
Follows:
Precedes:

   initial clone from Clearcase integ3 branch


git log with '-m' and '-follow' shows:

$ git log -m --follow --oneline MANIFEST.MF
9cc8be4 (from 1222d7c) Merge branch 'master' into ww; strategy "ours"
a423f2d (from f869950) merge from ww branch; remove Bundle-NativeCode
51f0628 (from 2c6478c) Merge branch 'ww' of coverity:rmps into ww
240f151 require java 1.8
f746268 initial clone from Clearcase integ3 branch


-Original Message-
From: Uxío Prego [mailto:upr...@madiva.com]
Sent: Monday, August 14, 2017 01:12
To: Burkhardt, Glenn B UTAS
Cc: git@vger.kernel.org
Subject: [External] Re: gitk -m ?

Not sure what you are wanting to achieve, but please make sure neither `gitk 
PATHSPEC` nor `gitk --all PATHSPEC` are presenting you enough information.

On 3 Aug 2017, at 19:37, Burkhardt, Glenn B UTAS  
wrote:

I've been looking in 'gitk' for an option that does what 'git log -m'
does.  Did I miss something?  In particular, I'd like to get information about 
a file that's currently available with "git log -m --all --follow", but 
presented in 'gitk'.  If it's not there, please consider this a feature request.

Thanks.

Re: [PATCH] doc: clarify "config --bool" behaviour with empty values

2017-08-14 Thread Junio C Hamano

Andreas Heiduk  writes:

> `git config --bool xxx.yyy` returns `true` for `[xxx]yyy` but
> `false` for `[xxx]yyy=` or `[xxx]yyy=""`.  This is tested in
> t1300-repo-config.sh since 09bc098c2.
>
> Signed-off-by: Andreas Heiduk 
> ---
>  Documentation/config.txt | 3 ++-
>  Documentation/git.txt| 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index d5c9c4cab..d3261006b 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -221,7 +221,8 @@ boolean::
>   is taken as true.
>  
> false;; Boolean false can be spelled as `no`, `off`,
> - `false`, or `0`.
> + `false`, `0`, no value (but still with `=`) or the
> + empty string.

Thanks for noticing that it was a problem not spelling out that an
empty string means false.  We do need to spell it out.

However, I think this "no value (but still with '=')" is making it
more confusing than necessary for two reasons.

(1) The notation

[section] var =

is a perfectly valid way to spell an empty string, and is *not*
a way to say "section.var has no value".

(2) In fact there is no way to say "section.var has no value", which
is an often lamented inconvenience, because sometimes people may
have their own setting in ~/.gitconfig and want to override it
in the repository specific .git/config but do the overriding not
with a specific value but by saying "pretend as if there were no
setting in lower precedence configuration files like
~/.gitconfig".  If we ever fix this and introduce some syntax to
mean "the variable has no value", it will become necessary to
update the above description, but I am sure nobody will remember
it.

I notice that in this Values section (where the boolean:: is the
first entry) there is no mention on how to spell a string value.

Perhaps something like this?

diff --git a/Documentation/config.txt b/Documentation/config.txt
index d5c9c4cab6..7580088bec 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -210,6 +210,14 @@ Values of many variables are treated as a simple string, 
but there
 are variables that take values of specific types and there are rules
 as to how to spell them.

+string::
+
+   A variable can take a string value, which can be quoted with
+   double quotes and backslashes as outlined in the Syntax
+   section above.  Note that it is sufficient to say 'name ='
+   without anything after the equal sign to spell an empty
+   string.
+
 boolean::

When a variable is said to take a boolean value, many
@@ -221,7 +229,7 @@ boolean::
is taken as true.

false;; Boolean false can be spelled as `no`, `off`,
-   `false`, or `0`.
+   `false`, or `0`.  An empty string can also be used.
 +
 When converting value to the canonical form using `--bool` type
 specifier; 'git config' will ensure that the output is "true" or

Re: [PATCH/RFC] File commited with CRLF should roundtrip diff and apply

2017-08-14 Thread Junio C Hamano

tbo...@web.de writes:

> From: Torsten Bögershausen 
>
> When a file had been commited with CRLF and core.autocrlf is true,
> the following does not roundtrip, `git apply` fails:
>
> printf "Added line\r\n" >>file &&
> git diff >patch &&
> git checkout -- . &&
> git apply patch

Should this tweak be in effect when we are paying attention to the
contents of the index?  Or does it matter only when we are doing
"git apply" without either "--index" or "--cache"?  

The fact that the new flag that is passed from load_preimage() down
to read_old_data(), the latter of which is only about the "behave as
a better 'patch -p1', ignoring the index" mode, hints that this
should not affect anything when we are paying attention to the
index, but then calling the load_patch_target() helper with a very
generic CR_AT_EOL flag looks a bit misleading, by making it appear
as if the caller is telling all three cases to do the funny CR
business.  I wonder if we instead want to pass the "patch" structure
down the callchain and have _only_ read_old_data() pay attention to
the has_crlf bit in it---that way, it becomes more obvious what is
going on.

I also notice that read_old_data() still passes _index to
convert_to_git().  Can we fix convert_to_git() further to allow NULL
as the istate parameter when we tell it not to look at the index ( I
am reading your passing SAFE_CRLF_KEEP and SAFE_CRLF_FALSE as a way
for the caller to tell that the caller knows what it wants already
and does not want covnert_to_git() to look at the index)?

With or without CR in the patch, I do not see any reason for the
codepath from read_old_data() down to the convert API to look at the
index, ever, as read_old_data() is called only when (!state->cached
&& !state->check_index), i.e. when we are working as a better 'patch
-p1' without even having to know that we are in a Git repository, by
load_patch_target().

> diff --git a/apply.c b/apply.c
> index f2d599141d..63455cd65f 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -220,6 +220,7 @@ struct patch {
>   unsigned int recount:1;
>   unsigned int conflicted_threeway:1;
>   unsigned int direct_to_threeway:1;
> + unsigned int has_crlf:1;
>   struct fragment *fragments;
>   char *result;
>   size_t resultsize;
> @@ -1662,6 +1663,17 @@ static void check_whitespace(struct apply_state *state,
>   record_ws_error(state, result, line + 1, len - 2, state->linenr);
>  }
>  
> +/* Check if the patch has context lines with CRLF or
> +   the patch wants to remove lines with CRLF */

Style.

> +static void check_old_for_crlf(struct patch *patch, const char *line, int 
> len)
> +{
> + if (len >= 2 && line[len-1] == '\n' && line[len-2] == '\r') {
> + patch->ws_rule |= WS_CR_AT_EOL;
> + patch->has_crlf = 1;
> + }
> +}

I am wondering where the discrepancy between the names (old crlf
here, as opposed to "struct patch" that says "has_crlf" implying it
does not care which side between old and new has one) comes from.

Is the intention that you only care about what is expected of the
preimage?  

> @@ -1712,11 +1724,13 @@ static int parse_fragment(struct apply_state *state,
>   if (!deleted && !added)
>   leading++;
>   trailing++;
> + check_old_for_crlf(patch, line, len);
>   if (!state->apply_in_reverse &&
>   state->ws_error_action == correct_ws_error)
>   check_whitespace(state, line, len, 
> patch->ws_rule);
>   break;

If check_old is about "we care about the lines that are supposed to
match what currently exist in the target file to be patched", then
the call to it also must pay attention to apply_in_reverse.  What
appears on '+' lines in a patch will become the "expected old
contents the patched target is supposed to have" when we are running
"apply -R".

>   case '-':
> + check_old_for_crlf(patch, line, len);
>   if (state->apply_in_reverse &&
>   state->ws_error_action != nowarn_ws_error)
>   check_whitespace(state, line, len, 
> patch->ws_rule);

Likewise.

Thanks for working on this; unlike the previous one I commented, I
think this one I can sort of see how this is an approach to fix it.

Re: [RFC PATCH 0/3] Fixes to "diff --color-moved" MIN_BLOCK_LENGTH handling

2017-08-14 Thread Stefan Beller

On Fri, Aug 11, 2017 at 5:39 PM, Junio C Hamano  wrote:
> Jonathan Tan  writes:
>
>> Note that these patches are for "next", depending on the "--color-moved"
>> patches.
>
> As we have finished Git 2.14 cycle, in preparation for the next one,
> the 'next' branch will be rewound and rebuilt early next week.  I do
> not mind tentatively ejecting some topics that needs fix-ups out of
> 'next' to give them a clean restart.

These would cleanly apply on sb/diff-color-move.

>
> My preference however is to keep sb/diff-color-move topic as-is
> without replacing and fixing it with incremental updates like these
> patches.

I would have hoped to not need to reroll that topic.
Though I do find patches 1&2 valuable either on top or squashed
into "[PATCH] diff.c: color moved lines differently" and
"[PATCH] diff.c: color moved lines differently, plain mode"
respectively.

So I'd ask to pick at least patches 1&2 on top of that series, please?
(I am missing the context for *why* you preference is to not do
exactly this).

Thanks.

Re: [RFC PATCH 3/3] diff: check MIN_BLOCK_LENGTH at start of new block

2017-08-14 Thread Stefan Beller

On Fri, Aug 11, 2017 at 3:49 PM, Jonathan Tan  wrote:
> When noticing that the current line is not the continuation of the
> current block, but the start of a new one, mark_color_as_moved() does
> not check the length of the current block. Perform that check.

As far as I remember that behavior is intentional, as indicated by
the succeeding test.

The whole MIN_BLOCK_LENGTH thing is a hack IMHO as we did not have
a better heuristic for suppressing uninteresting "moved" lines such as closing
braces in C.

The information that a thing is moved in between two blocks is more
valuable than pointing out it is just 'new' or 'old'.

As this is changing behavior in a way that seems controversial, can you
give your motivation/example for why this behavior is better?
(Do we want to put it into an option/mode?)

Re: [RFC PATCH 2/3] diff: respect MIN_BLOCK_LENGTH for last block

2017-08-14 Thread Stefan Beller

On Fri, Aug 11, 2017 at 3:49 PM, Jonathan Tan  wrote:
> Currently, MIN_BLOCK_LENGTH is only checked when diff encounters a line
> that does not belong to the current block. In particular, this means
> that MIN_BLOCK_LENGTH is not checked after all lines are encountered.
>
> Perform that check.

Thanks for spotting! This fix looks straightforward correct.
(Also thanks for factoring out the adjustment, I am tempted to
start a bike shedding discussion about its name, though. :P)

> Signed-off-by: Jonathan Tan 
> ---
>  diff.c | 29 ++---
>  t/t4015-diff-whitespace.sh | 35 +++
>  2 files changed, 57 insertions(+), 7 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index 4965ffbc4..95620b130 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -858,6 +858,26 @@ static int shrink_potential_moved_blocks(struct 
> moved_entry **pmb,
> return rp + 1;
>  }
>
> +/*
> + * If o->color_moved is COLOR_MOVED_PLAIN, this function does nothing.
> + *
> + * Otherwise, if the last block has fewer lines than
> + * COLOR_MOVED_MIN_BLOCK_LENGTH, unset DIFF_SYMBOL_MOVED_LINE on all lines in
> + * that block.
> + *
> + * The last block consists of the (n - block_length)'th line up to but not
> + * including the nth line.
> + */
> +static void adjust_last_block(struct diff_options *o, int n, int 
> block_length)
> +{
> +   int i;
> +   if (block_length >= COLOR_MOVED_MIN_BLOCK_LENGTH ||
> +   o->color_moved == COLOR_MOVED_PLAIN)
> +   return;
> +   for (i = 1; i < block_length + 1; i++)
> +   o->emitted_symbols->buf[n - i].flags &= 
> ~DIFF_SYMBOL_MOVED_LINE;
> +}
> +
>  /* Find blocks of moved code, delegate actual coloring decision to helper */
>  static void mark_color_as_moved(struct diff_options *o,
> struct hashmap *add_lines,
> @@ -893,13 +913,7 @@ static void mark_color_as_moved(struct diff_options *o,
> }
>
> if (!match) {
> -   if (block_length < COLOR_MOVED_MIN_BLOCK_LENGTH &&
> -   o->color_moved != COLOR_MOVED_PLAIN) {
> -   for (i = 1; i < block_length + 1; i++) {
> -   l = >emitted_symbols->buf[n - i];
> -   l->flags &= ~DIFF_SYMBOL_MOVED_LINE;
> -   }
> -   }
> +   adjust_last_block(o, n, block_length);
> pmb_nr = 0;
> block_length = 0;
> continue;
> @@ -941,6 +955,7 @@ static void mark_color_as_moved(struct diff_options *o,
> if (flipped_block)
> l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT;
> }
> +   adjust_last_block(o, n, block_length);
>
> free(pmb);
>  }
> diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
> index c3b697411..6f7758e5c 100755
> --- a/t/t4015-diff-whitespace.sh
> +++ b/t/t4015-diff-whitespace.sh
> @@ -1382,6 +1382,41 @@ EOF
> test_cmp expected actual
>  '
>
> +test_expect_success '--color-moved block at end of diff output respects 
> MIN_BLOCK_LENGTH' '
> +   git reset --hard &&
> +   >bar &&
> +   cat <<-\EOF >foo &&
> +   irrelevant_line
> +   line1
> +   EOF
> +   git add foo bar &&
> +   git commit -m x &&
> +
> +   cat <<-\EOF >bar &&
> +   line1
> +   EOF
> +   cat <<-\EOF >foo &&
> +   irrelevant_line
> +   EOF
> +
> +   git diff HEAD --color-moved=zebra --no-renames | grep -v "index" | 
> test_decode_color >actual &&
> +   cat >expected <<-\EOF &&
> +   diff --git a/bar b/bar
> +   --- a/bar
> +   +++ b/bar
> +   @@ -0,0 +1 @@
> +   +line1
> +   diff --git a/foo b/foo
> +   --- a/foo
> +   +++ b/foo
> +   @@ -1,2 +1 @@
> +irrelevant_line
> +   -line1
> +   EOF
> +
> +   test_cmp expected actual
> +'
> +
>  test_expect_success 'move detection with submodules' '
> test_create_repo bananas &&
> echo ripe >bananas/recipe &&
> --
> 2.14.0.434.g98096fd7a8-goog
>

Re: [RFC PATCH 1/3] diff: avoid redundantly clearing a flag

2017-08-14 Thread Stefan Beller

On Fri, Aug 11, 2017 at 3:49 PM, Jonathan Tan  wrote:
> No code in diff.c sets DIFF_SYMBOL_MOVED_LINE except in
> mark_color_as_moved(), so it is redundant to clear it for the current
> line. Therefore, clear it only for previous lines.

Oh that part. I remember debating with myself if I rather want to have
the upper bound adjusted by one less (block_length instead of
'block_length + 1'), and then add a constant to 'buf[n - i];'

The patch as implemented is fine, too.

> This makes a refactoring in a subsequent patch easier.
>
> Signed-off-by: Jonathan Tan 
> ---
>  diff.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/diff.c b/diff.c
> index f84346b47..4965ffbc4 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -895,7 +895,7 @@ static void mark_color_as_moved(struct diff_options *o,
> if (!match) {
> if (block_length < COLOR_MOVED_MIN_BLOCK_LENGTH &&
> o->color_moved != COLOR_MOVED_PLAIN) {
> -   for (i = 0; i < block_length + 1; i++) {
> +   for (i = 1; i < block_length + 1; i++) {
> l = >emitted_symbols->buf[n - i];
> l->flags &= ~DIFF_SYMBOL_MOVED_LINE;
> }
> --
> 2.14.0.434.g98096fd7a8-goog
>

Re: [PATCH 1/9] Convert pack-objects to size_t

2017-08-14 Thread Junio C Hamano

Junio C Hamano  writes:

> One interesting question is which of these two types we should use
> for the size of objects Git uses.  
>
> Most of the "interesting" operations done by Git require that the
> thing is in core as a whole before we can do anything (e.g. compare
> two such things to produce delta, have one in core and apply patch),
> so it is tempting that we deal with size_t, but at the lowest level
> to serve as a SCM, i.e. recording the state of a file at each
> version, we actually should be able to exceed the in-core
> limit---both "git add" of a huge file whose contents would not fit
> in-core and "git checkout" of a huge blob whose inflated contents
> would not fit in-core should (in theory, modulo bugs) be able to
> exercise the streaming interface to handle such case without holding
> everything in-core at once.  So from that point of view, even size_t
> may not be the "correct" type to use.

A few additions to the above observations.

 - We have varint that encodes how far the location from a delta
   representation of an object to its base object in the packfile.
   Both encoding and decoding sides in the current code use off_t to
   represent this offset, so we can already reference an object that
   is far in the same packfile as a base.

 - I think it is OK in practice to limit the size of individual
   objects to size_t (i.e. on 32-bit arch, you cannot interact with
   a repository with an object whose size exceeds 4GB).  Using off_t
   would allow occasional ultra-huge objects that can only be added
   and checked in via the streaming API on such a platform, but I
   suspect that it may become too much of a hassle to maintain.

   It may help reducing the maintenance if we introduced obj_size_t
   that is defined to be size_t for now, so that we can later swap
   it to ofs_t or some larger type when we know we do need to
   support objects whose size cannot be expressed in size_t, but I
   do not offhand know what the pros-and-cons with such an approach
   would look like.

Thanks.

Re: Bug with corruption on clone/fsck/... with large packs + 64-bit Windows, problem with usage of "long" datatype for sizes/offsets?

2017-08-14 Thread Stefan Beller

On Mon, Aug 14, 2017 at 9:05 AM, Johannes Schindelin
 wrote:
> Hi Christoph,
>
> On Fri, 11 Aug 2017, Dr.-Ing. Christoph Cullmann wrote:
>
>> on Windows 64-bit, for a repository having a .pack file > 4GB I get
>> during cloning:
>
> The reason is Git's source code that over-uses the `unsigned long`
> datatype.
>
> In a nearby-thread, an underappreciated effort by Martin Koegler is
> underway to get the ball rolling in getting it fixed. Maybe you can help
> making Martin a lot more welcome on the Git mailing list, and maybe even
> help getting his patches reviewed and integrated?

'nearby' as in [1] ;-)

[1] 
https://public-inbox.org/git/1502527643-21944-1-git-send-email-martin@mail.zuhause/

I had the impression the review is going well there?

>
> Ciao,
> Johannes

Re: [RFC PATCH 0/7] Implement ref namespaces as a ref storage backend

2017-08-14 Thread Stefan Beller

> Technical description
> =
>
> This patch series adds a new refs backend, stacking on top of the files 
> backend,
> based on whether `core.namespace` is set in git config.

Currently there is another Big Thing getting started in in the refs backend.
https://public-inbox.org/git/CAJo=hJtg0PAVHT1phbArdra8+4LfnEEuaj3fBid==bxkzgh...@mail.gmail.com/
Maybe it is worth looking into that as well? Reftables solve
the problem of scaling (i.e. a repo containing a million refs works
just fine, and fast(!) for reading/writing refs, which is not the case
for packed refs)

Re: [PATCH] Add overflow check to get_delta_hdr_size

2017-08-14 Thread Junio C Hamano

Martin Koegler  writes:

> From: Martin Koegler 
>
> Signed-off-by: Martin Koegler 
> ---
> Applies on top of my size_t series
>
> I'm not sure, if die or error is better.

As there is no fallback when we hit it, die would be sufficient; the
only thing the callers of this helper, or their callers, could do is
as a reaction to an error return from here would be to die themselves,
I would think.




>  delta.h | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/delta.h b/delta.h
> index 2df0f55..18a4983 100644
> --- a/delta.h
> +++ b/delta.h
> @@ -96,6 +96,11 @@ static inline size_t get_delta_hdr_size(const unsigned 
> char **datap,
>   cmd = *data++;
>   size |= (cmd & 0x7f) << i;
>   i += 7;
> + if (bitsizeof(size_t) <= i) {
> + error("too large object size");
> + size = 0;
> + break;
> + }
>   } while (cmd & 0x80 && data < top);
>   *datap = data;
>   return size;

Re: [PATCH] Add overflow check to get_delta_hdr_size

2017-08-14 Thread Junio C Hamano

Martin Koegler  writes:

> From: Martin Koegler 
>
> Signed-off-by: Martin Koegler 
> ---
> Applies on top of my size_t series
>
> I'm not sure, if die or error is better.

As there is no fallback when we hit it, die would be sufficient; the
only thing the callers of this helper, or their callers, could do is
as a reaction to an error return from here would be to die themselves,
I would think.

>
>  delta.h | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/delta.h b/delta.h
> index 2df0f55..18a4983 100644
> --- a/delta.h
> +++ b/delta.h
> @@ -96,6 +96,11 @@ static inline size_t get_delta_hdr_size(const unsigned 
> char **datap,
>   cmd = *data++;
>   size |= (cmd & 0x7f) << i;
>   i += 7;
> + if (bitsizeof(size_t) <= i) {
> + error("too large object size");
> + size = 0;
> + break;
> + }
>   } while (cmd & 0x80 && data < top);
>   *datap = data;
>   return size;

Re: [External] Re: gitk -m ?

2017-08-14 Thread Uxío Prego

I do not know if you can do what you want, mostly if you can do what you want
as simply as you might be wanting that you want it, but I guess you could use
this gitk boot command as a _simple_ work around somehow aliased within your
command line configuration:

$ gitk (--all)? $(git log -m --follow --pretty=format:%h PATHSPEC)

Alternatively, there is a _view configuration_ menu (_new view_, _edit view_)
where there is a text box labeled _Command to generate more commits to
include_. If you type here:

git log -m --follow --pretty=format:%h PATHSPEC

I do not know what will happen and I can not test that now (I eventually will),
but chances are it could do what you wanted. Maybe you can even use custom
aliases in there that text box.

I guess you are receiving a more authorized answer soonish, in the meanwhile,
hope that helped.

Regards,


On 14 Aug 2017, at 15:20, Burkhardt, Glenn B UTAS
 wrote:

They don't.  In particular, information about commits that are parts
of merges is missing.

Here's an example.  There are only two entries listed in 'gitk --all'
for a particular file (sorry, I'd prefer to include a screen sho, but
the mailing list doesn't allow HTML messages).

gitk --all MANIFEST.MF

Parent: f7462684ae78720aac05c929256d770118cf01fa (initial clone from
Clearcase integ3 branch)
Branches: master, remotes/origin/master, remotes/origin/ww, ww
Follows:
Precedes:

   require java 1.8

Child:  240f151d61fd4fd06f377bc52970b3574e5f9031 (require java 1.8)
Branches: master, remotes/origin/master, remotes/origin/ww, ww
Follows:
Precedes:

   initial clone from Clearcase integ3 branch


git log with '-m' and '-follow' shows:

$ git log -m --follow --oneline MANIFEST.MF
9cc8be4 (from 1222d7c) Merge branch 'master' into ww; strategy "ours"
a423f2d (from f869950) merge from ww branch; remove Bundle-NativeCode
51f0628 (from 2c6478c) Merge branch 'ww' of coverity:rmps into ww
240f151 require java 1.8
f746268 initial clone from Clearcase integ3 branch


-Original Message-
From: Uxío Prego [mailto:upr...@madiva.com]
Sent: Monday, August 14, 2017 01:12
To: Burkhardt, Glenn B UTAS
Cc: git@vger.kernel.org
Subject: [External] Re: gitk -m ?

Not sure what you are wanting to achieve, but please make sure neither
`gitk PATHSPEC` nor `gitk --all PATHSPEC` are presenting you enough
information.

On 3 Aug 2017, at 19:37, Burkhardt, Glenn B UTAS
 wrote:

I've been looking in 'gitk' for an option that does what 'git log -m'
does.  Did I miss something?  In particular, I'd like to get
information about a file that's currently available with "git log -m
--all --follow", but presented in 'gitk'.  If it's not there, please
consider this a feature request.

Thanks.

Re: [PATCH v2 1/2] format-patch: have progress option while generating patches

2017-08-14 Thread Junio C Hamano

Jeff King  writes:

> On Sat, Aug 12, 2017 at 09:06:18AM +0100, Philip Oakley wrote:
>
>> > > > + progress = start_progress_delay(_("Generating patches"), total, 0, 
>> > > > 1);
>> > > 
>> > > I don't really have an opinion on a 1 second delay versus 2. I thought
>> > > we used 2 pretty consistently, though grepping around I do see a couple
>> > > of 1's. It probably doesn't matter, but just a curiosity.
>> > 
> ...
> Here we're just talking about calls to start_progress_delay(), and how
> long it waits before deciding that the operation is slow enough to show
> progress. Blame, rename detection, and checkout use 1 second. Prune,
> prune-packed, and connectivity checks all use 2 seconds. I doubt it
> matters all that much, but presumably the purpose in all is "how long
> before a user starts to wonder if things are actually happening", which
> is probably command-independent.

I feel comfortable moving forward, basing our decisions on the
assumption that the "delay before the user wonders is independent of
the command" without anybody actually proving it.

Even though I think that it is natural for people to expect longer
delay from some operation than others (due to some chicken-and-egg
reasons), trying to scientifically measure and to come up with
different delay value that suited for each and every operation is
waste of our brain cycles.

Perhaps we may want to replace the calls to progress_delay() with a
call to a simpler wrapper that does not let the callers give their
own delay threashold to simplify the API.

Re: Bug?: git archive exclude pathspec and gitattributes export-ignore

2017-08-14 Thread René Scharfe

Am 13.08.2017 um 06:53 schrieb David Adam:
> Hi all,
> 
> I think I have a bug in git (tested 2.11.0 on Debian 8, 2.14.1 on OS X and
> 2.14.1.145.gb3622a4 on OS X).
> 
> Given a repository with an export-ignore directive for a subdirectory in
> .gitattributes, `git archive` with a pathspec that excludes a different
> subdirectory produces no output file and git exits with -1 as the return
> status.
> 
> As shown:
> 
> > git init foo && cd foo
> Initialized empty Git repository in /Users/david/src/foo/.git/
> > mkdir a b
> > touch {a,b}/somefile
> > echo "/a export-ignore" >> .gitattributes
> > git add .
> > git commit -m "Initial commit"
> [master (root-commit) 53527a7] Initial commit
>  3 files changed, 1 insertion(+)
>  create mode 100644 .gitattributes
>  create mode 100644 a/somefile
>  create mode 100644 b/somefile
> > git archive --verbose master ':(top)' ':(exclude)b*'
> .gitattributes
> > echo $?
> 255
> 
> If this is intended behaviour, is there any way of achieving the goal of
> excluding a subdirectory not listed as export-ignore? Using the exclude
> pathspec ":(exclude)b" produces an empty subdirectory b in the output,
> which I would like to avoid.
> 
> This is a reduced testcase; my goal is to end up with two archives, one
> containing directory b only, and one containing everything except for
> directory b - so I can't just add 'b export-ignore' to gitattributes.

Thanks for the thoughtful bug report!

The problem seems to be that archive.c::write_archive_entry() returns 0
instead of READ_TREE_RECURSIVE for directories with the attribute
"export-ignore", and archive.c::write_directory() gets caught by
surprise by that and returns -1, which ends up causing git archive to
exit with return code 255 without actually writing anything.

This should only happen if you use wildcards like "*", i.e. git archive
should behave as expected if you spell out the full name of the
directory.  Can you confirm that?

The real solution is probably to teach tree-walk.c::do_match() how to
handle attributes and then inject ":(attr:-export-ignore)" as a default
internal pathspec in archive.c::parse_pathspec_arg() instead of handling
it in archive.c::write_archive_entry().

@Duy: What do you think?

Thanks,
René

Re: Bug with corruption on clone/fsck/... with large packs + 64-bit Windows, problem with usage of "long" datatype for sizes/offsets?

2017-08-14 Thread Johannes Schindelin

Hi Christoph,

On Fri, 11 Aug 2017, Dr.-Ing. Christoph Cullmann wrote:

> on Windows 64-bit, for a repository having a .pack file > 4GB I get
> during cloning:

The reason is Git's source code that over-uses the `unsigned long`
datatype.

In a nearby-thread, an underappreciated effort by Martin Koegler is
underway to get the ball rolling in getting it fixed. Maybe you can help
making Martin a lot more welcome on the Git mailing list, and maybe even
help getting his patches reviewed and integrated?

Ciao,
Johannes

RE: reftable [v5]: new ref storage format

2017-08-14 Thread David Turner

> -Original Message-
> From: Howard Chu [mailto:h...@symas.com]
> Sent: Monday, August 14, 2017 8:31 AM
> To: spea...@spearce.org
> Cc: David Turner ; ava...@gmail.com;
> ben.a...@acegi.com.au; dborow...@google.com; git@vger.kernel.org;
> gits...@pobox.com; mhag...@alum.mit.edu; p...@peff.net;
> sbel...@google.com; sto...@gmail.com
> Subject: Re: reftable [v5]: new ref storage format
> 
> Howard Chu wrote:
> > The primary issue with using LMDB over NFS is with performance. All
> > reads are performed thru accesses of mapped memory, and in general,
> > NFS implementations don't cache mmap'd pages. I believe this is a
> > consequence of the fact that they also can't guarantee cache
> > coherence, so the only way for an NFS client to see a write from
> > another NFS client is by always refetching pages whenever they're accessed.
> 
> > LMDB's read lock management also wouldn't perform well over NFS; it
> > also uses an mmap'd file. On a local filesystem LMDB read locks are
> > zero cost since they just atomically update a word in the mmap. Over
> > NFS, each update to the mmap would also require an msync() to
> > propagate the change back to the server. This would seriously limit
> > the speed with which read transactions may be opened and closed.
> > (Ordinarily opening and closing a read txn can be done with zero
> > system calls.)
> 
> All that aside, we could simply add an EXCLUSIVE open-flag to LMDB, and
> prevent multiple processes from using the DB concurrently. In that case,
> maintaining coherence with other NFS clients is a non-issue. It strikes me 
> that git
> doesn't require concurrent multi-process access anyway, and any particular
> process would only use the DB for a short time before closing it and going 
> away.

Git, in general, does require concurrent multi-process access, depending on 
what 
that means.

For example, a post-receive hook might call some git command which opens the 
ref database.  This means that git receive-pack would have to close and 
re-open the ref database.  More generally, a fair number of git commands are
implemented in terms of other git commands, and might need the same treatment.
We could, in general, close and re-open the database around fork/exec, but I am
not sure that this solves the general problem -- by mere happenstance, one might
be e.g. pushing in one terminal while running git checkout in another.  This is 
especially true with git worktrees, which share one ref database across multiple
working directories.

Re: [RFC] clang-format: outline the git project's coding style

2017-08-14 Thread Johannes Schindelin

Hi Junio,

On Thu, 10 Aug 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > On Wed, 9 Aug 2017, Stefan Beller wrote:
> >
> >> > I am sure that something even better will be possible: a Continuous
> >> > "Integration" that fixes the coding style automatically by using
> >> > `filter-branch` (avoiding the merge conflicts that would arise if
> >> > `rebase -i` was used).
> >> 
> >> I do not quite follow. Is that to be used by Junio while integrating
> >> branches?
> >
> > I was more thinking about a bot on GitHub. "Code cleanup as a service".
> 
> I vaguely recall that there was a discussion to have SubmitGit wait
> for success from Travis CI; if that is already in place, then I can
> sort of see how it would help individual contributors to have the
> style checker in that pipeline as well.  
> 
> I have a mixed feelings about "fixing" styles automatically, though.

That's too bad. I would much rather focus on quality of code than
conformance of style, even if the latter is a lot easier than the former.

Ciao,
Dscho

RE: [External] Re: gitk -m ?

2017-08-14 Thread Burkhardt, Glenn B UTAS

They don't.  In particular, information about commits that are parts of merges 
is missing.

Here's an example.  There are only two entries listed in 'gitk --all' for a 
particular file (sorry, I'd prefer to include a screen sho, but the mailing 
list doesn't allow HTML messages).

gitk --all MANIFEST.MF

Parent: f7462684ae78720aac05c929256d770118cf01fa (initial clone from Clearcase 
integ3 branch)
Branches: master, remotes/origin/master, remotes/origin/ww, ww
Follows: 
Precedes: 

require java 1.8

Child:  240f151d61fd4fd06f377bc52970b3574e5f9031 (require java 1.8)
Branches: master, remotes/origin/master, remotes/origin/ww, ww
Follows: 
Precedes: 

initial clone from Clearcase integ3 branch


git log with '-m' and '-follow' shows:

$ git log -m --follow --oneline MANIFEST.MF
9cc8be4 (from 1222d7c) Merge branch 'master' into ww; strategy "ours"
a423f2d (from f869950) merge from ww branch; remove Bundle-NativeCode
51f0628 (from 2c6478c) Merge branch 'ww' of coverity:rmps into ww
240f151 require java 1.8
f746268 initial clone from Clearcase integ3 branch


-Original Message-
From: Uxío Prego [mailto:upr...@madiva.com] 
Sent: Monday, August 14, 2017 01:12
To: Burkhardt, Glenn B UTAS
Cc: git@vger.kernel.org
Subject: [External] Re: gitk -m ?

Not sure what you are wanting to achieve, but please make sure neither `gitk 
PATHSPEC` nor `gitk --all PATHSPEC` are presenting you enough information.

> On 3 Aug 2017, at 19:37, Burkhardt, Glenn B UTAS 
>  wrote:
> 
> I've been looking in 'gitk' for an option that does what 'git log -m' does.  
> Did I miss something?  In particular, I'd like to get information about a 
> file that's currently available with "git log -m --all --follow", but 
> presented in 'gitk'.  If it's not there, please consider this a feature 
> request.
> 
> Thanks.

Re: reftable [v5]: new ref storage format

2017-08-14 Thread Howard Chu


Howard Chu wrote:
The primary issue with using LMDB over NFS is with performance. All reads are 
performed thru accesses of mapped memory, and in general, NFS implementations 
don't cache mmap'd pages. I believe this is a consequence of the fact that 
they also can't guarantee cache coherence, so the only way for an NFS client 
to see a write from another NFS client is by always refetching pages whenever 
they're accessed.


LMDB's read lock management also wouldn't perform well over NFS; it also uses an mmap'd file. On a local filesystem LMDB read locks are zero cost since they just atomically update a word in the mmap. Over NFS, each update to the mmap would also require an msync() to propagate the change back to the server. This would seriously limit the speed with which read transactions may be opened and closed. (Ordinarily opening and closing a read txn can be done with zero system calls.) 


All that aside, we could simply add an EXCLUSIVE open-flag to LMDB, and 
prevent multiple processes from using the DB concurrently. In that case, 
maintaining coherence with other NFS clients is a non-issue. It strikes me 
that git doesn't require concurrent multi-process access anyway, and any 
particular process would only use the DB for a short time before closing it 
and going away.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: reftable [v6]: new ref storage format

2017-08-14 Thread Michael Haggerty

On 08/07/2017 03:47 AM, Shawn Pearce wrote:
> 6th iteration of the reftable storage format.

Thanks!

> Changes from v5:
> - extensions.refStorage = reftable is used to select this format.
> 
> - Log records can be explicitly deleted (for refs/stash).
> - Log records may use Michael Haggerty's chained idea to compress before zlib.
>   This saved ~5.8% on one of my example repositories.

Meh. Do you think that's worth the complexity? The percentage savings
will presumably be even lower for repositories that store significant
information in their reflog messages.

> [...]
>  ref record
> 
> A `ref_record` describes a single reference, storing both the name and
> its value(s). Records are formatted as:
> 
> varint( prefix_length )
> varint( (suffix_length << 3) | value_type )
> suffix
> value?
> 
> [...]
> - `0x0`: deletion; no value data (see transactions, below)
> - `0x1`: one 20-byte object id; value of the ref
> - `0x2`: two 20-byte object ids; value of the ref, peeled target
> - `0x3`: symref and text: `varint( text_len ) text`
> 
> Symbolic references use `0x3` with a `text` string starting with `"ref: "`,
> followed by the complete name of the reference target.  No
> compression is applied to the target name.  Other types of contents
> that are also reference like, such as `FETCH_HEAD` and `MERGE_HEAD`,
> may also be stored using type `0x3`.

I'm still relatively negative on storing "other" references (except
`HEAD`) in reftable. Here are my thoughts:

* "Other" references are not considered for reachability, so there
  is no need for their modification to be done atomically.

* "Other" references don't have or need reflogs.

* The refs API would have to provide a way for other Git code to
  read and write "other" references including their extra
  information, and the users of that information would have to
  be rewritten to use the new API.

* Presumably there are other programs in the wild (e.g., scripts)
  that want to read that information. They wouldn't be able to
  extract it from reftable files themselves, so we would also have
  to provide a command-line tool to read (and write?) such files.

> Types `0x4..0x7` are reserved for future use.

Regardless, I suggest allocating separate `value_type`s for generic
symrefs (which then wouldn't need a `ref: ` prefix) vs. for "other"
references.

> [...]
> ### Ref index

It wasn't clear to me whether (in the case of a multi-level index) ref
index blocks have to be aligned in `block_size` blocks (both their
maximum size and their alignment). I don't see a reason for that to be
required, though of course a compactor implementation might choose to
block-align these blocks based on the filesystem that is in use.

For that matter, I don't see an intrinsic reason that object blocks or
object index blocks need to be block aligned.

In fact, the only way I can see that the current reftable proposal makes
use of `block_size` is so that `obj_record`s can record `block_delta` in
units of `block_size` rather than in units of bytes. (And given that I'm
skeptical about the value of the object index, that justification seems
thin.)

I totally accept that *you* want to align your blocks, and I'm totally
supportive of a format that permits a reftable compactor to write
reftables that are block-aligned. It just still seems to me that it
imposes more complexity than necessary on *other* reftable compactor
implementations that don't care about block alignment.

Aside from the object index, I think it would be straightforward to
write a reftable reader that is totally ignorant of `block_size`.

So unless I've overlooked something, I think the following plan wouldn't
cause you any extra trouble, but would make it easier to implement a
compactor that doesn't care about block sizes or object indexes:

If a reftable has an object index, then `block_size` must be specified,
and ref blocks *must* be aligned to start at multiples of `block_size`.

However, if a reftable has no object index, then its `block_size` is
only a hint about the typical block size; e.g., "if you want to read a
full block, then try reading `block_size` and you'll probably get the
whole thing". And if `block_size` is zero, then readers get no guidance
about the typical block size (which would be just fine for an mmap-based
reader).

Essentially, choices about block alignment would become a
quality-of-implementation issue for reftable compactors, and readers
would hardly need to care.

> [...]
>  index record
> 
> An index record describes the last entry in another block.
> Index records are written as:
> 
> varint( prefix_length )
> varint( (suffix_length << 3) | 0 )
> suffix
> varint( block_position )
> 
> Index records use prefix compression exactly like `ref_record`.
> 
> Index records store `block_position` after the suffix, specifying the
> absolute position in bytes (from the start of the file) of the block
> that ends with this reference.

Is there a reason

[no subject]

2017-08-14 Thread Adrian Gillian Bayford

£1.5 Million Has Been Granted To You As A Donation Visit 
www.bbc.co.uk/news/uk-england-19254228 Sendname Address Phone for more info

1 2 >

1 - 100 of 106 matches

Mail list logo