Re: What's cooking in git.git (Oct 2016, #06; Mon, 24)

2016-10-24 Thread Stefan Beller
On Mon, Oct 24, 2016 at 6:09 PM, Junio C Hamano  wrote:
>
> * sb/submodule-ignore-trailing-slash (2016-10-18) 3 commits
>  . submodule--helper: normalize funny urls
>   (merged to 'next' on 2016-10-11 at e37425ed17)
>  + submodule: ignore trailing slash in relative url
>  + submodule: ignore trailing slash on superproject URL
>
>  A minor regression fix for "git submodule".
>
>  It seems that POSIX emulation layer of Windows is not cooperating;
>  this may have to wait (or tentatively reverted in Windows port) for
>  the resolution of the issue.
>
>  cf. 
>  cf. 
>
>  What's the current state of this topic?

The first 2 patches actually fix a bug users run into, and I these are
fine for general consumption IMHO.

The third patch only breaks tests as our test suite is holding it wrong.
I was bike shedding on the list and yak shaving here to come up with
the correct fix for the test suite.

One of the initial ways to work around the bugfix was to

git clone . root # <- add in this step and it works again.
git clone root super

but instead I will do the preparation for the 'super' project not
in '.' but in 'root', just called differently ("super_remote" ?)

An additional new test for cloning from '.' will be introduced, too.

I plan on working on that with highest priority for git after finishing
some attr stuff that I currently have open. So expect a patch (or two)
this week.

Thanks,
Stefan


What's cooking in git.git (Oct 2016, #06; Mon, 24)

2016-10-24 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

Originally I planed to start concluding this cycle today, but
waiting for the conclusion of a few test breakages on Windows, I
didn't tag -rc0 today.

Here are my current thinking on the notable topics.

 - the "off-by-one fix" part of sb/submodule-ignore-trailing-slash
   needs to be in the upcoming release but the "trailing /. in base
   should not affect the resolution of ../relative/path" part that
   is still under discussion can wait.  Which means we'd need a few
   more !MINGW prerequisites in the tests by -rc0.

 - js/prepare-sequencer topic is not yet in 'next' but it would be a
   nice-to-have in the upcoming release if we can.  It does not yet
   touch "rebase -i", but does touch the sequencer code that is used
   in cherry-pick and revert, so I'd prefer to cook it for at least
   a week and half in 'next' before merging.

 - ls/filter-process topic has been in 'next' with one known test
   breakage on Windows, whose resolution ls/git-open-cloexec is
   close to its final shape.  Perhaps we can cook them for at least
   a week and half in 'next' and have it in the upcoming release.

 - ex/deprecate-empty-pathspec-as-match-all topic that makes it
   illegal to say 'git add ""' when you mean 'git add .' has been in
   'next' for more than a cycle.  I am inclined to merge it in the
   upcoming release.

 - jc/merge-drop-old-syntax is relatively new in 'next' after all
   known in-tree dependents have been updated.  I am planning to
   keep it cooking in 'next' but add a backward incompatibility
   notice to the release notes to the upcoming release.

 - lt/abbrev-auto and its follow-up jk/abbrev-auto are about auto
   scaling the default abbreviation length when Git produces a short
   object name to adjust to the modern times.  Peff noticed one
   fallout from it recently and its fix jc/abbrev-auto is not yet in
   'next'.  I would not be surprised if there are other uncovered
   fallouts remaining in the code, but at the same time, I expect
   they are all cosmetic kind that do not affect correctness, so I
   am inclined to include all of them in the upcoming release.

I plan to merge other smallish topics that have been in 'next' to
'master' soonish, and relabel the remainder that have been labeled
as "Will merge to 'master'" to "Will hold" and keep cooking them in
'next'.  For this reason, please do not take the "Will merge to
'master'" label too literally in this issue of "What's cooking"
report.  It is always true that the label only means "the topic will
be in 'master' eventually", not "the topic will be in the upcoming
release", but in this issue that is even more true than usual.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[New Topics]

* jk/tap-verbose-fix (2016-10-24) 4 commits
  (merged to 'next' on 2016-10-24 at 5073a4de2d)
 + test-lib: bail out when "-v" used under "prove"
  (merged to 'next' on 2016-10-21 at 592679411c)
 + travis: use --verbose-log test option
 + test-lib: add --verbose-log option
 + test-lib: handle TEST_OUTPUT_DIRECTORY with spaces

 The Travis CI configuration we ship ran the tests with --verbose
 option but this risks non-TAP output that happens to be "ok" to be
 misinterpreted as TAP signalling a test that passed.  This resulted
 in unnecessary failure.  This has been corrected by introducing a
 new mode to run our tests in the test harness to send the verbose
 output separately to the log file.

 Will merge to 'master'.


* po/fix-doc-merge-base-illustration (2016-10-21) 1 commit
  (merged to 'next' on 2016-10-21 at ac6f04a6c5)
 + doc: fix merge-base ASCII art tab spacing

 Some AsciiDoc formatter mishandles a displayed illustration with
 tabs in it.  Adjust a few of them in merge-base documentation to
 work around them.

 Will merge to 'master'.


* jc/abbrev-auto (2016-10-22) 4 commits
 - transport: compute summary-width dynamically
 - transport: allow summary-width to be computed dynamically
 - fetch: pass summary_width down the callchain
 - transport: pass summary_width down the callchain
 (this branch uses jk/abbrev-auto and lt/abbrev-auto.)

 "git push" and "git fetch" reports from what old object to what new
 object each ref was updated, using abbreviated refnames, and they
 attempt to align the columns for this and other pieces of
 information.  The way these codepaths compute how many display
 columns to allocate for the object names portion of this output has
 been updated to match the recent "auto scale the default
 abbreviation length" change.

 Will merge to 'next'.


* jc/reset-unmerge (2016-10-24) 1 commit
 - reset: 

Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread Jeff King
On Mon, Oct 24, 2016 at 04:53:50PM -0700, Junio C Hamano wrote:

> > So how about this?  It gets rid of magic number 3 and works for array
> > size that's not a power of two.  And as a nice side effect it can't
> > trigger a signed overflow anymore.
> 
> Looks good to me.  Peff?

Any of the variants discussed in this thread is fine by me.

-Peff


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread Junio C Hamano
René Scharfe  writes:

> Am 24.10.2016 um 19:27 schrieb Junio C Hamano:
>> Junio C Hamano  writes:
>> 
 I think it would be preferable to just fix it inline in each place.
>>>
>>> Yeah, probably.
>>>
>>> My initial reaction to this was
>>>
>>>  char *sha1_to_hex(const unsigned char *sha1)
>>>  {
>>> -   static int bufno;
>>> +   static unsigned int bufno;
>>> static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
>>> return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
>>>
>>> "ah, we do not even need bufno as uint; it could be ushort or even
>>> uchar".  If this were a 256 element ring buffer and the index were
>>> uchar, we wouldn't even be having this discussion, and "3 &" is a
>>> way to get a fake type that is a 2-bit unsigned integer that wraps
>>> around when incremented.
>>>
>>> But being explicit, especially when we know that we can rely on the
>>> fact that the compilers are usually intelligent enough, is a good
>>> idea, I would think.
>>>
>>> Isn't size_t often wider than uint, by the way?  It somehow makes me
>>> feel dirty to use it when we know we only care about the bottom two
>>> bit, especially with the explicit "bufno %= ARRAY_SIZE(hexbuffer)",
>>> but I may be simply superstitious in this case.  I dunno.
>> 
>> If we are doing the wrap-around ourselves, I suspect that the index
>> should stay "int" (not even unsigned), as that is supposed to be the
>> most natural and performant type on the architecture.  Would it
>> still result in better code to use size_t instead?
>
> Right.
>
> Gcc emits an extra cltq instruction for x86-64 (Convert Long To Quad,
> i.e. 32-bit to 64-bit) if we wrap explicitly and still use an int in
> sha1_to_hex().  It doesn't if we use an unsigned int or size_t.  It
> also doesn't for get_pathname().  I guess updating the index variable
> only after use makes the difference there.
>
> But I think we can ignore that; it's just an extra cycle.  I only
> even noticed it when verifying that "% 4" is turned into "& 3"..
> Clang and icc don't add the cltq, by the way.
>
> So how about this?  It gets rid of magic number 3 and works for array
> size that's not a power of two.  And as a nice side effect it can't
> trigger a signed overflow anymore.

Looks good to me.  Peff?

> Just adding "unsigned" looks more attractive to me for some reason.
> Perhaps I stared enough at the code to get used to the magic values
> there..

I somehow share that feeling, too.


[PATCH] Allow stashes to be referenced by index only

2016-10-24 Thread Aaron M Watson
Instead of referencing "stash@{n}" explicitly, it can simply be
referenced as "n".  Most users only reference stashes by their position
in the stash stask (what I refer to as the "index"). The syntax for the
typical stash (stash@{n}) is slightly annoying and easy to forget, and
sometimes difficult to escape properly in a script. Because of this the
capability to do things with the stash by simply referencing the index
is desirable.

This patch includes the superior implementation provided by Øsse Walle
(thanks for that), with a slight change to fix a broken test in the test
suite. I also merged the test scripts as suggested by Jeff King, and
un-wrapped the documentation as suggested by Junio Hamano.

Signed-off-by: Aaron M Watson 
---
 Documentation/git-stash.txt |  3 ++-
 git-stash.sh| 15 +--
 t/t3903-stash.sh| 35 +++
 3 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index 92df596..2e9cef0 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -39,7 +39,8 @@ The latest stash you created is stored in `refs/stash`; older
 stashes are found in the reflog of this reference and can be named using
 the usual reflog syntax (e.g. `stash@{0}` is the most recently
 created stash, `stash@{1}` is the one before it, `stash@{2.hours.ago}`
-is also possible).
+is also possible). Stashes may also be referenced by specifying just the
+stash index (e.g. the integer `n` is equivalent to `stash@{n}`).
 
 OPTIONS
 ---
diff --git a/git-stash.sh b/git-stash.sh
index 826af18..d7072c8 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -384,9 +384,8 @@ parse_flags_and_rev()
i_tree=
u_tree=
 
-   REV=$(git rev-parse --no-flags --symbolic --sq "$@") || exit 1
-
FLAGS=
+   REV=
for opt
do
case "$opt" in
@@ -404,6 +403,9 @@ parse_flags_and_rev()
die "$(eval_gettext "unknown option: 
\$opt")"
FLAGS="${FLAGS}${FLAGS:+ }$opt"
;;
+   *)
+   REV="${REV}${REV:+ }'$opt'"
+   ;;
esac
done
 
@@ -422,6 +424,15 @@ parse_flags_and_rev()
;;
esac
 
+   case "$1" in
+   *[!0-9]*)
+   :
+   ;;
+   *)
+   set -- "${ref_stash}@{$1}"
+   ;;
+   esac
+
REV=$(git rev-parse --symbolic --verify --quiet "$1") || {
reference="$1"
die "$(eval_gettext "\$reference is not a valid reference")"
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 2142c1f..f82a8c4 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -131,6 +131,26 @@ test_expect_success 'drop middle stash' '
test 1 = $(git show HEAD:file)
 '
 
+test_expect_success 'drop middle stash by index' '
+   git reset --hard &&
+   echo 8 >file &&
+   git stash &&
+   echo 9 >file &&
+   git stash &&
+   git stash drop 1 &&
+   test 2 = $(git stash list | wc -l) &&
+   git stash apply &&
+   test 9 = $(cat file) &&
+   test 1 = $(git show :file) &&
+   test 1 = $(git show HEAD:file) &&
+   git reset --hard &&
+   git stash drop &&
+   git stash apply &&
+   test 3 = $(cat file) &&
+   test 1 = $(git show :file) &&
+   test 1 = $(git show HEAD:file)
+'
+
 test_expect_success 'stash pop' '
git reset --hard &&
git stash pop &&
@@ -604,7 +624,21 @@ test_expect_success 'invalid ref of the form stash@{n}, n 
>= N' '
git stash drop
 '
 
+test_expect_success 'invalid ref of the form "n", n >= N' '
+   git stash clear &&
+   test_must_fail git stash drop 0 &&
+   echo bar5 >file &&
+   echo bar6 >file2 &&
+   git add file2 &&
+   git stash &&
+   test_must_fail git stash drop 1 &&
+   test_must_fail git stash pop 1 &&
+   test_must_fail git stash apply 1 &&
+   test_must_fail git stash show 1 &&
+   test_must_fail git stash branch tmp 1 &&
+   git stash drop
+'
+
 test_expect_success 'stash branch should not drop the stash if the branch 
exists' '
git stash clear &&
echo foo >file &&
-- 
2.7.4



[PATCH] reset: --unmerge

2016-10-24 Thread Junio C Hamano
The procedure to resolve a merge conflict typically goes like this:

 - first open the file in the editor, and with the help of conflict
   markers come up with a resolution.

 - save the file.

 - look at the output from "git diff" to see the combined diff to
   double check if the resolution makes sense.

 - perform other tests, like trying to build the result with "make".

 - finally "git add file" to mark that you are done.

and repeating the above until you are done with all the conflicted
paths.  If you, for whatever reason, accidentally "git add file" by
mistake until you are convinced that you resolved it correctly (e.g.
doing "git add file" immediately after saving, without a chance to
peruse the output from "git diff"), there is no good way to recover.
There is "git checkout -m file" but that overwrites the working tree
file to reproduce the conflicted state, which is not exactly what
you want.  You only want to reproduce the conflicted state in the
index, so that you can inspect the (proposed) merge resolution you
already have in your working tree.

Add "git reset --unmerge " command that does just that.

Signed-off-by: Junio C Hamano 
---

 * Not thought through thoroughly yet about details, such as "should
   it always require at least one pathspec?".

 builtin/reset.c   |  8 +++-
 t/t7107-reset-unmerged.sh | 36 
 2 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/builtin/reset.c b/builtin/reset.c
index 9020ec66c8..3aa9e0b34a 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -21,6 +21,7 @@
 #include "parse-options.h"
 #include "unpack-trees.h"
 #include "cache-tree.h"
+#include "resolve-undo.h"
 
 static const char * const git_reset_usage[] = {
N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q] 
[]"),
@@ -272,6 +273,7 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
struct object_id oid;
struct pathspec pathspec;
int intent_to_add = 0;
+   int unmerge = 0;
const struct option options[] = {
OPT__QUIET(, N_("be quiet, only report errors")),
OPT_SET_INT(0, "mixed", _type,
@@ -286,6 +288,8 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
OPT_BOOL('p', "patch", _mode, N_("select hunks 
interactively")),
OPT_BOOL('N', "intent-to-add", _to_add,
N_("record only the fact that removed paths 
will be added later")),
+   OPT_BOOL(0, "unmerge", ,
+N_("recover conflicted stages from an earlier 'git 
add'")),
OPT_END()
};
 
@@ -357,7 +361,9 @@ int cmd_reset(int argc, const char **argv, const char 
*prefix)
hold_locked_index(lock, 1);
if (reset_type == MIXED) {
int flags = quiet ? REFRESH_QUIET : 
REFRESH_IN_PORCELAIN;
-   if (read_from_tree(, oid.hash, intent_to_add))
+   if (unmerge)
+   unmerge_cache();
+   else if (read_from_tree(, oid.hash, 
intent_to_add))
return 1;
if (get_git_work_tree())
refresh_index(_index, flags, NULL, NULL,
diff --git a/t/t7107-reset-unmerged.sh b/t/t7107-reset-unmerged.sh
new file mode 100755
index 00..57b2e27150
--- /dev/null
+++ b/t/t7107-reset-unmerged.sh
@@ -0,0 +1,36 @@
+#!/bin/sh
+
+test_description='git reset with paths'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+   echo one >file &&
+   git add file &&
+   git commit -m "one" &&
+   git tag initial &&
+
+   echo two >file &&
+   git commit -a -m "two" &&
+
+   git checkout -b side initial &&
+   echo three >file &&
+   git commit -a -m "three"
+'
+
+test_expect_success "cause conflict, resolve, and unresolve" '
+   git reset --hard &&
+   git checkout master &&
+   test_must_fail git merge side &&
+
+   git ls-files -u >expect &&
+
+   echo four >file &&
+   git add file &&
+
+   git reset --unmerge -- file &&
+   git ls-files -u >actual &&
+   test_cmp expect actual
+'
+
+test_done


Re: An anomaly, not a bug

2016-10-24 Thread Alexei Lozovsky
> Actually, git reads
>
> # comment
>
> as 'ignore files with name "" (4 spaces)', and then a comment.
> It does not ignore the leading whitespace.

Even not as a comment, it treats it as literally a filename with
a hash and that comment in it. However, one (usually) does not
name their files like that, so for all purposes this can be thought
of as a sort of comment.


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread René Scharfe
Am 24.10.2016 um 19:27 schrieb Junio C Hamano:
> Junio C Hamano  writes:
> 
>>> I think it would be preferable to just fix it inline in each place.
>>
>> Yeah, probably.
>>
>> My initial reaction to this was
>>
>>  char *sha1_to_hex(const unsigned char *sha1)
>>  {
>> -static int bufno;
>> +static unsigned int bufno;
>>  static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
>>  return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
>>
>> "ah, we do not even need bufno as uint; it could be ushort or even
>> uchar".  If this were a 256 element ring buffer and the index were
>> uchar, we wouldn't even be having this discussion, and "3 &" is a
>> way to get a fake type that is a 2-bit unsigned integer that wraps
>> around when incremented.
>>
>> But being explicit, especially when we know that we can rely on the
>> fact that the compilers are usually intelligent enough, is a good
>> idea, I would think.
>>
>> Isn't size_t often wider than uint, by the way?  It somehow makes me
>> feel dirty to use it when we know we only care about the bottom two
>> bit, especially with the explicit "bufno %= ARRAY_SIZE(hexbuffer)",
>> but I may be simply superstitious in this case.  I dunno.
> 
> If we are doing the wrap-around ourselves, I suspect that the index
> should stay "int" (not even unsigned), as that is supposed to be the
> most natural and performant type on the architecture.  Would it
> still result in better code to use size_t instead?

Right.

Gcc emits an extra cltq instruction for x86-64 (Convert Long To Quad,
i.e. 32-bit to 64-bit) if we wrap explicitly and still use an int in
sha1_to_hex().  It doesn't if we use an unsigned int or size_t.  It
also doesn't for get_pathname().  I guess updating the index variable
only after use makes the difference there.

But I think we can ignore that; it's just an extra cycle.  I only
even noticed it when verifying that "% 4" is turned into "& 3"..
Clang and icc don't add the cltq, by the way.

So how about this?  It gets rid of magic number 3 and works for array
size that's not a power of two.  And as a nice side effect it can't
trigger a signed overflow anymore.

Just adding "unsigned" looks more attractive to me for some reason.
Perhaps I stared enough at the code to get used to the magic values
there..

René

---
 hex.c  | 3 ++-
 path.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hex.c b/hex.c
index ab2610e..845b01a 100644
--- a/hex.c
+++ b/hex.c
@@ -78,7 +78,8 @@ char *sha1_to_hex(const unsigned char *sha1)
 {
static int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
+   bufno = (bufno + 1) % ARRAY_SIZE(hexbuffer);
+   return sha1_to_hex_r(hexbuffer[bufno], sha1);
 }
 
 char *oid_to_hex(const struct object_id *oid)
diff --git a/path.c b/path.c
index a8e7295..52d889c 100644
--- a/path.c
+++ b/path.c
@@ -25,7 +25,8 @@ static struct strbuf *get_pathname(void)
STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
};
static int index;
-   struct strbuf *sb = _array[3 & ++index];
+   struct strbuf *sb = _array[index];
+   index = (index + 1) % ARRAY_SIZE(pathname_array);
strbuf_reset(sb);
return sb;
 }
-- 
2.10.1


Re: An anomaly, not a bug

2016-10-24 Thread Alexei Lozovsky
> Note: the comments can be started away from the left margin,
> as normal in all unix-linux configuration files we know of.
> Git follows this behaviour fine.

Actually, git reads

# comment

as 'ignore files with name "" (4 spaces)', and then a comment.
It does not ignore the leading whitespace.


[PATCH] doc: fix the 'revert a faulty merge' ASCII art tab spacing

2016-10-24 Thread Philip Oakley
The asciidoctor doc-tool stack does not always respect the 'tab = 8 spaces' rule
expectation, particularly for the Git-for-Windows generated html pages. This
follows on from the 'doc: fix merge-base ASCII art tab spacing' fix.

Use just spaces within the block of the ascii art.

All other *.txt ascii art containing three dashes has been checked.
Asciidoctor correctly formats the other art blocks that do contain tabs.

Signed-off-by: Philip Oakley 

An anomaly, not a bug

2016-10-24 Thread m.

Concerning ".gitignore", experienced using git 2.10.0

Starting code using one or more spaces or tabs from the left
margin will have git reading .gitignore and ignoring(or
un-ignoring) the command-instruction.

Example: Starting .gitignore

/*
   # above line is duly read.  Then un-ignoring
   # something but starting the command further to the
   # right will have git not reading that line

 !/nottobeignored.file

End .gitignore

Note: the comments can be started away from the left margin,
as normal in all unix-linux configuration files we know of.
Git follows this behaviour fine.

The lines containing commands, on the contrary of regular
convention cannot be indented by spaces or tabs.  Quite
unusual, confusing, not in the sense of conventional and
easily adverted in coding git.

Could be we are missing out on something,

Git command line tool is a functional tool now in our setup
for two years, first noticed this alien behaviour in this
version of git, on osx, the fink(osx package mananger)
binary.




Re: revisiting zstd timings

2016-10-24 Thread Junio C Hamano
Jeff King  writes:

> If we were designing git today, it seems like a no-brainer to use zstd
> over zlib. But given backwards-compatibility issues, I'm not sure.
> 10-20% speedup on reading is awfully nice, but I don't think there's a
> good way to gracefully transition, because zlib is part of the
> on-the-wire format for serving objects. We could re-compress on the fly,
> but that gets expensive (in existing cases, we can quite often serve the
> zlib content straight from disk, but this would require an extra
> inflate/deflate. At least we wouldn't have to reconstitute objects from
> deltas, though).
>
> A transition would probably look something like:
>
>   0. The patch below, or something like it, to teach git to read both
>  zlib and zstd, and optionally write zstd. We'd probably want to
>  make this an unconditional requirement like zlib, because the point
>  is for it to be available everywhere (I assume the zstd code is
>  pretty portable, but I haven't put it to the test).
>
>   1. Another patch to add a "zstd" capability to the protocol. This
>  would require teaching pack-objects an option to convert zstd back
>  to zlib on the fly.
>
>  Servers which handle a limited number of updated clients can switch
>  to zstd immediately to get the benefit, and their clients can
>  handle it directly. Likewise, clients of older servers may wish to
>  repack locally using zstd to get the benefit. They'll have to
>  recompress on the fly during push, but pushes are rare than other
>  operations (and often limited by bandwidth anyway).
>
>   2. After a while, eventually flip the default to zstd=5.
>
>   3. If "a while" is long enough, perhaps add a patch to let servers
>  tell clients "go upgrade" rather than recompressing on the fly.
>
> I don't have immediate plans for any of that, but maybe something to
> think about.

Thanks for a write-up.  This is quite interesting.

Thanks to d98b46f8d, this does not have to impact the object naming
;-)



Re: Reporting Bug in Git Version Control System

2016-10-24 Thread Junio C Hamano
Junio C Hamano  writes:

> Stefan Beller  writes:
>
>> On Mon, Oct 24, 2016 at 7:28 AM, Yash Jain  wrote:
>>> Hello,
>>> I have two accounts on github("yj291197" and "yaki29").
>>> Both the accounts have different gmail IDs("yj291...@gmail.com" and
>>> "yashjain@gmail.com" respectively) but same passwords.
>>> I used to use git for "yj291197" account and a few days earlier I made
>>> this new account and used git commit to commit on "yaki29" but it
>>> appeared as "yj291197" committed on "yaki29's" repo.
>>> Then I pulled a request of that commit then it appeared "yaki29"
>>> pulled a request with a commit of "yj291197".
>>>
>>> And during this whole session I was signed in as "yaki29" on github.com .
>>>
>>
>> This is a Github issue, so ask Github support.
>>
>> Or read up on .mailmap files.
>
> I am (obviously) not a GitHub support, but I think the confusion is
> coming from not understanding who the committer and the author of a
> commit are and where they are coming from.  They are both recorded
> locally, taken from user.name and user.email configuration variables
> when the commits are made.  "git push" to propagate them to GitHub
> will NOT change these values of a commit, once a commit is created.
>
> The story described looks quite consistent if the user has
> yj291...@gmail.com configured as user.email and kept making commits
> in the local repository, and pushed them to either yj291197 or yaki29
> accounts at GitHub, without ever changing the local configuration to
> use the other e-mail address.  All commits would record the user and
> e-mail address yj291197, and the only one that may be attributed to
> the new one yaki29 would be the automerge created at GitHub when a
> pull request is responded on-site without first fetching and making
> a merge locally.

IOW, this sounds like Pebkac to me.  There is no a thing that needs
fixing in Git, and I do not immediately see there is anything GitHub
needs to fix, either.  The user may need fixing, though ;-).




Re: [PATCH 17/36] attr: expose validity check for attribute names

2016-10-24 Thread Stefan Beller
On Sun, Oct 23, 2016 at 8:07 AM, Ramsay Jones
 wrote:
>
>
> On 23/10/16 00:32, Stefan Beller wrote:
>> From: Junio C Hamano 
>>
>> Export attr_name_valid() function, and a helper function that
>> returns the message to be given when a given  pair
>> is not a good name for an attribute.
>>
>> We could later update the message to exactly spell out what the
>> rules for a good attribute name are, etc.
>>
>> Signed-off-by: Junio C Hamano 
>> Signed-off-by: Stefan Beller 
>> ---
>
> [snip]
>
>> +extern int attr_name_valid(const char *name, size_t namelen);
>> +extern void invalid_attr_name_message(struct strbuf *, const char *, int);
>> +
>
> The symbol 'attr_name_valid()' is not used outside of attr.c, even
> by the end of this series. Do you expect this function to be used
> in any future series? (The export is deliberate and it certainly
> seems like it should be part of the public interface, but ...)
>
> In contrast, the 'invalid_attr_name_message()' function is called
> from code in pathspec.c, which relies on 'git_attr_counted()' to
> call 'attr_name_valid()' internally to check for validity. :-D

Yeah, I am taking over Junios patches and do not quite implement
what Junio thought I would. ;) So I guess it is a communication mismatch.

git_attr_counted is a wrapper around attr_name_valid in the way that
it either returns NULL when the attr name is invalid or it does extra work
and returns a pointer to an attr.

So I think for API completeness we'd want to keep attr_name_valid around,
as otherwise the API looks strange. But that doesn't seem like a compelling
reason, so I'll drop it from the header file and make it static.

Thanks,
Stefan


Re: [PATCH 28/36] attr: keep attr stack for each check

2016-10-24 Thread Junio C Hamano
Stefan Beller  writes:

> I looked for a platform independent way to get a thread id as a natural
> number, i.e. I want to get 1,2,3,... such that I could have just added
> list/array of attr stacks to each check, which would be the
>  tuple you envision.
>
> However I think we do not really need it to be per check.  If we had
> an easy portable way of getting such a thread id, I would have implemented
> a list of stacks per thread first. (Because each thread only looks at one
> check at a time.)

It seems that by "list of stacks per thread", you mean "there is a
list of stacks, each thread uses one and only element of that list",
but I do not think it would be desirable.

"Each thread only looks at one check at a time" is false.  For
example, "write_archive_entry()" would use one check that is
specific to "git archive" to learn about "export-ignore" and
"export-subst" attributes, while letting convert_to_write_tree()
called via sha1_file_to_archive() called via write_entry() method
(i.e. write_tar_entry() or write_zip_entry()) to use a separate
check that is specific to the convert.c API.

>> With manipulation of attr stack protected with a single Big
>> Attributes Lock, this should be safe.  It may not perform very well
>> when used by multiple threads, though ;-)
>
> I agree. So maybe it is not really a good fit for general consumption yet.

I would still think it is a good first step.  It may already be
thread-safe, but may not be thread-ready from performance point of
view.  IOW, this would not yet help an attempt to make the callers
faster by making them multi-threaded.



Re: [PATCH v5 00/27] Prepare the sequencer for the upcoming rebase -i patches

2016-10-24 Thread Junio C Hamano
Stefan Beller  writes:

>> Speaking of what to and not to include in the upcoming release, we
>> do want to include Stefan's off-by-one fix to the submodule-helper,
>> but that is blocked on Windows end due to the test.
>
> I'd be happy either way, i.e. we could revert that fix and make a release?
> AFAICT, Windows only has broken tests, not broken functionality with that
> submodule bug fix.

If you are referring the "trailing /. should not make difference
when resolving ../relative/path" change with "rever that fix", I
think that may be a reasonable way to proceed.  Even though that
change is a bugfix (at least from the point of view by me and j6t in
the recent discussion), it is a behaviour change that we would want
to see feedback from existing submodule users and deserves a longer
gestation period.  And that part is not yet in 'next' yet ;-)

> If we want a longer gestation period, we'd ideally merge it to master
> just after a release, such that we "cook" it in master without having
> it in any release (we had a similar discussion for the diff heuristics IIRC).

Yes.  

It would mean that we would need a separate patch that adds the
!MINGW prerequisite to some tests to what is on 'next', as the early
patches on sb/submodule-ignore-trailing-slash~ that fixes off-by-one
is the right thing to do either way.  It of course needs help from
Windows folks to validate the results.


Re: [PATCH v5 00/27] Prepare the sequencer for the upcoming rebase -i patches

2016-10-24 Thread Junio C Hamano
Johannes Schindelin  writes:

>> I prefer to cook it in 'next' sufficiently long to ensure that we hear
>> feedbacks from non-Windows users if there is any unexpected breakage.
>
> FWIW I am using the same patches not only on Windows but also in my Linux
> VM.

Thanks for a datapoint, but when I said "non-Windows users", I was
not referring you as "the" Windows user.  I am expecting that you
would hear from Windows users who got exposure to your series by its
inclusion in Git for Windows.  They are the ones that I had in mind
as "Windows users"---and not hearing breakage reported by them would
be a good sign.

The primary reason why we want to cook a new topic in 'next' is to
expose it to people with different workflows using it on different
things, and that is especially more important for a change that
affects features that are flexible and can be used in different
ways---the set of options and commands used by the original author
of the series are often different from other people's.

Any change, when it hits 'next', is expected to be sufficiently
tested by the original author [*1*], but that is only true in the
context of the original author's daily use.  Both reviews and
author's tests are not sufficient to find bugs [*2*].

Topics that touch parts of the system that are more important to
users' daily Git life deserve extra time to find any unexpected
breakage in them.  Windows users are participating in that test by
inclusion of the topic in the released version of Git for Windows.
I want to see the the test for the rest of the world done by early
adopters who run 'next' (as 'pu' is too scary for daily use).


[Footnote]

*1* And me, as topics geting ready to be in 'next' are first merged
to my private edition branch that is slightly ahead of 'next' to
be used in my everyday use, but just like the original author is
merely one user, I am also merely one user with a specific set
of workflows that is different from others'.

*2* Bug finding is not the primary purpose of the review in the
first place.  It is to find design mistakes both at the external
and internal level, and bug finding "here you have off-by-one"
is merely a side effect.  End user tests may expose the former
(e.g. the design based on a wrong assumption may not accomodate
certain workflow the original author and the reviewers failed to
consider while writing and reviewing), but no amount of test
will uncover the latter (e.g. internal API that is misdesigned
will make future enhancement unnecessarily harder).

I think it was one of the achievements of the review cycle of
this particular series that we got rid of the _entrust() thing,
for example.  That had no visible external effect that would
have been caught by cooking on 'next' or releasing it to the
public, but was the kind of thing the code review was expected
to find and fix.


Re: [PATCH v5 00/27] Prepare the sequencer for the upcoming rebase -i patches

2016-10-24 Thread Junio C Hamano
Johannes Schindelin  writes:

>> I still do not understand (note that I am not saying "I do not
>> accept"--acceptance or rejection happens after an understandable
>> explanation is given, and "do not understand" means no such
>> explanation has been given yet) your justification behind adding a
>> technical debt to reimplement the author-script parser and not
>> sharing it with "git am" in 13/27.
>
> At this point, I am most of all reluctant to introduce such a huge change,
> which surely would introduce a regression.

That is a perfectly sensible and honest answer that I can
understand, accept and even support.

You've been working on the series for several months, running with
these dozen or so lines of code, and replacing it with another dozen
or so lines of code would require you to make sure the result is
actually doing the same thing for the remainder of your series.  And
I agree that is an unnecessary risk in order to ship a working code.
The code being battle tested counts.

I cared on this point mostly because I wanted to make sure that
people later can find out why there are two functions that ought to
be doing the same thing.  

If there were a technical reason why these two must stay to be
different implementations that are potentially doing different
things, I want to see that reason described, so that those who come
later and want to refactor the helper functions into one later will
know why they are separate in the first place.

If on the other hand there isn't any technical reason why they must
stay to be different, and they are different mostly because you
happened to have written a different one in 13/27 and have been
running with it, that is also a good thing for them to know.



Re: [PATCH v2 2/2] read-cache: make sure file handles are not inherited by child processes

2016-10-24 Thread Lars Schneider

> On 24 Oct 2016, at 21:22, Johannes Sixt  wrote:
> 
> Am 24.10.2016 um 20:03 schrieb larsxschnei...@gmail.com:
>> From: Lars Schneider 
>> 
>> This fixes "convert: add filter..process option" (edcc8581) on
>> Windows.
> 
> Today's next falls flat on its face on Windows in t0021.15 "required process 
> filter should filter data"; might it be the failure meant here? (I haven't 
> dug deeper, yet.)

Yes, this is the failure meant here :-)

Cheers,
Lars




Re: [PATCH v2 2/2] read-cache: make sure file handles are not inherited by child processes

2016-10-24 Thread Junio C Hamano
Eric Wong  writes:

> larsxschnei...@gmail.com wrote:
>> +++ b/read-cache.c
>> @@ -156,7 +156,11 @@ void fill_stat_cache_info(struct cache_entry *ce, 
>> struct stat *st)
>>  static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
>>  {
>>  int match = -1;
>> -int fd = open(ce->name, O_RDONLY);
>> +int fd = open(ce->name, O_RDONLY | O_CLOEXEC);
>> +
>> +if (O_CLOEXEC && fd < 0 && errno == EINVAL)
>> +/* Try again w/o O_CLOEXEC: the kernel might not support it */
>> +fd = open(ce->name, O_RDONLY);
>
> In the case of O_CLOEXEC != 0 and repeated EINVALs,
> it'd be good to use something like sha1_file_open_flag as in 1/2
> so we don't repeatedly hit EINVAL.  Thanks.

Sounds sane.  

It's just only once, so perhaps we do not mind a recursion like
this?

 read-cache.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index b594865d89..a6978b9321 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -156,11 +156,14 @@ void fill_stat_cache_info(struct cache_entry *ce, struct 
stat *st)
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 {
int match = -1;
-   int fd = open(ce->name, O_RDONLY | O_CLOEXEC);
+   static int cloexec = O_CLOEXEC;
+   int fd = open(ce->name, O_RDONLY | cloexec);
 
-   if (O_CLOEXEC && fd < 0 && errno == EINVAL)
+   if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) {
/* Try again w/o O_CLOEXEC: the kernel might not support it */
-   fd = open(ce->name, O_RDONLY);
+   cloexec &= ~O_CLOEXEC;
+   return ce_compare_data(ce, st);
+   }
 
if (fd >= 0) {
unsigned char sha1[20];


Re: [PATCH 2/3] submodule tests: replace cloning from . by "$(pwd)"

2016-10-24 Thread Johannes Sixt

Am 24.10.2016 um 21:10 schrieb Stefan Beller:

The ease of use in our own testing suite is the feature I was talking about.


Ok, thanks for the clarification. Next steps would then be, IMO, to fix 
the tests to match the future world order and mark tests that fail due 
to the bug with test_expect_failure, and then switch them back to 
test_expect_success in the patch with the bug fix.


-- Hannes



Re: [PATCH v5 00/27] Prepare the sequencer for the upcoming rebase -i patches

2016-10-24 Thread Stefan Beller
On Sat, Oct 22, 2016 at 10:11 AM, Junio C Hamano  wrote:

>
> There isn't enough time to include this topic in the upcoming
> release within the current https://tinyurl.com/gitCal calendar,
> however, which places the final on Nov 11th.
>
> I am wondering if it makes sense to delay 2.11 by moving the final
> by 4 weeks to Dec 9th.
>
> Thoughts?
>
> Speaking of what to and not to include in the upcoming release, we
> do want to include Stefan's off-by-one fix to the submodule-helper,
> but that is blocked on Windows end due to the test.

I'd be happy either way, i.e. we could revert that fix and make a release?
AFAICT, Windows only has broken tests, not broken functionality with that
submodule bug fix.

> I think
> everybody agreed that a longer time "right thing to do" fix is to
> address the "when base is /path/to/dir/., where is ../sub relative
> to it?" issue, but if we are to do so, it would need a longer
> gestation period once it hits 'next', as it can affect the current
> users and we may even need B/C notes in the release notes for the
> change.  Giving ourselves a few more weeks of breathing room would
> help us to make sure the fix to relative URL issue is sound, too.

If we want a longer gestation period, we'd ideally merge it to master
just after a release, such that we "cook" it in master without having
it in any release (we had a similar discussion for the diff heuristics IIRC).

So please don't let the release schedule depend on my ability to deliver a
proper patch for the submodule path issue.

Thanks,
Stefan


Re: [PATCH 28/36] attr: keep attr stack for each check

2016-10-24 Thread Stefan Beller
On Mon, Oct 24, 2016 at 12:07 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Instead of having a global attr stack, attach the stack to each check.
>
> Two threads may be working on "git checkout", one "git_attr_check"
> in convert.c may be used by them to learn the EOL conversion for
> each path, and these threads are working in different parts of
> worktree in parallel.  The set of .gitattributes files each of these
> threads wants to be cached at one time is tied to where in the
> directory hierarchy the thread is working in.
>
> The view by API users would not have to change from the point on
> since 27/36 or so, but I think attr_stack needs to become per
>  tuple when we are fully thread-ready for the above
> reason.

I looked for a platform independent way to get a thread id as a natural
number, i.e. I want to get 1,2,3,... such that I could have just added
list/array of attr stacks to each check, which would be the
 tuple you envision.

However I think we do not really need it to be per check.  If we had
an easy portable way of getting such a thread id, I would have implemented
a list of stacks per thread first. (Because each thread only looks at one
check at a time.)

So this is not a baby step because I did not want to do it all at once, but
because I did not find a suitable API to use.

>
> But we need to start somewhere to move away from the current "one
> single attr stack" to "there are multiple attr stacks", and this
> "two checks may and do use different attr stacks" is probably a
> reasonable first step.  It may give a single-threaded API users
> immediate benefit if the "read and keep only the entries relevant
> to the query" optimization is done with this step alone, without
> making the cache per  pair.
>
>> This allows to use the attr in a multithreaded way.
>
> With manipulation of attr stack protected with a single Big
> Attributes Lock, this should be safe.  It may not perform very well
> when used by multiple threads, though ;-)

I agree. So maybe it is not really a good fit for general consumption yet.


Re: [PATCH v2 2/2] read-cache: make sure file handles are not inherited by child processes

2016-10-24 Thread Johannes Sixt

Am 24.10.2016 um 20:03 schrieb larsxschnei...@gmail.com:

From: Lars Schneider 

This fixes "convert: add filter..process option" (edcc8581) on
Windows.


Today's next falls flat on its face on Windows in t0021.15 "required 
process filter should filter data"; might it be the failure meant here? 
(I haven't dug deeper, yet.)


++ test_config_global filter.protocol.process 
'/d/Src/mingw-git/t/t0021/rot13-filter.pl clean smudge'
++ test_when_finished 'test_unconfig --global 
'\''filter.protocol.process'\'''

++ test 0 = 0
++ test_cleanup='{ test_unconfig --global '\''filter.protocol.process'\''
} && (exit "$eval_ret"); eval_ret=$?; :'
++ git config --global filter.protocol.process 
'/d/Src/mingw-git/t/t0021/rot13-filter.pl clean smudge'

++ test_config_global filter.protocol.required true
++ test_when_finished 'test_unconfig --global 
'\''filter.protocol.required'\'''

++ test 0 = 0
++ test_cleanup='{ test_unconfig --global '\''filter.protocol.required'\''
} && (exit "$eval_ret"); eval_ret=$?; { test_unconfig 
--global '\''filter.protocol.process'\''

} && (exit "$eval_ret"); eval_ret=$?; :'
++ git config --global filter.protocol.required true
++ rm -rf repo
++ mkdir repo
++ cd repo
++ git init
Initialized empty Git repository in d:/Src/mingw-git/t/trash 
directory.t0021-conversion/repo/.git/

++ echo git-stderr.log
++ echo '*.r filter=protocol'
++ git add .
++ git commit . -m 'test commit 1'
[master (root-commit) aa5dd37] test commit 1
 Author: A U Thor 
 2 files changed, 2 insertions(+)
 create mode 100644 .gitattributes
 create mode 100644 .gitignore
++ git branch empty-branch
++ cp 'd:/Src/mingw-git/t/trash directory.t0021-conversion/test.o' test.r
++ cp 'd:/Src/mingw-git/t/trash directory.t0021-conversion/test2.o' test2.r
++ mkdir testsubdir
++ cp 'd:/Src/mingw-git/t/trash directory.t0021-conversion/test3 
'\''sq'\'',$x.o' 'testsubdir/test3 '\''sq'\'',$x.r'

+++ file_size test.r
+++ cat test.r
+++ wc -c
+++ sed 's/^[ ]*//'
++ S=57
+++ file_size test2.r
+++ cat test2.r
+++ wc -c
+++ sed 's/^[ ]*//'
++ S2=14
+++ file_size 'testsubdir/test3 '\''sq'\'',$x.r'
+++ cat 'testsubdir/test3 '\''sq'\'',$x.r'
+++ wc -c
+++ sed 's/^[ ]*//'
++ S3=49
++ filter_git add .
++ rm -f rot13-filter.log
++ git add .
error: last command exited with $?=128
not ok 15 - required process filter should filter data



Re: [PATCH 27/36] attr: convert to new threadsafe API

2016-10-24 Thread Stefan Beller
On Mon, Oct 24, 2016 at 11:55 AM, Junio C Hamano  wrote:

>
> Make that a double-asterisk.  The same problem appears in an updated
> example in technical/api-gitattributes.txt doc, but the example in
> the commit log message (below) is correct.

The implementation is actually using a double pointer, see below,
I forgot commit message and documentation

>> GIT_ATTR_RESULT_INIT_FOR(myresult, 1);
>
> Are you sure about this?  We've called attr_check_initl() already so
> if this is declaring myresult, it would be decl-after-stmt.

I forgot to update the commit message and Documentation.
GIT_ATTR_RESULT_INIT_FOR is gone in the header
and in the implementation.  I'll update that patch
to be consistent throughout all of {Documentation,
commit message, implementation}.

>
> The latter half is questionable.  If it is "static" it wouldn't be
> thread safe, no?  I think the diff in this patch for archive.c shows
> that we only expect
>
> struct git_attr_result result[2];
>
> upfront without RESULT_INIT_FOR(), and the reason why there is no
> need to free the result[] is because it is on the stack.  And each
> element in result[] may point at a string, but the string belongs to
> the attr subsystem and must not be freed.
>

Same as above, it's bogus.

Thanks,
Stefan


Re: [PATCH 2/3] submodule tests: replace cloning from . by "$(pwd)"

2016-10-24 Thread Stefan Beller
On Sun, Oct 23, 2016 at 3:14 AM, Johannes Sixt  wrote:
> Am 22.10.2016 um 22:46 schrieb Stefan Beller:
>>
>> I have looked into it again, and by now I think the bug is a feature,
>> actually.
>>
>> Consider this:
>>
>> git clone . super
>> git -C super submodule add ../submodule
>> # we thought the previous line is buggy
>> git clone super super-clone
>
>
> At this point, we *should* have this if there were no bugs (at least that is
> my assumption):
>
>   /tmp
>   !
>   + submodule <- submodule's remote repo
>   !
>   + foo   <- we are here (.), super's remote repo
> !
> + super   <- remote.origin.url=/tmp/foo/.
>   !
>   + submodule <- remote.origin.url=/tmp/foo/./../submodule
>  submodule.submodule.url=../submodule
>
> When I test this, 'git submodule add' fails:

Yes this setup currently fails because the  /tmp/foo/./../submodule
is computed to be /tmp/foo/submodule.

In the test suite the directory "foo" is usually called
"trash\ directory.t-description" and the remote of
the submodule is created inside of it, such that:

/tmp
   !
   + foo<- we are here (.), super's remote repo
 !
 + submodule<- submodule's remote repo
 !
 + super<- remote.origin.url=/tmp/foo/.
   !
   + submodule  <- remote.origin.url=/tmp/foo/./../submodule
   submodule.submodule.url=../submodule
   current result=/tmp/foo/submodule

works.

>
> foo@master> git -C super submodule add ../submodule
> fatal: repository '/tmp/foo/submodule' does not exist
> fatal: clone of '/tmp/foo/submodule' into submodule path
> '/tmp/foo/super/submodule' failed
>
>> Now in the super-clone the ../submodule is the correct
>> relative url, because the url where we cloned from doesn't
>> end in /.
>
>
> I do not understand why this would be relevant. The question is not how the
> submodule's remote URL ends, but how the submodule's remote URL is
> constructed from the super-project's URL and the relative path specified for
> 'git submodule add'.

I was not precise here. If you have the working setup as above, you can clone
super to super-clone and it keeps working, because of the current "bug".

The difference between "super" that is cloned from . and "super-clone" that
is cloned from "super" is only the remote url, and currently
/tmp/foo/super
/tmp/foo/.

+ relative path ../submodule resolve to the same submodule remote.

>
> Whether ../submodule or ./submodule is the correct relative URL depends on
> where the origin of the submodule is located relative to the origin of the
> super-project. In the above example, it is ../submodule. However, the error
> message tells us that git looked in /tmp/foo/submodule, which looks like the
> /. bug!

Correct.

At the time I was trying to fix the urls in the test suite with the
bug fix and I then
realized that this bugfix introduces a lot of pain to our test suite,
because we do
these secondary clones quite a few times. The pattern usually is:
* setup super (cloned from .)
* clone super to super-clone
* trash super-clone for testing purposes and delete it.

>
> I do not understand where you see a feature here. What am I missing?

The ease of use in our own testing suite is the feature I was talking about.

When we would fix the bug, we would not just need to fix
s|../submodule|./submodule| when setting up super, but we would need to
fix it every time again in reverse when creating these short lived
"super-clones"
that get tested on and deleted.

So maybe the correct fix for the testing suite is a double clone, i.e. instead
of


# prepare some stuff
git clone . super

we rather do:

# mkdir upstream && prepare stuff in upstream
git clone upstream super

However that way we would end up not exercising the
code path to be fixed with the actual bug fix i.e. we'd never clone from /.
because it is silly conceptually. We could add a new test for that though
that only tests that cloning from . constructs the correct URL.
However that is already tested in t0060 resolving the relative URLs
via the git submodule helper.

Thanks for bearing this discussion,
I feel like I am missing the obvious point here,

Stefan


Re: [PATCH 28/36] attr: keep attr stack for each check

2016-10-24 Thread Junio C Hamano
Stefan Beller  writes:

> Instead of having a global attr stack, attach the stack to each check.

Two threads may be working on "git checkout", one "git_attr_check"
in convert.c may be used by them to learn the EOL conversion for
each path, and these threads are working in different parts of
worktree in parallel.  The set of .gitattributes files each of these
threads wants to be cached at one time is tied to where in the
directory hierarchy the thread is working in.

The view by API users would not have to change from the point on
since 27/36 or so, but I think attr_stack needs to become per
 tuple when we are fully thread-ready for the above
reason.

But we need to start somewhere to move away from the current "one
single attr stack" to "there are multiple attr stacks", and this
"two checks may and do use different attr stacks" is probably a
reasonable first step.  It may give a single-threaded API users
immediate benefit if the "read and keep only the entries relevant
to the query" optimization is done with this step alone, without
making the cache per  pair.

> This allows to use the attr in a multithreaded way.

With manipulation of attr stack protected with a single Big
Attributes Lock, this should be safe.  It may not perform very well
when used by multiple threads, though ;-)



Re: [PATCH 27/36] attr: convert to new threadsafe API

2016-10-24 Thread Junio C Hamano
Stefan Beller  writes:

> This revamps the API of the attr subsystem to be thread safe.
> Before we had the question and its results in one struct type.
> The typical usage of the API was
>
> static struct git_attr_check *check;
>
> if (!check)
> check = git_attr_check_initl("text", NULL);
>
> git_check_attr(path, check);
> act_on(check->value[0]);
>
> This has a couple of issues when it comes to thread safety:
>
> * the initialization is racy in this implementation. To make it
>   thread safe, we need to acquire a mutex, such that only one
>   thread is executing the code in git_attr_check_initl.
>   As we do not want to introduce a mutex at each call site,
>   this is best done in the attr code. However to do so, we need
>   to have access to the `check` variable, i.e. the API has to
>   look like
> git_attr_check_initl(struct git_attr_check*, ...);

Make that a double-asterisk.  The same problem appears in an updated
example in technical/api-gitattributes.txt doc, but the example in
the commit log message (below) is correct.

> The usage of the new API will be:
>
> /*
>  * The initl call will thread-safely check whether the
>  * struct git_attr_check has been initialized. We only
>  * want to do the initialization work once, hence we do
>  * that work inside a thread safe environment.
>  */
> static struct git_attr_check *check;
> git_attr_check_initl(, "text", NULL);
>
> /*
>  * Obtain a pointer to a correctly sized result
>  * statically allocated on the stack; this macro:
>  */
> GIT_ATTR_RESULT_INIT_FOR(myresult, 1);

Are you sure about this?  We've called attr_check_initl() already so
if this is declaring myresult, it would be decl-after-stmt.

> /* Perform the check and act on it: */
> git_check_attr(path, check, myresult);
> act_on(myresult->value[0]);
>
> /*
>  * No need to free the check as it is static, hence doesn't leak
>  * memory. The result is also static, so no need to free there either.
>  */

The latter half is questionable.  If it is "static" it wouldn't be
thread safe, no?  I think the diff in this patch for archive.c shows
that we only expect

struct git_attr_result result[2];

upfront without RESULT_INIT_FOR(), and the reason why there is no
need to free the result[] is because it is on the stack.  And each
element in result[] may point at a string, but the string belongs to
the attr subsystem and must not be freed.



Re: [PATCH v2 2/2] read-cache: make sure file handles are not inherited by child processes

2016-10-24 Thread Eric Wong
larsxschnei...@gmail.com wrote:
> +++ b/read-cache.c
> @@ -156,7 +156,11 @@ void fill_stat_cache_info(struct cache_entry *ce, struct 
> stat *st)
>  static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
>  {
>   int match = -1;
> - int fd = open(ce->name, O_RDONLY);
> + int fd = open(ce->name, O_RDONLY | O_CLOEXEC);
> +
> + if (O_CLOEXEC && fd < 0 && errno == EINVAL)
> + /* Try again w/o O_CLOEXEC: the kernel might not support it */
> + fd = open(ce->name, O_RDONLY);

In the case of O_CLOEXEC != 0 and repeated EINVALs,
it'd be good to use something like sha1_file_open_flag as in 1/2
so we don't repeatedly hit EINVAL.  Thanks.


Re: Reporting Bug in Git Version Control System

2016-10-24 Thread Junio C Hamano
Stefan Beller  writes:

> On Mon, Oct 24, 2016 at 7:28 AM, Yash Jain  wrote:
>> Hello,
>> I have two accounts on github("yj291197" and "yaki29").
>> Both the accounts have different gmail IDs("yj291...@gmail.com" and
>> "yashjain@gmail.com" respectively) but same passwords.
>> I used to use git for "yj291197" account and a few days earlier I made
>> this new account and used git commit to commit on "yaki29" but it
>> appeared as "yj291197" committed on "yaki29's" repo.
>> Then I pulled a request of that commit then it appeared "yaki29"
>> pulled a request with a commit of "yj291197".
>>
>> And during this whole session I was signed in as "yaki29" on github.com .
>>
>
> This is a Github issue, so ask Github support.
>
> Or read up on .mailmap files.

I am (obviously) not a GitHub support, but I think the confusion is
coming from not understanding who the committer and the author of a
commit are and where they are coming from.  They are both recorded
locally, taken from user.name and user.email configuration variables
when the commits are made.  "git push" to propagate them to GitHub
will NOT change these values of a commit, once a commit is created.

The story described looks quite consistent if the user has
yj291...@gmail.com configured as user.email and kept making commits
in the local repository, and pushed them to either yj291197 or yaki29
accounts at GitHub, without ever changing the local configuration to
use the other e-mail address.  All commits would record the user and
e-mail address yj291197, and the only one that may be attributed to
the new one yaki29 would be the automerge created at GitHub when a
pull request is responded on-site without first fetching and making
a merge locally.



Re: [PATCH v1 00/19] Add configuration options for split-index

2016-10-24 Thread Junio C Hamano
Christian Couder  writes:

> The design is similar as the previous work that introduced
> "core.untrackedCache". 
>
> The new "core.splitIndex" configuration option can be either true,
> false or undefined which is the default.
>
> When it is true, the split index is created, if it does not already
> exists, when the index is read. When it is false, the split index is
> removed if it exists, when the index is read. Otherwise it is left as
> is.

I admit I haven't thought it through, but this sounds OK.

> Along with this new configuration variable, the two following options
> are also introduced:
>
> - splitIndex.maxPercentChange
>
> This is to avoid having too many changes accumulating in the split
> index while in split index mode. The git-update-index
> documentation says:
>
>   If split-index mode is already enabled and `--split-index` is
>   given again, all changes in $GIT_DIR/index are pushed back to
>   the shared index file.
>
> but it is probably better to not expect the user to think about it
> and to have a mechanism that pushes back all changes to the shared
> index file automatically when some threshold is reached.
>
> The default threshold is when the number of entries in the split
> index file reaches 20% (by default) of the number of entries in
> the shared index file. The new "splitIndex.maxPercentChange"
> config option lets people tweak this value.

OK.

> - splitIndex.sharedIndexExpire
>
> To make sure that old sharedindex files are eventually removed
> when a new one has been created, we "touch" the shared index file
> every time it is used by a new split index file. Then we can
> delete shared indexes with an mtime older than one week (by
> default), when we create a new shared index file. The new
> "splitIndex.sharedIndexExpire" config option lets people tweak
> this grace period.

I do not quite understand this justification.  Doesn't each of the
"this hold only changes since the base index file" files have a
backpointer that names the base index file it is a delta against?

Is it safe to remove a base index file when there is still a split
index file that points at it?  

IOW, I do not see why it can be safe for the expiration decision to
be based on timestamp (I would understand it if it were based on a
refcnt, though).




Re: [PATCH v2 0/2] Use CLOEXEC to avoid fd leaks

2016-10-24 Thread Junio C Hamano
larsxschnei...@gmail.com writes:

> ## Changes since v1
>  * add fallbacks in case O_CLOEXEC is not available

That is a good idea.

>  * rename 'git_open_noatime_cloexec' to 'git_open' (Eric, Dscho)

OK.  This is the old git_open_noatime() that is meant to be used
ONLY for the files Git uses for its internal implementation, and
never for end-user files.  I think it is a good idea to open them
with O_CLOEXEC.

And the separate patch to use O_CLOEXEC in ce_compare_data() that
opens a working tree file for reading does not use git_open(), which
is also correct.  I like it.

>  * rebased the topic on `next` to fix a merge conflict

I think this applies cleanly to 'master', so that is where I'd fork
my copy at.

Thanks.



Re: Reporting Bug in Git Version Control System

2016-10-24 Thread Stefan Beller
This is a Github issue, so ask Github support.

Or read up on .mailmap files.

On Mon, Oct 24, 2016 at 7:28 AM, Yash Jain  wrote:
> Hello,
> I have two accounts on github("yj291197" and "yaki29").
> Both the accounts have different gmail IDs("yj291...@gmail.com" and
> "yashjain@gmail.com" respectively) but same passwords.
> I used to use git for "yj291197" account and a few days earlier I made
> this new account and used git commit to commit on "yaki29" but it
> appeared as "yj291197" committed on "yaki29's" repo.
> Then I pulled a request of that commit then it appeared "yaki29"
> pulled a request with a commit of "yj291197".
>
>
>
> And during this whole session I was signed in as "yaki29" on github.com .
>
>
> Please reply 


Re: [PATCH 0/3] fix travis TAP/--verbose conflict

2016-10-24 Thread Lars Schneider

> On 21 Oct 2016, at 12:41, Jeff King  wrote:
> 
> On Fri, Oct 21, 2016 at 04:43:48AM -0400, Jeff King wrote:
> 
>> The obvious fix would be to send "--verbose" output to stderr, but I
>> suspect that would end up annoying for people who do:
>> 
>>  ./t5547-push-quarantine.sh -v | less
>> 
>> to read long output. Probably we need some option like "--log" which
>> logs in the same way that "--tee" does, but _without_ sending the data
>> to stdout. Naively, that just means replacing the "tee" invocation with
>> "cat", but I suspect it will be a lot more complicated than that,
>> because we still need to let the TAP output go to stdout.
> 
> Yeah, it was definitely a lot more complicated. This patch series fixes
> it.

Thanks a lot for this detailed and quick fix :-)

Cheers,
Lars



[PATCH v2 2/2] read-cache: make sure file handles are not inherited by child processes

2016-10-24 Thread larsxschneider
From: Lars Schneider 

This fixes "convert: add filter..process option" (edcc8581) on
Windows.

Consider the case of a file that requires filtering and is present in
branch A but not in branch B. If A is the current HEAD and we checkout B
then the following happens:

1. ce_compare_data() opens the file
2.   index_fd() detects that the file requires to run a clean filter and
 calls index_stream_convert_blob()
4. index_stream_convert_blob() calls convert_to_git_filter_fd()
5.   convert_to_git_filter_fd() calls apply_filter() which creates a
 new long running filter process (in case it is the first file
 of this kind to be filtered)
6.   The new filter process inherits all file handles. This is the
 default on Linux/OSX and is explicitly defined in the
 `CreateProcessW` call in `mingw.c` on Windows.
7. ce_compare_data() closes the file
8. Git unlinks the file as it is not present in B

The unlink operation does not work on Windows because the filter process
has still an open handle to the file. On Linux/OSX the unlink operation
succeeds but the file descriptors still leak into the child process.

Fix this problem by opening files in read-cache with the CLOEXEC flag to
ensure that the file descriptor does not remain open in a newly spawned
process similar to 05d1ed61.

Signed-off-by: Lars Schneider 
---
 read-cache.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 38d67fa..b594865 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -156,7 +156,11 @@ void fill_stat_cache_info(struct cache_entry *ce, struct 
stat *st)
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 {
int match = -1;
-   int fd = open(ce->name, O_RDONLY);
+   int fd = open(ce->name, O_RDONLY | O_CLOEXEC);
+
+   if (O_CLOEXEC && fd < 0 && errno == EINVAL)
+   /* Try again w/o O_CLOEXEC: the kernel might not support it */
+   fd = open(ce->name, O_RDONLY);
 
if (fd >= 0) {
unsigned char sha1[20];
-- 
2.10.0



[PATCH v2 0/2] Use CLOEXEC to avoid fd leaks

2016-10-24 Thread larsxschneider
From: Lars Schneider 

Use the CLOEXEC flag similar to 05d1ed61 to fix leaked file descriptors.

This mini patch series is necessary to make the "ls/filter-process" topic
work properly for Windows. Right now the topic generates test failures
on Windows as reported by Dscho:
http://public-inbox.org/git/alpine.DEB.2.20.1610211457030.3264@virtualbox/

Patch 1/2 was contemplated by Junio here:
http://public-inbox.org/git/xmqq37lnbbpk@gitster.mtv.corp.google.com/

Thanks to
Eric, Jakub, Dscho, and Junio for the review of v1,
Lars



## Changes since v1
 * add fallbacks in case O_CLOEXEC is not available
 * rename 'git_open_noatime_cloexec' to 'git_open' (Eric, Dscho)
 * rebased the topic on `next` to fix a merge conflict


Lars Schneider (2):
  sha1_file: open window into packfiles with CLOEXEC
  read-cache: make sure file handles are not inherited by child
processes

 builtin/pack-objects.c |  2 +-
 cache.h|  2 +-
 pack-bitmap.c  |  2 +-
 read-cache.c   |  6 +-
 sha1_file.c| 26 --
 5 files changed, 24 insertions(+), 14 deletions(-)



## Interdiff (v1..v2)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0fdd331..0fd52bd 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -720,7 +720,7 @@ static off_t write_reused_pack(struct sha1file *f)
if (!is_pack_valid(reuse_packfile))
die("packfile is invalid: %s", reuse_packfile->pack_name);

-   fd = git_open_noatime_cloexec(reuse_packfile->pack_name);
+   fd = git_open(reuse_packfile->pack_name);
if (fd < 0)
die_errno("unable to open packfile for reuse: %s",
  reuse_packfile->pack_name);
diff --git a/cache.h b/cache.h
index 0dea548..419b7a0 100644
--- a/cache.h
+++ b/cache.h
@@ -1125,7 +1125,7 @@ extern int write_sha1_file(const void *buf, unsigned long 
len, const char *type,
 extern int hash_sha1_file_literally(const void *buf, unsigned long len, const 
char *type, unsigned char *sha1, unsigned flags);
 extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned 
char *);
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
-extern int git_open_noatime_cloexec(const char *name);
+extern int git_open(const char *name);
 extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, 
unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 1b39e5d..39bcc16 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -266,7 +266,7 @@ static int open_pack_bitmap_1(struct packed_git *packfile)
return -1;

idx_name = pack_bitmap_filename(packfile);
-   fd = git_open_noatime_cloexec(idx_name);
+   fd = git_open(idx_name);
free(idx_name);

if (fd < 0)
diff --git a/read-cache.c b/read-cache.c
index 200d4fa..b594865 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -158,6 +158,10 @@ static int ce_compare_data(const struct cache_entry *ce, 
struct stat *st)
int match = -1;
int fd = open(ce->name, O_RDONLY | O_CLOEXEC);

+   if (O_CLOEXEC && fd < 0 && errno == EINVAL)
+   /* Try again w/o O_CLOEXEC: the kernel might not support it */
+   fd = open(ce->name, O_RDONLY);
+
if (fd >= 0) {
unsigned char sha1[20];
if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, 0))
diff --git a/sha1_file.c b/sha1_file.c
index dbe027b..93b836b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -370,7 +370,7 @@ void read_info_alternates(const char * relative_base, int 
depth)
int fd;

path = xstrfmt("%s/info/alternates", relative_base);
-   fd = git_open_noatime_cloexec(path);
+   fd = git_open(path);
free(path);
if (fd < 0)
return;
@@ -663,7 +663,7 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
struct pack_idx_header *hdr;
size_t idx_size;
uint32_t version, nr, i, *index;
-   int fd = git_open_noatime_cloexec(path);
+   int fd = git_open(path);
struct stat st;

if (fd < 0)
@@ -1069,7 +1069,7 @@ static int open_packed_git_1(struct packed_git *p)
while (pack_max_fds <= pack_open_fds && close_one_pack())
; /* nothing */

-   p->pack_fd = git_open_noatime_cloexec(p->pack_name);
+   p->pack_fd = git_open(p->pack_name);
if (p->pack_fd < 0 || fstat(p->pack_fd, ))
return -1;
pack_open_fds++;
@@ -1586,7 +1586,7 @@ int check_sha1_signature(const unsigned char *sha1, void 
*map,
return hashcmp(sha1, real_sha1) ? -1 : 0;
 }

-int git_open_noatime_cloexec(const char *name)
+int git_open(const char *name)
 {
static int 

[PATCH v2 1/2] sha1_file: open window into packfiles with CLOEXEC

2016-10-24 Thread larsxschneider
From: Lars Schneider 

All processes that the Git main process spawns inherit the open file
descriptors of the main process. These leaked file descriptors can
cause problems.

Use the CLOEXEC flag similar to 05d1ed61 to fix the leaked file
descriptors. Since `git_open_noatime` does not describe the function
properly anymore rename it to `git_open`.

Signed-off-by: Lars Schneider 
---
 builtin/pack-objects.c |  2 +-
 cache.h|  2 +-
 pack-bitmap.c  |  2 +-
 sha1_file.c| 26 --
 4 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1e7c2a9..0fd52bd 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -720,7 +720,7 @@ static off_t write_reused_pack(struct sha1file *f)
if (!is_pack_valid(reuse_packfile))
die("packfile is invalid: %s", reuse_packfile->pack_name);
 
-   fd = git_open_noatime(reuse_packfile->pack_name);
+   fd = git_open(reuse_packfile->pack_name);
if (fd < 0)
die_errno("unable to open packfile for reuse: %s",
  reuse_packfile->pack_name);
diff --git a/cache.h b/cache.h
index b7f34b4..419b7a0 100644
--- a/cache.h
+++ b/cache.h
@@ -1125,7 +1125,7 @@ extern int write_sha1_file(const void *buf, unsigned long 
len, const char *type,
 extern int hash_sha1_file_literally(const void *buf, unsigned long len, const 
char *type, unsigned char *sha1, unsigned flags);
 extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned 
char *);
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
-extern int git_open_noatime(const char *name);
+extern int git_open(const char *name);
 extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, 
unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
diff --git a/pack-bitmap.c b/pack-bitmap.c
index b949e51..39bcc16 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -266,7 +266,7 @@ static int open_pack_bitmap_1(struct packed_git *packfile)
return -1;
 
idx_name = pack_bitmap_filename(packfile);
-   fd = git_open_noatime(idx_name);
+   fd = git_open(idx_name);
free(idx_name);
 
if (fd < 0)
diff --git a/sha1_file.c b/sha1_file.c
index 1e41954..93b836b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -370,7 +370,7 @@ void read_info_alternates(const char * relative_base, int 
depth)
int fd;
 
path = xstrfmt("%s/info/alternates", relative_base);
-   fd = git_open_noatime(path);
+   fd = git_open(path);
free(path);
if (fd < 0)
return;
@@ -663,7 +663,7 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
struct pack_idx_header *hdr;
size_t idx_size;
uint32_t version, nr, i, *index;
-   int fd = git_open_noatime(path);
+   int fd = git_open(path);
struct stat st;
 
if (fd < 0)
@@ -1069,7 +1069,7 @@ static int open_packed_git_1(struct packed_git *p)
while (pack_max_fds <= pack_open_fds && close_one_pack())
; /* nothing */
 
-   p->pack_fd = git_open_noatime(p->pack_name);
+   p->pack_fd = git_open(p->pack_name);
if (p->pack_fd < 0 || fstat(p->pack_fd, ))
return -1;
pack_open_fds++;
@@ -1586,9 +1586,9 @@ int check_sha1_signature(const unsigned char *sha1, void 
*map,
return hashcmp(sha1, real_sha1) ? -1 : 0;
 }
 
-int git_open_noatime(const char *name)
+int git_open(const char *name)
 {
-   static int sha1_file_open_flag = O_NOATIME;
+   static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
 
for (;;) {
int fd;
@@ -1598,12 +1598,18 @@ int git_open_noatime(const char *name)
if (fd >= 0)
return fd;
 
-   /* Might the failure be due to O_NOATIME? */
-   if (errno != ENOENT && sha1_file_open_flag) {
-   sha1_file_open_flag = 0;
+   /* Try again w/o O_CLOEXEC: the kernel might not support it */
+   if (O_CLOEXEC && errno == EINVAL &&
+   (sha1_file_open_flag & O_CLOEXEC)) {
+   sha1_file_open_flag &= ~O_CLOEXEC;
continue;
}
 
+   /* Might the failure be due to O_NOATIME? */
+   if (errno != ENOENT && (sha1_file_open_flag & O_NOATIME)) {
+   sha1_file_open_flag &= ~O_NOATIME;
+   continue;
+   }
return -1;
}
 }
@@ -1632,7 +1638,7 @@ static int open_sha1_file(const unsigned char *sha1)
struct alternate_object_database *alt;
int most_interesting_errno;

Re: [PATCH 0/4] nd/ita-empty-commit update

2016-10-24 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> The index_differs_from() also takes a flag to set/clear this new
> flag instead of relying on has_ita_entries like the old 2/3.

I think that probably is a good move.

> The name ita-invisible-in-index is not perfect but I could not think
> of any better. Another name could be diff-cached-ignores-ita, but
> that's just half of what it does. The other half is diff-files-includes-ita...

I can't either, and it is one of the reasons why I am reluctant.
Not being able to be named with a short-and-sweet name often is a
sign that the thing to be named is conceptually not well thought
out.

But as we need to give it some name to the flat to ease
experimenting, let's take that name as-is.


Re: [PATCH 2/3] submodule tests: replace cloning from . by "$(pwd)"

2016-10-24 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 22.10.2016 um 22:46 schrieb Stefan Beller:
>> I have looked into it again, and by now I think the bug is a feature,
>> actually.
>>
>> Consider this:
>>
>> git clone . super
>> git -C super submodule add ../submodule
>> # we thought the previous line is buggy
>> git clone super super-clone
>
> At this point, we *should* have this if there were no bugs (at least
> that is my assumption):
>
>   /tmp
>   !
>   + submodule <- submodule's remote repo
>   !
>   + foo   <- we are here (.), super's remote repo
> !
> + super   <- remote.origin.url=/tmp/foo/.
>   !
>   + submodule <- remote.origin.url=/tmp/foo/./../submodule
>  submodule.submodule.url=../submodule
>
> When I test this, 'git submodule add' fails:
>
> foo@master> git -C super submodule add ../submodule
> fatal: repository '/tmp/foo/submodule' does not exist
> fatal: clone of '/tmp/foo/submodule' into submodule path
> '/tmp/foo/super/submodule' failed
>
>> Now in the super-clone the ../submodule is the correct
>> relative url, because the url where we cloned from doesn't
>> end in /.
>
> I do not understand why this would be relevant. The question is not
> how the submodule's remote URL ends, but how the submodule's remote
> URL is constructed from the super-project's URL and the relative path
> specified for 'git submodule add'.

FWIW, that matches my understanding.

> Whether ../submodule or ./submodule is the correct relative URL
> depends on where the origin of the submodule is located relative to
> the origin of the super-project. In the above example, it is
> ../submodule. However, the error message tells us that git looked in
> /tmp/foo/submodule, which looks like the /. bug!
>
> I do not understand where you see a feature here. What am I missing?
>
> -- Hannes


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread Junio C Hamano
Junio C Hamano  writes:

>> I think it would be preferable to just fix it inline in each place.
>
> Yeah, probably.
>
> My initial reaction to this was
>
>  char *sha1_to_hex(const unsigned char *sha1)
>  {
> - static int bufno;
> + static unsigned int bufno;
>   static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
>   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
>
> "ah, we do not even need bufno as uint; it could be ushort or even
> uchar".  If this were a 256 element ring buffer and the index were
> uchar, we wouldn't even be having this discussion, and "3 &" is a
> way to get a fake type that is a 2-bit unsigned integer that wraps
> around when incremented.
>
> But being explicit, especially when we know that we can rely on the
> fact that the compilers are usually intelligent enough, is a good
> idea, I would think.
>
> Isn't size_t often wider than uint, by the way?  It somehow makes me
> feel dirty to use it when we know we only care about the bottom two
> bit, especially with the explicit "bufno %= ARRAY_SIZE(hexbuffer)",
> but I may be simply superstitious in this case.  I dunno.

If we are doing the wrap-around ourselves, I suspect that the index
should stay "int" (not even unsigned), as that is supposed to be the
most natural and performant type on the architecture.  Would it
still result in better code to use size_t instead?



Author: René Scharfe 
Date:   Sun Oct 23 19:57:30 2016 +0200

hex: make wraparound of the index into ring-buffer explicit

Overflow is defined for unsigned integers, but not for signed ones.

We could make the ring-buffer index in sha1_to_hex() and
get_pathname() unsigned to be on the safe side to resolve this, but
let's make it explicit that we are wrapping around at whatever the
number of elements the ring-buffer has.  The compiler is smart enough
to turn modulus into bitmask for these codepaths that use
ring-buffers of a size that is a power of 2.
---
 cache.h | 3 +++
 hex.c   | 4 ++--
 path.c  | 4 ++--
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index 4cba08ecb1..5429df0f92 100644
--- a/cache.h
+++ b/cache.h
@@ -547,6 +547,9 @@ extern int daemonize(void);
} \
} while (0)
 
+#define NEXT_RING_ITEM(array, index) \
+   (array)[(index) = ((index) + 1) % ARRAY_SIZE(array)]
+
 /* Initialize and use the cache information */
 struct lock_file;
 extern int read_index(struct index_state *);
diff --git a/hex.c b/hex.c
index ab2610e498..5e711b9e32 100644
--- a/hex.c
+++ b/hex.c
@@ -76,9 +76,9 @@ char *oid_to_hex_r(char *buffer, const struct object_id *oid)
 
 char *sha1_to_hex(const unsigned char *sha1)
 {
-   static int bufno;
+   static size_t bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
+   return sha1_to_hex_r(NEXT_RING_ITEM(hexbuffer, bufno), sha1);
 }
 
 char *oid_to_hex(const struct object_id *oid)
diff --git a/path.c b/path.c
index fe3c4d96c6..5b2ab2271f 100644
--- a/path.c
+++ b/path.c
@@ -23,8 +23,8 @@ static struct strbuf *get_pathname(void)
static struct strbuf pathname_array[4] = {
STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
};
-   static int index;
-   struct strbuf *sb = _array[3 & ++index];
+   static size_t index;
+   struct strbuf *sb = _RING_ITEM(pathname_array, index);
strbuf_reset(sb);
return sb;
 }


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread Junio C Hamano
Jeff King  writes:

>> > You could also write the second line like:
>> > 
>> >   bufno %= ARRAY_SIZE(hexbuffer);
>> > 
>> > which is less magical (right now the set of buffers must be a power of
>> > 2). I expect the compiler could turn that into a bitmask itself.
>> ...
>
> I think it would be preferable to just fix it inline in each place.

Yeah, probably.

My initial reaction to this was

 char *sha1_to_hex(const unsigned char *sha1)
 {
-   static int bufno;
+   static unsigned int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);

"ah, we do not even need bufno as uint; it could be ushort or even
uchar".  If this were a 256 element ring buffer and the index were
uchar, we wouldn't even be having this discussion, and "3 &" is a
way to get a fake type that is a 2-bit unsigned integer that wraps
around when incremented.

But being explicit, especially when we know that we can rely on the
fact that the compilers are usually intelligent enough, is a good
idea, I would think.

Isn't size_t often wider than uint, by the way?  It somehow makes me
feel dirty to use it when we know we only care about the bottom two
bit, especially with the explicit "bufno %= ARRAY_SIZE(hexbuffer)",
but I may be simply superstitious in this case.  I dunno.


Re: RFC Failover url for fetches?

2016-10-24 Thread Junio C Hamano
Jakub Narębski  writes:

>> As to fetching from two or more places as "fallback", I am
>> moderately negative to add it as a dumb feature that does nothing
>> more than "My fetch from A failed, so let's blindly try it from B".
>> I'd prefer to keep the "My fetch from A is failing" knowledge near
>> the surface of end user's consciousness as a mechanism to pressure A
>> to fix it--that way everybody who is fetching from A benefits.
>> After all, doing "git remote add B" once (you'd need to tell the URL
>> for B anyway to Git) and issuing "git fetch B" after seeing your
>> regular "git fetch" fails once in a blue moon is not all that
>> cumbersome, I would think.
>
> One would need to configure fallback B remote to use the same
> remote-branch namespace as remote A, if it is to be used as fallback,
> I would think.

Yeah, I left it out because I thought that was obvious, but spelling
it out explicitly may have helped those who weren't reading carefully.

Thanks


[PATCH] gitk: Fix Japanese translation for "marked commit"

2016-10-24 Thread Satoshi Yasushima
Signed-off-by: Satoshi Yasushima 
---
 po/ja.po | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/po/ja.po b/po/ja.po
index f143753..510306b 100644
--- a/po/ja.po
+++ b/po/ja.po
@@ -2,16 +2,17 @@
 # Copyright (C) 2005-2015 Paul Mackerras
 # This file is distributed under the same license as the gitk package.
 #
-# YOKOTA Hiroshi , 2015.
 # Mizar , 2009.
 # Junio C Hamano , 2009.
+# YOKOTA Hiroshi , 2015.
+# Satoshi Yasushima , 2016.
 msgid ""
 msgstr ""
 "Project-Id-Version: gitk\n"
 "Report-Msgid-Bugs-To: \n"
 "POT-Creation-Date: 2015-05-17 14:32+1000\n"
 "PO-Revision-Date: 2015-11-12 13:00+0900\n"
-"Last-Translator: YOKOTA Hiroshi \n"
+"Last-Translator: Satoshi Yasushima \n"
 "Language-Team: Japanese\n"
 "Language: ja\n"
 "MIME-Version: 1.0\n"
@@ -314,11 +315,11 @@ msgstr "マークを付けたコミットと比較する"
 
 #: gitk:2630 gitk:2641
 msgid "Diff this -> marked commit"
-msgstr "これと選択したコミットのdiffを見る"
+msgstr "これとマークを付けたコミットのdiffを見る"
 
 #: gitk:2631 gitk:2642
 msgid "Diff marked commit -> this"
-msgstr "選択したコミットとこれのdiffを見る"
+msgstr "マークを付けたコミットとこれのdiffを見る"
 
 #: gitk:2632
 msgid "Revert this commit"
-- 
2.10.1.windows.1



Reporting Bug in Git Version Control System

2016-10-24 Thread Yash Jain
Hello,
I have two accounts on github("yj291197" and "yaki29").
Both the accounts have different gmail IDs("yj291...@gmail.com" and
"yashjain@gmail.com" respectively) but same passwords.
I used to use git for "yj291197" account and a few days earlier I made
this new account and used git commit to commit on "yaki29" but it
appeared as "yj291197" committed on "yaki29's" repo.
Then I pulled a request of that commit then it appeared "yaki29"
pulled a request with a commit of "yj291197".



And during this whole session I was signed in as "yaki29" on github.com .


Please reply 


Re: [PATCH v5 00/27] Prepare the sequencer for the upcoming rebase -i patches

2016-10-24 Thread Johannes Schindelin
Hi Max,

On Mon, 24 Oct 2016, Max Horn wrote:

> > On 23 Oct 2016, at 11:54, Johannes Schindelin  
> > wrote:
> > 
> > On Sat, 22 Oct 2016, Junio C Hamano wrote:
> > 
> [...]
> 
> >> There isn't enough time to include this topic in the upcoming release
> >> within the current https://tinyurl.com/gitCal calendar, however,
> >> which places the final on Nov 11th.
> > 
> > More is the pity.
> > 
> > Thank you, though, for being upfront with me. I will shift my focus to
> > tasks that require my attention more urgently, then.
> 
> Junio did go on, though:
> 
> >> I am wondering if it makes sense to delay 2.11 by moving the final
> >> by 4 weeks to Dec 9th.
> 
> I was reading this as an offer to delay things to accommodate the
> integration your work into 2.11. I.e. "within the current plan, there is
> no time for this, but we could adjust the plan". But maybe I am
> misinterpreting?

There is no indication that the rebase--helper patches would make it into
2.11 even with four more weeks.

I will now focus on other things that I postponed in favor of the
interactive rebase patches. In fact, I *have* to focus on some quite
pressing tasks that I neglected over those patches.

It's not like the process would magically improve just because a release
date is pushed. To the contrary, pushing the release date to allow for the
rebase--helper to be included may very well have the counterintuitive
effect of delaying things beyond even that pushed date "because there is
now so much time left" (until there isn't). It's a variation of
[Parkinson's Law](https://en.wikipedia.org/wiki/Parkinson%27s_law) ;-)

Anyway, back to work,
Dscho


[RFH] limiting ref advertisements

2016-10-24 Thread Jeff King
I'm looking into the oft-discussed idea of reducing the size of ref
advertisements by having the client say "these are the refs I'm
interested in". Let's set aside the protocol complexities for a
moment and imagine we magically have some way to communicate a set of
patterns to the server.

What should those patterns look like?

I had hoped that we could keep most of the pattern logic on the
client-side. Otherwise we risk incompatibilities between how the client
and server interpret a pattern. I had also hoped we could do some kind
of prefix-matching, which would let the server look only at the
interesting bits of the ref tree (so if you don't care about
refs/changes, and the server has some ref storage that is hierarchical,
they can literally get away without opening that sub-tree).

The patch at the end of this email is what I came up with in that
direction. It obviously won't compile without the twenty other patches
implementing transport->advertise_prefixes, but it gives you a sense of
what I'm talking about.

Unfortunately it doesn't work in all cases, because refspec sources may
be unqualified. If I ask for:

  git fetch $remote master:foo

then we have to actually dwim-resolve "master" from the complete list of
refs we get from the remote.  It could be "refs/heads/master",
"refs/tags/master", etc. Worse, it could be "refs/master". In that case,
at least, I think we are OK because we avoid advertising refs directly
below "refs/" in the first place. But if you have a slash, like:

  git fetch $remote jk/foo

then that _could_ be "refs/jk/foo". Likewise, we cannot even optimize
the common case of a fully-qualified ref, like "refs/heads/foo". If it
exists, we obviously want to use that. But if it doesn't, then it
could be refs/something-else/refs/heads/foo. That's unlikely, but it
_does_ work now, and optimizing the advertisement would break it.

So it seems like left-anchoring the refspecs can never be fully correct.
We can communicate "master" to the server, who can then look at every
ref it would advertise and ask "could this be called master"? But it
will be setting in stone the set of "could this be" patterns. Granted,
those haven't changed much over the history of git, but it seems awfully
fragile.

In an ideal world the client and server would negotiate to come to some
agreement on the patterns being used. But as we are bolting this onto
the existing protocol, I was really trying to do it without introducing
an extra capabilities phase or extra round-trips. I.e., something like
David Turner's "stick the refspec in the HTTP query parameters" trick,
but working everywhere[1].

Clever ideas?

-Peff

[1] I do have working patches to pass these "early capabilities"
everywhere, but they're still somewhat rough. I got it to the point
where I could flip the default to "on" to see what breaks. That's
not something we'd want to do for real, but is good for running the
test suite to uncover issues like this one.

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7c10d70092..3a2585ffd7 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -302,6 +302,33 @@ static void find_non_local_tags(struct transport 
*transport,
string_list_clear(_refs, 0);
 }
 
+static void add_advertise_prefixes(struct transport *transport,
+  const struct refspec *refs, int nr)
+{
+   struct argv_array *list = >advertise_prefixes;
+   int i;
+
+   for (i = 0; i < nr; i++) {
+   const struct refspec *rs = [i];
+   size_t len;
+
+   if (!rs->pattern)
+   argv_array_push(list, rs->src);
+   else if (strip_suffix(rs->src, "/*", ))
+   argv_array_pushf(list, "%.*s", (int)len, rs->src);
+   else {
+   /*
+* This refspec is too complex for us to communicate;
+* not only do we skip it, but we must avoid
+* communicating any prefixes, since we need to see
+* all refs.
+*/
+   transport->ignore_advertise_prefixes = 1;
+   return;
+   }
+   }
+}
+
 static struct ref *get_ref_map(struct transport *transport,
   struct refspec *refspecs, int refspec_count,
   int tags, int *autotags)
@@ -314,12 +341,18 @@ static struct ref *get_ref_map(struct transport 
*transport,
/* opportunistically-updated references: */
struct ref *orefs = NULL, **oref_tail = 
 
-   const struct ref *remote_refs = transport_get_remote_refs(transport);
+   const struct ref *remote_refs;
+
+   if (tags == TAGS_SET || (tags == TAGS_DEFAULT && *autotags))
+   add_advertise_prefixes(transport, tag_refspec, 1);
 
if (refspec_count) {
struct refspec *fetch_refspec;
int fetch_refspec_nr;
 

Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread Jeff King
On Sun, Oct 23, 2016 at 07:57:30PM +0200, René Scharfe wrote:

> > > Hard to trigger, but probably even harder to diagnose once someone
> > > somehow manages to do it on some uncommon architecture.
> > 
> > Indeed. If we are worried about overflow, we would also want to assume
> > that it wraps at a multiple of 4, but that is probably a sane
> > assumption.
> 
> Hmm, I can't think of a way to violate this assumption except with unsigned
> integers that are only a single bit wide.  That would be a weird machine.
> Are there other possibilities?

No, I don't think so. I don't recall offhand whether the C standard
allows integers that are not powers of 2. But if it does, and somebody
develops such a platform, I have very little sympathy.

My comment was mostly "this is the only other restriction I can think
of, and it is crazy".

> > You could also write the second line like:
> > 
> >   bufno %= ARRAY_SIZE(hexbuffer);
> > 
> > which is less magical (right now the set of buffers must be a power of
> > 2). I expect the compiler could turn that into a bitmask itself.
> 
> Expelling magic is a good idea.  And indeed, at least gcc, clang and icc on
> x86-64 are smart enough to use an AND instead of dividing
> (https://godbolt.org/g/rFPpzF).
> 
> But gcc also adds a sign extension (cltq/cdqe) if we store the truncated
> value, unlike the other two compilers.  I don't see why -- the bit mask
> operation enforces a value between 0 and 3 (inclusive) and the upper bits of
> eax are zeroed automatically, so the cltq is effectively a noop.
> 
> Using size_t gets us rid of the extra instruction and is the right type
> anyway.  It would suffice on its own, hmm..

Yeah, I had assumed you would also switch to some form of unsigned type
either way.

> > I'm fine with any of the options. I guess you'd want a similar patch for
> > find_unique_abbrev on top of jk/no-looking-at-dotgit-outside-repo.
> 
> Actually I'd want you to want to amend your series yourself. ;)  Maybe I can
> convince Coccinelle to handle that issue for us.

I thought that series was in "next" already, but I see it isn't yet. I'd
still wait until the sha1_to_hex() solution settles, and then copy it.

> And there's also path.c::get_pathname().  That's enough cases to justify
> adding a macro, I'd say:
> [...]
> +#define NEXT_RING_ITEM(array, index) \
> + (array)[(index) = ((index) + 1) % ARRAY_SIZE(array)]
> +

I dunno. It hides a lot of magic without saving a lot of lines in the
caller, and the callers have to make sure "array" is an array and that
"index" is unsigned.

E.g., in this code:

> @@ -24,8 +24,8 @@ static struct strbuf *get_pathname(void)
>   static struct strbuf pathname_array[4] = {
>   STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
>   };
> - static int index;
> - struct strbuf *sb = _array[3 & ++index];
> + static size_t index;
> + struct strbuf *sb = _RING_ITEM(pathname_array, index);
>   strbuf_reset(sb);
>   return sb;
>  }

The truly ugly part is the repeated STRBUF_INIT. :)

I think it would be preferable to just fix it inline in each place.

-Peff


Re: [PATCH] Allow stashes to be referenced by index only

2016-10-24 Thread Jeff King
On Sun, Oct 23, 2016 at 01:41:25PM -0400, Aaron and Ashley Watson wrote:

> > But what's going on here? Why did we bother running rev-parse earlier if
> > we don't actually use the value of REV?
> >
> > You mentioned tweaking it to fix a broken test, and indeed, just using
> > $REV here breaks a few tests in t3903.
> >
> > Offhand, I do not see anything wrong with pulling the non-option values
> > out in the loop. But in that case I think the assignment of REV can just
> > go away completely.
> >
> 
> The only reason for REV to remain is to preserve the error message seen with
> the previous behavior. Perhaps it would be better to instead move the
> assignment
> of REV to the only place it is still used: the error message when multiple
> arguments were detected.

Ah, thanks, I missed that use. We suppress stderr, so we're literally
just getting the set of revs there. But that should match what we have
in ARGV anyway (after all, $ARGV is where we decided we had too many
revs, and what we'll feed to rev-parse to get the sha1).

So I wonder if:

  Too many revisions specified: $ARGV

would be more appropriate (at which point you can probably just continue
to call it $REV).

> I'm not sure of the next steps in the process of submitting a patch.
> Should I submit a new patch by replying to this email, or is using git
> send-email to create a new mail thread better?

Generally re-rolls of a patch are done as replies to the original. Gmail
is bad about corrupting whitespace, though, so you can't just reply and
paste the patch in there. You can use ask git-send-email to continue the
thread, though:

  mid=1473378397-22453-1-git-send-email-watso...@gmail.com
  git send-email --in-reply-to=$mid ...other options...

You might also want to cc the people involved in the earlier discussion.
The public-inbox archive gives a customized full send-email command for
each message to make it easy.

  
http://public-inbox.org/git/1473378397-22453-1-git-send-email-watso...@gmail.com/

-Peff


Re: [PATCH v5 00/27] Prepare the sequencer for the upcoming rebase -i patches

2016-10-24 Thread Max Horn
Hi Dscho,

> On 23 Oct 2016, at 11:54, Johannes Schindelin  
> wrote:
> 
> Hi Junio,
> 
> On Sat, 22 Oct 2016, Junio C Hamano wrote:
> 
[...]

>> There isn't enough time to include this topic in the upcoming
>> release within the current https://tinyurl.com/gitCal calendar,
>> however, which places the final on Nov 11th.
> 
> More is the pity.
> 
> Thank you, though, for being upfront with me. I will shift my focus to
> tasks that require my attention more urgently, then.

Junio did go on, though:

>> I am wondering if it makes sense to delay 2.11 by moving the final
>> by 4 weeks to Dec 9th.

I was reading this as an offer to delay things to accommodate the integration 
your work into 2.11. I.e. "within the current plan, there is no time for this, 
but we could adjust the plan". But maybe I am misinterpreting?


Cheers,
Max


[PATCH 1/4] diff-lib: allow ita entries treated as "not yet exist in index"

2016-10-24 Thread Nguyễn Thái Ngọc Duy
When comparing the index and the working tree to show which paths are
new, and comparing the tree recorded in the HEAD and the index to see if
committing the contents recorded in the index would result in an empty
commit, we would want the former comparison to say "these are new paths"
and the latter to say "there is no change" for paths that are marked as
intent-to-add.

We made a similar attempt at d95d728a ("diff-lib.c: adjust position of
i-t-a entries in diff", 2015-03-16), which redefined the semantics of
these two comparison modes globally, which was a disaster and had to be
reverted at 78cc1a54 ("Revert "diff-lib.c: adjust position of i-t-a
entries in diff"", 2015-06-23).

To make sure we do not repeat the same mistake, introduce a new internal
diffopt option so that this different semantics can be asked for only by
callers that ask it, while making sure other unaudited callers will get
the same comparison result.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 diff-lib.c  | 14 ++
 diff.h  |  1 +
 t/t2203-add-intent.sh   | 16 +++-
 t/t7064-wtstatus-pv2.sh |  4 ++--
 wt-status.c |  7 ++-
 5 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 3007c85..27f1228 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -214,6 +214,12 @@ int run_diff_files(struct rev_info *revs, unsigned int 
option)
   !is_null_oid(>oid),
   ce->name, 0);
continue;
+   } else if (revs->diffopt.ita_invisible_in_index &&
+  ce_intent_to_add(ce)) {
+   diff_addremove(>diffopt, '+', ce->ce_mode,
+  EMPTY_BLOB_SHA1_BIN, 0,
+  ce->name, 0);
+   continue;
}
 
changed = match_stat_with_submodule(>diffopt, ce, 
,
@@ -379,6 +385,14 @@ static void do_oneway_diff(struct unpack_trees_options *o,
struct rev_info *revs = o->unpack_data;
int match_missing, cached;
 
+   /* i-t-a entries do not actually exist in the index */
+   if (revs->diffopt.ita_invisible_in_index &&
+   idx && ce_intent_to_add(idx)) {
+   idx = NULL;
+   if (!tree)
+   return; /* nothing to diff.. */
+   }
+
/* if the entry is not checked out, don't examine work tree */
cached = o->index_only ||
(idx && ((idx->ce_flags & CE_VALID) || ce_skip_worktree(idx)));
diff --git a/diff.h b/diff.h
index ec76a90..68a6618 100644
--- a/diff.h
+++ b/diff.h
@@ -146,6 +146,7 @@ struct diff_options {
int dirstat_permille;
int setup;
int abbrev;
+   int ita_invisible_in_index;
 /* white-space error highlighting */
 #define WSEH_NEW 1
 #define WSEH_CONTEXT 2
diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh
index 8f22c43..2276e4e 100755
--- a/t/t2203-add-intent.sh
+++ b/t/t2203-add-intent.sh
@@ -5,10 +5,24 @@ test_description='Intent to add'
 . ./test-lib.sh
 
 test_expect_success 'intent to add' '
+   test_commit 1 &&
+   git rm 1.t &&
+   echo hello >1.t &&
echo hello >file &&
echo hello >elif &&
git add -N file &&
-   git add elif
+   git add elif &&
+   git add -N 1.t
+'
+
+test_expect_success 'git status' '
+   git status --porcelain | grep -v actual >actual &&
+   cat >expect <<-\EOF &&
+   DA 1.t
+   A  elif
+A file
+   EOF
+   test_cmp expect actual
 '
 
 test_expect_success 'check result of "add -N"' '
diff --git a/t/t7064-wtstatus-pv2.sh b/t/t7064-wtstatus-pv2.sh
index 3012a4d..e319fa2 100755
--- a/t/t7064-wtstatus-pv2.sh
+++ b/t/t7064-wtstatus-pv2.sh
@@ -246,8 +246,8 @@ test_expect_success 'verify --intent-to-add output' '
git add --intent-to-add intent1.add intent2.add &&
 
cat >expect <<-EOF &&
-   1 AM N... 00 100644 100644 $_z40 $EMPTY_BLOB intent1.add
-   1 AM N... 00 100644 100644 $_z40 $EMPTY_BLOB intent2.add
+   1 .A N... 00 00 100644 $_z40 $_z40 intent1.add
+   1 .A N... 00 00 100644 $_z40 $_z40 intent2.add
EOF
 
git status --porcelain=v2 >actual &&
diff --git a/wt-status.c b/wt-status.c
index 9a14658..05a7dcb 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -437,7 +437,7 @@ static void wt_status_collect_changed_cb(struct 
diff_queue_struct *q,
 
switch (p->status) {
case DIFF_STATUS_ADDED:
-   die("BUG: worktree status add???");
+   d->mode_worktree = p->two->mode;
break;
 
case DIFF_STATUS_DELETED:
@@ -547,6 +547,7 @@ static void wt_status_collect_changes_worktree(struct 
wt_status *s)

[PATCH 2/4] diff: add --ita-[in]visible-in-index

2016-10-24 Thread Nguyễn Thái Ngọc Duy
The option --ita-invisible-in-index exposes the "ita_invisible_in_index"
diff flag to outside to allow easier experimentation with this new mode.
The "plan" is to make --ita-invisible-in-index default to keep consistent
behavior with 'status' and 'commit', but a bunch other commands like
'apply', 'merge', 'reset' need to be taken into consideration as well.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/diff-options.txt | 8 
 diff.c | 4 
 t/t2203-add-intent.sh  | 4 +++-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 7805a0c..0fdd53f 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -575,5 +575,13 @@ endif::git-format-patch[]
 --line-prefix=::
Prepend an additional prefix to every line of output.
 
+--ita-invisible-in-index::
+   By default entries added by "git add -N" appear as an existing
+   empty file in "git diff" and a new file in "git diff --cached".
+   This option makes the entry appear as a new file in "git diff"
+   and non-existent in "git diff --cached". This option could be
+   reverted with `--ita-visible-in-index`. Both options are
+   experimental and could be removed in future.
+
 For more detailed explanation on these common options, see also
 linkgit:gitdiffcore[7].
diff --git a/diff.c b/diff.c
index c6da383..e8e73f8 100644
--- a/diff.c
+++ b/diff.c
@@ -3923,6 +3923,10 @@ int diff_opt_parse(struct diff_options *options,
return parse_submodule_opt(options, arg);
else if (skip_prefix(arg, "--ws-error-highlight=", ))
return parse_ws_error_highlight(options, arg);
+   else if (!strcmp(arg, "--ita-invisible-in-index"))
+   options->ita_invisible_in_index = 1;
+   else if (!strcmp(arg, "--ita-visible-in-index"))
+   options->ita_invisible_in_index = 0;
 
/* misc options */
else if (!strcmp(arg, "-z"))
diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh
index 2276e4e..0e54f63 100755
--- a/t/t2203-add-intent.sh
+++ b/t/t2203-add-intent.sh
@@ -57,7 +57,9 @@ test_expect_success 'i-t-a entry is simply ignored' '
git add -N nitfol &&
git commit -m second &&
test $(git ls-tree HEAD -- nitfol | wc -l) = 0 &&
-   test $(git diff --name-only HEAD -- nitfol | wc -l) = 1
+   test $(git diff --name-only HEAD -- nitfol | wc -l) = 1 &&
+   test $(git diff --name-only --ita-invisible-in-index HEAD -- nitfol | 
wc -l) = 0 &&
+   test $(git diff --name-only --ita-invisible-in-index -- nitfol | wc -l) 
= 1
 '
 
 test_expect_success 'can commit with an unrelated i-t-a entry in index' '
-- 
2.8.2.524.g6ff3d78



[PATCH 0/4] nd/ita-empty-commit update

2016-10-24 Thread Nguyễn Thái Ngọc Duy
This version splits the old 1/3 into two, with better description in
1/4. The index_differs_from() also takes a flag to set/clear this new
flag instead of relying on has_ita_entries like the old 2/3.

The name ita-invisible-in-index is not perfect but I could not think
of any better. Another name could be diff-cached-ignores-ita, but
that's just half of what it does. The other half is diff-files-includes-ita...

Nguyễn Thái Ngọc Duy (4):
  Subject: diff-lib: allow ita entries treated as "not yet exist in index"
  diff: add --ita-[in]visible-in-index
  commit: fix empty commit creation when there's no changes but ita entries
  commit: don't be fooled by ita entries when creating initial commit

 Documentation/diff-options.txt |  8 
 builtin/commit.c   | 13 +
 diff-lib.c | 18 +-
 diff.c |  4 
 diff.h |  3 ++-
 sequencer.c|  4 ++--
 t/t2203-add-intent.sh  | 41 +++--
 t/t7064-wtstatus-pv2.sh|  4 ++--
 wt-status.c|  7 ++-
 9 files changed, 89 insertions(+), 13 deletions(-)

-- 
2.8.2.524.g6ff3d78



[PATCH 4/4] commit: don't be fooled by ita entries when creating initial commit

2016-10-24 Thread Nguyễn Thái Ngọc Duy
ita entries are dropped at tree generation phase. If the entire index
consists of just ita entries, the result would be a a commit with no
entries, which should be caught unless --allow-empty is specified. The
test "!!active_nr" is not sufficient to catch this.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/commit.c  | 11 ---
 t/t2203-add-intent.sh | 10 ++
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index fe8694d..42732ba 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -894,9 +894,14 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
if (amend)
parent = "HEAD^1";
 
-   if (get_sha1(parent, sha1))
-   commitable = !!active_nr;
-   else {
+   if (get_sha1(parent, sha1)) {
+   int i, ita_nr = 0;
+
+   for (i = 0; i < active_nr; i++)
+   if (ce_intent_to_add(active_cache[i]))
+   ita_nr++;
+   commitable = active_nr - ita_nr > 0;
+   } else {
/*
 * Unless the user did explicitly request a submodule
 * ignore mode by passing a command line option we do
diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh
index 8652a96..84a9028 100755
--- a/t/t2203-add-intent.sh
+++ b/t/t2203-add-intent.sh
@@ -129,6 +129,16 @@ test_expect_success 'cache-tree does skip dir that becomes 
empty' '
)
 '
 
+test_expect_success 'commit: ita entries ignored in empty intial commit check' 
'
+   git init empty-intial-commit &&
+   (
+   cd empty-intial-commit &&
+   : >one &&
+   git add -N one &&
+   test_must_fail git commit -m nothing-new-here
+   )
+'
+
 test_expect_success 'commit: ita entries ignored in empty commit check' '
git init empty-subsequent-commit &&
(
-- 
2.8.2.524.g6ff3d78



[PATCH 3/4] commit: fix empty commit creation when there's no changes but ita entries

2016-10-24 Thread Nguyễn Thái Ngọc Duy
If i-t-a entries are present and there is no change between the index
and HEAD i-t-a entries, index_differs_from() still returns "dirty, new
entries" (aka, the resulting commit is not empty), but cache-tree will
skip i-t-a entries and produce the exact same tree of current
commit.

index_differs_from() is supposed to catch this so we can abort
git-commit (unless --no-empty is specified). Update it to optionally
ignore i-t-a entries when doing a diff between the index and HEAD so
that it would return "no change" in this case and abort commit.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/commit.c  |  2 +-
 diff-lib.c|  4 +++-
 diff.h|  2 +-
 sequencer.c   |  4 ++--
 t/t2203-add-intent.sh | 11 +++
 5 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index bb9f79b..fe8694d 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -910,7 +910,7 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
if (ignore_submodule_arg &&
!strcmp(ignore_submodule_arg, "all"))
diff_flags |= DIFF_OPT_IGNORE_SUBMODULES;
-   commitable = index_differs_from(parent, diff_flags);
+   commitable = index_differs_from(parent, diff_flags, 1);
}
}
strbuf_release(_ident);
diff --git a/diff-lib.c b/diff-lib.c
index 27f1228..5244746 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -535,7 +535,8 @@ int do_diff_cache(const unsigned char *tree_sha1, struct 
diff_options *opt)
return 0;
 }
 
-int index_differs_from(const char *def, int diff_flags)
+int index_differs_from(const char *def, int diff_flags,
+  int ita_invisible_in_index)
 {
struct rev_info rev;
struct setup_revision_opt opt;
@@ -547,6 +548,7 @@ int index_differs_from(const char *def, int diff_flags)
DIFF_OPT_SET(, QUICK);
DIFF_OPT_SET(, EXIT_WITH_STATUS);
rev.diffopt.flags |= diff_flags;
+   rev.diffopt.ita_invisible_in_index = ita_invisible_in_index;
run_diff_index(, 1);
if (rev.pending.alloc)
free(rev.pending.objects);
diff --git a/diff.h b/diff.h
index 68a6618..b171172 100644
--- a/diff.h
+++ b/diff.h
@@ -356,7 +356,7 @@ extern int diff_result_code(struct diff_options *, int);
 
 extern void diff_no_index(struct rev_info *, int, const char **);
 
-extern int index_differs_from(const char *def, int diff_flags);
+extern int index_differs_from(const char *def, int diff_flags, int 
ita_invisible_in_index);
 
 /*
  * Fill the contents of the filespec "df", respecting any textconv defined by
diff --git a/sequencer.c b/sequencer.c
index eec8a60..b082635 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -469,7 +469,7 @@ static int do_pick_commit(struct commit *commit, struct 
replay_opts *opts)
unborn = get_sha1("HEAD", head);
if (unborn)
hashcpy(head, EMPTY_TREE_SHA1_BIN);
-   if (index_differs_from(unborn ? EMPTY_TREE_SHA1_HEX : "HEAD", 
0))
+   if (index_differs_from(unborn ? EMPTY_TREE_SHA1_HEX : "HEAD", 
0, 0))
return error_dirty_index(opts);
}
discard_cache();
@@ -1064,7 +1064,7 @@ static int sequencer_continue(struct replay_opts *opts)
if (ret)
return ret;
}
-   if (index_differs_from("HEAD", 0))
+   if (index_differs_from("HEAD", 0, 0))
return error_dirty_index(opts);
todo_list = todo_list->next;
return pick_commits(todo_list, opts);
diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh
index 0e54f63..8652a96 100755
--- a/t/t2203-add-intent.sh
+++ b/t/t2203-add-intent.sh
@@ -129,5 +129,16 @@ test_expect_success 'cache-tree does skip dir that becomes 
empty' '
)
 '
 
+test_expect_success 'commit: ita entries ignored in empty commit check' '
+   git init empty-subsequent-commit &&
+   (
+   cd empty-subsequent-commit &&
+   test_commit one &&
+   : >two &&
+   git add -N two &&
+   test_must_fail git commit -m nothing-new-here
+   )
+'
+
 test_done
 
-- 
2.8.2.524.g6ff3d78