Re: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default

2017-05-15 Thread Michael Kebe
Hi Marc,

works like a charm!

Michael


2017-05-15 16:33 GMT+02:00 Marc Stevens :
> Hi Michael,
>
>
>
> See the latest commit to the SHA1DC repo:
>
> https://github.com/cr-marcstevens/sha1collisiondetection/commit/33a694a9ee1b79c24be45f9eab5ac0e1aeeaf271
>
> Just apply this same patch to SHA1DC in Git.
>
>
>
> Best regards,
>
> Marc Stevens
>
>
>
> Van: Michael Kebe [mailto:michael.k...@gmail.com]
> Verzonden: maandag 15 mei 2017 16:22
> Aan: Ævar Arnfjörð Bjarmason 
> CC: Jeff King ; Git Mailing List ; Marc
> Stevens 
> Onderwerp: Re: Git 2.13.0 segfaults on Solaris SPARC due to
> DC_SHA1=YesPlease being on by default
>
>
>
> Just give me patch or repo, I will test it tomorrow.
>
>
>
> Michael
>
>
>
>
>
> Am 15.05.2017 4:14 nachm. schrieb "Ævar Arnfjörð Bjarmason"
> :
>
> On Mon, May 15, 2017 at 3:58 PM, Marc Stevens  wrote:
>> Hi Aevar,
>>
>> Thank you for notifying us of this issue.
>> Big endianness is a tricky issue, also since I don't have access or
>> accurate knowledge about all big endian systems.
>> Our github repo does check correct functioning, including an endianness
>> mistake, with 'make test'.
>> But I guess this is not included for SHA1DC in Git.
>>
>> Anyway, we can easily add the _BIG_ENDIAN macrotest to the git repo and
>> will do so soon.
>>
>> I don't think the segfault is caused by buffer overflow, inproper access,
>> or the endianness issue.
>> But I did notice an unexpected issue: the message block pointer m=0x398ad5
>> is odd.
>> Can you confirm whether loading an uint32_t from an odd address triggers a
>> hardware interrupt on your platform?
>> This is not problem for x86, but maybe for your platform it is?
>> If it is then we should always copy buffer contents to the sha1context to
>> avoid this issue.
>
> I don't have access to the box in question, Michael was testing this
> code for me. But unaligned access is probably the cause, although
> according to some info I found online that should give a SIGBUS not a
> SIGSEGV, but that may have changed:
>
> https://bugs.python.org/issue12181
> https://github.com/magnumripper/JohnTheRipper/issues/2187
>
>> Best regards,
>> Marc Stevens
>>
>> -Oorspronkelijk bericht-
>> Van: Ævar Arnfjörð Bjarmason [mailto:ava...@gmail.com]
>> Verzonden: maandag 15 mei 2017 14:49
>> Aan: Git Mailing List 
>> CC: michael.k...@gmail.com; Jeff King ; Marc Stevens
>> 
>> Onderwerp: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease
>> being on by default
>>
>> Since 2.13.0 just running "git status" on a newly init'd repo on Solaris
>> SPARC[1] segfaults. Michael (CC'd) reported this issue on #git and I helped
>> him debug it.
>>
>> Just compiling with BLK_SHA1=YesPlease solves the issue.
>>
>> There are at least two different issues with DC_SHA1 here:
>>
>>  * We don't properly detect that this platform is big endian. The check at
>> the top of sha1dc/sha1.c needs to test for _BIG_ENDIAN. This comes from
>> sys/isa_defs.h which (I'm told by #solaris) is included on Solaris by
>> default, at least by stdio.h.
>>
>> Hacking the endian detection makes t0013-sha1dc.sh pass.
>>
>>   * Even with that & the test passing just a plain "git init x && cd x &&
>> touch A && git add A && git commit" will segfault.
>>
>> This is some bug in the sha1dc code, presumably some big endian issue
>> that's not resolved by the change above. Backtrace for that (censored actual
>> author info):
>>
>>  Program received signal SIGSEGV, Segmentation fault.
>>  [Switching to Thread 1 (LWP 1)]
>>  0x002f8c84 in sha1_compression_states (ihv=0xffbf8268, m=0x398ad5,
>> W=0xffbf82fc, states=0xffbf857c) at sha1dc/sha1.c:291
>>  291 SHA1COMPRESS_FULL_ROUND1_STEP_LOAD(a, b, c, d, e, m,
>> W, 0, temp);
>>  (gdb) bt
>>  #0  0x002f8c84 in sha1_compression_states (ihv=0xffbf8268, m=0x398ad5,
>> W=0xffbf82fc, states=0xffbf857c) at sha1dc/sha1.c:291
>>  #1  0x00300b60 in sha1_process (ctx=0xffbf8260, block=0x398ad5) at
>> sha1dc/sha1.c:1616
>>  #2  0x00301188 in SHA1DCUpdate (ctx=0xffbf8260,
>>  buf=0x398ad5 "deadbeefdeadbeefdeadbeefdeadbeefdeadbeef\nauthor Au
>> Thor  123456789 +\ncommitter Au Thor
>>  123456789
>>   +\n\nBlah Blah"..., len=220)
>>  at sha1dc/sha1.c:1731
>>  #3  0x0030168c in git_SHA1DCUpdate (ctx=0xffbf8260, vdata=0x398aa0,
>> len=273) at sha1dc/sha1.c:1808
>>  #4  0x002a6f7c in write_sha1_file_prepare (buf=0x398aa0, len=273,
>> type=0x959c8 "commit", sha1=0xffbfd630 "",
>>  hdr=0xffbf8c28 "commit 273", hdrlen=0xffbf8c24) at sha1_file.c:3207
>>  #5  0x002a71ac in hash_sha1_file (buf=0x398aa0, len=273, type=0x959c8
>> "commit", sha1=0xffbfd630 "") at sha1_file.c:3266
>>  #6  0x002a25f8 in check_sha1_signature (sha1=0xffbfdbb8
>> "\375\067\356\337\002", map=0x398aa0, size=273, 

Re: [PATCH v1 1/5] dir: make lookup_untracked() available outside of dir.c

2017-05-15 Thread Junio C Hamano
Ben Peart  writes:

> Remove the static qualifier from lookup_untracked() and make it
> available to other modules by exporting it from dir.h.

Surely that is what you did in this patch, but leaves readers in
suspense wondering why this helper needs to be available to others
in the first place ;-)  Let's read on.

>
> Signed-off-by: Ben Peart 
> ---
>  dir.c | 2 +-
>  dir.h | 3 +++
>  2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/dir.c b/dir.c
> index f451bfa48c..1b5558fdf9 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -660,7 +660,7 @@ static void trim_trailing_spaces(char *buf)
>   *
>   * If "name" has the trailing slash, it'll be excluded in the search.
>   */
> -static struct untracked_cache_dir *lookup_untracked(struct untracked_cache 
> *uc,
> +struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
>   struct untracked_cache_dir 
> *dir,
>   const char *name, int len)
>  {
> diff --git a/dir.h b/dir.h
> index bf23a470af..9e387551bd 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -339,4 +339,7 @@ extern void connect_work_tree_and_git_dir(const char 
> *work_tree, const char *git
>  extern void relocate_gitdir(const char *path,
>   const char *old_git_dir,
>   const char *new_git_dir);
> +struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
> +  struct untracked_cache_dir *dir,
> +  const char *name, int len);
>  #endif


Re: [PATCH v1 0/5] Fast git status via a file system watcher

2017-05-15 Thread Junio C Hamano
Ben Peart  writes:

>   Add documentation for the fsmonitor extension.  This includes the
> core.fsmonitor setting, the query-fsmonitor hook, and the fsmonitor
> index extension.
>   Add a sample query-fsmonitor hook script that integrates with the
> cross platform Watchman file watching service.

These two have looong titles ;-)  Accident?


Re: [PATCH v1 3/5] fsmonitor: add test cases for fsmonitor extension

2017-05-15 Thread Junio C Hamano
Ben Peart  writes:

> Add test cases that ensure status results are correct when using the new
> fsmonitor extension.  Test untracked, modified, and new files by
> ensuring the results are identical to when not using the extension.
>
> Add a test to ensure updates to the index properly mark corresponding
> entries in the index extension as dirty so that the status is correct
> after commands that modify the index but don't trigger changes in the
> working directory.
>
> Add a test that verifies that if the fsmonitor extension doesn't tell
> git about a change, it doesn't discover it on its own.  This ensures
> git is honoring the extension and that we get the performance benefits
> desired.
>
> Signed-off-by: Ben Peart 
> ---
>  t/t7519-status-fsmonitor.sh | 134 
> 
>  1 file changed, 134 insertions(+)
>  create mode 100644 t/t7519-status-fsmonitor.sh

Please make this executable.

> diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
> new file mode 100644
> index 00..2d63efc27b
> --- /dev/null
> +++ b/t/t7519-status-fsmonitor.sh
> @@ -0,0 +1,134 @@
> ...
> +# Ensure commands that call refresh_index() to move the index back in time
> +# properly invalidate the fsmonitor cache
> +...
> + git status >output &&
> + git -c core.fsmonitor=false status >expect &&
> + test_i18ncmp expect output
> +'

Hmm. I wonder if we can somehow detect the case where we got the
correct and expected result only because fsmonitor was not in
effect, even though the test requested it to be used?  Not limited
to this particular test piece, but applies to all of them in this
file.


What's cooking in git.git (May 2017, #05; Tue, 16)

2017-05-15 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

The tip of 'next' has been rewound, and the first batch of topics
for the next release have been merged to 'master'.  I tentatively
wrote doen that this cycle will last for 11 weeks, completing at the
end of July.

Many topics are marked for 'next', but it is very possible that I
missed some recent discussions that should stop their current
incarnation to be merged and instead a replacement should be
queued.  Please holler if you see such entries.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ab/aix-needs-compat-regex (2017-05-04) 1 commit
  (merged to 'next' on 2017-05-09 at 881202b6bb)
 + config.mak.uname: set NO_REGEX=NeedsStartEnd on AIX

 Build fix.


* ab/clone-no-tags (2017-05-01) 3 commits
  (merged to 'next' on 2017-04-30 at 601649896a)
 + tests: rename a test having to do with shallow submodules
 + clone: add a --no-tags option to clone without tags
 + tests: change "cd ... && git fetch" to "cd &&\n\tgit fetch"

 "git clone" learned the "--no-tags" option not to fetch all tags
 initially, and also set up the tagopt not to follow any tags in
 subsequent fetches.


* bw/submodule-has-commits-update (2017-05-02) 6 commits
  (merged to 'next' on 2017-05-08 at 57267f2277)
 + submodule: refactor logic to determine changed submodules
 + submodule: improve submodule_has_commits()
 + submodule: change string_list changed_submodule_paths
 + submodule: remove add_oid_to_argv()
 + submodule: rename free_submodules_sha1s()
 + submodule: rename add_sha1_to_array()

 Code clean-up and duplicate removal.


* dt/gc-ignore-old-gc-logs (2017-04-24) 3 commits
  (merged to 'next' on 2017-04-26 at 4f4cab8368)
 + test-lib: retire $remove_trash variable
 + test-lib.sh: do not barf under --debug at the end of the test
 + test-lib: abort when can't remove trash directory

 Attempt to allow us notice "fishy" situation where we fail to
 remove the temporary directory used during the test.


* dt/raise-core-packed-git-limit (2017-04-20) 1 commit
  (merged to 'next' on 2017-04-26 at c72dd8c62f)
 + Increase core.packedGitLimit

 The default packed-git limit value has been raised on larger
 platforms to save "git fetch" from a (recoverable) failure while
 "gc" is running in parallel.


* jc/apply-fix-mismerge (2017-05-08) 1 commit
  (merged to 'next' on 2017-05-09 at e0b89532d0)
 + apply.c: fix whitespace-only mismerge

 Mismerge fix.


* jk/am-leakfix (2017-04-27) 3 commits
  (merged to 'next' on 2017-04-30 at 78becd7a96)
 + am: shorten ident_split variable name in get_commit_info()
 + am: simplify allocations in get_commit_info()
 + am: fix commit buffer leak in get_commit_info()

 The codepath in "git am" that is used when running "git rebase"
 leaked memory held for the log message of the commits being rebased.


* jk/no-null-sha1-in-cache-tree (2017-04-23) 1 commit
  (merged to 'next' on 2017-04-26 at 45fbe9d57d)
 + cache-tree: reject entries with null sha1

 Code to update the cache-tree has been tightened so that we won't
 accidentally write out any 0{40} entry in the tree object.


* jn/clone-add-empty-config-from-command-line (2017-05-02) 1 commit
  (merged to 'next' on 2017-05-09 at c56ac3f782)
 + clone: handle empty config values in -c

 "git clone --config var=val" is a way to populate the
 per-repository configuration file of the new repository, but it did
 not work well when val is an empty string.  This has been fixed.


* jn/credential-doc-on-clear (2017-05-02) 1 commit
  (merged to 'next' on 2017-05-09 at 96fa65d122)
 + credential doc: make multiple-helper behavior more prominent

 Doc update.


* js/larger-timestamps (2017-05-09) 9 commits
  (merged to 'next' on 2017-05-09 at ae0603fd3e)
 + archive-tar: fix a sparse 'constant too large' warning
  (merged to 'next' on 2017-04-28 at b56a0d38cd)
 + use uintmax_t for timestamps
 + date.c: abort if the system time cannot handle one of our timestamps
 + timestamp_t: a new data type for timestamps
 + PRItime: introduce a new "printf format" for timestamps
 + parse_timestamp(): specify explicitly where we parse timestamps
 + t0006 & t5000: skip "far in the future" test when time_t is too limited
 + t0006 & t5000: prepare for 64-bit timestamps
 + ref-filter: avoid using `unsigned long` for catch-all data type

 Some platforms have ulong that is smaller than time_t, and our
 historical use of ulong for timestamp would mean they cannot
 represent some timestamp that the platform allows.  Invent a
 separate and dedicated timestamp_t (so that we can distingiuish
 timestamps and a vanilla ulongs, which along is already 

Re: [PATCH v2 3/4] add--helper: implement interactive status command

2017-05-15 Thread Junio C Hamano
Daniel Ferreira  writes:

> + if (!q->nr)
> + return;
> +
> + for (i = 0; i < q->nr; i++) {
> + struct diff_filepair *p;
> + p = q->queue[i];
> + diff_flush_stat(p, options, );
> + }

Commenting just on the part that interacts with the diff machinery,
without/before carefully reading the remainder of the patches.

I suspect that refactoring this part out of diff_flush() into a
helper function "compute_diffstat()", like this:

diff --git a/diff.c b/diff.c
index 74283d9001..a42ff42e92 100644
--- a/diff.c
+++ b/diff.c
@@ -4770,12 +4770,7 @@ void diff_flush(struct diff_options *options)
dirstat_by_line) {
struct diffstat_t diffstat;
 
-   memset(, 0, sizeof(struct diffstat_t));
-   for (i = 0; i < q->nr; i++) {
-   struct diff_filepair *p = q->queue[i];
-   if (check_pair_status(p))
-   diff_flush_stat(p, options, );
-   }
+   compute_diffstat(, );
if (output_format & DIFF_FORMAT_NUMSTAT)
show_numstat(, options);
if (output_format & DIFF_FORMAT_DIFFSTAT)

and then exporting that function and "struct diffstat_t" to your
helper, may make it a better API, rather than having the callers to
call diff_flush_stat() for each and every path.



Re: [PATCH 19/19] diff.c: color moved lines differently

2017-05-15 Thread Jonathan Tan
I expected there to be one main function that takes in a diff options
and returns the appropriate output without much (if any) changes in
other functions...but (as with the previous patch) maybe there are
some complications that I didn't foresee.

On Sat, May 13, 2017 at 9:01 PM, Stefan Beller  wrote:
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 475e874d51..90403c06e3 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1051,14 +1051,22 @@ This does not affect linkgit:git-format-patch[1] or 
> the
>  'git-diff-{asterisk}' plumbing commands.  Can be overridden on the
>  command line with the `--color[=]` option.
>
> +color.moved::
> +   A boolean value, whether a diff should color moved lines
> +   differently. The moved lines are searched for in the diff only.
> +   Duplicated lines from somewhere in the project that are not
> +   part of the diff are not colored as moved.
> +   Defaults to false.
> +
>  color.diff.::
> Use customized color for diff colorization.  `` specifies
> which part of the patch to use the specified color, and is one
> of `context` (context text - `plain` is a historical synonym),
> `meta` (metainformation), `frag`
> (hunk header), 'func' (function in hunk header), `old` (removed 
> lines),
> -   `new` (added lines), `commit` (commit headers), or `whitespace`
> -   (highlighting whitespace errors).
> +   `new` (added lines), `commit` (commit headers), `whitespace`
> +   (highlighting whitespace errors), `movedFrom` (removed lines that
> +   reappear), `movedTo` (added lines that were removed elsewhere).

There should be 4 "moved" colors. I think the code below is correct
(oldmoved, oldmovedalternate, etc.) but the documentation above is
wrong.

>
>  color.decorate.::
> Use customized color for 'git log --decorate' output.  `` is one
> diff --git a/diff.c b/diff.c
> index dbab7fb44e..6372e0eb25 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -31,6 +31,7 @@ static int diff_indent_heuristic; /* experimental */
>  static int diff_rename_limit_default = 400;
>  static int diff_suppress_blank_empty;
>  static int diff_use_color_default = -1;
> +static int diff_color_moved_default;
>  static int diff_context_default = 3;
>  static int diff_interhunk_context_default;
>  static const char *diff_word_regex_cfg;
> @@ -55,6 +56,10 @@ static char diff_colors[][COLOR_MAXLEN] = {
> GIT_COLOR_YELLOW,   /* COMMIT */
> GIT_COLOR_BG_RED,   /* WHITESPACE */
> GIT_COLOR_NORMAL,   /* FUNCINFO */
> +   GIT_COLOR_BOLD_RED, /* OLD_MOVED_A */
> +   GIT_COLOR_BG_RED,   /* OLD_MOVED_B */
> +   GIT_COLOR_BOLD_GREEN,   /* NEW_MOVED_A */
> +   GIT_COLOR_BG_GREEN, /* NEW_MOVED_B */
>  };
>
>  static NORETURN void die_want_option(const char *option_name)
> @@ -80,6 +85,14 @@ static int parse_diff_color_slot(const char *var)
> return DIFF_WHITESPACE;
> if (!strcasecmp(var, "func"))
> return DIFF_FUNCINFO;
> +   if (!strcasecmp(var, "oldmoved"))
> +   return DIFF_FILE_OLD_MOVED;
> +   if (!strcasecmp(var, "oldmovedalternative"))
> +   return DIFF_FILE_OLD_MOVED_ALT;
> +   if (!strcasecmp(var, "newmoved"))
> +   return DIFF_FILE_NEW_MOVED;
> +   if (!strcasecmp(var, "newmovedalternative"))
> +   return DIFF_FILE_NEW_MOVED_ALT;
> return -1;
>  }
>
> @@ -234,6 +247,10 @@ int git_diff_ui_config(const char *var, const char 
> *value, void *cb)
> diff_use_color_default = git_config_colorbool(var, value);
> return 0;
> }
> +   if (!strcmp(var, "color.moved")) {
> +   diff_color_moved_default = git_config_bool(var, value);
> +   return 0;
> +   }
> if (!strcmp(var, "diff.context")) {
> diff_context_default = git_config_int(var, value);
> if (diff_context_default < 0)
> @@ -354,6 +371,81 @@ int git_diff_basic_config(const char *var, const char 
> *value, void *cb)
> return git_default_config(var, value, cb);
>  }
>
> +struct moved_entry {
> +   struct hashmap_entry ent;
> +   const struct buffered_patch_line *line;
> +   struct moved_entry *next_line;
> +};
> +
> +static void get_ws_cleaned_string(const struct buffered_patch_line *l,
> + struct strbuf *out)
> +{
> +   int i;
> +   for (i = 0; i < l->len; i++) {
> +   if (isspace(l->line[i]))
> +   continue;
> +   strbuf_addch(out, l->line[i]);
> +   }
> +}
> +
> +static int buffered_patch_line_cmp_no_ws(const struct buffered_patch_line *a,
> +const struct buffered_patch_line *b,
> +const void *keydata)
> +{
> +   struct strbuf sba = STRBUF_INIT;
> + 

Re: [PATCH 18/19] diff: buffer all output if asked to

2017-05-15 Thread Jonathan Tan
Overall, this patch seems larger than it should to me, although there
might be good reasons for that that I don't know. I'll remark on what
I find unexpected.

On Sat, May 13, 2017 at 9:01 PM, Stefan Beller  wrote:
> diff --git a/diff.c b/diff.c
> index 08dcc56bb9..dbab7fb44e 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -516,29 +516,29 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t 
> *mf2,
> ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
>  }
>
> -static void emit_line_0(struct diff_options *o, const char *set, const char 
> *reset,
> -   int sign, const char *line, int len)
> +static void emit_buffered_patch_line(struct diff_options *o,
> +struct buffered_patch_line *e)
>  {
> -   int has_trailing_newline, has_trailing_carriage_return;
> +   int has_trailing_newline, has_trailing_carriage_return, len = e->len;
> FILE *file = o->file;
>
> fputs(diff_line_prefix(o), file);
>
> -   has_trailing_newline = (len > 0 && line[len-1] == '\n');
> +   has_trailing_newline = (len > 0 && e->line[len-1] == '\n');
> if (has_trailing_newline)
> len--;
> -   has_trailing_carriage_return = (len > 0 && line[len-1] == '\r');
> +   has_trailing_carriage_return = (len > 0 && e->line[len-1] == '\r');
> if (has_trailing_carriage_return)
> len--;
>
> -   if (len || sign) {
> -   if (set)
> -   fputs(set, file);
> -   if (sign)
> -   fputc(sign, file);
> -   fwrite(line, len, 1, file);
> -   if (reset)
> -   fputs(reset, file);
> +   if (len || e->sign) {
> +   if (e->set)
> +   fputs(e->set, file);
> +   if (e->sign)
> +   fputc(e->sign, file);
> +   fwrite(e->line, len, 1, file);
> +   if (e->reset)
> +   fputs(e->reset, file);
> }
> if (has_trailing_carriage_return)
> fputc('\r', file);
> @@ -546,6 +546,65 @@ static void emit_line_0(struct diff_options *o, const 
> char *set, const char *res
> fputc('\n', file);
>  }
>
> +static void emit_buffered_patch_line_ws(struct diff_options *o,
> +   struct buffered_patch_line *e,
> +   const char *ws, unsigned ws_rule)

This introduces a new _ws emission function - how is this used and how
is this different from the non-ws one? I see BPL_EMIT_LINE_WS, but I
don't see the caller that introduces that constant in this patch.

> +{
> +   struct buffered_patch_line s = {e->set, e->reset, "", 0, e->sign};
> +
> +   emit_buffered_patch_line(o, );
> +   ws_check_emit(e->line, e->len, ws_rule,
> + o->file, e->set, e->reset, ws);
> +}
> +
> +static void process_next_buffered_patch_line(struct diff_options *o, int 
> line_no)
> +{
> +   struct buffered_patch_line *e = >line_buffer[line_no];
> +
> +   const char *ws = o->current_filepair->ws;
> +   unsigned ws_rule = o->current_filepair->ws_rule;
> +
> +   switch (e->state) {
> +   case BPL_EMIT_LINE_ASIS:
> +   emit_buffered_patch_line(o, e);
> +   break;
> +   case BPL_EMIT_LINE_WS:
> +   emit_buffered_patch_line_ws(o, e, ws, ws_rule);
> +   break;
> +   case BPL_HANDOVER:
> +   o->current_filepair++;

If we're just buffering the diff output, do we need to store
per-file-pair metadata? (I assume that's why you need a special
handover constant.) Clients can already read what they need from the
diff output.

> +   break;
> +   default:
> +   die("BUG: malformatted buffered patch line: '%d'", 
> e->state);
> +   }
> +}
> +
> +static void append_buffered_patch_line(struct diff_options *o,
> +  struct buffered_patch_line *e)
> +{
> +   struct buffered_patch_line *f;
> +   ALLOC_GROW(o->line_buffer,
> +  o->line_buffer_nr + 1,
> +  o->line_buffer_alloc);
> +   f = >line_buffer[o->line_buffer_nr];
> +   o->line_buffer_nr++;
> +
> +   memcpy(f, e, sizeof(struct buffered_patch_line));
> +   f->line = e->line ? xmemdupz(e->line, e->len) : NULL;
> +}
> +
> +static void emit_line_0(struct diff_options *o,
> +   const char *set, const char *reset,
> +   int sign, const char *line, int len)
> +{
> +   struct buffered_patch_line e = {set, reset, line, len, sign, 
> BPL_EMIT_LINE_ASIS};
> +
> +   if (o->use_buffer)
> +   append_buffered_patch_line(o, );
> +   else
> +   emit_buffered_patch_line(o, );
> +}
> +
>  void emit_line(struct 

Re: git rebase regression: cannot pass a shell expression directly to --exec

2017-05-15 Thread Jeff King
On Mon, May 15, 2017 at 11:53:57PM -0400, Jeff King wrote:

> My /bin/sh isn't bash, but I should be able to build with
> SHELL_PATH=/bin/bash to reproduce. But I can't:
> 
>$ bash
>$ foo() { echo >&2 "in foo"; }
>$ export -f foo
>$ bash -c foo
>in foo
> 
>$ strace -f 2>&1 git.compile rebase -x 'foo;' HEAD^ | grep foo
>Executing: foo;
>[pid  1788] execve("/bin/bash", ["/bin/bash", "-c", "foo;", "foo;"], [/* 
> 60 vars */] 
>foo;: foo: command not found
> 
> So I'm not sure why the direct "bash -c" can find it, but somehow the
> variable doesn't make it through to the "bash -c" at the lower level.
> Replacing "foo;" with "env" shows the environment, but BASH_FUNC_foo
> isn't set in it. I'm not sure where it's getting eaten, though.

Hmph. It looks like "dash" eats it. My "git.compile" above is a symlink
to /path/to/git/bin-wrappers/git. But it looks like our Makefile isn't
smart enough to rebuild bin-wrappers when you switch SHELL_PATH, so it
will had "/bin/sh" in it. Which points to dash on my machine. And
indeed, dash seems to wipe the environment:

  $ foo() { echo >&2 "in foo"; }
  $ export -f foo
  $ bash -c foo
  in foo
  $ dash -c "bash -c foo"
  bash: foo: command not found

I wonder if it's related to ShellShock protections, or if dash just
rejects variable names with the "%" in them or something. Anyway. That
explains why I had trouble reproducing. Using bash consistently as my
shell lets me reproduce Eric's results. And doing the semicolon trick
does make it work again.

So I feel confident that I've analyzed the problem correctly, at least.

-Peff


Implementation of sorted hashmap iteration

2017-05-15 Thread Daniel Ferreira
---
Hi there! When implementing a patch series to port git-add--interactive
from Perl to C[1] I ended up implementing this quirk to iterate through
a hashmap in the order elements were added to it. In the end, it did
not make it into the series but I figured I'd share it anyway if it
interests anyone.

[1]: 
https://public-inbox.org/git/1494907234-28903-1-git-send-email-bnm...@gmail.com/T/#t

-- Daniel.

 hashmap.c | 25 +
 hashmap.h | 12 
 2 files changed, 37 insertions(+)

diff --git a/hashmap.c b/hashmap.c
index 7d1044e..948576c 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -196,6 +196,7 @@ void hashmap_add(struct hashmap *map, void *entry)
unsigned int b = bucket(map, entry);

/* add entry */
+   ((struct hashmap_entry *) entry)->index = map->size;
((struct hashmap_entry *) entry)->next = map->table[b];
map->table[b] = entry;

@@ -254,6 +255,30 @@ void *hashmap_iter_next(struct hashmap_iter *iter)
}
 }

+void hashmap_iter_init_sorted(struct hashmap *map,
+   struct hashmap_iter_sorted *iter)
+{
+   hashmap_iter_init(map, (struct hashmap_iter *)iter);
+   iter->pos = 0;
+   iter->sorted_entries = xcalloc(map->size,
+   sizeof(struct hashmap_entry *));
+
+   struct hashmap_entry *ent;
+   while ((ent = hashmap_iter_next((struct hashmap_iter *)iter))) {
+   iter->sorted_entries[ent->index] = ent;
+   }
+}
+
+void *hashmap_iter_next_sorted(struct hashmap_iter_sorted *iter)
+{
+   return iter->sorted_entries[iter->pos++];
+}
+
+void hashmap_iter_free_sorted(struct hashmap_iter_sorted *iter)
+{
+   free(iter->sorted_entries);
+}
+
 struct pool_entry {
struct hashmap_entry ent;
size_t len;
diff --git a/hashmap.h b/hashmap.h
index de6022a..f2a5d36 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -30,6 +30,7 @@ static inline unsigned int sha1hash(const unsigned char *sha1)
 struct hashmap_entry {
struct hashmap_entry *next;
unsigned int hash;
+   unsigned int index;
 };

 typedef int (*hashmap_cmp_fn)(const void *entry, const void *entry_or_key,
@@ -48,6 +49,12 @@ struct hashmap_iter {
unsigned int tablepos;
 };

+struct hashmap_iter_sorted {
+   struct hashmap_iter base;
+   struct hashmap_entry **sorted_entries;
+   unsigned int pos;
+};
+
 /* hashmap functions */

 extern void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function,
@@ -112,6 +119,11 @@ static inline void *hashmap_iter_first(struct hashmap *map,
return hashmap_iter_next(iter);
 }

+extern void hashmap_iter_init_sorted(struct hashmap *map,
+   struct hashmap_iter_sorted *iter);
+extern void *hashmap_iter_next_sorted(struct hashmap_iter_sorted *iter);
+extern void hashmap_iter_free_sorted(struct hashmap_iter_sorted *iter);
+
 /* string interning */

 extern const void *memintern(const void *data, size_t len);
--
2.7.4 (Apple Git-66)



[PATCH v2 3/4] add--helper: implement interactive status command

2017-05-15 Thread Daniel Ferreira
Implement add--interactive's status command in the add--helper builtin.
It prints a numstat comparing changed files between a) the worktree and
the index; b) the index and the HEAD.

To do so, we use run_diff_index() and run_diff_files() to get changed
files, use the diffstat API on them to get the numstat and use a
combination of a hashmap and qsort() to print the result in
O(n) + O(n lg n) complexity.

This is the first add--interactive command implemented in C of those
anticipated by 6cbc52d ("add--helper: create builtin helper for
interactive add", 2017-05-15).

Signed-off-by: Daniel Ferreira 
---
 builtin/add--helper.c | 279 ++
 1 file changed, 279 insertions(+)

diff --git a/builtin/add--helper.c b/builtin/add--helper.c
index 6a97f0e..f6d35bf 100644
--- a/builtin/add--helper.c
+++ b/builtin/add--helper.c
@@ -1,6 +1,285 @@
 #include "builtin.h"
+#include "color.h"
+#include "diff.h"
+#include "diffcore.h"
+#include "hashmap.h"
+#include "revision.h"
+
+#define HEADER_INDENT "  "
+
+enum collection_phase {
+   WORKTREE,
+   INDEX
+};
+
+struct file_stat {
+   struct hashmap_entry ent;
+   struct {
+   uintmax_t added, deleted;
+   } index, worktree;
+   char name[FLEX_ARRAY];
+};
+
+struct collection_status {
+   enum collection_phase phase;
+
+   const char *reference;
+   struct pathspec pathspec;
+
+   struct hashmap file_map;
+};
+
+static int use_color = -1;
+enum color_add_i {
+   COLOR_PROMPT,
+   COLOR_HEADER,
+   COLOR_HELP,
+   COLOR_ERROR
+};
+
+static char colors[][COLOR_MAXLEN] = {
+   GIT_COLOR_BOLD_BLUE, /* Prompt */
+   GIT_COLOR_BOLD,  /* Header */
+   GIT_COLOR_BOLD_RED,  /* Help */
+   GIT_COLOR_BOLD_RED   /* Error */
+};
+
+static const char *get_color(enum color_add_i ix)
+{
+   if (want_color(use_color))
+   return colors[ix];
+   return "";
+}
+
+static int parse_color_slot(const char *slot)
+{
+   if (!strcasecmp(slot, "prompt"))
+   return COLOR_PROMPT;
+   if (!strcasecmp(slot, "header"))
+   return COLOR_HEADER;
+   if (!strcasecmp(slot, "help"))
+   return COLOR_HELP;
+   if (!strcasecmp(slot, "error"))
+   return COLOR_ERROR;
+
+   return -1;
+}
+
+static int add_i_config(const char *var,
+   const char *value, void *cbdata)
+{
+   const char *name;
+
+   if (!strcmp(var, "color.interactive")) {
+   use_color = git_config_colorbool(var, value);
+   return 0;
+   }
+
+   if (skip_prefix(var, "color.interactive", )) {
+   int slot = parse_color_slot(name);
+   if (slot < 0)
+   return 0;
+   if (!value)
+   return config_error_nonbool(var);
+   return color_parse(value, colors[slot]);
+   }
+
+   return git_default_config(var, value, cbdata);
+}
+
+static int hash_cmp(const void *entry, const void *entry_or_key,
+ const void *keydata)
+{
+   const struct file_stat *e1 = entry, *e2 = entry_or_key;
+   const char *name = keydata ? keydata : e2->name;
+
+   return strcmp(e1->name, name);
+}
+
+static int alphabetical_cmp(const void *a, const void *b)
+{
+   struct file_stat *f1 = *((struct file_stat **)a);
+   struct file_stat *f2 = *((struct file_stat **)b);
+
+   return strcmp(f1->name, f2->name);
+}
+
+static void collect_changes_cb(struct diff_queue_struct *q,
+struct diff_options *options,
+void *data)
+{
+   struct collection_status *s = data;
+   struct diffstat_t stat = { 0 };
+   int i;
+
+   if (!q->nr)
+   return;
+
+   for (i = 0; i < q->nr; i++) {
+   struct diff_filepair *p;
+   p = q->queue[i];
+   diff_flush_stat(p, options, );
+   }
+
+   for (i = 0; i < stat.nr; i++) {
+   struct file_stat *entry;
+   const char *name = stat.files[i]->name;
+   unsigned int hash = strhash(name);
+
+   entry = hashmap_get_from_hash(>file_map, hash, name);
+   if (!entry) {
+   FLEX_ALLOC_STR(entry, name, name);
+   hashmap_entry_init(entry, hash);
+   hashmap_add(>file_map, entry);
+   }
+
+   if (s->phase == WORKTREE) {
+   entry->worktree.added = stat.files[i]->added;
+   entry->worktree.deleted = stat.files[i]->deleted;
+   } else if (s->phase == INDEX) {
+   entry->index.added = stat.files[i]->added;
+   entry->index.deleted = stat.files[i]->deleted;
+   }
+   }
+}
+
+static void collect_changes_worktree(struct collection_status *s)
+{

[PATCH v2 4/4] add--interactive: use add-interactive--helper for status_cmd

2017-05-15 Thread Daniel Ferreira
Call the newly introduced git-add-interactive--helper builtin on
status_cmd() instead of relying on git-add--interactive's Perl
functions to build print the numstat.

Signed-off-by: Daniel Ferreira 
---
 git-add--interactive.perl | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 709a5f6..8c192bf 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -632,9 +632,7 @@ EOF
 }
 
 sub status_cmd {
-   list_and_choose({ LIST_ONLY => 1, HEADER => $status_head },
-   list_modified());
-   print "\n";
+   system(qw(git add--helper --status));
 }
 
 sub say_n_paths {
-- 
2.7.4 (Apple Git-66)



[PATCH v2 1/4] diff: export diffstat interface

2017-05-15 Thread Daniel Ferreira
Make the diffstat interface (namely, the diffstat_t struct and
diff_flush_stat) no longer be internal to diff.c and allow it to be used
by other parts of git.

This is helpful for code that may want to easily extract information
from files using the diff machinery, while flushing it differently from
how the show_* functions used by diff_flush() do it. One example is the
builtin implementation of git-add--interactive's status.

Signed-off-by: Daniel Ferreira 
---
 diff.c | 17 +
 diff.h | 19 ++-
 2 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/diff.c b/diff.c
index 74283d9..7c073c2 100644
--- a/diff.c
+++ b/diff.c
@@ -1460,21 +1460,6 @@ static char *pprint_rename(const char *a, const char *b)
return strbuf_detach(, NULL);
 }
 
-struct diffstat_t {
-   int nr;
-   int alloc;
-   struct diffstat_file {
-   char *from_name;
-   char *name;
-   char *print_name;
-   unsigned is_unmerged:1;
-   unsigned is_binary:1;
-   unsigned is_renamed:1;
-   unsigned is_interesting:1;
-   uintmax_t added, deleted;
-   } **files;
-};
-
 static struct diffstat_file *diffstat_add(struct diffstat_t *diffstat,
  const char *name_a,
  const char *name_b)
@@ -4310,7 +4295,7 @@ static void diff_flush_patch(struct diff_filepair *p, 
struct diff_options *o)
run_diff(p, o);
 }
 
-static void diff_flush_stat(struct diff_filepair *p, struct diff_options *o,
+void diff_flush_stat(struct diff_filepair *p, struct diff_options *o,
struct diffstat_t *diffstat)
 {
if (diff_unmodified_pair(p))
diff --git a/diff.h b/diff.h
index 5be1ee7..4cdc72d 100644
--- a/diff.h
+++ b/diff.h
@@ -188,6 +188,21 @@ struct diff_options {
int diff_path_counter;
 };
 
+struct diffstat_t {
+   int nr;
+   int alloc;
+   struct diffstat_file {
+   char *from_name;
+   char *name;
+   char *print_name;
+   unsigned is_unmerged:1;
+   unsigned is_binary:1;
+   unsigned is_renamed:1;
+   unsigned is_interesting:1;
+   uintmax_t added, deleted;
+   } **files;
+};
+
 enum color_diff {
DIFF_RESET = 0,
DIFF_CONTEXT = 1,
@@ -206,7 +221,6 @@ const char *diff_get_color(int diff_use_color, enum 
color_diff ix);
 
 const char *diff_line_prefix(struct diff_options *);
 
-
 extern const char mime_boundary_leader[];
 
 extern struct combine_diff_path *diff_tree_paths(
@@ -262,6 +276,9 @@ extern void diff_change(struct diff_options *,
 
 extern struct diff_filepair *diff_unmerge(struct diff_options *, const char 
*path);
 
+void diff_flush_stat(struct diff_filepair *p, struct diff_options *o,
+   struct diffstat_t *diffstat);
+
 #define DIFF_SETUP_REVERSE 1
 #define DIFF_SETUP_USE_CACHE   2
 #define DIFF_SETUP_USE_SIZE_CACHE  4
-- 
2.7.4 (Apple Git-66)



[PATCH v2 2/4] add--helper: create builtin helper for interactive add

2017-05-15 Thread Daniel Ferreira
Create a builtin helper for git-add--interactive, which right now is not
able to do anything.

This is the first step in an effort to convert git-add--interactive.perl
to a C builtin, in search for better portability, expressibility and
performance (specially on non-POSIX systems like Windows).

Additionally, an eventual complete port of git-add--interactive would
remove the last "big" Git script to have Perl as a dependency, allowing
most Git users to have a NOPERL build running without big losses.

Signed-off-by: Daniel Ferreira 
---
 .gitignore| 1 +
 Makefile  | 1 +
 builtin.h | 1 +
 builtin/add--helper.c | 6 ++
 git.c | 1 +
 5 files changed, 10 insertions(+)
 create mode 100644 builtin/add--helper.c

diff --git a/.gitignore b/.gitignore
index 833ef3b..11cec05 100644
--- a/.gitignore
+++ b/.gitignore
@@ -11,6 +11,7 @@
 /git
 /git-add
 /git-add--interactive
+/git-add--helper
 /git-am
 /git-annotate
 /git-apply
diff --git a/Makefile b/Makefile
index e35542e..b7f49b6 100644
--- a/Makefile
+++ b/Makefile
@@ -873,6 +873,7 @@ LIB_OBJS += xdiff-interface.o
 LIB_OBJS += zlib.o
 
 BUILTIN_OBJS += builtin/add.o
+BUILTIN_OBJS += builtin/add--helper.o
 BUILTIN_OBJS += builtin/am.o
 BUILTIN_OBJS += builtin/annotate.o
 BUILTIN_OBJS += builtin/apply.o
diff --git a/builtin.h b/builtin.h
index 9e4a898..85b4c55 100644
--- a/builtin.h
+++ b/builtin.h
@@ -30,6 +30,7 @@ extern int textconv_object(const char *path, unsigned mode, 
const struct object_
 extern int is_builtin(const char *s);
 
 extern int cmd_add(int argc, const char **argv, const char *prefix);
+extern int cmd_add__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_am(int argc, const char **argv, const char *prefix);
 extern int cmd_annotate(int argc, const char **argv, const char *prefix);
 extern int cmd_apply(int argc, const char **argv, const char *prefix);
diff --git a/builtin/add--helper.c b/builtin/add--helper.c
new file mode 100644
index 000..6a97f0e
--- /dev/null
+++ b/builtin/add--helper.c
@@ -0,0 +1,6 @@
+#include "builtin.h"
+
+int cmd_add__helper(int argc, const char **argv, const char *prefix)
+{
+   return 0;
+}
diff --git a/git.c b/git.c
index 8ff44f0..47ee257 100644
--- a/git.c
+++ b/git.c
@@ -391,6 +391,7 @@ static int run_builtin(struct cmd_struct *p, int argc, 
const char **argv)
 
 static struct cmd_struct commands[] = {
{ "add", cmd_add, RUN_SETUP | NEED_WORK_TREE },
+   { "add--helper", cmd_add__helper, RUN_SETUP | NEED_WORK_TREE },
{ "am", cmd_am, RUN_SETUP | NEED_WORK_TREE },
{ "annotate", cmd_annotate, RUN_SETUP },
{ "apply", cmd_apply, RUN_SETUP_GENTLY },
-- 
2.7.4 (Apple Git-66)



[PATCH v2 0/4] Port git-add--interactive.perl:status_cmd to C

2017-05-15 Thread Daniel Ferreira
This is the second version of a patch series to start porting
git-add--interactive from Perl to C.

Series:
v1: 
https://public-inbox.org/git/1494009820-2090-1-git-send-email-bnm...@gmail.com/

Travis CI build: https://travis-ci.org/theiostream/git/builds/232669894

I have applied most of the changes suggested by Johannes and Ævar (thank
you!). The one exception was Johannes' request to make this a WIP patch
series with a sort-of "design" of what's next to come. I did not do
this because I still have no clue...! I'll begin experimenting on
update_cmd() to see if I gain some insight on this issue that I could
bring to this series. I do think this would be a good idea, though! :)

-- Daniel.

Daniel Ferreira (4):
  diff: export diffstat interface
  add--helper: create builtin helper for interactive add
  add--helper: implement interactive status command
  add--interactive: use add-interactive--helper for status_cmd

 .gitignore|   1 +
 Makefile  |   1 +
 builtin.h |   1 +
 builtin/add--helper.c | 285 ++
 diff.c|  17 +--
 diff.h|  19 +++-
 git-add--interactive.perl |   4 +-
 git.c |   1 +
 8 files changed, 309 insertions(+), 20 deletions(-)
 create mode 100644 builtin/add--helper.c

--
2.7.4 (Apple Git-66)



Re: git rebase regression: cannot pass a shell expression directly to --exec

2017-05-15 Thread Jeff King
On Tue, May 16, 2017 at 12:40:51PM +0900, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Interesting. The "exec" string is still run with a shell. E.g.:
> > ...
> > I wonder if this is falling afoul of the optimization in run-command's
> > prepare_shell_cmd() to skip shell invocations for "simple" commands.
> > ...
> > So I suspect if you added an extraneous semi-colon, your case would work
> > again (and that would confirm for us that this is indeed the problem).
> 
> Wow, that's a brilliant analysis.

If it's right. :) It's all theory at this point.

My /bin/sh isn't bash, but I should be able to build with
SHELL_PATH=/bin/bash to reproduce. But I can't:

   $ bash
   $ foo() { echo >&2 "in foo"; }
   $ export -f foo
   $ bash -c foo
   in foo

   $ strace -f 2>&1 git.compile rebase -x 'foo;' HEAD^ | grep foo
   Executing: foo;
   [pid  1788] execve("/bin/bash", ["/bin/bash", "-c", "foo;", "foo;"], [/* 60 
vars */] 
   foo;: foo: command not found

So I'm not sure why the direct "bash -c" can find it, but somehow the
variable doesn't make it through to the "bash -c" at the lower level.
Replacing "foo;" with "env" shows the environment, but BASH_FUNC_foo
isn't set in it. I'm not sure where it's getting eaten, though.

> The "semicolon" trick is way too obscure, but perhaps can be
> documented as an escape hatch?

Yeah, I agree this should be documented if it can't be fixed. I wasn't
sure if we were giving up just yet.

Either way, though, it wouldn't hurt to mention optimizing out "maybe
shell" optimization, because it can occasionally produce user-visible
effects. Where would be a good place? In git(1), I guess?

It's almost something that could go in gitcli(7), but it's not really
about the CLI in particular. In most cases the shell-exec'd commands are
from config, but not always (as this case shows). So git-config(1)
probably isn't the right place.

AFAIK, we don't talk about this behavior at all in the existing
documentation. And I really don't know where we'd put it that somebody
would find it without having to read the documentation exhaustively.

-Peff


Re: How to force a pull to succeed?

2017-05-15 Thread Jeffrey Walton
On Mon, May 15, 2017 at 11:42 PM, Junio C Hamano  wrote:
> Jeffrey Walton  writes:
>
>> On Mon, May 15, 2017 at 11:27 PM, Junio C Hamano  wrote:
>>> Jeffrey Walton  writes:
>>>
 I scp'd a file to another machine for testing. The change tested OK,
 so I checked it in on the original machine.
 ...
 How do I force the pull to succeed?
>>>
>>> Git doesn't know (or care) if you "scp"ed a file from a known to be
>>> good place, or if you modified it in the editor.  When it notices
>>> that there are differences you may rather not to lose in these files
>>> (because they are different from HEAD), it refrains from touching
>>> them.
>>>
>>> So the way to go forward is for you to make sure that you do not
>>> have such local changes in the repository that your "pull" is trying
>>> to touch.  An easiest way would be to do
>>>
>>> git checkout HEAD -- ..
>>
>> Thanks. That's an extra command. Is there any way to roll it up into
>> one command?
>
> git checkout HEAD -- .. && git pull
>
> ;-)
>
>>> before doing a "git pull" to clear the damage you caused manually
>>> with your "scp".
>>
>> There's no damage. Its expected.
>
> The fact that you think it is expected is immaterial. Git doesn't
> know (or care) how you made the files different from HEAD, so it
> looks like a damage to it.

'git pull' fails and its expected, but 'git pull -f' is supposed to
succeed. That's what -f is supposed to do.

Is there a way to add intelligence to Git so that it sees they are the
_exact_ same file, and it stops bothering me with details of problems
that don't exist?

It seems like adding the intelligence is a good enhancement. A version
control tool has to do three things: check-out, check-in, and
determine differences. Its not doing a good job of determining
differences considering they are the exact same file.

Jeff


Re: How to force a pull to succeed?

2017-05-15 Thread Junio C Hamano
Jeffrey Walton  writes:

> On Mon, May 15, 2017 at 11:27 PM, Junio C Hamano  wrote:
>> Jeffrey Walton  writes:
>>
>>> I scp'd a file to another machine for testing. The change tested OK,
>>> so I checked it in on the original machine.
>>> ...
>>> How do I force the pull to succeed?
>>
>> Git doesn't know (or care) if you "scp"ed a file from a known to be
>> good place, or if you modified it in the editor.  When it notices
>> that there are differences you may rather not to lose in these files
>> (because they are different from HEAD), it refrains from touching
>> them.
>>
>> So the way to go forward is for you to make sure that you do not
>> have such local changes in the repository that your "pull" is trying
>> to touch.  An easiest way would be to do
>>
>> git checkout HEAD -- ..
>
> Thanks. That's an extra command. Is there any way to roll it up into
> one command?

git checkout HEAD -- .. && git pull

;-)

>> before doing a "git pull" to clear the damage you caused manually
>> with your "scp".
>
> There's no damage. Its expected.

The fact that you think it is expected is immaterial. Git doesn't
know (or care) how you made the files different from HEAD, so it
looks like a damage to it.



Re: git rebase regression: cannot pass a shell expression directly to --exec

2017-05-15 Thread Junio C Hamano
Jeff King  writes:

> Interesting. The "exec" string is still run with a shell. E.g.:
> ...
> I wonder if this is falling afoul of the optimization in run-command's
> prepare_shell_cmd() to skip shell invocations for "simple" commands.
> ...
> So I suspect if you added an extraneous semi-colon, your case would work
> again (and that would confirm for us that this is indeed the problem).

Wow, that's a brilliant analysis.

> The optimization in run-command is very old, but the switch to
> rebase--helper presumably moved us from doing that exec in the actual
> shell script to doing it via the C code.
>
> Which means your exported-function technique has been broken for _most_
> of Git all along, but it just now affected this particular spot.
>
> I'm not sure how to feel about it. In the face of exported functions, we
> can never do the shell-skipping optimization, because we don't know how
> the shell is going to interpret even a simple-looking command. And it is
> kind of a neat trick. But I also hate to add extra useless shell
> invocations for the predominantly common case that people aren't using
> this trick (or aren't even using a shell that supports function
> exports).

I was about to write this off as "an unfortunate regression, a
fallout that will likely left unfixed, due to lack of a good
practical workaround."

The point of rewriting things in C and using run_command() interface
was to avoid shell overhead.  We are doing an exec already, but
adding a shell invocation in the middle will double the number of
exec (and probably add an extra fork as well), which probably is
measurable on systems with slow fork/exec.

The "semicolon" trick is way too obscure, but perhaps can be
documented as an escape hatch?



Re: git rebase regression: cannot pass a shell expression directly to --exec

2017-05-15 Thread Jeff King
On Mon, May 15, 2017 at 11:25:03PM -0400, Jeff King wrote:

> One hack would be to look for BASH_FUNC_* in the environment and disable
> the optimization in that case. I think that would make your case Just
> Work. It doesn't help other oddball cases, like:
> 
>   - you're trying to run a shell builtin that behaves differently than
> its exec-able counterpart
> 
>   - your shell has some other mechanism for defining commands that we
> would not find via exec. I don't know of one offhand. Obviously $ENV
> could point to a file which defines some, but for most shells would
> not read any startup files for a non-interactive "sh -c" invocation.

So I was thinking something like the patch below, though I guess
technically you could look for BASH_FUNC_$argv[0]%%, which seems to be
bash's magic variable name. I hate to get too intimate with those
details, though.

Another option is to speculatively run "foo" without the shell, and if
execve fails to find it, then fall back to running the shell. That would
catch any number of cases where the shell "somehow" finds a command that
we can't.

You'd still have confusing behavior if your shell builtin behaved
differently than the exec-able version, though (because we'd quietly use
the exec-able one), but I would imagine that's exceedingly rare.

I dunno. Maybe the whole thing is a fool's errand.

diff --git a/run-command.c b/run-command.c
index 1c02bfb2e..8328d27fb 100644
--- a/run-command.c
+++ b/run-command.c
@@ -240,12 +240,24 @@ int sane_execvp(const char *file, char * const argv[])
return -1;
 }
 
+static int env_has_bash_functions(void)
+{
+   extern char **environ;
+   char **key;
+
+   for (key = environ; *key; key++)
+   if (starts_with(*key, "BASH_FUNC_"))
+   return 1;
+   return 0;
+}
+
 static const char **prepare_shell_cmd(struct argv_array *out, const char 
**argv)
 {
if (!argv[0])
die("BUG: shell command is empty");
 
-   if (strcspn(argv[0], "|&;<>()$`\\\"' \t\n*?[#~=%") != strlen(argv[0])) {
+   if (strcspn(argv[0], "|&;<>()$`\\\"' \t\n*?[#~=%") != strlen(argv[0]) ||
+   env_has_bash_functions()) {
 #ifndef GIT_WINDOWS_NATIVE
argv_array_push(out, SHELL_PATH);
 #else


Re: How to force a pull to succeed?

2017-05-15 Thread Jeffrey Walton
On Mon, May 15, 2017 at 11:27 PM, Junio C Hamano  wrote:
> Jeffrey Walton  writes:
>
>> I scp'd a file to another machine for testing. The change tested OK,
>> so I checked it in on the original machine.
>> ...
>> How do I force the pull to succeed?
>
> Git doesn't know (or care) if you "scp"ed a file from a known to be
> good place, or if you modified it in the editor.  When it notices
> that there are differences you may rather not to lose in these files
> (because they are different from HEAD), it refrains from touching
> them.
>
> So the way to go forward is for you to make sure that you do not
> have such local changes in the repository that your "pull" is trying
> to touch.  An easiest way would be to do
>
> git checkout HEAD -- ..

Thanks. That's an extra command. Is there any way to roll it up into
one command?

> before doing a "git pull" to clear the damage you caused manually
> with your "scp".

There's no damage. Its expected.

Jeff


Re: How to force a pull to succeed?

2017-05-15 Thread Junio C Hamano
Jeffrey Walton  writes:

> I scp'd a file to another machine for testing. The change tested OK,
> so I checked it in on the original machine.
> ...
> How do I force the pull to succeed?

Git doesn't know (or care) if you "scp"ed a file from a known to be
good place, or if you modified it in the editor.  When it notices
that there are differences you may rather not to lose in these files
(because they are different from HEAD), it refrains from touching
them.

So the way to go forward is for you to make sure that you do not
have such local changes in the repository that your "pull" is trying
to touch.  An easiest way would be to do

git checkout HEAD -- ..

before doing a "git pull" to clear the damage you caused manually
with your "scp".  


Re: git rebase regression: cannot pass a shell expression directly to --exec

2017-05-15 Thread Jeff King
On Mon, May 15, 2017 at 11:08:39AM -0700, Eric Rannaud wrote:

> It used to be possible to run a sequence like:
> 
>   foo() { echo X; }
>   export -f foo
>   git rebase --exec foo HEAD~10
> 
> Since upgrading to 2.13.0, I had to update my scripts to run:
> 
>   git rebase --exec "bash -c foo" HEAD~10

Interesting. The "exec" string is still run with a shell. E.g.:

  $ git rebase --exec 'for i in 1 2 3; do echo >&2 $i; done' HEAD~5
  Executing: for i in 1 2 3; do echo >&2 $i; done
  1
  2
  3
  Executing: for i in 1 2 3; do echo >&2 $i; done
  1
  2
  3
  [...and so on...]

I wonder if this is falling afoul of the optimization in run-command's
prepare_shell_cmd() to skip shell invocations for "simple" commands.

E.g., if I do:

   strace -fe execve git rebase -x foo HEAD^ 2>&1 | grep foo

I see:

   ...
   Executing: foo
   [pid 21820] execve("foo", ["foo"], [/* 57 vars */]) = -1 ENOENT (No such 
file or directory)
   fatal: cannot run foo: No such file or directory

But if I were to add a shell meta-character, like this:

   strace -fe execve git rebase -x 'foo;' HEAD^ 2>&1 | grep foo

then I see:

  ...
  Executing: foo;
  [pid 22569] execve("/bin/sh", ["/bin/sh", "-c", "foo;", "foo;"], [/* 57 vars 
*/]) = 0
  foo;: 1: foo;: foo: not found

So I suspect if you added an extraneous semi-colon, your case would work
again (and that would confirm for us that this is indeed the problem).

> I'm not sure if this was an intended change. Bisecting with the
> following script:
> [...]
> It points to this commit:
>
> commit 18633e1a22a68bbe8e6311a1039d13ebbf6fd041 (refs/bisect/bad)
> Author: Johannes Schindelin 
> Date:   Thu Feb 9 23:23:11 2017 +0100
>
> rebase -i: use the rebase--helper builtin

The optimization in run-command is very old, but the switch to
rebase--helper presumably moved us from doing that exec in the actual
shell script to doing it via the C code.

Which means your exported-function technique has been broken for _most_
of Git all along, but it just now affected this particular spot.

I'm not sure how to feel about it. In the face of exported functions, we
can never do the shell-skipping optimization, because we don't know how
the shell is going to interpret even a simple-looking command. And it is
kind of a neat trick. But I also hate to add extra useless shell
invocations for the predominantly common case that people aren't using
this trick (or aren't even using a shell that supports function
exports).

One hack would be to look for BASH_FUNC_* in the environment and disable
the optimization in that case. I think that would make your case Just
Work. It doesn't help other oddball cases, like:

  - you're trying to run a shell builtin that behaves differently than
its exec-able counterpart

  - your shell has some other mechanism for defining commands that we
would not find via exec. I don't know of one offhand. Obviously $ENV
could point to a file which defines some, but for most shells would
not read any startup files for a non-interactive "sh -c" invocation.

-Peff


Re: [RFC PATCH v2 00/22] Add blame to libgit

2017-05-15 Thread Junio C Hamano
Jeffrey Smith  writes:

> On Mon, May 15, 2017 at 7:23 PM, Junio C Hamano  wrote:
>> Jeffrey Smith  writes:
>>
>>> I did try to keep any unnecessary changes out of the big movement
>>> patches (17-19).  Would you prefer I break down 19 further?  I had
>>> been holding back because of the added churn that would introduce from
>>> lots of changing function visibility in blame.h and blame.c.
>>
>> To be honest, the "rename symbols while moving" done starting around
>> 04/22 made the series too noisy to read and I had hard time
>> reviewing the later ones like 17-19.  I think they share exactly the
>> same issue as 04/22, e.g. 17/22 renames origin_decref to
>> blame_origin_decref while moving the function and some of its
>> callers across files.
>
> Hm, I really thought I had split some of those out.  Patches 4, 5, 17, and 19
> could all be reasonably split into rename vs move patches.  Patch 18 only
> does moving.
>
> I do not see much benefit to splitting up the 'move xyz to scoreboard'
> subset (6-12) that way as moving an item to scoreboard is already causing
> each use to change (from xyz to sb->xyz).

Perhaps it would be better to repeat and clarify what I suggested
once more.

 * Keep mechanical changes separate from real changes,
 * Keep each real change separate from each other,
 * Keep the same kind of mechnical changes together, and
 * Keep different kinds of mechanical changes separate.

So, if you are moving some globals into a structure, we want a patch
that does that for a set of related globals and adjust their uses,
and do nothing else.  You'll probably have a handful of them in the
end.  For each of these real changes, it is usually better to have
more and smaller patches than to have fewer and larger patches.

Then if you are remaing a bunch of structure types and/or variable
names, we want a patch that does that mechanical change (i.e. rename
"struct foo" and "struct bar" to "struct blame_foo" and "struct
blame_bar") that does not do anything else (i.e. do not move lines
across files in the same patch).  As I said already, this does not
have to be one per struct type.  We can say "renaming symbols" is
just one kind of mechanical changes, which is different from "moving
lines across files".

Finally, move the lines across files (and again, do nothing else).
Having to turn "static something" into "extern something" is part of
moving the lines across files, and we do want to see that in the
same patch.  But renaming something to blame_something should have
already be done while these lines were in builtin/blame.c in an
earlier step.

Thanks.


Re: [PATCH] usage: fix a sparse 'redeclared with different type' error

2017-05-15 Thread Jeff King
On Tue, May 16, 2017 at 02:11:40AM +0100, Ramsay Jones wrote:

> 
> If you need to re-roll your 'jk/bug-to-abort' branch, could you please
> squash this into the relevant patch (commit d8193743e0 "usage.c: add
> BUG() function", 12-05-2017).
> 
> [Just FYI, sparse complains thus:
>   usage.c:212:6: error: symbol 'BUG_fl' redeclared with different type
>   (originally declared at git-compat-util.h:1076) - different modifiers
> ]

Hmm. Our model here is the die() function, which gets noreturn and
format attributes in the declaration, but only noreturn at the
definition.

Your patch here adds both attributes to the definition:

> +__attribute__((format (printf, 3, 4))) NORETURN
>  void BUG_fl(const char *file, int line, const char *fmt, ...)

Another possible model is trace_printf_key_fl(), which just has a format
attribute at the declaration, and nothing at all in the definition.

So it seems like this doesn't matter for the format attribute, but does
for NORETURN. Weird. I wonder if it's specific to the attribute, or
something about the way we hide it behind a macro.

There probably isn't a downside to repeating the format attribute, I
guess. Except that I'm not sure what happens if the two ever got out of
sync (gcc doesn't seem to complain, though in practice you'd probably
change the order or number of arguments at the same time, which is
likely to cause a mismatch).

-Peff


Re: [PATCH v1 2/5] Teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-05-15 Thread Jeff King
On Mon, May 15, 2017 at 09:55:12PM -0400, Ben Peart wrote:

> > > > +   istate->last_update = (time_t)ntohll(*(uint64_t *)index);
> [...]
> > (I note also that time_t is not necessarily 64-bits in the first place,
> > but David said something about this not really being a time_t).
> 
> The in memory representation is a time_t as that is the return value of
> time(NULL) but it is converted to/from a 64 bit value when written/read to
> the index extension so that the index format is the same no matter the
> native size of time_t.

OK. I guess your cast here will truncate on 32-bit systems, but
presumably not until 2038, so we can perhaps ignore it for now (and
anyway, time(NULL) will be broken on such a system at that point).

-Peff


Re: [RFC PATCH v2 00/22] Add blame to libgit

2017-05-15 Thread Jeffrey Smith
Hm, I really thought I had split some of those out.  Patches 4, 5, 17, and 19
could all be reasonably split into rename vs move patches.  Patch 18 only
does moving.

I do not see much benefit to splitting up the 'move xyz to scoreboard'
subset (6-12) that way as moving an item to scoreboard is already causing
each use to change (from xyz to sb->xyz).

On Mon, May 15, 2017 at 7:23 PM, Junio C Hamano  wrote:
> Jeffrey Smith  writes:
>
>> On Mon, May 15, 2017 at 4:24 AM, Junio C Hamano  wrote:
>>> Jeff Smith  writes:
>>>
 Rather than duplicate large portions of builtin/blame.c in cgit, it
 would be better to shift its core functionality into libgit.a.  The
 functionality left in builtin/blame.c mostly relates to terminal
 presentation.
>>>
>>> As I said in my review of 04/22, it is a bit hard/tedious to sift
>>> the changes to refactoring that actually changes code and pure
>>> renaming and movement of lines across files with the current
>>> sequence of the series, so it is very possible that I may have
>>> missed something, but from a cursory read through the series, from
>>> the comparison between master and the result of applying all
>>> patches, and inspection of the resulting builtin/blame.c file, I
>>> think what this series does is very sensible in general.  blame.h
>>> does not seem to expose anything that is not needed, which is a good
>>> sign.
>>>
>> I did try to keep any unnecessary changes out of the big movement
>> patches (17-19).  Would you prefer I break down 19 further?  I had
>> been holding back because of the added churn that would introduce from
>> lots of changing function visibility in blame.h and blame.c.
>
> To be honest, the "rename symbols while moving" done starting around
> 04/22 made the series too noisy to read and I had hard time
> reviewing the later ones like 17-19.  I think they share exactly the
> same issue as 04/22, e.g. 17/22 renames origin_decref to
> blame_origin_decref while moving the function and some of its
> callers across files.
>
>>
>> (And if gmail would quit trying to make my messages 'rich text' --
>> a.k.a. HTML -- that would be great...)
>>
>>


How to force a pull to succeed?

2017-05-15 Thread Jeffrey Walton
I scp'd a file to another machine for testing. The change tested OK,
so I checked it in on the original machine.

I'm now on the remote machine, and I'm trying to pull the same exact
file that exists on both local and remote. Git won't allow me to do
it, even with -f.

I'd really like -f (or another switch) to work as expected. I don't
like issuing extra command or extra typing.

How do I force the pull to succeed?

Thanks in adavnce.

**

qotom:cryptopp$ git pull
Updating 30ac06d..30ac53f
error: Your local changes to the following files would be overwritten by merge:
datatest.cpp
Please commit your changes or stash them before you merge.
Aborting
qotom:cryptopp$ git pull -f
Updating 30ac06d..30ac53f
error: Your local changes to the following files would be overwritten by merge:
datatest.cpp
Please commit your changes or stash them before you merge.


Re: [PATCH v1 2/5] Teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-05-15 Thread Ben Peart



On 5/15/2017 8:34 PM, Jeff King wrote:

On Tue, May 16, 2017 at 12:22:14AM +, brian m. carlson wrote:


On Mon, May 15, 2017 at 03:13:44PM -0400, Ben Peart wrote:

+   istate->last_update = (time_t)ntohll(*(uint64_t *)index);
+   index += sizeof(uint64_t);
+
+   ewah_size = ntohl(*(uint32_t *)index);
+   index += sizeof(uint32_t);


To answer the question you asked in your cover letter, you cannot write
this unless you can guarantee (((uintptr_t)index & 7) == 0) is true.
Otherwise, this will produce a SIGBUS on SPARC, Alpha, MIPS, and some
ARM systems, and it will perform poorly on PowerPC and other ARM
systems[0].

If you got that pointer from malloc and have only indexed multiples of 8
on it, you're good.  But if you're not sure, you probably want to use
memcpy.  If the compiler can determine that it's not necessary, it will
omit the copy and perform a direct load.


I think get_be32() does exactly what we want for the ewah_size read. For
the last_update one, we don't have a get_be64() yet, but it should be
easy to make based on the 16/32 versions.


Thanks for the pointers.  I'll update this to use the existing get_be32 
and have created a get_be64 and will use that for the last_update.




(I note also that time_t is not necessarily 64-bits in the first place,
but David said something about this not really being a time_t).



The in memory representation is a time_t as that is the return value of 
time(NULL) but it is converted to/from a 64 bit value when written/read 
to the index extension so that the index format is the same no matter 
the native size of time_t.



-Peff



Re: [PATCH] config: match both symlink & realpath versions in IncludeIf.gitdir:*

2017-05-15 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> Change the conditional inclusion mechanism to support
> e.g. gitdir:~/git_tree/repo where ~/git_tree is a symlink to to

s/to to/to/;

> /mnt/stuff/repo.
>
> This worked in the initial version of this facility[1], but regressed
> later in the series while solving a related bug[2].
>
> Now gitdir: will match against the symlinked
> path (e.g. gitdir:~/git_tree/repo) in addition to the current
> /mnt/stuff/repo path.
>
> Since this is already in a release version note in the documentation
> that this behavior changed, so users who expect their configuration to
> work on both v2.13.0 and some future version of git with this fix
> aren't utterly confused.
>
> 1. commit 3efd0bedc6 ("config: add conditional include", 2017-03-01)
> 2. commit 86f9515708 ("config: resolve symlinks in conditional
>include's patterns", 2017-04-05)
>
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>
> Here's a non-RFC patch to fix this bug.
>
>  Documentation/config.txt  | 11 +++
>  config.c  | 16 
>  t/t1305-config-include.sh | 23 +++
>  3 files changed, 50 insertions(+)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 475e874d51..137502a289 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -140,6 +140,17 @@ A few more notes on matching via `gitdir` and `gitdir/i`:
>  
>   * Symlinks in `$GIT_DIR` are not resolved before matching.
>  
> + * Both the symlink & realpath versions of paths will be matched
> +   outside of `$GIT_DIR`. E.g. if ~/git is a symlink to
> +   /mnt/storage/git, both `gitdir:~/git` and `gitdir:/mnt/storage/git`
> +   will match.
> +
> +   This was not the case in the initial release of this feature in
> +   v2.13.0, which only matched the realpath version. Configuration
> +   that wants to be compatible with the initial release of this
> +   feature needs to either specify only the realpath version, or both
> +   versions.
> +

Does the second paragraph format correctly, or do we need the usual
(and ugly) "+ by itself on a line and then dedent the next
paragraph" trick?

> diff --git a/config.c b/config.c
> index b4a3205da3..0498746112 100644
> --- a/config.c
> +++ b/config.c
> @@ -214,6 +214,7 @@ static int include_by_gitdir(const struct config_options 
> *opts,
>   struct strbuf pattern = STRBUF_INIT;
>   int ret = 0, prefix;
>   const char *git_dir;
> + int retry = 1;

I think your earlier RFC used a variable with a better name than
this; it is not like we are preparing ourselves for adding a
mechanism here to retry for many random reasons.

The logic would be clearer With an "already_tried_absolute" variable
that is initially 0, retrying unless it is already set, and set it
when we retry, like you did in your RFC.

Thanks.


Re: [PATCH v1 2/5] Teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-05-15 Thread Ben Peart


On 5/15/2017 5:21 PM, David Turner wrote:



-Original Message-
From: Ben Peart [mailto:peart...@gmail.com]
Sent: Monday, May 15, 2017 3:14 PM
To: git@vger.kernel.org
Cc: gits...@pobox.com; benpe...@microsoft.com; pclo...@gmail.com;
johannes.schinde...@gmx.de; David Turner ;
p...@peff.net
Subject: [PATCH v1 2/5] Teach git to optionally utilize a file system monitor to
speed up detecting new or changed files.



@@ -342,6 +344,8 @@ struct index_state {
struct hashmap dir_hash;
unsigned char sha1[20];
struct untracked_cache *untracked;
+   time_t last_update;
+   struct ewah_bitmap *bitmap;


The name 'bitmap' doesn't tell the reader much about what it used for.


+static int update_istate(const char *name, void *is) {


Rename to mark_file_dirty?  Also why does it take a void pointer?  Or return 
int (rather than void)?



Thanks for the feedback.  I'll do some renaming and change the types passed.



+void refresh_by_fsmonitor(struct index_state *istate) {
+   static has_run_once = FALSE;
+   struct strbuf buffer = STRBUF_INIT;


Rename to query_result? Also I think you're leaking it.



Good catch!  I missed the leak there.  Fixed for the next roll.


[PATCH] usage: fix a sparse 'redeclared with different type' error

2017-05-15 Thread Ramsay Jones

Signed-off-by: Ramsay Jones 
---

Hi Jeff,

If you need to re-roll your 'jk/bug-to-abort' branch, could you please
squash this into the relevant patch (commit d8193743e0 "usage.c: add
BUG() function", 12-05-2017).

[Just FYI, sparse complains thus:
  usage.c:212:6: error: symbol 'BUG_fl' redeclared with different type
  (originally declared at git-compat-util.h:1076) - different modifiers
]

Thanks!

ATB,
Ramsay Jones

 usage.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/usage.c b/usage.c
index 545136734..918b539bc 100644
--- a/usage.c
+++ b/usage.c
@@ -209,6 +209,7 @@ static NORETURN void BUG_vfl(const char *file, int line, 
const char *fmt, va_lis
 }
 
 #ifdef HAVE_VARIADIC_MACROS
+__attribute__((format (printf, 3, 4))) NORETURN
 void BUG_fl(const char *file, int line, const char *fmt, ...)
 {
va_list ap;
@@ -217,6 +218,7 @@ void BUG_fl(const char *file, int line, const char *fmt, 
...)
va_end(ap);
 }
 #else
+__attribute__((format (printf, 1, 2))) NORETURN
 void BUG(const char *fmt, ...)
 {
va_list ap;
-- 
2.13.0


Re: [PATCH 07/19] diff.c: convert fn_out_consume to use emit_line_*

2017-05-15 Thread Junio C Hamano
Junio C Hamano  writes:

>> -fprintf(o->file, "%s%s--- %s%s%s\n",
>> -line_prefix, meta, ecbdata->label_path[0], reset, 
>> name_a_tab);
>> -fprintf(o->file, "%s%s+++ %s%s%s\n",
>> -line_prefix, meta, ecbdata->label_path[1], reset, 
>> name_b_tab);
>> +emit_line_fmt(o, meta, reset, "--- %s%s\n",
>> +  ecbdata->label_path[0], name_a_tab);
>> +emit_line_fmt(o, meta, reset, "+++ %s%s\n",
>> +  ecbdata->label_path[1], name_b_tab);
>
> How is the loss of line_prefix from this call site compensated?

OK, emit_line_0() has already been aware of line_prefix, so that is
how the loss of line_prefix in the above is accounted for.  We are
good here.

>>  ecbdata->label_path[0] = ecbdata->label_path[1] = NULL;
>>  }
>>  
>> @@ -1349,7 +1346,7 @@ static void fn_out_consume(void *priv, char *line, 
>> unsigned long len)
>>  diff_words_flush(ecbdata);
>>  if (ecbdata->diff_words->type == DIFF_WORDS_PORCELAIN) {
>>  emit_line(o, context, reset, line, len);
>> -fputs("~\n", o->file);
>> +emit_line(o, NULL, NULL, "~\n", 2);

So unless we have some magic here, we would see an extra line-prefix
before that "~\n" thing, no?


>>  } else {
>>  /*
>>   * Skip the prefix character, if any.  With


Re: [PATCH 07/19] diff.c: convert fn_out_consume to use emit_line_*

2017-05-15 Thread Junio C Hamano
Stefan Beller  writes:

> In a later patch, I want to propose an option to detect
> moved lines in a diff, which cannot be done in a one-pass over
> the diff. Instead we need to go over the whole diff twice,
> because we cannot detect the first line of the two corresponding
> lines (+ and -) that got moved.
>
> So to prepare the diff machinery for two pass algorithms
> (i.e. buffer it all up and then operate on the result),
> move all emissions to places, such that the only emitting
> function is emit_line_0.
>
> This covers the parts of fn_out_consume.
>
> Signed-off-by: Stefan Beller 
> ---
>  diff.c | 13 +
>  1 file changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index aef159a919..93343a9ccc 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1289,7 +1289,6 @@ static void fn_out_consume(void *priv, char *line, 
> unsigned long len)
>   const char *context = diff_get_color(ecbdata->color_diff, DIFF_CONTEXT);
>   const char *reset = diff_get_color(ecbdata->color_diff, DIFF_RESET);
>   struct diff_options *o = ecbdata->opt;
> - const char *line_prefix = diff_line_prefix(o);
>  
>   o->found_changes = 1;
>  
> @@ -1301,14 +1300,12 @@ static void fn_out_consume(void *priv, char *line, 
> unsigned long len)
>  
>   if (ecbdata->label_path[0]) {
>   const char *name_a_tab, *name_b_tab;
> -
>   name_a_tab = strchr(ecbdata->label_path[0], ' ') ? "\t" : "";
>   name_b_tab = strchr(ecbdata->label_path[1], ' ') ? "\t" : "";
> -
> - fprintf(o->file, "%s%s--- %s%s%s\n",
> - line_prefix, meta, ecbdata->label_path[0], reset, 
> name_a_tab);
> - fprintf(o->file, "%s%s+++ %s%s%s\n",
> - line_prefix, meta, ecbdata->label_path[1], reset, 
> name_b_tab);
> + emit_line_fmt(o, meta, reset, "--- %s%s\n",
> +   ecbdata->label_path[0], name_a_tab);
> + emit_line_fmt(o, meta, reset, "+++ %s%s\n",
> +   ecbdata->label_path[1], name_b_tab);

How is the loss of line_prefix from this call site compensated?
emit_line_fmt() receives o so it is possible diff_line_prefix(o)
may be called there and prepended to the output over there, but I
somehow do not think that is the case---in fact 06/19 does not seem
to teach emit_line_fmt() to do something like that.

Unless emit_line() is now taught to do the line_prefix thing, that
is.

But then the hunk we see below, which didn't add line_prefix to the
output, would add an unwanted prefix, so that is not likely.

Hmph...

>   ecbdata->label_path[0] = ecbdata->label_path[1] = NULL;
>   }
>  
> @@ -1349,7 +1346,7 @@ static void fn_out_consume(void *priv, char *line, 
> unsigned long len)
>   diff_words_flush(ecbdata);
>   if (ecbdata->diff_words->type == DIFF_WORDS_PORCELAIN) {
>   emit_line(o, context, reset, line, len);
> - fputs("~\n", o->file);
> + emit_line(o, NULL, NULL, "~\n", 2);
>   } else {
>   /*
>* Skip the prefix character, if any.  With


Re: [PATCH v2 04/29] log: add exhaustive tests for pattern style options & config

2017-05-15 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> On Mon, May 15, 2017 at 6:57 AM, Junio C Hamano  wrote:
>> Ævar Arnfjörð Bjarmason   writes:
>>> + if test_have_prereq PCRE
>>> + then
>>> + # Only PCRE would match [\d]\| with only
>>> + # "(1|2)" due to [\d]. POSIX basic would match
>>> + # both it and "1", and POSIX extended would
>>> + # match neither.
>>
>> OK.  BRE would match because the other side of "\|" is empty to
>> match anything?
>
> Yes. I'll clarify this. It's not just a POSIX basic feature. The same
> is true of extended and perl. E.g.:

Yes, but "\|" won't be taken as alternative in ERE or PCRE, and that
is why "[\d]\|" would match everything with BRE (as opposed to
others---PCRE matches "(1|2)" not because "\|" is an alternative but
because the pattern looks for a digit followed by a literal vert-bar,
and ERE does not match any because there is no 'd' followed by a
literal vert-bar).

I was mostly reacting to "BRE would match both it and '1'", which
singled out "1" as if "1" is special and gives a false impression
that it wouldn't have matched if it were "7".

>
> git grep [-E|-P] 'foo|bar'
>
> Both match the same as:
>
> git grep [-E|-P] '(foo|bar)'
>
>>> + git -c grep.patternType=perl log --pretty=tformat:%s \
>>> + --grep="[\d]\|" >actual.perl &&
>>> + test_cmp expect.perl actual.perl
>>> + fi &&
>>


Re: [PATCH] fixup! log: add exhaustive tests for pattern style options & config

2017-05-15 Thread Junio C Hamano
Johannes Schindelin  writes:

> On Mon, 15 May 2017, Junio C Hamano wrote:
>
>> My knee-jerk reaction matched Dscho's, but grep is about contents,
>> and we should be able to test this if we used a sensible tagnames or
>> didn't use any.  Glad to see somebody can step back and think ;-)
>
> Maybe somebody should step back even further and think even more, as we
> could adjust test_commit to mangle the argument into a tag name that is
> legal even with a refs backend relying on NTFS.

Perhaps, but I am not sure if that is needed.  

The point of the helper is to serve as a simple "we are building a
toy sample history by only adding a one-liner new file" convenience
helper, and I think it is sensible to keep its definition simple.
The callers (like the ones being added in the rerolled patch under
discussion) with special needs can supply tagname when the default
one is not suitable.

In hindsight, perhaps it would have been better if the default for
the helper were _not_ to create any tag (and callers who care about
tags can optionally tell it to add tag, or tag the resulting commit
themselves), but that is lamenting water under the bridge.


Re: [PATCH 00/11] Start retiring .git/remotes/ and .git/branches/ for good

2017-05-15 Thread Junio C Hamano
Johannes Schindelin  writes:

> On Fri, 12 May 2017, Junio C Hamano wrote:
>
>> Is it really hurting us having to support these old information
>> sources we treat as read-only?
>
> Well, you frequently complain about my patches, claiming that they place
> unnecessary maintenance burden on you.
>
> I would say that the .git/remotes/ and .git/branches/ code is a lot more
> maintenance burden than most of my patches.

I wasn't going to respond to this thread anymore, because I didn't
feel like the discussion was going anywhere, and you already said
you won't listen to me.

You seem to be confused between "maintenance burden" and "burden on
the maintainer". I felt that it needs to be corrected for other
people reading this exchange from the sideline.

When we worked to add feature X in the remotes subsystem, we
were slowed down because we had to adjust the code also for
.git/branches.  The same story for feature Y.  The same for
feature Z.  This is getting ridiculous/cumbersome, especially
given that we know .git/branches is not used by anybody.

That's a maintenance burden, and the "we" refers to the Git
development community as a whole, not the maintainer.  It is not a
burden on _me_.

Also important to notice is I do not know what X, Y and Z are with
respect to .git/branches feature.  That is where "Is it really
hurting?" question comes from, but it hasn't been answered so far.

What's burden on the maintainer is having to engage in a discussion
like this one, to reject an attempt to remove something that is not
demonstratably a maintenance burden, because the maintainer has to
act as the last-resort champion for the end-users, when others on
the list do not speak up X-<.

Yes, I know that proving that something we currently support is not
used by anybody is HARD [*1*].  That is why removal is costly.  And
that in turn is why we need to be careful when adding new things and
making changes in general.


[Footnotes]

*1* Removal of "rsync" transport we did recently was a happy but
rare case.  It has been broken for a few years without anybody
complaining.


Re: [PATCH v1 2/5] Teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-05-15 Thread Jeff King
On Tue, May 16, 2017 at 12:22:14AM +, brian m. carlson wrote:

> On Mon, May 15, 2017 at 03:13:44PM -0400, Ben Peart wrote:
> > +   istate->last_update = (time_t)ntohll(*(uint64_t *)index);
> > +   index += sizeof(uint64_t);
> > +
> > +   ewah_size = ntohl(*(uint32_t *)index);
> > +   index += sizeof(uint32_t);
> 
> To answer the question you asked in your cover letter, you cannot write
> this unless you can guarantee (((uintptr_t)index & 7) == 0) is true.
> Otherwise, this will produce a SIGBUS on SPARC, Alpha, MIPS, and some
> ARM systems, and it will perform poorly on PowerPC and other ARM
> systems[0].
> 
> If you got that pointer from malloc and have only indexed multiples of 8
> on it, you're good.  But if you're not sure, you probably want to use
> memcpy.  If the compiler can determine that it's not necessary, it will
> omit the copy and perform a direct load.

I think get_be32() does exactly what we want for the ewah_size read. For
the last_update one, we don't have a get_be64() yet, but it should be
easy to make based on the 16/32 versions.

(I note also that time_t is not necessarily 64-bits in the first place,
but David said something about this not really being a time_t).

-Peff


Re: [RFC PATCH v2 00/22] Add blame to libgit

2017-05-15 Thread Junio C Hamano
Jeffrey Smith  writes:

> On Mon, May 15, 2017 at 4:24 AM, Junio C Hamano  wrote:
>> Jeff Smith  writes:
>>
>>> Rather than duplicate large portions of builtin/blame.c in cgit, it
>>> would be better to shift its core functionality into libgit.a.  The
>>> functionality left in builtin/blame.c mostly relates to terminal
>>> presentation.
>>
>> As I said in my review of 04/22, it is a bit hard/tedious to sift
>> the changes to refactoring that actually changes code and pure
>> renaming and movement of lines across files with the current
>> sequence of the series, so it is very possible that I may have
>> missed something, but from a cursory read through the series, from
>> the comparison between master and the result of applying all
>> patches, and inspection of the resulting builtin/blame.c file, I
>> think what this series does is very sensible in general.  blame.h
>> does not seem to expose anything that is not needed, which is a good
>> sign.
>>
> I did try to keep any unnecessary changes out of the big movement
> patches (17-19).  Would you prefer I break down 19 further?  I had
> been holding back because of the added churn that would introduce from
> lots of changing function visibility in blame.h and blame.c.

To be honest, the "rename symbols while moving" done starting around
04/22 made the series too noisy to read and I had hard time
reviewing the later ones like 17-19.  I think they share exactly the
same issue as 04/22, e.g. 17/22 renames origin_decref to
blame_origin_decref while moving the function and some of its
callers across files.

>
> (And if gmail would quit trying to make my messages 'rich text' --
> a.k.a. HTML -- that would be great...)
>
>


Re: [PATCH v1 2/5] Teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-05-15 Thread brian m. carlson
On Mon, May 15, 2017 at 03:13:44PM -0400, Ben Peart wrote:
> + istate->last_update = (time_t)ntohll(*(uint64_t *)index);
> + index += sizeof(uint64_t);
> +
> + ewah_size = ntohl(*(uint32_t *)index);
> + index += sizeof(uint32_t);

To answer the question you asked in your cover letter, you cannot write
this unless you can guarantee (((uintptr_t)index & 7) == 0) is true.
Otherwise, this will produce a SIGBUS on SPARC, Alpha, MIPS, and some
ARM systems, and it will perform poorly on PowerPC and other ARM
systems[0].

If you got that pointer from malloc and have only indexed multiples of 8
on it, you're good.  But if you're not sure, you probably want to use
memcpy.  If the compiler can determine that it's not necessary, it will
omit the copy and perform a direct load.

[0] To be technically correct, all of those systems except SPARC can
have unaligned access fixed up automatically, depending on the kernel
settings.  But such a fixup involves taking a trap into the kernel,
performing two aligned loads and bit shifting, and returning to
userspace, which performs about as well as you'd expect.  For that
reason, Debian build machines have such fixups turned off and will just
SIGBUS.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 14/19] diff.c: convert word diffing to use emit_line_*

2017-05-15 Thread Stefan Beller
On Mon, May 15, 2017 at 3:40 PM, Jonathan Tan  wrote:
>
> I suspect that this will need to be refactored more thoroughly. Here, for
> example, emit_line (which prints the prefix) is printed nearly
> unconditionally, whereas in the original version, "fputs(line_prefix, fp)"
> is only printed when "print" is true.

Yes, manual testing confirms this is the case. Maybe I should to add a test
for "git diff --word-diff --line-prefix"

Thanks,
Stefan


Re: [PATCHv3 4/4] clone: use free_refspec() to free refspec list

2017-05-15 Thread Jeff King
On Mon, May 15, 2017 at 01:29:07PM +0200, SZEDER Gábor wrote:

> On Mon, May 15, 2017 at 1:05 PM, SZEDER Gábor  wrote:
> > From: Jeff King 
> >
> > Using free() on a refspec was always leaky, as its string
> > fields also need freed. But it became more so when ad00f128d
> > (clone: respect configured fetch respecs during initial
> > fetch, 2016-03-30) taught clone to create a list of
> > refspecs, each of which need to be freed.
> >
> > [sg: adjusted the function parameters.]
> >
> > Signed-off-by: Jeff King 
> > Signed-off-by: SZEDER Gábor 
> > ---
> >  builtin/clone.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/builtin/clone.c b/builtin/clone.c
> > index 4144190da..4bf28d7f5 100644
> > --- a/builtin/clone.c
> > +++ b/builtin/clone.c
> > @@ -1120,6 +1120,6 @@ int cmd_clone(int argc, const char **argv, const char 
> > *prefix)
> > strbuf_release();
> > junk_mode = JUNK_LEAVE_ALL;
> >
> > -   free(refspec);
> > +   free_refspec(remote->fetch_refspec_nr + 1, remote->fetch);
> > return err;
> >  }
> 
> Erm...  I should have given a bit more thought to this last patch,
> shouldn't I?
> 
> First, the unchanged commit message is now (i.e. by using the parsed
> refspecs returned by remote_get()) completely outdated.
> Second, while it properly frees those refspecs, i.e. the array and all
> its string fields, it will now leak the memory pointed by the
> 'refspec' variable.  However, why free just that one field of the
> 'struct *remote'?  Alas, we don't seem to have a free_remote()
> function...

Right, the "struct remote" fields are meant to last the program
lifetime, and just go away when the program does. So they are never
freed. We could just lump this in with them, and remove the free()
entirely.

It's a little funny with your patch as-is, because "struct remote"
doesn't actually know about the refspec we stuck on the end. But since
it doesn't free its refspecs anyway, it's not any more leaky than all
the other entries.

-Peff


Re: [PATCH] pull: optionally rebase submodules

2017-05-15 Thread Brandon Williams
On 05/11, Stefan Beller wrote:
> On Thu, May 11, 2017 at 10:24 AM, Brandon Williams  wrote:
> > Teach pull to optionally update submodules when '--recurse-submodules'
> > is provided.  This will teach pull to run 'submodule update --rebase'
> > when the '--recurse-submodules' and '--rebase' flags are given.
> >
> > Signed-off-by: Brandon Williams 
> > ---
> >
> > Pull is already a shortcut for running fetch followed by a merge/rebase, so 
> > why
> > not have it be a shortcut for running 'submodule update --rebase' when the
> > recurse flag is given!
> 
> I have not thought about the implications of this shortcut, as opposed to
> actually implementing it in C (which presumably would contain more checks).
> Will do.

Well this would be a short up until we actually implement recursion in
merge and rebase.  For rebase we may want to wait until its completely
ported to C since that effort is still underway.  Alternatively we can avoid
this shortcut and wait until rebase is finished being ported.

> 
> >
> >  builtin/pull.c| 30 ++-
> >  t/t5572-pull-submodule.sh | 97 
> > +++
> >  2 files changed, 125 insertions(+), 2 deletions(-)
> >
> > diff --git a/builtin/pull.c b/builtin/pull.c
> > index dd1a4a94e..d73d654e6 100644
> > --- a/builtin/pull.c
> > +++ b/builtin/pull.c
> > @@ -77,6 +77,7 @@ static const char * const pull_usage[] = {
> >  /* Shared options */
> >  static int opt_verbosity;
> >  static char *opt_progress;
> > +static int recurse_submodules;
> >
> >  /* Options passed to git-merge or git-rebase */
> >  static enum rebase_type opt_rebase = -1;
> > @@ -532,6 +533,17 @@ static int pull_into_void(const struct object_id 
> > *merge_head,
> > return 0;
> >  }
> >
> > +static int  update_submodules(void)
> 
> Maybe s/update_submodules/rebase_submodules/ ?
> 
> > +{
> > +   struct child_process cp = CHILD_PROCESS_INIT;
> > +   cp.git_cmd = 1;
> > +
> > +   argv_array_pushl(, "submodule", "update", "--recursive", 
> > NULL);
> > +   argv_array_push(, "--rebase");
> 
> The --rebase could be part of the _pushl ?
> Also we could set
> no_stdin = 1
> we do need stdout/err though.

can do.

> 
> 
> > +
> > +   return run_command();
> > +}
> > +
> >  /**
> >   * Runs git-merge, returning its exit status.
> >   */
> > @@ -816,6 +828,14 @@ int cmd_pull(int argc, const char **argv, const char 
> > *prefix)
> > oidclr(_fork_point);
> > }
> >
> > +   if (opt_recurse_submodules &&
> > +   !strcmp(opt_recurse_submodules, "--recurse-submodules")) {
> 
> So this checks if we pass --recurse-submodules to fetch and if it is not
> a on-demand/no.
> Maybe we'd want to use the same infrastructure as fetch does, such that
> parse_fetch_recurse makes the decision. (Then "--recurse-submodules=TrUe"
> would work as well, IIUC)

Agreed, it may be better to actually parse the switch properly.

> 
> 
> > +   recurse_submodules = 1;
> > +
> > +   if (!opt_rebase)
> > +   die(_("--recurse-submodules is only valid with 
> > --rebase"));
> 
> I wonder if there are existing users of "git pull --recurse --merge";
> as of now this would fetch the submodules (on-demand) and merge
> in the local commits of the superprojects. It sounds like a useful workflow
> which we'd be blocking here? Maybe just do nothing in case of !opt_rebase,
> i.e. make it part of the first condition added in this hunk?
> 
> > +   ret = run_rebase(_head, merge_heads.oid, 
> > _fork_point);
> > +
> > +   if (!ret && recurse_submodules)
> > +   ret = update_submodules();
> 
> Instead of doing the rebase of submodules here, we may just want to pass
> 'recurse_submodules' into run_rebase, which can do it, too. (It already has
> a 'ret' value, so the main cmd is not as cluttered.
> 
> ---
> Before reviewing the tests, let's step a bit back and talk about the design
> and what is useful to the user. From reading the code, we
>   1) perform a fetch in the superproject
>   2) rebase the superproject (not rewriting any submodule pointers,
> but that may be ok for now)
>   3) sequentially:
>   3a) fetch each submodule on demand
>   3b) run rebase inside of it.
> 
> 
> (A) On the sequence:
> If in a normal pull --rebase we have a failure, we can fixup the failure
> and then continue via "git rebase --continue"; now if we have a failure
> in 3), we would need to fixup the submodule ourselves and then as
> we lost all progress in the superproject, rerun "pull --rebase --recurse"?

Yeah this may not have the best workflow...

> 
> (B)
> As discussed offline this produces bad results if we have a non-ff
> operation in the superproject, that also has submodule pointer updates.
> So additionally we would need to walk the superprojects local commits
> and check if any submodule is touched.
> 
> 
> >
> > 

Re: [PATCHv3 1/4] clone: respect additional configured fetch refspecs during initial fetch

2017-05-15 Thread Jeff King
On Mon, May 15, 2017 at 01:05:54PM +0200, SZEDER Gábor wrote:

> The initial fetch during a clone doesn't transfer refs matching
> additional fetch refspecs given on the command line as configuration
> variables.  This contradicts to the documentation stating that

Minor gramm-o: s/to the/the/

> @@ -989,6 +994,10 @@ int cmd_clone(int argc, const char **argv, const char 
> *prefix)
>   strbuf_reset();
>  
>   remote = remote_get(option_origin);
> + REALLOC_ARRAY(remote->fetch, remote->fetch_refspec_nr + 1);
> + memcpy(remote->fetch+remote->fetch_refspec_nr, refspec,
> +sizeof(*refspec));

Here we append to remote->fetch. We are assuming then that
remote->fetch_refspec has already been parsed into remote->fetch. Which
I think it always is by remote_get(), but given that it lazy-parses in
some cases, it feels a little dangerous.

I also notice that you don't touch remote->fetch_refspec_nr, nor
fetch_refspec_alloc. So the remote struct doesn't actually know about
this entry.  It would probably be wrong if you _did_ update them,
because remote->fetch_refspec (the list of refspec strings) would not
have a matching entry, and would potentially access uninitialized
memory.

I think the whole thing would be a lot less messy if "struct remote" let
you add a new refspec (as a string) after the initial parse, and it
would handle the details. Just making the existing add_fetch_refspec()
public isn't quite enough, because you'd need to invalidate and re-parse
the matching "fetch" array, too. Something like:

diff --git a/remote.c b/remote.c
index 9c8912ab1..0881ed32c 100644
--- a/remote.c
+++ b/remote.c
@@ -2319,3 +2319,17 @@ void apply_push_cas(struct push_cas_option *cas,
for (ref = remote_refs; ref; ref = ref->next)
apply_cas(cas, remote, ref);
 }
+
+void remote_add_fetch_refspec(struct remote *remote, const char *refspec)
+{
+   add_fetch_refspec(remote, refspec);
+   if (remote->fetch) {
+   struct refspec *parsed;
+
+   parsed = parse_refspec_internal(1, , 1, 0);
+   REALLOC_ARRAY(remote->fetch, remote->fetch_refspec_nr);
+   remote->fetch[remote->fetch_refspec_nr - 1] = *parsed;
+   /* Not free_refspec, as we copied its pointers above */
+   free(parsed);
+   }
+}

That feels a bit dirty, too, but at least it's not reaching across
module boundaries. I think the cleanest thing would be to actually add
it to the config before calling remote_get().

I think in the earlier discussion you mentioned there are some ordering
problems with writing out the new on-disk config. But could we add it to
the temporary environment, like:

  strbuf_addf(, "remote.%s.fetch=%s", option_origin, refspec_pattern);
  git_config_push_parameter(key.buf);

?

Come to think of it, though, I thought the reason we weren't using
remote_get() in the first place is that some code paths (like
single-branch) needed to actually get the remote ref list before we knew
the refspec? So how does this approach work at all? :)

I guess that doesn't apply here. We always feed the transport code with
a broad refspec, and then narrow it down later. It's only that we can't
write the final config to disk until we've computed the correct branch
based on the remote refs. gotten the branch.

If all that's correct, then I think the push_parameter() thing would
work. It does feel like a round-a-bout way to solve the problem, but
it's at least manipulating solid, public APIs.

-Peff


Re: [PATCHv3 0/4] clone: respect configured fetch respecs during initial fetch

2017-05-15 Thread Jeff King
On Mon, May 15, 2017 at 01:05:53PM +0200, SZEDER Gábor wrote:

> This is a reroll of sg/clone-refspec-from-command-line-config.
> Sorry for the delay, family visit.

No problem. Thanks for letting us know before it went to 'next'. ;)

> The first patch is the updated version of what is now the first commit
> of that topic.  The changes are those mentioned in [1]:
> 
>  - updated commit message,
>  - renamed 'refspec_count' to 'refspec_nr',

Good.

>  - use the parsed fetch refspecs returned by remote.c:remote_get()
>instead of parsing them ourselves (look at the third hunk of the
>diff of builtin/clone.c, how much shorter it looks),

Yeah, that is much nicer. It does feel a little dirty modifying
remote->fetch, though. I'll comment on the specific patch.

>  - modified tests to check that refs matching the default refspecs are
>transferred as well, and
>  - added a test for the combination of '-c
>remote..fetch= --origin='.

Sounds good.

> The second patch is a doc update to warn users that not all
> configuration variables are supported via 'git clone -c ...' at the
> moment.

Good idea.

> Patches 3 and 4 are the last two patches from Peff from this morning
> [2].  I picked those up, because his last patch required a bit of
> variable name adjustments.  I didn't pick up his first patch, because
> using remote_get() already factors out refspec parsing.

Makes sense. Thanks for including them.

-Peff


Re: git worktrees must exist even if locked

2017-05-15 Thread Junio C Hamano
"taylor, david"  writes:

> The original report was against Git v2.12.2.  I have since tried v2.12.3, 
> v2.13.0,
> and the next branch.  All exhibit the same symptoms.
>
> Even if you ignore the original scenario for creating the problem, if I do a 
> 'rm -rf' or 'mv'
> of a tree that contains within it worktrees, that should not break the use of 
> Git with
> worktrees that live elsewhere nor commands that don't require a repository.

Duy, any ideas?

Thanks.

>
>> -Original Message-
>> From: taylor, david
>> Sent: Wednesday, May 10, 2017 4:25 PM
>> To: git@vger.kernel.org
>> Subject: git worktrees must exist even if locked
>> 
>> The Git documentation in describing worktrees says that one reason
>> why you might want to lock a worktree is to prevent it from being pruned
>> if it is on a removable media that isn't currently mounted.
>> 
>> So, my expectation was that if the worktree is inaccessible (and locked), Git
>> would pretend that there is no worktree by that name.
>> 
>> In reality, if you have such a worktree, Git gets an error.
>> 
>>  On local systems, /home is local to a machine; home directories are
>> elsewhere.
>> Home directories are NFS mounted; /home is not.
>> 
>> . create a repository in /my/home/dir/my-repo.git with
>> 
>> git clone --bare 
>> 
>> . create an empty directory /home/somedir/worktree-tests
>> 
>> . use 'git worktree add' to add /home/somedir/worktree-tests/> name>
>>   as a worktree on branch .  It gets populated with the correct
>>   content.
>> 
>> . lock it using'git worktree lock'
>> 
>> So far, so good.  Now, go to a different computer -- one on which
>> /home/somedir/worktree-tests does not exist (and therefore
>> /home/somedir/worktree-tests/ does not exist).
>> 
>> . cd /my/home/dir/my-repo.git
>> 
>> Now, try issuing Git commands.  Many will fail.
>> 
>>   git fetch ==> fails:
>> 
>>   fatal: Invalid path '/home/somedir/worktree-tests': No such file or 
>> directory
>> 
>>   git status ==> fails -- same error as above
>>   git help worktree ==> fails -- same error as above


Re: [PATCH 19/19] diff.c: color moved lines differently

2017-05-15 Thread Brandon Williams
On 05/13, Stefan Beller wrote:
> When there is a lot of code moved around such as in 11979b9 (2005-11-18,
> "http.c: reorder to avoid compilation failure.") for example, the review
> process is quite hard, as it is not mentally challenging.  It is a rather
> tedious process, that gets boring quickly. However you still need to read
> through all of the code to make sure the moved lines are there as supposed.
> 
> While it is trivial to color up a patch like the following
> 
> $ git diff
> diff --git a/file2.c b/file2.c
> index 9163a0f..8e66dc0 100644
> --- a/file2.c
> +++ b/file2.c
> @@ -3,13 +3,6 @@ void *xmemdupz(const void *data, size_t len)
> return memcpy(xmallocz(len), data, len);
>  }
> 
> -int secure_foo(struct user *u)
> -{
> -   if (!u->is_allowed_foo)
> -   return;
> -   foo(u);
> -}
> -
>  char *xstrndup(const char *str, size_t len)
>  {
> char *p = memchr(str, '\0', len);
> diff --git a/test.c b/test.c
> index a95e6fe..81eb0eb 100644
> --- a/test.c
> +++ b/test.c
> @@ -18,6 +18,13 @@ ssize_t pread_in_full(int fd, void *buf, size_t count, 
> off_t offset)
> return total;
>  }
> 
> +int secure_foo(struct user *u)
> +{
> +   if (!u->is_allowed_foo)
> +   return;
> +   foo(u);
> +}
> +
>  int xdup(int fd)
>  {
> int ret = dup(fd);
> 
> as in this patch all lines that add or remove lines
> should be colored in the new color that indicates moved
> lines.
> 
> However the intention of this patch is to aid reviewers
> to spotting permutations in the moved code. So consider the
> following malicious move:
> 
> diff --git a/file2.c b/file2.c
> index 9163a0f..8e66dc0 100644
> --- a/file2.c
> +++ b/file2.c
> @@ -3,13 +3,6 @@ void *xmemdupz(const void *data, size_t len)
> return memcpy(xmallocz(len), data, len);
>  }
> 
> -int secure_foo(struct user *u)
> -{
> -   if (!u->is_allowed_foo)
> -   return;
> -   foo(u);
> -}
> -
>  char *xstrndup(const char *str, size_t len)
>  {
> char *p = memchr(str, '\0', len);
> diff --git a/test.c b/test.c
> index a95e6fe..a679c40 100644
> --- a/test.c
> +++ b/test.c
> @@ -18,6 +18,13 @@ ssize_t pread_in_full(int fd, void *buf, size_t count, 
> off_t offset)
> return total;
>  }
> 
> +int secure_foo(struct user *u)
> +{
> +   foo(u);
> +   if (!u->is_allowed_foo)
> +   return;
> +}
> +
>  int xdup(int fd)
>  {
> int ret = dup(fd);
> 
> If the moved code is larger, it is easier to hide some permutation in the
> code, which is why we would not want to color all lines as "moved" in this
> case. So we do not just need to color lines differently that are added and
> removed in the same diff, we need to tweak the algorithm a bit more.
> 
> As the reviewers attention should be brought to the places, where the
> difference is introduced to the moved code, we cannot just have one new
> color for all of moved code.
> 
> First I implemented an alternative design, which would show a moved hunk
> in one color, and its boundaries in another color. This idea was error
> prone as it inspected each line and its neighboring lines to determine
> if the line was (a) moved and (b) if was deep inside a hunk by having
> matching neighboring lines. This is unreliable as the we can construct
> hunks which have equal neighbors that just exceed the number of lines
> inspected. See one of the tests as an example.
> 
> Instead this provides a dynamic programming greedy algorithm that finds
> the largest moved hunk and then switches color to the alternative color
> for the next hunk. By doing this any permutation is recognized and
> displayed. That implies that there is no dedicated boundary or
> inside-hunk color, but instead we'll have just two colors alternating
> for hunks.
> 
> It would be a bit more UX friendly if the two corresponding hunks
> (of added and deleted lines) for one move would get the same color id.
> (Both get "regular moved" or "alternative moved"). This problem is
> deferred to a later patch for now.
> 
> Algorithm-by: Jonathan Tan 
> Signed-off-by: Stefan Beller 
> ---
>  Documentation/config.txt   |  12 +-
>  diff.c | 265 
> ++---
>  diff.h |  21 +++-
>  t/t4015-diff-whitespace.sh | 229 +++
>  4 files changed, 506 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 475e874d51..90403c06e3 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1051,14 +1051,22 @@ This does not affect linkgit:git-format-patch[1] or 
> the
>  

Re: [PATCH 14/19] diff.c: convert word diffing to use emit_line_*

2017-05-15 Thread Jonathan Tan

On 05/13/2017 09:01 PM, Stefan Beller wrote:

In a later patch, I want to propose an option to detect
moved lines in a diff, which cannot be done in a one-pass over
the diff. Instead we need to go over the whole diff twice,
because we cannot detect the first line of the two corresponding
lines (+ and -) that got moved.

So to prepare the diff machinery for two pass algorithms
(i.e. buffer it all up and then operate on the result),
move all emissions to places, such that the only emitting
function is emit_line_0.

This covers all code related to diffing words.

Signed-off-by: Stefan Beller 
---
 diff.c | 56 
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/diff.c b/diff.c
index 91dc045a45..07041a35ad 100644
--- a/diff.c
+++ b/diff.c
@@ -886,37 +886,38 @@ struct diff_words_data {
struct diff_words_style *style;
 };

-static int fn_out_diff_words_write_helper(FILE *fp,
+static int fn_out_diff_words_write_helper(struct diff_options *o,
  struct diff_words_style_elem *st_el,
  const char *newline,
  size_t count, const char *buf,
  const char *line_prefix)
 {
-   int print = 0;
+   struct strbuf sb = STRBUF_INIT;

while (count) {
char *p = memchr(buf, '\n', count);
-   if (print)
-   fputs(line_prefix, fp);
+
if (p != buf) {
-   if (st_el->color && fputs(st_el->color, fp) < 0)
-   return -1;
-   if (fputs(st_el->prefix, fp) < 0 ||
-   fwrite(buf, p ? p - buf : count, 1, fp) != 1 ||
-   fputs(st_el->suffix, fp) < 0)
-   return -1;
-   if (st_el->color && *st_el->color
-   && fputs(GIT_COLOR_RESET, fp) < 0)
-   return -1;
+   if (st_el->color)
+   strbuf_addstr(, st_el->color);
+   strbuf_addstr(, st_el->prefix);
+   strbuf_add(, buf, p ? p - buf : count);
+   strbuf_addstr(, st_el->suffix);
+   if (st_el->color && *st_el->color)
+   strbuf_addstr(, GIT_COLOR_RESET);
}
if (!p)
-   return 0;
-   if (fputs(newline, fp) < 0)
-   return -1;
+   goto out;
+   strbuf_addstr(, newline);
+   emit_line(o, NULL, NULL, sb.buf, sb.len);


I suspect that this will need to be refactored more thoroughly. Here, 
for example, emit_line (which prints the prefix) is printed nearly 
unconditionally, whereas in the original version, "fputs(line_prefix, 
fp)" is only printed when "print" is true.



+   strbuf_reset();
count -= p + 1 - buf;
buf = p + 1;
-   print = 1;
}
+out:
+   if (sb.len)
+   emit_line(o, NULL, NULL, sb.buf, sb.len);
+   strbuf_release();
return 0;
 }

@@ -994,25 +995,25 @@ static void fn_out_diff_words_aux(void *priv, char *line, 
unsigned long len)
} else
plus_begin = plus_end = diff_words->plus.orig[plus_first].end;

-   if (color_words_output_graph_prefix(diff_words)) {
-   fputs(line_prefix, diff_words->opt->file);
-   }
+   if (color_words_output_graph_prefix(diff_words))
+   emit_line(diff_words->opt, NULL, NULL, "", 0);
+
if (diff_words->current_plus != plus_begin) {
-   fn_out_diff_words_write_helper(diff_words->opt->file,
+   fn_out_diff_words_write_helper(diff_words->opt,
>ctx, style->newline,
plus_begin - diff_words->current_plus,
diff_words->current_plus, line_prefix);
if (*(plus_begin - 1) == '\n')
-   fputs(line_prefix, diff_words->opt->file);
+   emit_line(diff_words->opt, NULL, NULL, "", 0);
}
if (minus_begin != minus_end) {
-   fn_out_diff_words_write_helper(diff_words->opt->file,
+   fn_out_diff_words_write_helper(diff_words->opt,
>old, style->newline,
minus_end - minus_begin, minus_begin,
line_prefix);
}
if (plus_begin != plus_end) {
-   fn_out_diff_words_write_helper(diff_words->opt->file,
+   fn_out_diff_words_write_helper(diff_words->opt,
>new, style->newline,
plus_end - plus_begin, plus_begin,

Re: t5400 failure on Windows

2017-05-15 Thread Jeff King
On Mon, May 15, 2017 at 10:05:29PM +0200, Johannes Sixt wrote:

> I observe the following failure on Windws with origin/next, in
> t5400-send-pack.sh
> 
> +++ diff -u expect refs
> --- expect  Mon May 15 06:54:59 2017
> +++ refsMon May 15 06:54:59 2017
> @@ -1,4 +1,3 @@
>  5285e1710002a06a379056b0d21357a999f3c642 refs/heads/master
> -5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/HEAD
>  5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/master
>  53d9066ca10f2e103b33caf3a16a723553c33b00 .have
> error: last command exited with $?=1
> not ok 16 - receive-pack de-dupes .have lines

Interesting that it's the symref. I wonder if that is important (though
they should always be implemented as a real symref file and not a
symlink, so I don't think FS limitations would come into play).

For sanity, you might want to double-check that the "shared-fork"
repository has that symref on disk, and that for-each-ref reports it.
But...

> The trace file looks like this:
> 
>  trace 
> packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 
> refs/heads/master\0report-status delete-refs side-band-64k quiet atomic 
> ofs-delta agent=git/2.13.0.497.g5af12421b0
> packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 
> refs/remotes/origin/HEAD

Wait, there it is in the receive-pack output. But if we look at what
"push" receives:

> packet: push< 5285e1710002a06a379056b0d21357a999f3c642 
> refs/heads/master\0report-status delete-refs side-band-64k quiet atomic 
> ofs-delta agent=git/2.13.0.497.g5af12421b0
> packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 
> refs/remotes/origin/master
> packet: push< 5285e1710002a06a379056b0d21357a999f3c642 
> refs/remotes/origin/master

It's skipped! So either:

  1. receive-pack wrote the trace line without sending it (which seems
 an unlikely bug to differ between platforms)

  2. push some didn't receive it (which makes no sense; this is a stream
 socket and it was able to parse the next pktline)

  3. push did get it but for some reason didn't write the trace (also
 unlikely as with option 1)

  4. There is something racy and unportable about both programs writing
 to the same trace file. It's opened with O_APPEND, which means that
 each write should atomically position the pointer at the end of the
 file. Is it possible there's a problem with that in the way
 O_APPEND works on Windows?

 If that was the case, though, I'd generally expect a sheared write
 or a partial overwrite. The two origin/HEAD lines from the two
 programs are the exact same length, but I'd find it more likely for
 the overwrite to happen with one of the follow-on lines.

So all of those sound kind of implausible. If you run the test over and
over, does it happen consistently, and is it always the same line?

-Peff


Re: [PATCH 05/19] diff.c: emit_line_0 can handle no color setting

2017-05-15 Thread Stefan Beller
On Mon, May 15, 2017 at 11:31 AM, Jonathan Tan  wrote:
> On 05/13/2017 09:01 PM, Stefan Beller wrote:
>>
>> In a later patch, I want to propose an option to detect
>> moved lines in a diff, which cannot be done in a one-pass over
>> the diff. Instead we need to go over the whole diff twice,
>> because we cannot detect the first line of the two corresponding
>> lines (+ and -) that got moved.
>>
>> So to prepare the diff machinery for two pass algorithms
>> (i.e. buffer it all up and then operate on the result),
>> move all emissions to places, such that the only emitting
>> function is emit_line_0.
>>
>> In later patches we may pass lines that are not colored to
>> the central function emit_line_0, so we
>> need to emit the color only when it is non-NULL.
>
>
> The diff below seems to just make emit_line_0 allow NULL for set and reset,
> unlike what the commit message above describes. (And is that necessary?
> Couldn't the caller just pass "" if they don't want any setting and/or
> resetting?)
>

They could just give ""; but instead of having an empty system
call I thought about this short cut.

I'll reword the commit message.

Thanks,
Stefan


Re: [PATCH v7] fetch-pack: always allow fetching of literal SHA1s

2017-05-15 Thread Jeff King
On Mon, May 15, 2017 at 10:32:20AM -0700, Jonathan Tan wrote:

> OK, one combined function (for lazy initialization and checking
> containment in the oidset) with comment it is.
> 
> Change from v6: changed back to "tip_oids_contain", and included Peff's
> comment.

Thanks, looks good to me.

-Peff


Re: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default

2017-05-15 Thread Jeff King
On Mon, May 15, 2017 at 04:13:58PM +0200, Ævar Arnfjörð Bjarmason wrote:

> On Mon, May 15, 2017 at 3:58 PM, Marc Stevens  wrote:
> > Hi Aevar,
> >
> > Thank you for notifying us of this issue.
> > Big endianness is a tricky issue, also since I don't have access or 
> > accurate knowledge about all big endian systems.
> > Our github repo does check correct functioning, including an endianness 
> > mistake, with 'make test'.
> > But I guess this is not included for SHA1DC in Git.
> >
> > Anyway, we can easily add the _BIG_ENDIAN macrotest to the git repo and 
> > will do so soon.
> >
> > I don't think the segfault is caused by buffer overflow, inproper access, 
> > or the endianness issue.
> > But I did notice an unexpected issue: the message block pointer m=0x398ad5 
> > is odd.
> > Can you confirm whether loading an uint32_t from an odd address triggers a 
> > hardware interrupt on your platform?
> > This is not problem for x86, but maybe for your platform it is?
> > If it is then we should always copy buffer contents to the sha1context to 
> > avoid this issue.
> 
> I don't have access to the box in question, Michael was testing this
> code for me. But unaligned access is probably the cause, although
> according to some info I found online that should give a SIGBUS not a
> SIGSEGV, but that may have changed:

Yeah, I would have expected SIGBUS there. If we have alignment issues,
though, I'd expect that ARM systems will experience problems.

Block-sha1 uses a macro which allows unaligned loads on platforms that
support it, and otherwise does the endian conversion on the fly as we
load the bytes into a local variable (which presumably happens all
in-register). That may be faster than doing a mass copy of the buffer.

-Peff


Re: [PATCH 15/19] diff.c: convert diff_flush to use emit_line_*

2017-05-15 Thread Stefan Beller
On Mon, May 15, 2017 at 1:21 PM, Jonathan Tan  wrote:
> On 05/13/2017 09:01 PM, Stefan Beller wrote:
>>
>> In a later patch, I want to propose an option to detect
>> moved lines in a diff, which cannot be done in a one-pass over
>> the diff. Instead we need to go over the whole diff twice,
>> because we cannot detect the first line of the two corresponding
>> lines (+ and -) that got moved.
>>
>> So to prepare the diff machinery for two pass algorithms
>> (i.e. buffer it all up and then operate on the result),
>> move all emissions to places, such that the only emitting
>> function is emit_line_0.
>>
>> This covers diff_flush.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>>  diff.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/diff.c b/diff.c
>> index 07041a35ad..386b28cf47 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -4873,7 +4873,9 @@ void diff_flush(struct diff_options *options)
>>   term, 1);
>> if (options->stat_sep) {
>> /* attach patch instead of inline */
>> -   fputs(options->stat_sep, options->file);
>> +   emit_line(options, NULL, NULL,
>> + options->stat_sep,
>> + strlen(options->stat_sep));
>
>
> Same comment as patch 10 - is it OK that we now output a prefix too?

In this case, I would think it is ok. The stat_sep is only used in
"format-patch --attach" for example, which makes NO sense to
use in combination with --line-prefix.

(That is already broken; at least the line-prefix part, as we
do *not* prefix all the lines with the given prefix. That is because
stat_sep is a multiline string emitted, starting with '\n'.)

Thanks,
Stefan


Re: clone vs submodule operation with HTTP cURL

2017-05-15 Thread Jean-Francois Bouchard
Hello,

Follow-up to this post.

I seems that setting GIT_HTTP_PROXY_AUTHMETHOD=anyauth is a workaround
to this issue.

So now the issue is : git is not setting AUTHMETHOD correctly when
doing a submodule update

Note that a user is needed in the URL string. :@ does not work,
empty.auth variable does not work.

Any ideas ?

Thanks,
JF

On Thu, May 11, 2017 at 6:04 PM, Jean-Francois Bouchard
 wrote:
> Hello everyone,
>
> In our usage of GSSAPI via HTTPS, all our operation with git are very
> well handle, however, when trying to update a submodule, git seems to
> be managing cURL differently. cURL drop the ball quickly.
>
> Example (No other setup needed on the client) :
> git clone HTTPrepo -> GET -> 401 -> GET -> 401 -> GET (this time with
> Authorization: Negotiate)  -> 200 OK
>
> git submodule update -> GET -> 401 -> git prompt for username
>
> Is the codepath for clone regarding cURL is different than with submodule ?
>
> Using : 2.13.0, I have also tried the emptyAuth stuff with no avail.
>
> Thanks,
> JF

-- 


Avis de confidentialité

Les informations contenues dans le présent message et dans toute pièce qui 
lui est jointe sont confidentielles et peuvent être protégées par le secret 
professionnel. Ces informations sont à l’usage exclusif de son ou de ses 
destinataires. Si vous recevez ce message par erreur, veuillez s’il vous 
plait communiquer immédiatement avec l’expéditeur et en détruire tout 
exemplaire. De plus, il vous est strictement interdit de le divulguer, de 
le distribuer ou de le reproduire sans l’autorisation de l’expéditeur. 
Merci.

Confidentiality notice

This e-mail message and any attachment hereto contain confidential 
information which may be privileged and which is intended for the exclusive 
use of its addressee(s). If you receive this message in error, please 
inform sender immediately and destroy any copy thereof. Furthermore, any 
disclosure, distribution or copying of this message and/or any attachment 
hereto without the consent of the sender is strictly prohibited. Thank you.


RE: [PATCH v1 2/5] Teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-05-15 Thread David Turner

> -Original Message-
> From: Ben Peart [mailto:peart...@gmail.com]
> Sent: Monday, May 15, 2017 3:14 PM
> To: git@vger.kernel.org
> Cc: gits...@pobox.com; benpe...@microsoft.com; pclo...@gmail.com;
> johannes.schinde...@gmx.de; David Turner ;
> p...@peff.net
> Subject: [PATCH v1 2/5] Teach git to optionally utilize a file system monitor 
> to
> speed up detecting new or changed files.

> @@ -342,6 +344,8 @@ struct index_state {
>   struct hashmap dir_hash;
>   unsigned char sha1[20];
>   struct untracked_cache *untracked;
> + time_t last_update;
> + struct ewah_bitmap *bitmap;

The name 'bitmap' doesn't tell the reader much about what it used for.

> +static int update_istate(const char *name, void *is) {

Rename to mark_file_dirty?  Also why does it take a void pointer?  Or return 
int (rather than void)?

> +void refresh_by_fsmonitor(struct index_state *istate) {
> + static has_run_once = FALSE;
> + struct strbuf buffer = STRBUF_INIT;

Rename to query_result? Also I think you're leaking it.



Re: Is it possible to use new conditional includes outside of git dir?

2017-05-15 Thread Ævar Arnfjörð Bjarmason
On Mon, May 15, 2017 at 8:09 PM, Mihails Strasuns
 wrote:
> I was very excited to try out new conditional include feature but have
> quickly found out that it doesn't work with some of my custom scripts
> because of the following behaviour:
>
> # .gitconfig
> [includeIf "gitdir:~/work/"]
> path = ~/.gitconfig.work
> [user]
> name1 = Someone
>
> # .gitconfig.work
> [user]
> name2 = SomeoneElse
> $ cd ~/work; git config user.name1 # "Someone"
> $ cd ~/work ; git config user.name2 # 
>
> I see that it matches current `git config` documentation - indeed, it
> mentions only matching on git directory and not arbitrary file path. I
> wonder though, if this limitation is intentional and what is the
> rationale behind it?

It's intentional, but in the "that's how this works" sense, not "we'll
never add what you just asked for".

This came up on list last week:
https://public-inbox.org/git/f55dc360-9c1e-45b9-b8ba-39e1001bd...@gmail.com/T/#u

So I'd like to ask you the same question I asked Raphael in that thread.


Re: Git just passed Subversion on openhub.net

2017-05-15 Thread Ævar Arnfjörð Bjarmason
On Mon, May 15, 2017 at 5:54 PM, Øyvind A. Holm  wrote:
> openhub.net has a comparion of the number of public repositories on the
> net, based on searching public hosting services on the net. Git just
> passed Subversion after the number of Git repositories has exploded
> lately. It seems as lots of new repositories were created after cpython
> changed to Git in February.
>
> I've been tracking the development on
>  since 2014-08, and all
> the data since then are availble on
>
>   https://gitlab.com/sunny256/openhub-repositories
> and
>   https://github.com/sunny256/openhub-repositories
>
> Current status:

Thanks, that's really interesting!

>  https://gitlab.com/sunny256/openhub-repositories/blob/master/status.txt

ITYM: https://github.com/sunny256/openhub-repositories/blob/master/status.txt

> SVG graphs are available on
>
>   https://gitlab.com/sunny256/openhub-repositories/tree/master/graph

ITYM: 
https://github.com/sunny256/openhub-repositories/blob/master/graph/repos.svg

> Regards,
> Øyvind
>
> +-| Øyvind A. Holm  - N 60.37604° E 5.9° |-+
> | OpenPGP: 0xFB0CBEE894A506E5 - http://www.sunbase.org/pubkey.asc |
> | Fingerprint: A006 05D6 E676 B319 55E2  E77E FB0C BEE8 94A5 06E5 |
> +| 27e0042e-3985-11e7-b3fc-db5caa6d21d3 |-+


[GSoC] Update: Week 0 (Community Bonding Period)

2017-05-15 Thread Prathamesh Chavan
SUMMARY OF MY PROJECT:

Git submodule subcommands are currently implemented by using shell script
'git-submodule.sh'. There are several reasons why we'll prefer not to
use the shell script. My project intends to convert the subcommands into
C code, thus making them builtins. This will increase Git's portability
and hence the efficiency of working with the git-submodule commands.
Link to the complete proposal: [1]

Mentors:
Stefan Beller 
Christian Couder 

UPDATES:

As proposed, till 15th May I mostly spend my time updating my information
about the codebase.

Along with this, I even spend some time working on the patch:
submodule: port subcommand foreach from shell to C[2]
I even updated my patch by sending v3, and adding a test[3] to
the existing ones. Currently, the patch passes all the test,
except the newly added one.

PLAN FOR WEEK-1:

As proposed, I would be starting to code officially from 16th May
(tomorrow), and hence have plans for working as follows:

Improvise submodule: port subcommand foreach from shell to C patch,
as suggested and also work on getting the newly added test passed.
I have recently received review from Brandon Williams 
and my mentor Stefan Beller  which I am really
thankful for, and wish to implement these suggestions as soon as
possible.

Adding to this, I am also waiting for the community's review on the
issue regarding the $path variable, which I added along with the
patch[2]. This issue is also highlighted by the new test[3] added.
Any suggestion would be of huge help to work further on getting
all the tests clear.

Along with this, I plan to port submodule subcommand status.
This required first porting of the function set_name_rev from
shell to C, and then later port the subcommand.

[1]: 
https://docs.google.com/document/d/1krxVLooWl--75Pot3dazhfygR3wCUUWZWzTXtK1L-xU/edit
[2]: https://public-inbox.org/git/20170512114404.10008-2-pc44...@gmail.com/
[3]: https://public-inbox.org/git/20170512114404.10008-1-pc44...@gmail.com/


Re: [PATCH 15/19] diff.c: convert diff_flush to use emit_line_*

2017-05-15 Thread Jonathan Tan

On 05/13/2017 09:01 PM, Stefan Beller wrote:

In a later patch, I want to propose an option to detect
moved lines in a diff, which cannot be done in a one-pass over
the diff. Instead we need to go over the whole diff twice,
because we cannot detect the first line of the two corresponding
lines (+ and -) that got moved.

So to prepare the diff machinery for two pass algorithms
(i.e. buffer it all up and then operate on the result),
move all emissions to places, such that the only emitting
function is emit_line_0.

This covers diff_flush.

Signed-off-by: Stefan Beller 
---
 diff.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 07041a35ad..386b28cf47 100644
--- a/diff.c
+++ b/diff.c
@@ -4873,7 +4873,9 @@ void diff_flush(struct diff_options *options)
  term, 1);
if (options->stat_sep) {
/* attach patch instead of inline */
-   fputs(options->stat_sep, options->file);
+   emit_line(options, NULL, NULL,
+ options->stat_sep,
+ strlen(options->stat_sep));


Same comment as patch 10 - is it OK that we now output a prefix too?


}
}




Re: [PATCH v1 5/5] Add a sample query-fsmonitor hook script that integrates with the cross platform Watchman file watching service.

2017-05-15 Thread Ben Peart



On 5/15/2017 3:50 PM, David Turner wrote:



-Original Message-
From: Ben Peart [mailto:peart...@gmail.com]
Sent: Monday, May 15, 2017 3:14 PM
To: git@vger.kernel.org
Cc: gits...@pobox.com; benpe...@microsoft.com; pclo...@gmail.com;
johannes.schinde...@gmx.de; David Turner ;
p...@peff.net
Subject: [PATCH v1 5/5] Add a sample query-fsmonitor hook script that
integrates with the cross platform Watchman file watching service.

To use the script:

Download and install Watchman from https://facebook.github.io/watchman/
and instruct Watchman to watch your working directory for changes
('watchman watch-project /usr/src/git').

Rename the sample integration hook from query-fsmonitor.sample to query-
fsmonitor.

Configure git to use the extension ('git config core.fsmonitor true') and
optionally turn on the untracked cache for optimal performance ('git config
core.untrackedcache true').

Signed-off-by: Ben Peart 
Signed-off-by: Johannes Schindelin 
---
  templates/hooks--query-fsmonitor.sample | 27
+++
  1 file changed, 27 insertions(+)
  create mode 100644 templates/hooks--query-fsmonitor.sample

diff --git a/templates/hooks--query-fsmonitor.sample b/templates/hooks--
query-fsmonitor.sample
new file mode 100644
index 00..4bd22f21d8
--- /dev/null
+++ b/templates/hooks--query-fsmonitor.sample
@@ -0,0 +1,27 @@
+#!/bin/sh
+#
+# An example hook script to integrate Watchman #
+(https://facebook.github.io/watchman/) with git to provide fast # git
+status.
+#
+# The hook is passed a time_t formatted as a string and outputs to #
+stdout all files that have been modified since the given time.
+# Paths must be relative to the root of the working tree and #
+separated by a single NUL.
+#
+# To enable this hook, rename this file to "query-fsmonitor"
+
+# Convert unix style paths to escaped Windows style paths case "$(uname
+-s)" in
+MINGW*|MSYS_NT*)
+  GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,,g')"
+  ;;
+*)
+  GIT_WORK_TREE="$PWD"
+  ;;
+esac
+
+# Query Watchman for all the changes since the requested time echo
+"[\"query\", \"$GIT_WORK_TREE\", {\"since\": $1,
+\"fields\":[\"name\"]}]" | \ watchman -j | \ perl -e 'use JSON::PP; my
+$o = JSON::PP->new->utf8->decode(join("", <>)); die "Watchman: $o-

{'error'}.\nFalling back to scanning...\n" if defined($o->{"error"});

print(join("\0", @{$o->{"files"}}));'

Last time I checked, the argument to 'since' was not a time_t -- it was a
watchman clock spec.  Have you tested this?  Does it work?



Watchman also accepts a Unix time value for "since" as documented here 
(https://facebook.github.io/watchman/docs/expr/since.html).


Yes, this has been tested and works correctly as long as you have a 
recent version that contains the patch 
(https://github.com/facebook/watchman/commit/67b26a8938336f08918fc7187129b6c1a571f35b) 
that made sure it was greedy when using the Unix time.




t5400 failure on Windows

2017-05-15 Thread Johannes Sixt
I observe the following failure on Windws with origin/next, in
t5400-send-pack.sh

+++ diff -u expect refs
--- expect  Mon May 15 06:54:59 2017
+++ refsMon May 15 06:54:59 2017
@@ -1,4 +1,3 @@
 5285e1710002a06a379056b0d21357a999f3c642 refs/heads/master
-5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/HEAD
 5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/master
 53d9066ca10f2e103b33caf3a16a723553c33b00 .have
error: last command exited with $?=1
not ok 16 - receive-pack de-dupes .have lines


The trace file looks like this:

 trace 
packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 
refs/heads/master\0report-status delete-refs side-band-64k quiet atomic 
ofs-delta agent=git/2.13.0.497.g5af12421b0
packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 
refs/remotes/origin/HEAD
packet: push< 5285e1710002a06a379056b0d21357a999f3c642 
refs/heads/master\0report-status delete-refs side-band-64k quiet atomic 
ofs-delta agent=git/2.13.0.497.g5af12421b0
packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 
refs/remotes/origin/master
packet: push< 5285e1710002a06a379056b0d21357a999f3c642 
refs/remotes/origin/master
packet: receive-pack> 53d9066ca10f2e103b33caf3a16a723553c33b00 .have
packet: push< 53d9066ca10f2e103b33caf3a16a723553c33b00 .have
packet: receive-pack> 
packet: push< 
packet: push>  
b1a1c97e94b6388c108e195d28a3e89f00c81698 refs/heads/foo\0 report-status 
agent=git/2.13.0.497.g5af12421b0
packet: push> 
packet: receive-pack<  
b1a1c97e94b6388c108e195d28a3e89f00c81698 refs/heads/foo\0 report-status 
agent=git/2.13.0.497.g5af12421b0
packet: receive-pack< 
packet: receive-pack> unpack ok
packet: receive-pack> ok refs/heads/foo
packet: receive-pack> 
packet: push< unpack ok
packet: push< ok refs/heads/foo
packet: push< 
 end of trace 

Where should I start looking? On a Linux box, the test case does
produce an additional line

packet: push< 5285e1710002a06a379056b0d21357a999f3c642 
refs/remotes/origin/HEAD

in the trace file. I also do not see that any tests would be skipped
on Windows. (But I forgot to check whether refs/remotes/origin/HEAD
is present in any of the repositories.)

-- Hannes


RE: [PATCH v1 5/5] Add a sample query-fsmonitor hook script that integrates with the cross platform Watchman file watching service.

2017-05-15 Thread David Turner


> -Original Message-
> From: Ben Peart [mailto:peart...@gmail.com]
> Sent: Monday, May 15, 2017 3:14 PM
> To: git@vger.kernel.org
> Cc: gits...@pobox.com; benpe...@microsoft.com; pclo...@gmail.com;
> johannes.schinde...@gmx.de; David Turner ;
> p...@peff.net
> Subject: [PATCH v1 5/5] Add a sample query-fsmonitor hook script that
> integrates with the cross platform Watchman file watching service.
> 
> To use the script:
> 
> Download and install Watchman from https://facebook.github.io/watchman/
> and instruct Watchman to watch your working directory for changes
> ('watchman watch-project /usr/src/git').
> 
> Rename the sample integration hook from query-fsmonitor.sample to query-
> fsmonitor.
> 
> Configure git to use the extension ('git config core.fsmonitor true') and
> optionally turn on the untracked cache for optimal performance ('git config
> core.untrackedcache true').
> 
> Signed-off-by: Ben Peart 
> Signed-off-by: Johannes Schindelin 
> ---
>  templates/hooks--query-fsmonitor.sample | 27
> +++
>  1 file changed, 27 insertions(+)
>  create mode 100644 templates/hooks--query-fsmonitor.sample
> 
> diff --git a/templates/hooks--query-fsmonitor.sample b/templates/hooks--
> query-fsmonitor.sample
> new file mode 100644
> index 00..4bd22f21d8
> --- /dev/null
> +++ b/templates/hooks--query-fsmonitor.sample
> @@ -0,0 +1,27 @@
> +#!/bin/sh
> +#
> +# An example hook script to integrate Watchman #
> +(https://facebook.github.io/watchman/) with git to provide fast # git
> +status.
> +#
> +# The hook is passed a time_t formatted as a string and outputs to #
> +stdout all files that have been modified since the given time.
> +# Paths must be relative to the root of the working tree and #
> +separated by a single NUL.
> +#
> +# To enable this hook, rename this file to "query-fsmonitor"
> +
> +# Convert unix style paths to escaped Windows style paths case "$(uname
> +-s)" in
> +MINGW*|MSYS_NT*)
> +  GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,,g')"
> +  ;;
> +*)
> +  GIT_WORK_TREE="$PWD"
> +  ;;
> +esac
> +
> +# Query Watchman for all the changes since the requested time echo
> +"[\"query\", \"$GIT_WORK_TREE\", {\"since\": $1,
> +\"fields\":[\"name\"]}]" | \ watchman -j | \ perl -e 'use JSON::PP; my
> +$o = JSON::PP->new->utf8->decode(join("", <>)); die "Watchman: $o-
> >{'error'}.\nFalling back to scanning...\n" if defined($o->{"error"});
> print(join("\0", @{$o->{"files"}}));'

Last time I checked, the argument to 'since' was not a time_t -- it was a 
watchman clock spec.  Have you tested this?  Does it work?



Re: [PATCH 03/19] diff.c: drop 'nofirst' from emit_line_0

2017-05-15 Thread Brandon Williams
On 05/15, Stefan Beller wrote:
> On Mon, May 15, 2017 at 12:22 PM, Brandon Williams  wrote:
> 
> > Does the order of newline/carriage return always the same?
> 
> https://en.wikipedia.org/wiki/Newline
> 
> There are operating systems that like it the other way round.
> The BBC micro is no longer relevant (IMHO), but RISC OS
> spooled text output *may* be relevant as they released a stable
> version not that long ago.
> 
> But I would think this code would have issues with RISC OS
> text spooling without this patch as well.
> 
> Stefan

Fair enough, its not relevant to the series.  I was just pointing it
out.

-- 
Brandon Williams


Re: [PATCH 03/19] diff.c: drop 'nofirst' from emit_line_0

2017-05-15 Thread Stefan Beller
On Mon, May 15, 2017 at 12:22 PM, Brandon Williams  wrote:

> Does the order of newline/carriage return always the same?

https://en.wikipedia.org/wiki/Newline

There are operating systems that like it the other way round.
The BBC micro is no longer relevant (IMHO), but RISC OS
spooled text output *may* be relevant as they released a stable
version not that long ago.

But I would think this code would have issues with RISC OS
text spooling without this patch as well.

Stefan


Re: [PATCH 06/19] diff: add emit_line_fmt

2017-05-15 Thread Brandon Williams
On 05/13, Stefan Beller wrote:
> In the following patches we'll convert all printing functions to use
> the emit_line_* family of functions.
> 
> Many of the printing functions to be converted are formatted. So offer
> a formatted function in the emit_line function family as well.
> 
> Signed-off-by: Stefan Beller 
> ---
>  diff.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/diff.c b/diff.c
> index 48f0fb98dc..aef159a919 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -552,6 +552,20 @@ static void emit_line(struct diff_options *o, const char 
> *set, const char *reset
>   emit_line_0(o, set, reset, 0, line, len);
>  }
>  
> +static void emit_line_fmt(struct diff_options *o,
> +   const char *set, const char *reset,
> +   const char *fmt, ...)
> +{
> + struct strbuf sb = STRBUF_INIT;
> + va_list ap;
> + va_start(ap, fmt);
> + strbuf_vaddf(, fmt, ap);
> + va_end(ap);
> +
> + emit_line(o, set, reset, sb.buf, sb.len);
> + strbuf_release();
> +}
> +
>  static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char 
> *line, int len)
>  {
>   if (!((ecbdata->ws_rule & WS_BLANK_AT_EOF) &&
> -- 
> 2.13.0.18.g183880de0a
> 

Since this is a new function, and it is marked static, this patch
shouldn't compile or the compiler should throw a warning or something.

-- 
Brandon Williams


Re: [PATCH 10/19] diff.c: convert emit_rewrite_lines to use emit_line_*

2017-05-15 Thread Stefan Beller
On Mon, May 15, 2017 at 12:09 PM, Jonathan Tan  wrote:
>
>
> On 05/13/2017 09:01 PM, Stefan Beller wrote:
>>
>> In a later patch, I want to propose an option to detect
>> moved lines in a diff, which cannot be done in a one-pass over
>> the diff. Instead we need to go over the whole diff twice,
>> because we cannot detect the first line of the two corresponding
>> lines (+ and -) that got moved.
>>
>> So to prepare the diff machinery for two pass algorithms
>> (i.e. buffer it all up and then operate on the result),
>> move all emissions to places, such that the only emitting
>> function is emit_line_0.
>>
>> This covers emit_rewrite_lines.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>>  diff.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/diff.c b/diff.c
>> index e4b46fee4f..369c804f03 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -748,7 +748,7 @@ static void emit_rewrite_lines(struct emit_callback
>> *ecb,
>> if (!endp) {
>> const char *context = diff_get_color(ecb->color_diff,
>>  DIFF_CONTEXT);
>> -   putc('\n', ecb->opt->file);
>> +   emit_line(ecb->opt, NULL, NULL, "\n", 1);
>
>
> This outputs diff_line_prefix(ecb->opt) - is that OK?

It shows this area is not covered well by our test suite.

My first reaction was that this is not ok, but we'd have
to inspect the situation. It was introduced in
35e2d03c2c (Fix '\ No newline...' annotation in rewrite diffs,
2012-08-04).

And looking at the code of the function I think this is broken.

I wonder what the best way forward is for this patch series here,
as we'd need to buffer the last line. That should be fine as it is
a corner case, maybe:

diff --git a/diff.c b/diff.c
index 0f10736ee6..f46e52135d 100644
--- a/diff.c
+++ b/diff.c
@@ -1011,15 +1011,27 @@ static void add_line_count(struct strbuf *out,
int count)
 static void emit_rewrite_lines(struct emit_callback *ecb,
   int prefix, const char *data, int size)
 {
-   const char *endp = NULL;
-   static const char *nneof = " No newline at end of file\n";
+   static const char *nneof = "\\ No newline at end of file\n";
const char *reset = diff_get_color(ecb->color_diff, DIFF_RESET);
+   struct strbuf sb = STRBUF_INIT;

while (0 < size) {
int len;

endp = memchr(data, '\n', size);
-   len = endp ? (endp - data + 1) : size;
+   if (endp)
+   len = endp - data + 1;
+   else {
+   /* last line has no \n */
+   while (0 < size) {
+   strbuf_addch(, *data);
+   size -= len;
+   data += len;
+   }
+   strbuf_addch(, '\n');
+   data = sb.buf;
+   len = sb.len;
+   }
if (prefix != '+') {
ecb->lno_in_preimage++;
emit_del_line(reset, ecb, data, len);
@@ -1030,12 +1042,12 @@ static void emit_rewrite_lines(struct
emit_callback *ecb,
size -= len;
data += len;
}
-   if (!endp) {
+   if (sb.len) {
const char *context = diff_get_color(ecb->color_diff,
 DIFF_CONTEXT);
-   emit_line(ecb->opt, NULL, NULL, "\n", 1);
-   emit_line_0(ecb->opt, context, reset, '\\',
+   emit_line(ecb->opt, context, reset,
nneof, strlen(nneof));
+   strbuf_release();
}
 }


Re: [PATCH 03/19] diff.c: drop 'nofirst' from emit_line_0

2017-05-15 Thread Brandon Williams
On 05/13, Stefan Beller wrote:
> In 250f79930d (diff.c: split emit_line() from the first char and the rest
> of the line, 2009-09-14) we introduced the local variable 'nofirst' that
> indicates if we have no first sign character. With the given implementation
> we had to use an extra variable unlike reusing 'first' because the lines
> first character could be '\0'.
> 
> Change the meaning of the 'first' argument to not mean the first character
> of the line, but rather just containing the sign that is prepended to the
> line. Refactor emit_line to not include the lines first character, but pass
> the complete line as well as a '\0' sign, which now serves as an indication
> not to print a sign.
> 
> With this patch other callers hard code the sign (which are '+', '-',
> ' ' and '\\') such that we do not run into unexpectedly emitting an
> error-nous '\0'.
> 
> The audit of the caller revealed that the sign cannot be '\n' or '\r',
> so remove that condition for trailing newline or carriage return in the
> sign; the else part of the condition handles the len==0 perfectly,
> so we can drop the if/else construct.
> 
> Signed-off-by: Stefan Beller 
> ---
>  diff.c | 40 +---
>  1 file changed, 17 insertions(+), 23 deletions(-)
> 
> diff --git a/diff.c b/diff.c
> index c2ed605cd0..4269b8dccf 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -517,33 +517,24 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t 
> *mf2,
>  }
>  
>  static void emit_line_0(struct diff_options *o, const char *set, const char 
> *reset,
> - int first, const char *line, int len)
> + int sign, const char *line, int len)
>  {
>   int has_trailing_newline, has_trailing_carriage_return;
> - int nofirst;
>   FILE *file = o->file;
>  
>   fputs(diff_line_prefix(o), file);
>  
> - if (len == 0) {
> - has_trailing_newline = (first == '\n');
> - has_trailing_carriage_return = (!has_trailing_newline &&
> - (first == '\r'));
> - nofirst = has_trailing_newline || has_trailing_carriage_return;
> - } else {
> - has_trailing_newline = (len > 0 && line[len-1] == '\n');
> - if (has_trailing_newline)
> - len--;
> - has_trailing_carriage_return = (len > 0 && line[len-1] == '\r');
> - if (has_trailing_carriage_return)
> - len--;
> - nofirst = 0;
> - }
> + has_trailing_newline = (len > 0 && line[len-1] == '\n');
> + if (has_trailing_newline)
> + len--;
> + has_trailing_carriage_return = (len > 0 && line[len-1] == '\r');
> + if (has_trailing_carriage_return)
> + len--;

Does the order of newline/carriage return always the same?

>  
> - if (len || !nofirst) {
> + if (len || sign) {
>   fputs(set, file);
> - if (!nofirst)
> - fputc(first, file);
> + if (sign)
> + fputc(sign, file);
>   fwrite(line, len, 1, file);
>   fputs(reset, file);
>   }
> @@ -556,7 +547,7 @@ static void emit_line_0(struct diff_options *o, const 
> char *set, const char *res
>  static void emit_line(struct diff_options *o, const char *set, const char 
> *reset,
> const char *line, int len)
>  {
> - emit_line_0(o, set, reset, line[0], line+1, len-1);
> + emit_line_0(o, set, reset, 0, line, len);
>  }
>  
>  static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char 
> *line, int len)
> @@ -4822,9 +4813,12 @@ void diff_flush(struct diff_options *options)
>  
>   if (output_format & DIFF_FORMAT_PATCH) {
>   if (separator) {
> - fprintf(options->file, "%s%c",
> - diff_line_prefix(options),
> - options->line_termination);
> + char term[2];
> + term[0] = options->line_termination;
> + term[1] = '\0';
> +
> + emit_line(options, NULL, NULL,
> +   term, 1);
>   if (options->stat_sep) {
>   /* attach patch instead of inline */
>   fputs(options->stat_sep, options->file);
> -- 
> 2.13.0.18.g183880de0a
> 

-- 
Brandon Williams


[PATCH] config: match both symlink & realpath versions in IncludeIf.gitdir:*

2017-05-15 Thread Ævar Arnfjörð Bjarmason
Change the conditional inclusion mechanism to support
e.g. gitdir:~/git_tree/repo where ~/git_tree is a symlink to to
/mnt/stuff/repo.

This worked in the initial version of this facility[1], but regressed
later in the series while solving a related bug[2].

Now gitdir: will match against the symlinked
path (e.g. gitdir:~/git_tree/repo) in addition to the current
/mnt/stuff/repo path.

Since this is already in a release version note in the documentation
that this behavior changed, so users who expect their configuration to
work on both v2.13.0 and some future version of git with this fix
aren't utterly confused.

1. commit 3efd0bedc6 ("config: add conditional include", 2017-03-01)
2. commit 86f9515708 ("config: resolve symlinks in conditional
   include's patterns", 2017-04-05)

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

Here's a non-RFC patch to fix this bug.

 Documentation/config.txt  | 11 +++
 config.c  | 16 
 t/t1305-config-include.sh | 23 +++
 3 files changed, 50 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..137502a289 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -140,6 +140,17 @@ A few more notes on matching via `gitdir` and `gitdir/i`:
 
  * Symlinks in `$GIT_DIR` are not resolved before matching.
 
+ * Both the symlink & realpath versions of paths will be matched
+   outside of `$GIT_DIR`. E.g. if ~/git is a symlink to
+   /mnt/storage/git, both `gitdir:~/git` and `gitdir:/mnt/storage/git`
+   will match.
+
+   This was not the case in the initial release of this feature in
+   v2.13.0, which only matched the realpath version. Configuration
+   that wants to be compatible with the initial release of this
+   feature needs to either specify only the realpath version, or both
+   versions.
+
  * Note that "../" is not special and will match literally, which is
unlikely what you want.
 
diff --git a/config.c b/config.c
index b4a3205da3..0498746112 100644
--- a/config.c
+++ b/config.c
@@ -214,6 +214,7 @@ static int include_by_gitdir(const struct config_options 
*opts,
struct strbuf pattern = STRBUF_INIT;
int ret = 0, prefix;
const char *git_dir;
+   int retry = 1;
 
if (opts->git_dir)
git_dir = opts->git_dir;
@@ -226,6 +227,7 @@ static int include_by_gitdir(const struct config_options 
*opts,
strbuf_add(, cond, cond_len);
prefix = prepare_include_condition_pattern();
 
+again:
if (prefix < 0)
goto done;
 
@@ -245,6 +247,20 @@ static int include_by_gitdir(const struct config_options 
*opts,
ret = !wildmatch(pattern.buf + prefix, text.buf + prefix,
 icase ? WM_CASEFOLD : 0, NULL);
 
+   if (!ret && retry) {
+   /*
+* We've tried e.g. matching gitdir:~/work, but if
+* ~/work is a symlink to /mnt/storage/work
+* strbuf_realpath() will expand it, so the rule won't
+* match. Let's match against a
+* strbuf_add_absolute_path() version of the path,
+* which'll do the right thing
+*/
+   strbuf_reset();
+   strbuf_add_absolute_path(, git_dir);
+   retry = 0;
+   goto again;
+   }
 done:
strbuf_release();
strbuf_release();
diff --git a/t/t1305-config-include.sh b/t/t1305-config-include.sh
index 933915ec06..d9d2f545a4 100755
--- a/t/t1305-config-include.sh
+++ b/t/t1305-config-include.sh
@@ -273,6 +273,29 @@ test_expect_success SYMLINKS 'conditional include, 
relative path with symlinks'
)
 '
 
+test_expect_success SYMLINKS 'conditional include, gitdir matching symlink' '
+   ln -s foo bar &&
+   (
+   cd bar &&
+   echo "[includeIf \"gitdir:bar/\"]path=bar7" >>.git/config &&
+   echo "[test]seven=7" >.git/bar7 &&
+   echo 7 >expect &&
+   git config test.seven >actual &&
+   test_cmp expect actual
+   )
+'
+
+test_expect_success SYMLINKS 'conditional include, gitdir matching symlink, 
icase' '
+   (
+   cd bar &&
+   echo "[includeIf \"gitdir/i:BAR/\"]path=bar8" >>.git/config &&
+   echo "[test]eight=8" >.git/bar8 &&
+   echo 8 >expect &&
+   git config test.eight >actual &&
+   test_cmp expect actual
+   )
+'
+
 test_expect_success 'include cycles are detected' '
cat >.gitconfig <<-\EOF &&
[test]value = gitconfig
-- 
2.13.0.rc2.291.g57267f2277



[PATCH v1 3/5] fsmonitor: add test cases for fsmonitor extension

2017-05-15 Thread Ben Peart
Add test cases that ensure status results are correct when using the new
fsmonitor extension.  Test untracked, modified, and new files by
ensuring the results are identical to when not using the extension.

Add a test to ensure updates to the index properly mark corresponding
entries in the index extension as dirty so that the status is correct
after commands that modify the index but don't trigger changes in the
working directory.

Add a test that verifies that if the fsmonitor extension doesn't tell
git about a change, it doesn't discover it on its own.  This ensures
git is honoring the extension and that we get the performance benefits
desired.

Signed-off-by: Ben Peart 
---
 t/t7519-status-fsmonitor.sh | 134 
 1 file changed, 134 insertions(+)
 create mode 100644 t/t7519-status-fsmonitor.sh

diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
new file mode 100644
index 00..2d63efc27b
--- /dev/null
+++ b/t/t7519-status-fsmonitor.sh
@@ -0,0 +1,134 @@
+#!/bin/sh
+
+test_description='git status with file system watcher'
+
+. ./test-lib.sh
+
+clean_repo () {
+   git reset --hard HEAD
+   git clean -fd
+}
+
+dirty_repo () {
+   : >untracked
+   : >dir1/untracked
+   : >dir2/untracked
+   echo 1 >modified
+   echo 2 >dir1/modified
+   echo 3 >dir2/modified
+   echo 4 >new
+   echo 5 >dir1/new
+   echo 6 >dir2/new
+   git add new
+   git add dir1/new
+   git add dir2/new
+}
+
+test_expect_success 'setup' '
+   mkdir -p .git/hooks &&
+   write_script .git/hooks/query-fsmonitor<<-\EOF &&
+   printf "untracked\0"
+   printf "dir1/untracked\0"
+   printf "dir2/untracked\0"
+   printf "modified\0"
+   printf "dir1/modified\0"
+   printf "dir2/modified\0"
+   printf "new\0""
+   printf "dir1/new\0"
+   printf "dir2/new\0"
+   EOF
+   : >tracked &&
+   : >modified &&
+   mkdir dir1 &&
+   : >dir1/tracked &&
+   : >dir1/modified &&
+   mkdir dir2 &&
+   : >dir2/tracked &&
+   : >dir2/modified &&
+   git add . &&
+   test_tick &&
+   git commit -m initial &&
+   dirty_repo
+'
+
+cat >.gitignore <<\EOF
+.gitignore
+expect*
+output*
+EOF
+
+# Status is well tested elsewhere so we'll just ensure that the results are
+# the same when using core.fsmonitor. First call after turning on the option
+# does a complete scan so need to do two calls to ensure we test the new
+# codepath.
+
+test_expect_success 'status with core.untrackedcache true' '
+   git config core.fsmonitor true  &&
+   git config core.untrackedcache true &&
+   git -c core.fsmonitor=false -c core.untrackedcache=true status >expect 
&&
+   clean_repo &&
+   git status &&
+   dirty_repo &&
+   git status >output &&
+   test_i18ncmp expect output
+'
+
+test_expect_success 'status with core.untrackedcache false' '
+   git config core.fsmonitor true &&
+   git config core.untrackedcache false &&
+   git -c core.fsmonitor=false -c core.untrackedcache=false status >expect 
&&
+   clean_repo &&
+   git status &&
+   dirty_repo &&
+   git status >output &&
+   test_i18ncmp expect output
+'
+
+# Ensure commands that call refresh_index() to move the index back in time
+# properly invalidate the fsmonitor cache
+
+test_expect_success 'refresh_index() invalidates fsmonitor cache' '
+   git config core.fsmonitor true &&
+   git config core.untrackedcache true &&
+   clean_repo &&
+   git status &&
+   dirty_repo &&
+   write_script .git/hooks/query-fsmonitor<<-\EOF &&
+   EOF
+   git add . &&
+   git commit -m "to reset" &&
+   git status &&
+   git reset HEAD~1 &&
+   git status >output &&
+   git -c core.fsmonitor=false status >expect &&
+   test_i18ncmp expect output
+'
+
+# Now make sure it's actually skipping the check for modified and untracked
+# files unless it is told about them.  Note, after "git reset --hard HEAD" no
+# extensions exist other than 'TREE' so do a "git status" to get the extension
+# written before testing the results.
+
+test_expect_success 'status doesnt detect unreported modifications' '
+   git config core.fsmonitor true &&
+   git config core.untrackedcache true &&
+   write_script .git/hooks/query-fsmonitor<<-\EOF &&
+   :
+   EOF
+   clean_repo &&
+   git status &&
+   : >untracked &&
+   echo 2 >dir1/modified &&
+   git status >output &&
+   test_i18ngrep ! "Changes not staged for commit:" output &&
+   test_i18ngrep ! "Untracked files:" output &&
+   write_script .git/hooks/query-fsmonitor<<-\EOF &&
+   printf "untracked%s\0"
+   printf "dir1/modified\0"
+   EOF
+   git status >output &&
+   test_i18ngrep "Changes not staged for commit:" output &&
+   test_i18ngrep "Untracked files:" output
+'
+

[PATCH v1 5/5] Add a sample query-fsmonitor hook script that integrates with the cross platform Watchman file watching service.

2017-05-15 Thread Ben Peart
To use the script:

Download and install Watchman from https://facebook.github.io/watchman/
and instruct Watchman to watch your working directory for changes
('watchman watch-project /usr/src/git').

Rename the sample integration hook from query-fsmonitor.sample to
query-fsmonitor.

Configure git to use the extension ('git config core.fsmonitor true')
and optionally turn on the untracked cache for optimal performance
('git config core.untrackedcache true').

Signed-off-by: Ben Peart 
Signed-off-by: Johannes Schindelin 
---
 templates/hooks--query-fsmonitor.sample | 27 +++
 1 file changed, 27 insertions(+)
 create mode 100644 templates/hooks--query-fsmonitor.sample

diff --git a/templates/hooks--query-fsmonitor.sample 
b/templates/hooks--query-fsmonitor.sample
new file mode 100644
index 00..4bd22f21d8
--- /dev/null
+++ b/templates/hooks--query-fsmonitor.sample
@@ -0,0 +1,27 @@
+#!/bin/sh
+#
+# An example hook script to integrate Watchman
+# (https://facebook.github.io/watchman/) with git to provide fast
+# git status.
+#
+# The hook is passed a time_t formatted as a string and outputs to
+# stdout all files that have been modified since the given time.
+# Paths must be relative to the root of the working tree and
+# separated by a single NUL.
+#
+# To enable this hook, rename this file to "query-fsmonitor"
+
+# Convert unix style paths to escaped Windows style paths
+case "$(uname -s)" in
+MINGW*|MSYS_NT*)
+  GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,,g')"
+  ;;
+*)
+  GIT_WORK_TREE="$PWD"
+  ;;
+esac
+
+# Query Watchman for all the changes since the requested time
+echo "[\"query\", \"$GIT_WORK_TREE\", {\"since\": $1, \"fields\":[\"name\"]}]" 
| \
+watchman -j | \
+perl -e 'use JSON::PP; my $o = JSON::PP->new->utf8->decode(join("", <>)); die 
"Watchman: $o->{'error'}.\nFalling back to scanning...\n" if 
defined($o->{"error"}); print(join("\0", @{$o->{"files"}}));'
-- 
2.13.0.windows.1.6.g4597375fc3



[PATCH v1 2/5] Teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-05-15 Thread Ben Peart
When the index is read from disk, the query-fsmonitor index extension is
used to flag the last known potentially dirty index and untracked cach
entries.

If git finds out some entries are 'fsmonitor-dirty', but are really
unchanged (e.g. the file was changed, then reverted back), then Git will
clear the marking in the extension. If git adds or updates an index
entry, it is marked 'fsmonitor-dirty' to ensure it is checked for
changes in the working directory.

Before the 'fsmonitor-dirty' flags are used to limit the scope of the
files to be checked, the query-fsmonitor hook proc is called with the
time the index was last updated.  The hook proc returns the list of
files changed since that last updated time and the list of
potentially dirty entries is updated to reflect the current state.

refresh_index() and valid_cached_dir() are updated so that any entry not
flagged as potentially dirty is not checked as it cannot have any
changes.

Signed-off-by: Ben Peart 

---
 Makefile   |   1 +
 builtin/update-index.c |   1 +
 cache.h|   5 ++
 config.c   |   5 ++
 dir.c  |  13 +++
 dir.h  |   2 +
 entry.c|   1 +
 environment.c  |   1 +
 fsmonitor.c| 233 +
 fsmonitor.h|   9 ++
 read-cache.c   |  28 +-
 unpack-trees.c |   1 +
 12 files changed, 298 insertions(+), 2 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h

diff --git a/Makefile b/Makefile
index 94cce645a5..89acff1f46 100644
--- a/Makefile
+++ b/Makefile
@@ -761,6 +761,7 @@ LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
+LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
 LIB_OBJS += graph.o
diff --git a/builtin/update-index.c b/builtin/update-index.c
index ebfc09faa0..32fd977b43 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -232,6 +232,7 @@ static int mark_ce_flags(const char *path, int flag, int 
mark)
else
active_cache[pos]->ce_flags &= ~flag;
active_cache[pos]->ce_flags |= CE_UPDATE_IN_BASE;
+   active_cache[pos]->ce_flags |= CE_FSMONITOR_DIRTY;
cache_tree_invalidate_path(_index, path);
active_cache_changed |= CE_ENTRY_CHANGED;
return 0;
diff --git a/cache.h b/cache.h
index 40ec032a2d..64aa6e57cd 100644
--- a/cache.h
+++ b/cache.h
@@ -201,6 +201,7 @@ struct cache_entry {
 #define CE_ADDED (1 << 19)
 
 #define CE_HASHED(1 << 20)
+#define CE_FSMONITOR_DIRTY   (1 << 21)
 #define CE_WT_REMOVE (1 << 22) /* remove in work directory */
 #define CE_CONFLICTED(1 << 23)
 
@@ -324,6 +325,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define CACHE_TREE_CHANGED (1 << 5)
 #define SPLIT_INDEX_ORDERED(1 << 6)
 #define UNTRACKED_CHANGED  (1 << 7)
+#define FSMONITOR_CHANGED  (1 << 8)
 
 struct split_index;
 struct untracked_cache;
@@ -342,6 +344,8 @@ struct index_state {
struct hashmap dir_hash;
unsigned char sha1[20];
struct untracked_cache *untracked;
+   time_t last_update;
+   struct ewah_bitmap *bitmap;
 };
 
 extern struct index_state the_index;
@@ -767,6 +771,7 @@ extern int precomposed_unicode;
 extern int protect_hfs;
 extern int protect_ntfs;
 extern int git_db_env, git_index_env, git_graft_env, git_common_dir_env;
+extern int core_fsmonitor;
 
 /*
  * Include broken refs in all ref iterations, which will
diff --git a/config.c b/config.c
index d971cc3474..d146c88399 100644
--- a/config.c
+++ b/config.c
@@ -1224,6 +1224,11 @@ static int git_default_core_config(const char *var, 
const char *value)
return 0;
}
 
+   if (!strcmp(var, "core.fsmonitor")) {
+   core_fsmonitor = git_config_bool(var, value);
+   return 0;
+   }
+
/* Add other config variables here and to Documentation/config.txt. */
return platform_core_config(var, value);
 }
diff --git a/dir.c b/dir.c
index 1b5558fdf9..da428489e2 100644
--- a/dir.c
+++ b/dir.c
@@ -1652,6 +1652,18 @@ static int valid_cached_dir(struct dir_struct *dir,
if (!untracked)
return 0;
 
+   refresh_by_fsmonitor(_index);
+   if (dir->untracked->use_fsmonitor) {
+   /*
+* With fsmonitor, we can trust the untracked cache's
+* valid field.
+*/
+   if (untracked->valid)
+   goto skip_stat;
+   else
+   invalidate_directory(dir->untracked, untracked);
+   }
+
if (stat(path->len ? path->buf : ".", )) {
invalidate_directory(dir->untracked, untracked);
memset(>stat_data, 0, sizeof(untracked->stat_data));
@@ -1665,6 +1677,7 @@ 

[PATCH v1 4/5] Add documentation for the fsmonitor extension. This includes the core.fsmonitor setting, the query-fsmonitor hook, and the fsmonitor index extension.

2017-05-15 Thread Ben Peart
Signed-off-by: Ben Peart 
---
 Documentation/config.txt |  7 +++
 Documentation/githooks.txt   | 23 +++
 Documentation/technical/index-format.txt | 18 ++
 3 files changed, 48 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index bc7088b287..a9a58cb8a6 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -391,6 +391,13 @@ core.protectNTFS::
8.3 "short" names.
Defaults to `true` on Windows, and `false` elsewhere.
 
+core.fsmonitor::
+   If set to true, call the query-fsmonitor hook proc which will
+   identify all files that may have had changes since the last
+   request. This information is used to speed up operations like
+   'git commit' and 'git status' by limiting what git must scan to
+   detect changes.
+
 core.trustctime::
If false, the ctime differences between the index and the
working tree are ignored; useful when the inode change time
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 706091a569..f7b4b4a844 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -448,6 +448,29 @@ The commits are guaranteed to be listed in the order that 
they were
 processed by rebase.
 
 
+[[query-fsmonitor]]
+query-fsmonitor
+
+
+This hook is invoked when the configuration option core.fsmonitor is
+set and git needs to identify changed or untracked files.  It takes
+a single argument which is the time in elapsed seconds since midnight,
+January 1, 1970.
+
+The hook should output to stdout the list of all files in the working
+directory that may have changed since the requested time.  The logic
+should be inclusive so that it does not miss any potential changes.
+The paths should be relative to the root of the working directory
+and be separated by a single NUL.
+
+Git will limit what files it checks for changes as well as which
+directories are checked for untracked files based on the path names
+given.
+
+The exit status determines whether git will use the data from the
+hook to limit its search.  On error, it will fall back to verifying
+all files and folders.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Documentation/technical/index-format.txt 
b/Documentation/technical/index-format.txt
index ade0b0c445..b002d23c05 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -295,3 +295,21 @@ The remaining data of each directory block is grouped by 
type:
 in the previous ewah bitmap.
 
   - One NUL.
+
+== File System Monitor cache
+
+  The file system monitor cache tracks files for which the query-fsmonitor
+  hook has told us about changes.  The signature for this extension is
+  { 'F', 'S', 'M', 'N' }.
+
+  The extension starts with
+
+  - 32-bit version number: the current supported version is 1.
+
+  - 64-bit time: the extension data reflects all changes through the given
+   time which is stored as the seconds elapsed since midnight, January 1, 
1970.
+
+  - 32-bit bitmap size: the size of the CE_FSMONITOR_DIRTY bitmap.
+
+  - An ewah bitmap, the n-th bit indicates whether the n-th index entry
+is CE_FSMONITOR_DIRTY.
-- 
2.13.0.windows.1.6.g4597375fc3



[PATCH v1 1/5] dir: make lookup_untracked() available outside of dir.c

2017-05-15 Thread Ben Peart
Remove the static qualifier from lookup_untracked() and make it
available to other modules by exporting it from dir.h.

Signed-off-by: Ben Peart 
---
 dir.c | 2 +-
 dir.h | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index f451bfa48c..1b5558fdf9 100644
--- a/dir.c
+++ b/dir.c
@@ -660,7 +660,7 @@ static void trim_trailing_spaces(char *buf)
  *
  * If "name" has the trailing slash, it'll be excluded in the search.
  */
-static struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
+struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
struct untracked_cache_dir 
*dir,
const char *name, int len)
 {
diff --git a/dir.h b/dir.h
index bf23a470af..9e387551bd 100644
--- a/dir.h
+++ b/dir.h
@@ -339,4 +339,7 @@ extern void connect_work_tree_and_git_dir(const char 
*work_tree, const char *git
 extern void relocate_gitdir(const char *path,
const char *old_git_dir,
const char *new_git_dir);
+struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
+struct untracked_cache_dir *dir,
+const char *name, int len);
 #endif
-- 
2.13.0.windows.1.6.g4597375fc3



[PATCH v1 0/5] Fast git status via a file system watcher

2017-05-15 Thread Ben Peart
Goal

?
Today, git must check existing files to see if there have been changes
and scan the working directory looking for new, untracked files.  As the
number of files and folders in the working directory increases, the time
to perform these checks can become very expensive O(# files in working
directory).

Given the number of new or modified files is typically a very small
percentage of the total number of files, it would be much more
performant if git only had to check files and folders that potentially
had changes. This reduces the cost to O(# modified files).

This patch series makes it possible to optionally add a hook process
that can return the set of files that may have been changed since the
requested time.  Git can then use this to limit its scan to only those
files and folders that potentially have changes.

Design
~~

A new git hook (query-fsmonitor) must exist and be enabled 
(core.fsmonitor=true) that takes a time_t formatted as a string and
outputs to stdout all files that have been modified since the requested
time.

A new 'fsmonitor' index extension has been added to store the time the
fsmonitor hook was last queried and a ewah bitmap of the current
'fsmonitor-dirty' files. Unmarked entries are 'fsmonitor-clean', marked
entries are 'fsmonitor-dirty.'

As needed, git will call the query-fsmonitor hook proc for the set of
changes since the index was last updated. Git then uses this set of
files along with the list saved in the fsmonitor index extension to flag
the potentially dirty index and untracked cache entries.  

refresh_index() and valid_cached_dir() are updated so that any entry not
flagged as potentially dirty is not checked as it cannot have any
changes. This saves all the work of checking files and folders for
changes that are already known to be clean.

If git finds out some entries are 'fsmonitor-dirty', but are really
unchanged (e.g. the file was changed, then reverted back), then Git will
clear the marking in the extension. If git adds or updates an index
entry, it is marked 'fsmonitor-dirty' to ensure it is checked for
changes.

The code is conservative so in case of any error (missing index
extension, error from hook, etc) it falls back to normal logic of
checking everything.

A sample hook is provided in query-fsmonitor.sample to integrate with
the cross platform Watchman file watching service
https://facebook.github.io/watchman/


Performance
~~~

The performance wins of this model are pretty dramatic. Each test was
run 3 times and averaged.  "Files" is the number of files in the working
directory.  Tests were done with a cold file system cache as well as
with a warm file system cache on a HDD.  SSD speeds were typically about
10x faster than the HDD.  Typical real world results would fall
somewhere between these extremes. 

**
| Repo on HDD | Cache | fsmonitor=false | fsmonitor=true |
**
| 3K Files| Cold  |   0.77s |  0.55s |
++
| 100K Files  | Cold  |  38.76s |  2.17s |
++
| 3M Files| Cold  | 421.55s | 18.57s |
++
| 3K Files| Warm  |   0.05s |  0.24s |
++
| 100K Files  | Warm  |   1.13s |  0.40s |
++
| 3M Files| Warm  |  59.33s |  4.19s |
++

Note that with the smallest repo, warm times actually increase slightly
as the overhead of calling the hook, watchman and perl outweighs the
savings of not scanning the working directory.


Open Issues
~~~

The index extension currently has a 32 bit version number, a 64 bit time
and a 32 bit bitmap size.  Do I need to quad-align the version and
bitmap size in the index extension or can all supported platforms handle
dereferencing memory that isn't quad aligned?


Credits
~~~

Idea taken and code refactored from 
http://public-inbox.org/git/1466914464-10358-1-git-send-email-nova...@novalis.org/

Current version as a fork of GFW on GitHub here: 
https://github.com/benpeart/git-for-windows/tree/fsmonitor


Ben Peart (5):
  dir: make lookup_untracked() available outside of dir.c
  Teach git to optionally utilize a file system monitor to speed up
detecting new or changed files.
  fsmonitor: add test cases for fsmonitor extension
  Add documentation for the fsmonitor extension.  This includes the
core.fsmonitor setting, the query-fsmonitor hook, and the fsmonitor
index extension.
  Add a sample query-fsmonitor hook script that integrates with the
cross platform Watchman file watching service.

 Documentation/config.txt |   7 +
 

Re: [PATCH 10/19] diff.c: convert emit_rewrite_lines to use emit_line_*

2017-05-15 Thread Jonathan Tan



On 05/13/2017 09:01 PM, Stefan Beller wrote:

In a later patch, I want to propose an option to detect
moved lines in a diff, which cannot be done in a one-pass over
the diff. Instead we need to go over the whole diff twice,
because we cannot detect the first line of the two corresponding
lines (+ and -) that got moved.

So to prepare the diff machinery for two pass algorithms
(i.e. buffer it all up and then operate on the result),
move all emissions to places, such that the only emitting
function is emit_line_0.

This covers emit_rewrite_lines.

Signed-off-by: Stefan Beller 
---
 diff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index e4b46fee4f..369c804f03 100644
--- a/diff.c
+++ b/diff.c
@@ -748,7 +748,7 @@ static void emit_rewrite_lines(struct emit_callback *ecb,
if (!endp) {
const char *context = diff_get_color(ecb->color_diff,
 DIFF_CONTEXT);
-   putc('\n', ecb->opt->file);
+   emit_line(ecb->opt, NULL, NULL, "\n", 1);


This outputs diff_line_prefix(ecb->opt) - is that OK?


emit_line_0(ecb->opt, context, reset, '\\',
nneof, strlen(nneof));
}



Re: [PATCH] fixup! log: add exhaustive tests for pattern style options & config

2017-05-15 Thread Jonathan Nieder
Hi,

Johannes Schindelin wrote:
> On Fri, 12 May 2017, Jonathan Nieder wrote:

>> Would something like
>>
>>  test PIPE_IN_FILENAME '
>>  >"a|b" &&
>>  test -f "a|b"
>>  '
>>
>> work?
[...]
> Back to the subject: The MSYS2 emulation layer inherits a neat trick from
> Cygwin, where it *can* create file names containing pipe symbols. They
> will be quietly mapped into a private UTF-8 page, and when Cygwin or MSYS2
> read the file back, the file name maps from this page back to ASCII
> transparently.
>
> That strategy is all good and dandy, as long as you stay within the POSIX
> emulation layer.
>
> Git for Windows avoids the POSIX emulation layer as much as possible, for
> speed, and also for robustness.
>
> Which means that Git does *not* map the file names using said private
> UTF-8 code page. And therefore, your test would succeed (because the shell
> script would stay within the POSIX emulation layer, which creates that
> file using above-mentioned strategy), but Git (being a regular Win32
> program) *still* would fail to create said file.

Wow.  Thanks for a clear explaination.

I'll be keeping a copy of this message handy for the next time I'm
confused about filename handling in the testsuite on Windows.

Sincerely,
Jonathan


Re: [PATCH 08/19] diff.c: convert builtin_diff to use emit_line_*

2017-05-15 Thread Jonathan Tan

On 05/13/2017 09:01 PM, Stefan Beller wrote:

In a later patch, I want to propose an option to detect
moved lines in a diff, which cannot be done in a one-pass over
the diff. Instead we need to go over the whole diff twice,
because we cannot detect the first line of the two corresponding
lines (+ and -) that got moved.

So to prepare the diff machinery for two pass algorithms
(i.e. buffer it all up and then operate on the result),
move all emissions to places, such that the only emitting
function is emit_line_0.

This covers builtin_diff.

Signed-off-by: Stefan Beller 
---
 diff.c | 43 ++-
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/diff.c b/diff.c
index 93343a9ccc..8e00010bf4 100644
--- a/diff.c
+++ b/diff.c
@@ -1293,8 +1293,9 @@ static void fn_out_consume(void *priv, char *line, 
unsigned long len)
o->found_changes = 1;

if (ecbdata->header) {
-   fprintf(o->file, "%s", ecbdata->header->buf);
-   strbuf_reset(ecbdata->header);
+   emit_line(o, NULL, NULL,
+ ecbdata->header->buf, ecbdata->header->len);
+   strbuf_release(ecbdata->header);
ecbdata->header = NULL;
}

@@ -2407,7 +2408,7 @@ static void builtin_diff(const char *name_a,
b_two = quote_two(b_prefix, name_b + (*name_b == '/'));
lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";
lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";
-   strbuf_addf(, "%s%sdiff --git %s %s%s\n", line_prefix, meta, 
a_one, b_two, reset);
+   strbuf_addf(, "%sdiff --git %s %s%s\n", meta, a_one, b_two, 
reset);
if (lbl[0][0] == '/') {
/* /dev/null */
strbuf_addf(, "%s%snew file mode %06o%s\n", line_prefix, 
meta, two->mode, reset);
@@ -2439,7 +2440,7 @@ static void builtin_diff(const char *name_a,
if (complete_rewrite &&
(textconv_one || !diff_filespec_is_binary(one)) &&
(textconv_two || !diff_filespec_is_binary(two))) {
-   fprintf(o->file, "%s", header.buf);
+   emit_line(o, NULL, NULL, header.buf, header.len);
strbuf_reset();
emit_rewrite_diff(name_a, name_b, one, two,
textconv_one, textconv_two, o);
@@ -2449,7 +2450,8 @@ static void builtin_diff(const char *name_a,
}

if (o->irreversible_delete && lbl[1][0] == '/') {
-   fprintf(o->file, "%s", header.buf);
+   if (header.len)
+   emit_line(o, NULL, NULL, header.buf, header.len);


This used to unconditionally output diff_line_prefix(0), but now outputs 
it only if the non-prefix part is blank. Is that expected? (Same 
comments below.)



strbuf_reset();
goto free_ab_and_return;
} else if (!DIFF_OPT_TST(o, TEXT) &&
@@ -2459,13 +2461,16 @@ static void builtin_diff(const char *name_a,
S_ISREG(one->mode) && S_ISREG(two->mode) &&
!DIFF_OPT_TST(o, BINARY)) {
if (!oidcmp(>oid, >oid)) {
-   if (must_show_header)
-   fprintf(o->file, "%s", header.buf);
+   if (must_show_header && header.len)
+   emit_line(o, NULL, NULL,
+ header.buf, header.len);
goto free_ab_and_return;
}
-   fprintf(o->file, "%s", header.buf);
-   fprintf(o->file, "%sBinary files %s and %s differ\n",
-   line_prefix, lbl[0], lbl[1]);
+   if (header.len)
+   emit_line(o, NULL, NULL,
+ header.buf, header.len);
+   emit_line_fmt(o, 0, 0, "Binary files %s and %s 
differ\n",
+ lbl[0], lbl[1]);
goto free_ab_and_return;
}
if (fill_mmfile(, one) < 0 || fill_mmfile(, two) < 0)
@@ -2473,17 +2478,21 @@ static void builtin_diff(const char *name_a,
/* Quite common confusing case */
if (mf1.size == mf2.size &&
!memcmp(mf1.ptr, mf2.ptr, mf1.size)) {
-   if (must_show_header)
-   fprintf(o->file, "%s", header.buf);
+   if (must_show_header && header.len)
+   emit_line(o, NULL, NULL,
+ header.buf, header.len);
goto free_ab_and_return;
}
-   fprintf(o->file, "%s", header.buf);
+   if (header.len)
+   

Re: [GSoC][RFC/PATCH v3 2/2] submodule: port subcommand foreach from shell to C

2017-05-15 Thread Brandon Williams
On 05/12, Prathamesh Chavan wrote:
> This aims to make git-submodule foreach a builtin. This is the very
> first step taken in this direction. Hence, 'foreach' is ported to
> submodule--helper, and submodule--helper is called from git-submodule.sh.
> The code is split up to have one function to obtain all the list of
> submodules. This function acts as the front-end of git-submodule foreach
> subcommand. It calls the function for_each_submodule_list, which basically
> loops through the list and calls function fn, which in this case is
> runcommand_in_submodule. This third function is a calling function that
> takes care of running the command in that submodule, and recursively
> perform the same when --recursive is flagged.
> 
> The first function module_foreach first parses the options present in
> argv, and then with the help of module_list_compute, generates the list of
> submodules present in the current working tree.
> 
> The second function for_each_submodule_list traverses through the
> list, and calls function fn (which in case of submodule subcommand
> foreach is runcommand_in_submodule) is called for each entry.
> 
> The third function runcommand_in_submodule, generates a submodule struct sub
> for $name value and then later prepends name=sub->name and other
> value assignments to the env argv_array structure of a child_process.
> Also the  of submodule-foreach is push to args argv_array
> structure and finally, using run_command the commands are executed
> using a shell.
> 
> The third function also takes care of the recursive flag, by creating
> a separate child_process structure and prepending "--super-prefix 
> displaypath",
> to the args argv_array structure. Other required arguments and the
> input  of submodule-foreach is also appended to this argv_array.
> 
> The commit 1c4fb136db (submodule foreach: skip eval for more than one
> argument, 2013-09-27), which explains that why for the case when argc>1,
> we do not use eval. But since in this patch, we are calling the
> command in a separate shell itself for all values of argc, this case
> is not considered separately.
> 
> Both env variable $path and $sm_path were added since both are used in
> tests in t7407.
> 
> Mentored-by: Christian Couder 
> Mentored-by: Stefan Beller 
> Signed-off-by: Prathamesh Chavan 
> ---
> 
> In this new version of patch, following changes were added:
> 
> foreach_submodule was renamed to runcommand_in_submodule
> 
> We module_foreach into two parts, such that, module foreach
> generates the list of submodules present in the cwd. And later
> calls for_each_submodule_list which basically loops through the
> list of submodule and calls the passed function fn.
> 
> Additionally, this patch also pass all those test, which it
> failed earlier.
> 
> Since in the run-command API errors out when child
> process with no arguments is passed, for test #10
> from t7407-submodule-foreach, additional condition
> was added in runcommand_in_submodule funciton to
> handle this case instead of modifying the run-command.
> 
> Complete build report is available at:
> https://travis-ci.org/pratham-pc/git/builds/
> branch: foreach
> build #52
> 
> It can been seen that the patch fails in test #9
> of t7407-submodule-foreach, which is the newly added
> test to that suite. The main reason of adding this test
> was to bring the behavior of $path for the submodule
> foreach --recursive case.
> 
> The observation made was as follows:
> 
> For a project - super containing dir (not a submodule)
> and a submodule sub which contains another submodule
> subsub. When we run a command from super/dir:
> 
> git submodule foreach "echo \$path-\$sm_path"
> 
> actual results:
> Entering '../sub'
> ../sub-../sub
> Entering '../sub/subsub'
> ../subsub-../subsub
> 
> ported function's result:
> Entering '../sub'
> sub-../sub
> Entering '../sub/subsub'
> subsub-../sub/subsub
> 
> This is occurring since in cmd_foreach of git-submodule.sh
> when we use to recurse, we call cmd_foreach
> and hence the process ran in the same shell.
> Because of this, the variable $wt_prefix is set only once
> which is at the beginning of the submodule foreach execution.
> wt_prefix=$(git rev-parse --show-prefix)
> 
> And since sm_path and path are set using $wt_prefix as :
> sm_path=$(git submodule--helper relative-path "$sm_path" "$wt_prefix") &&
> path=$sm_path
> It differs with the value of displaypath as well.
> 
> This make the value of $path confusing and I also feel it
> deviates from its documentation:
> $path is the name of the submodule directory relative
> to the superproject.
> 
> But since in refactoring the code, we wish to maintain the
> code in same way, we need to pass wt_prefix on every
> recursive call, which may result in complex C code.
> Another option could be to first correct the $path value
> in git-submodule.sh and then port the updated cmd_foreach.
> 
>  

Re: [PATCH 03/19] diff.c: drop 'nofirst' from emit_line_0

2017-05-15 Thread Stefan Beller
On Mon, May 15, 2017 at 11:26 AM, Jonathan Tan  wrote:

> "erroneous"?

yep, words are hard.

>
> I also don't understand the meaning of this paragraph - if you mean that
> this patch teaches other callers to hardcode the sign, I don't see any such
> changes in the diff below.

The last two hunks of the patch switch two callers that call with a sign
that is hard to reason about.

> After reading the patch below, would this commit message be better:
>
> [begin]
> diff.c: teach emit_line_0 to accept sign parameter
>
> Instead of a separate "first" parameter representing the first character of
> the line to be printed, make emit_line_0 take an optional "sign" parameter
> specifically intended to hold the sign of the line. Callers that store the
> sign and line separately can use the "sign" parameter like they used the
> "first" parameter previously, and callers that store the sign and line
> together (or do not have a sign) no longer need to manipulate their
> arguments to fit the requirements of emit_line_0.
>
> (And then mention that you have checked all the callers and that none of
> them send '\n' or '\r' as the sign, as you have done in this version.)
> [end]

That describes the situation better, indeed.

>> @@ -556,7 +547,7 @@ static void emit_line_0(struct diff_options *o, const
>> char *set, const char *res
>>  static void emit_line(struct diff_options *o, const char *set, const char
>> *reset,
>>   const char *line, int len)
>>  {
>> -   emit_line_0(o, set, reset, line[0], line+1, len-1);
>> +   emit_line_0(o, set, reset, 0, line, len);
>>  }
>
>
> Maybe this function is unnecessary now that emit_line_0 can take the line
> directly.

That would produce a lot of code churn, specifically in later patches;
but I can remove that function if anyone feels strongly about it.

>> +   char term[2];
>> +   term[0] = options->line_termination;
>> +   term[1] = '\0';
>> +
>> +   emit_line(options, NULL, NULL,
>> + term, 1);
>
>
> If options->line_termination is 0, this is actually a zero-length string
> (not 1).

So passing in !!options->line_termination should be fine?

Thanks,
Stefan


Re: [PATCH 05/19] diff.c: emit_line_0 can handle no color setting

2017-05-15 Thread Jonathan Tan

On 05/13/2017 09:01 PM, Stefan Beller wrote:

In a later patch, I want to propose an option to detect
moved lines in a diff, which cannot be done in a one-pass over
the diff. Instead we need to go over the whole diff twice,
because we cannot detect the first line of the two corresponding
lines (+ and -) that got moved.

So to prepare the diff machinery for two pass algorithms
(i.e. buffer it all up and then operate on the result),
move all emissions to places, such that the only emitting
function is emit_line_0.

In later patches we may pass lines that are not colored to
the central function emit_line_0, so we
need to emit the color only when it is non-NULL.


The diff below seems to just make emit_line_0 allow NULL for set and 
reset, unlike what the commit message above describes. (And is that 
necessary? Couldn't the caller just pass "" if they don't want any 
setting and/or resetting?)




Signed-off-by: Stefan Beller 
---
 diff.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index 381b572d76..48f0fb98dc 100644
--- a/diff.c
+++ b/diff.c
@@ -532,11 +532,13 @@ static void emit_line_0(struct diff_options *o, const 
char *set, const char *res
len--;

if (len || sign) {
-   fputs(set, file);
+   if (set)
+   fputs(set, file);
if (sign)
fputc(sign, file);
fwrite(line, len, 1, file);
-   fputs(reset, file);
+   if (reset)
+   fputs(reset, file);
}
if (has_trailing_carriage_return)
fputc('\r', file);



Re: [PATCH/RFC] The new IncludeIf facility doesn't DWIM when the repo is symlinked

2017-05-15 Thread Ævar Arnfjörð Bjarmason
On Mon, May 15, 2017 at 5:20 PM, Ævar Arnfjörð Bjarmason
 wrote:
> I have a ~/git_tree in my homedir that's symlinked to an external
> drive, and doing "gitdir:~/git_tree/" doesn't work, because instead of
> matching against ~/git_tree it's matched against
> /mnt/some-other-storage/.
>
> Here's a WIP patch that makes this work for me, any reason I shouldn't
> finish this up & that we shouldn't be doing this? The doc don't say
> "we'll only match gitdir against the absolute resolved path" or
> anything like that, so until I checked out the implementation I didn't
> realize what was going on:

On closer inspection I can see that this was actually broken in
86f9515708 ("config: resolve symlinks in conditional include's
patterns", 2017-04-05), but works in the initial addition of
IncludeIf.

> diff --git a/config.c b/config.c
> index b4a3205da3..606acaa3f1 100644
> --- a/config.c
> +++ b/config.c
> @@ -214,6 +214,7 @@ static int include_by_gitdir(const struct
> config_options *opts,
> struct strbuf pattern = STRBUF_INIT;
> int ret = 0, prefix;
> const char *git_dir;
> +   int tried_absolute = 0;
>
> if (opts->git_dir)
> git_dir = opts->git_dir;
> @@ -226,6 +227,7 @@ static int include_by_gitdir(const struct
> config_options *opts,
> strbuf_add(, cond, cond_len);
> prefix = prepare_include_condition_pattern();
>
> +again:
> if (prefix < 0)
> goto done;
>
> @@ -245,6 +247,12 @@ static int include_by_gitdir(const struct
> config_options *opts,
> ret = !wildmatch(pattern.buf + prefix, text.buf + prefix,
>  icase ? WM_CASEFOLD : 0, NULL);
>
> +   if (!ret && !tried_absolute) {
> +   tried_absolute = 1;
> +   strbuf_reset();
> +   strbuf_add_absolute_path(, git_dir);
> +   goto again;
> +   }
>  done:
> strbuf_release();
> strbuf_release();


Re: [PATCH 03/19] diff.c: drop 'nofirst' from emit_line_0

2017-05-15 Thread Jonathan Tan

On 05/13/2017 09:01 PM, Stefan Beller wrote:

In 250f79930d (diff.c: split emit_line() from the first char and the rest
of the line, 2009-09-14) we introduced the local variable 'nofirst' that
indicates if we have no first sign character. With the given implementation
we had to use an extra variable unlike reusing 'first' because the lines
first character could be '\0'.

Change the meaning of the 'first' argument to not mean the first character
of the line, but rather just containing the sign that is prepended to the
line. Refactor emit_line to not include the lines first character, but pass
the complete line as well as a '\0' sign, which now serves as an indication
not to print a sign.

With this patch other callers hard code the sign (which are '+', '-',
' ' and '\\') such that we do not run into unexpectedly emitting an
error-nous '\0'.


"erroneous"?

I also don't understand the meaning of this paragraph - if you mean that 
this patch teaches other callers to hardcode the sign, I don't see any 
such changes in the diff below.


After reading the patch below, would this commit message be better:

[begin]
diff.c: teach emit_line_0 to accept sign parameter

Instead of a separate "first" parameter representing the first character 
of the line to be printed, make emit_line_0 take an optional "sign" 
parameter specifically intended to hold the sign of the line. Callers 
that store the sign and line separately can use the "sign" parameter 
like they used the "first" parameter previously, and callers that store 
the sign and line together (or do not have a sign) no longer need to 
manipulate their arguments to fit the requirements of emit_line_0.


(And then mention that you have checked all the callers and that none of 
them send '\n' or '\r' as the sign, as you have done in this version.)

[end]



The audit of the caller revealed that the sign cannot be '\n' or '\r',
so remove that condition for trailing newline or carriage return in the
sign; the else part of the condition handles the len==0 perfectly,
so we can drop the if/else construct.

Signed-off-by: Stefan Beller 
---
 diff.c | 40 +---
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/diff.c b/diff.c
index c2ed605cd0..4269b8dccf 100644
--- a/diff.c
+++ b/diff.c
@@ -517,33 +517,24 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t 
*mf2,
 }

 static void emit_line_0(struct diff_options *o, const char *set, const char 
*reset,
-   int first, const char *line, int len)
+   int sign, const char *line, int len)
 {
int has_trailing_newline, has_trailing_carriage_return;
-   int nofirst;
FILE *file = o->file;

fputs(diff_line_prefix(o), file);

-   if (len == 0) {
-   has_trailing_newline = (first == '\n');
-   has_trailing_carriage_return = (!has_trailing_newline &&
-   (first == '\r'));
-   nofirst = has_trailing_newline || has_trailing_carriage_return;
-   } else {
-   has_trailing_newline = (len > 0 && line[len-1] == '\n');
-   if (has_trailing_newline)
-   len--;
-   has_trailing_carriage_return = (len > 0 && line[len-1] == '\r');
-   if (has_trailing_carriage_return)
-   len--;
-   nofirst = 0;
-   }
+   has_trailing_newline = (len > 0 && line[len-1] == '\n');
+   if (has_trailing_newline)
+   len--;
+   has_trailing_carriage_return = (len > 0 && line[len-1] == '\r');
+   if (has_trailing_carriage_return)
+   len--;

-   if (len || !nofirst) {
+   if (len || sign) {
fputs(set, file);
-   if (!nofirst)
-   fputc(first, file);
+   if (sign)
+   fputc(sign, file);
fwrite(line, len, 1, file);
fputs(reset, file);
}
@@ -556,7 +547,7 @@ static void emit_line_0(struct diff_options *o, const char 
*set, const char *res
 static void emit_line(struct diff_options *o, const char *set, const char 
*reset,
  const char *line, int len)
 {
-   emit_line_0(o, set, reset, line[0], line+1, len-1);
+   emit_line_0(o, set, reset, 0, line, len);
 }


Maybe this function is unnecessary now that emit_line_0 can take the 
line directly.




 static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char 
*line, int len)
@@ -4822,9 +4813,12 @@ void diff_flush(struct diff_options *options)

if (output_format & DIFF_FORMAT_PATCH) {
if (separator) {
-   fprintf(options->file, "%s%c",
-   diff_line_prefix(options),
-   options->line_termination);
+   char term[2];
+   term[0] = options->line_termination;
+ 

Re: [PATCH v2] send-email: support validate hook

2017-05-15 Thread Jonathan Tan

On 05/14/2017 06:55 PM, Junio C Hamano wrote:

Jonathan Tan  writes:


diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 706091a56..b2514f4d4 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -447,6 +447,14 @@ rebase::
 The commits are guaranteed to be listed in the order that they were
 processed by rebase.

+sendemail-validate
+~~
+
+This hook is invoked by 'git send-email'.  It takes a single parameter,
+the name of the file that holds the e-mail to be sent.  Exiting with a
+non-zero status causes 'git send-email' to abort before sending any
+e-mails.
+


I briefly wondered if an interface that allows only the name of the
file will close the door to future enhancements, but in the end
decided it is OK.  E.g. users may want "here is the contents, is it
appropriate to be sent to this and that address?"---but this hook is
meant to enhances/extends the existing "make sure the contents do
not syntactically bust the technical limit of underlying transport",
and sits at a place best to do so in the codeflow, i.e. before we
even start to decide who the recipients of the patch are.  So those
who want "given the contents of the patch and list of the recipients,
decide if it is OK to send the patch to all of them" would be better
served by a separate hook, not this one.


+   write_script .git/hooks/sendemail-validate <<-\EOF &&
+   # test that we have the correct environment variable, pwd, and
+   # argument
+   case "$GIT_DIR" in
+   *.git)
+   true
+   ;;
+   *)
+   false
+   ;;
+   esac &&
+   test -e 0001-add-master.patch &&
+   grep "add master" "$1"
+   EOF


I'd squash in cosmetic changes to de-dent the contents of
write_script (our standard style is that the body of the script is
written as if the column at which write_script..EOF starts is the
left-edge of the page; I think this file already has a few style
violations that may want to be updated with a separate clean-up
patch when the file is quiet), and then de-dent the case arm (case
arms are indented at the same depth as case/esac).  Also I think we
care that 0001-add-master.patch not just exists but is a file, so
I'd do s/test -e/test -f/ while at it.

Thanks.


Thanks to you too. I agree with these changes - sorry, your previous 
reply must have slipped my mind when I was writing v2.


git rebase regression: cannot pass a shell expression directly to --exec

2017-05-15 Thread Eric Rannaud
Hi all,

It used to be possible to run a sequence like:

  foo() { echo X; }
  export -f foo
  git rebase --exec foo HEAD~10

Since upgrading to 2.13.0, I had to update my scripts to run:

  git rebase --exec "bash -c foo" HEAD~10

I'm not sure if this was an intended change. Bisecting with the
following script:

  #!/usr/bin/env bash

  make -j8 || exit 3

  function foo() {
  echo OK
  }
  export -f foo

  pushd tmp
  ../git --exec-path=.. rebase --exec foo HEAD^^
  ret=$?
  # Cleanup if failure
  ../git --exec-path=.. rebase --abort &> /dev/null
  popd
  exit $ret

It points to this commit:

commit 18633e1a22a68bbe8e6311a1039d13ebbf6fd041 (refs/bisect/bad)
Author: Johannes Schindelin 
Date:   Thu Feb 9 23:23:11 2017 +0100

rebase -i: use the rebase--helper builtin

Now that the sequencer learned to process a "normal" interactive rebase,
we use it. The original shell script is still used for "non-normal"
interactive rebases, i.e. when --root or --preserve-merges was passed.

Please note that the --root option (via the $squash_onto variable) needs
special handling only for the very first command, hence it is still okay
to use the helper upon continue/skip.

Also please note that the --no-ff setting is volatile, i.e. when the
interactive rebase is interrupted at any stage, there is no record of
it. Therefore, we have to pass it from the shell script to the
rebase--helper.

Note: the test t3404 had to be adjusted because the the error messages
produced by the sequencer comply with our current convention to start with
a lower-case letter.

Signed-off-by: Johannes Schindelin 
Signed-off-by: Junio C Hamano 


Thanks,
Eric


Re: [PATCH v2 21/29] grep: factor test for \0 in grep patterns into a function

2017-05-15 Thread Ævar Arnfjörð Bjarmason
On Mon, May 15, 2017 at 8:24 AM, Junio C Hamano  wrote:
> Ævar Arnfjörð Bjarmason   writes:
>
>> Factor the test for \0 in grep patterns into a function. Since commit
>> 9eceddeec6 ("Use kwset in grep", 2011-08-21) any pattern containing a
>> \0 is considered fixed as regcomp() can't handle it.
>>
>> This limitation was never documented, and other some regular
>> expression engines are capable of compiling a pattern containing a
>> \0. Factoring this out makes a subsequent change which does that
>> smaller.
>
> The latter half of the first sentence of this paragraph does not
> parse well, around "and other some".  "never documented" makes
> readers expect "... so let's document it", but I think you are
> driving in a different direction, something like...

I'll try to reword it.

> Since some other regular expression engines are capable of a
> pattern to match a string with NUL in it, it would be nice
> if we can use such an engine and match against NUL; as this
> "patterns with NUL always use --fixed match" is not documented,
> let's hope that nobody depend on that current behaviour.
> This step does not yet change the behaviour yet, but it
> makes it easier to do so in later steps.
>
> perhaps?  I tend to agree that it is OK to break users who depended
> on that "patterns with NULL match with --fixed" (partly because it
> is so unintuitive and not useful behaviour, and partly because it is
> easy for them to explicitly say -F), but let's make sure we clearly
> say that we will be knowingly breaking them.

I think there's a few different things going on here, which I'll
address accordingly.

 * Yes, if someone is relying on the undocumented behavior of a \0
pattern implying -F if and when I change that their stuff will break.
I'm going to actually just drop this whole discussion from this
commit, we can have an explanation for that when and if I actually
change it in a later series.

* I'm making things more confusing by saying "engines" here, but it's
important to note that git has never in its docs promised to use the C
library implementation of POSIX regexes, it's just advertised that it
uses POSIX regular *expressions*, which are a different thing.
Implementation != syntax.

E.g. GNU grep uses POSIX regex syntax, but its engine does not have
the same limitation as ours:

$ (command cd /tmp && printf "æ\0ð\n" >a && printf "æ\n" >> a;
printf "æ\0[öð]" >f; grep -a -f f a; echo $?)
æð
0

I went down this particular rabbit hole because I was genuinely
surprised that this didn't work with git-grep, until I realized that
of course the implementation detail of using C strings for this under
the hood was bleeding through.


Re: [PATCH v7] fetch-pack: always allow fetching of literal SHA1s

2017-05-15 Thread Jonathan Nieder
Jonathan Tan wrote:

> fetch-pack, when fetching a literal SHA-1 from a server that is not
> configured with uploadpack.allowtipsha1inwant (or similar), always
> returns an error message of the form "Server does not allow request for
> unadvertised object %s". However, it is sometimes the case that such
> object is advertised. This situation would occur, for example, if a user
> or a script was provided a SHA-1 instead of a branch or tag name for
> fetching, and wanted to invoke "git fetch" or "git fetch-pack" using
> that SHA-1.
>
> Teach fetch-pack to also check the SHA-1s of the refs in the received
> ref advertisement if a literal SHA-1 was given by the user.
>
> Helped-by: Jeff King 
> Signed-off-by: Jonathan Tan 
> ---
>  fetch-pack.c  | 40 ++--
>  t/t5500-fetch-pack.sh | 35 +++
>  2 files changed, 73 insertions(+), 2 deletions(-)

Reviewed-by: Jonathan Nieder 

I think we've done enough golf on this one.

Thanks for your patient work.
Jonathan


Re: [PATCH v2 20/29] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments

2017-05-15 Thread Ævar Arnfjörð Bjarmason
On Mon, May 15, 2017 at 8:14 AM, Junio C Hamano  wrote:
> Ævar Arnfjörð Bjarmason   writes:
>
>> Remove redundant assignments to the "regflags" variable. There are no
>> code paths that have previously set the regflags to anything, and
>> certainly not to `|= REG_EXTENDED`.
>>
>> This code gave the impression that it had to reset its environment,
>> but it doesn't. This dates back to the initial introduction of
>> git-grep in commit 5010cb5fcc ("built-in "git grep"", 2006-04-30).
>
> Back in 5010cb5fcc, we did do "opt.regflags &= ~REG_EXTENDED" upon
> seeing "-G" on the command line and flipped the bit on upon seeing
> "-E", but I think that was perfectly sensible and it would have been
> a bug if we didn't.  They were part of the command line parsing that
> could have seen "-E" on the command line earlier.
>
> If we want to find a commit to assign blame to, I think it is more
> correct to say that this came from the same one as 19/29 fixes.
>
> When cca2c172 ("git-grep: do not die upon -F/-P when
> grep.extendedRegexp is set.", 2011-05-09) switched the command line
> parsing to "read into a 'tentatively this is what we saw the last'
> variable and then finally commit just once", we didn't touch
> opt.regflags for PCRE and FIXED, but we still had to flip regflags
> between BRE and ERE, because parsing of grep.extendedregexp
> configuration variable directly touched opt.regflags back then,
> which was done by b22520a3 ("grep: allow -E and -n to be turned on
> by default via configuration", 2011-03-30).  When 84befcd0 ("grep:
> add a grep.patternType configuration setting", 2012-08-03)
> introduced extended_regexp_option field, we stopped flipping
> regflags while reading the configuration, and that was when we
> should have noticed and stopped dropping REG_EXTENDED bit in the
> "now we can commit what type to use" helper function.

Thanks for the history. I'll amend the commit message to note this.

> In any case, I think this change is safe to do in the current
> codebase.  I wonder if this and 19/29 should be a single patch,
> though, as the unnecessary bit-flipping all are blamed to the same
> origin.

Squashing these two makes sense. I'll do that.

>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  grep.c | 5 +
>>  1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/grep.c b/grep.c
>> index 59ae7809f2..bf6c2494fd 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -179,7 +179,6 @@ static void grep_set_pattern_type_option(enum 
>> grep_pattern_type pattern_type, st
>>   case GREP_PATTERN_TYPE_BRE:
>>   opt->fixed = 0;
>>   opt->pcre = 0;
>> - opt->regflags &= ~REG_EXTENDED;
>>   break;
>>
>>   case GREP_PATTERN_TYPE_ERE:
>> @@ -191,7 +190,6 @@ static void grep_set_pattern_type_option(enum 
>> grep_pattern_type pattern_type, st
>>   case GREP_PATTERN_TYPE_FIXED:
>>   opt->fixed = 1;
>>   opt->pcre = 0;
>> - opt->regflags &= ~REG_EXTENDED;
>>   break;
>>
>>   case GREP_PATTERN_TYPE_PCRE:
>> @@ -414,10 +412,9 @@ static void compile_fixed_regexp(struct grep_pat *p, 
>> struct grep_opt *opt)
>>  {
>>   struct strbuf sb = STRBUF_INIT;
>>   int err;
>> - int regflags;
>> + int regflags = opt->regflags;
>>
>>   basic_regex_quote_buf(, p->pattern);
>> - regflags = opt->regflags & ~REG_EXTENDED;
>>   if (opt->ignore_case)
>>   regflags |= REG_ICASE;
>>   err = regcomp(>regexp, sb.buf, regflags);


Re: [PATCH v2 18/29] grep: catch a missing enum in switch statement

2017-05-15 Thread Ævar Arnfjörð Bjarmason
On Mon, May 15, 2017 at 7:50 AM, Junio C Hamano  wrote:
> Ævar Arnfjörð Bjarmason   writes:
>
>> Add a die(...) to a default case for the switch statement selecting
>> between grep pattern types under --recurse-submodules.
>>
>> Normally this would be caught by -Wswitch, but the grep_pattern_type
>> type is converted to int by going through parse_options(). Changing
>> the argument type passed to compile_submodule_options() won't work,
>> the value will just get coerced. The -Wswitch-default warning will
>> warn about it, but that produces a lot of noise across the codebase,
>> this potential issue would be drowned in that noise.
>>
>> Thus catching this at runtime is the least worst option. This won't
>
> least bad, I guess?

Will fix.

>> ever trigger in practice, but if a new pattern type were to be added
>> this catches an otherwise silent bug during development.
>
> Good future-proofing.  When this and Peff's "BUG" thing both
> graduates, we can turn this into a BUG, I think.

Yup, this and a bunch of other stuff presumably. The BUG feature is nice.

> Thanks.
>
>>
>> See commit 0281e487fd ("grep: optionally recurse into submodules",
>> 2016-12-16) for the initial addition of this code.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  builtin/grep.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/builtin/grep.c b/builtin/grep.c
>> index 3ffb5b4e81..a191e2976b 100644
>> --- a/builtin/grep.c
>> +++ b/builtin/grep.c
>> @@ -495,6 +495,8 @@ static void compile_submodule_options(const struct 
>> grep_opt *opt,
>>   break;
>>   case GREP_PATTERN_TYPE_UNSPECIFIED:
>>   break;
>> + default:
>> + die("BUG: Added a new grep pattern type without updating 
>> switch statement");
>>   }
>>
>>   for (pattern = opt->pattern_list; pattern != NULL;


Re: [PATCH v2 04/29] log: add exhaustive tests for pattern style options & config

2017-05-15 Thread Ævar Arnfjörð Bjarmason
On Mon, May 15, 2017 at 6:57 AM, Junio C Hamano  wrote:
> Ævar Arnfjörð Bjarmason   writes:
>
>> + echo 2e >expect &&
>> + # In PCRE \d in [\d] is like saying "0-9", and matches the 2
>> + # in 2e...
>> + git -C num_commits log -1 --pretty="tformat:%s" -F -E --perl-regexp 
>> --grep="[\d]" >actual &&
>> + test_cmp expect actual &&
>> + echo 1d >expect &&
>> + # ...in POSIX basic & extended it is the same as [d],
>> + # i.e. "d", which matches 1d, but not and does not match 2e.
>
> s/not and//; I think.

Will fix.

>> + git -C num_commits log -1 --pretty="tformat:%s" -F -E --grep="[\d]" 
>> >actual &&
>>   test_cmp expect actual
>>  '
>>
>> @@ -280,6 +301,77 @@ test_expect_success 'log with grep.patternType 
>> configuration and command line' '
>>   test_cmp expect actual
>>  '
>>
>> +test_expect_success 'log with various grep.patternType configurations & 
>> command-lines' '
>> + git init pattern-type &&
>> + (
>> + cd pattern-type &&
>> + test_commit 1 file A &&
>> +
>> + # The tagname is overridden here because creating a
>> + # tag called "(1|2)" as test_commit would otherwise
>> + # implicitly do would fail on e.g. MINGW.
>
> Thanks.
>
>> + # POSIX extended needs to have | escaped to match it
>> + # literally, whereas under basic this is the same as
>> + # (|2), i.e. it would also match "1". This test checks
>> + # for extended by asserting that it is not matching
>> + # what basic would match.
>> + git -c grep.patternType=extended log --pretty=tformat:%s \
>> + --grep="\|2" >actual.extended &&
>
> Makes sense.
>
>> + if test_have_prereq PCRE
>> + then
>> + # Only PCRE would match [\d]\| with only
>> + # "(1|2)" due to [\d]. POSIX basic would match
>> + # both it and "1", and POSIX extended would
>> + # match neither.
>
> OK.  BRE would match because the other side of "\|" is empty to
> match anything?

Yes. I'll clarify this. It's not just a POSIX basic feature. The same
is true of extended and perl. E.g.:

git grep [-E|-P] 'foo|bar'

Both match the same as:

git grep [-E|-P] '(foo|bar)'

>> + git -c grep.patternType=perl log --pretty=tformat:%s \
>> + --grep="[\d]\|" >actual.perl &&
>> + test_cmp expect.perl actual.perl
>> + fi &&
>


[PATCH v7] fetch-pack: always allow fetching of literal SHA1s

2017-05-15 Thread Jonathan Tan
fetch-pack, when fetching a literal SHA-1 from a server that is not
configured with uploadpack.allowtipsha1inwant (or similar), always
returns an error message of the form "Server does not allow request for
unadvertised object %s". However, it is sometimes the case that such
object is advertised. This situation would occur, for example, if a user
or a script was provided a SHA-1 instead of a branch or tag name for
fetching, and wanted to invoke "git fetch" or "git fetch-pack" using
that SHA-1.

Teach fetch-pack to also check the SHA-1s of the refs in the received
ref advertisement if a literal SHA-1 was given by the user.

Helped-by: Jeff King 
Signed-off-by: Jonathan Tan 
---

OK, one combined function (for lazy initialization and checking
containment in the oidset) with comment it is.

Change from v6: changed back to "tip_oids_contain", and included Peff's
comment.
---
 fetch-pack.c  | 40 ++--
 t/t5500-fetch-pack.sh | 35 +++
 2 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index afb8b0502..703e7ec78 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -15,6 +15,7 @@
 #include "version.h"
 #include "prio-queue.h"
 #include "sha1-array.h"
+#include "oidset.h"
 
 static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
@@ -592,13 +593,38 @@ static void mark_recent_complete_commits(struct 
fetch_pack_args *args,
}
 }
 
+static void add_refs_to_oidset(struct oidset *oids, struct ref *refs)
+{
+   for (; refs; refs = refs->next)
+   oidset_insert(oids, >old_oid);
+}
+
+static int tip_oids_contain(struct oidset *tip_oids,
+   struct ref *unmatched, struct ref *newlist,
+   const struct object_id *id)
+{
+   /*
+* Note that this only looks at the ref lists the first time it's
+* called. This works out in filter_refs() because even though it may
+* add to "newlist" between calls, the additions will always be for
+* oids that are already in the set.
+*/
+   if (!tip_oids->map.tablesize) {
+   add_refs_to_oidset(tip_oids, unmatched);
+   add_refs_to_oidset(tip_oids, newlist);
+   }
+   return oidset_contains(tip_oids, id);
+}
+
 static void filter_refs(struct fetch_pack_args *args,
struct ref **refs,
struct ref **sought, int nr_sought)
 {
struct ref *newlist = NULL;
struct ref **newtail = 
+   struct ref *unmatched = NULL;
struct ref *ref, *next;
+   struct oidset tip_oids = OIDSET_INIT;
int i;
 
i = 0;
@@ -631,7 +657,8 @@ static void filter_refs(struct fetch_pack_args *args,
ref->next = NULL;
newtail = >next;
} else {
-   free(ref);
+   ref->next = unmatched;
+   unmatched = ref;
}
}
 
@@ -648,7 +675,9 @@ static void filter_refs(struct fetch_pack_args *args,
continue;
 
if ((allow_unadvertised_object_request &
-   (ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
+(ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1)) ||
+   tip_oids_contain(_oids, unmatched, newlist,
+>old_oid)) {
ref->match_status = REF_MATCHED;
*newtail = copy_ref(ref);
newtail = &(*newtail)->next;
@@ -656,6 +685,13 @@ static void filter_refs(struct fetch_pack_args *args,
ref->match_status = REF_UNADVERTISED_NOT_ALLOWED;
}
}
+
+   oidset_clear(_oids);
+   for (ref = unmatched; ref; ref = next) {
+   next = ref->next;
+   free(ref);
+   }
+
*refs = newlist;
 }
 
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index b5865b385..80a1a3239 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -547,6 +547,41 @@ test_expect_success 'fetch-pack can fetch a raw sha1' '
git fetch-pack hidden $(git -C hidden rev-parse refs/hidden/one)
 '
 
+test_expect_success 'fetch-pack can fetch a raw sha1 that is advertised as a 
ref' '
+   rm -rf server client &&
+   git init server &&
+   test_commit -C server 1 &&
+
+   git init client &&
+   git -C client fetch-pack ../server \
+   $(git -C server rev-parse refs/heads/master)
+'
+
+test_expect_success 'fetch-pack can fetch a raw sha1 overlapping a named ref' '
+   rm -rf server client &&
+   git init server &&
+   test_commit -C server 1 &&
+   test_commit -C server 2 &&
+
+   git init client &&
+   git -C client fetch-pack ../server \
+   $(git -C server rev-parse refs/tags/1) refs/tags/1

Re: [GSoC][RFC/PATCH v3 2/2] submodule: port subcommand foreach from shell to C

2017-05-15 Thread Stefan Beller
So we're looking for more reviewers for this patch,
one way to do it is e.g. via
$ git shortlog --since 12.month -sne -- ./git-submodule.sh
(so I cc'd Brandon)
Another way to find reviewers is
$ git blame ./git-submodule.sh
(so I cc'd Anders and Johan)

> It can been seen that the patch fails in test #9
> of t7407-submodule-foreach, which is the newly added
> test to that suite. The main reason of adding this test
> was to bring the behavior of $path for the submodule
> foreach --recursive case.
>
> The observation made was as follows:
>
> For a project - super containing dir (not a submodule)
> and a submodule sub which contains another submodule
> subsub. When we run a command from super/dir:
>
> git submodule foreach "echo \$path-\$sm_path"
>
> actual results:
> Entering '../sub'
> ../sub-../sub
> Entering '../sub/subsub'
> ../subsub-../subsub
>
> ported function's result:
> Entering '../sub'
> sub-../sub
> Entering '../sub/subsub'
> subsub-../sub/subsub
>
> This is occurring since in cmd_foreach of git-submodule.sh
> when we use to recurse, we call cmd_foreach
> and hence the process ran in the same shell.
> Because of this, the variable $wt_prefix is set only once
> which is at the beginning of the submodule foreach execution.
> wt_prefix=$(git rev-parse --show-prefix)
>
> And since sm_path and path are set using $wt_prefix as :
> sm_path=$(git submodule--helper relative-path "$sm_path" "$wt_prefix") &&
> path=$sm_path
> It differs with the value of displaypath as well.
>
> This make the value of $path confusing and I also feel it
> deviates from its documentation:
> $path is the name of the submodule directory relative
> to the superproject.
>
> But since in refactoring the code, we wish to maintain the
> code in same way, we need to pass wt_prefix on every
> recursive call, which may result in complex C code.
> Another option could be to first correct the $path value
> in git-submodule.sh and then port the updated cmd_foreach.
>

We discussed this patch off list before and I think the code is fine,
two minor nits below. But this observation is something we'd want
to talk about here on list. Sorry for dropping the ball for 3 days.

If we do not apply patch 1, the test suite passes, as far as I observe
the situation and patch 1 introduces a test that passes the shell
foreach, but breaks here, IIUC.

Specifically the $sm_path breaks the test in the sense that it is
actually correct now for nested submodules when the command
in the superproject is run from within a directory.

I think this is a rare bug, that we can just fix along the way, i.e.
mention in the commit message that we fix it and adapt the test
such that it passes with the rewrite in C, here.

> +   argv_array_pushf(_array, "path=%s", list_item->name);

git-am claims there are trailing white spaces.
$ git am 
Applying: submodule: port subcommand foreach from shell to C
.git/rebase-apply/patch:158: trailing whitespace.
argv_array_pushf(_array, "path=%s", list_item->name);
.git/rebase-apply/patch:172: trailing whitespace.
warning: 2 lines add whitespace errors.

Not sure which editor you use, but I could configure my editor to strip
trailing whitespaces on save, and had never any issues with them
since.

> +   struct cb_foreach info;
..
> +
> +   memset(, 0, sizeof(info));

Instead of this memset, you could also have a #define CB_FOREACH_INIT
similar to e.g. MODULE_LIST_INIT, STRBUF_INIT, STRING_LIST_INIT and so on.
This memset works, too.

Thanks,
Stefan


Re: [PATCH v2 27/29] pack-objects: fix buggy warning about threads

2017-05-15 Thread Ævar Arnfjörð Bjarmason
On Mon, May 15, 2017 at 10:59 AM, Junio C Hamano  wrote:
> Ævar Arnfjörð Bjarmason   writes:
>
>> Fix a buggy warning about threads under NO_PTHREADS=YesPlease. Due to
>> re-using the delta_search_threads variable for both the state of the
>> "pack.threads" config & the --threads option, setting "pack.threads"
>> but not supplying --threads would trigger the warning for both
>> "pack.threads" & --threads.
>>
>> Solve this bug by resetting the delta_search_threads variable in
>> git_pack_config(), it might then be set by --threads again and be
>> subsequently warned about, as the test I'm changing here asserts.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  builtin/pack-objects.c | 4 +++-
>>  t/t5300-pack-object.sh | 3 +--
>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>> index 0fe35d1b5a..f1baf05dfe 100644
>> --- a/builtin/pack-objects.c
>> +++ b/builtin/pack-objects.c
>> @@ -2472,8 +2472,10 @@ static int git_pack_config(const char *k, const char 
>> *v, void *cb)
>>   die("invalid number of threads specified (%d)",
>>   delta_search_threads);
>>  #ifdef NO_PTHREADS
>> - if (delta_search_threads != 1)
>> + if (delta_search_threads != 1) {
>>   warning("no threads support, ignoring %s", k);
>> + delta_search_threads = 0;
>> + }
>>  #endif
>>   return 0;
>>   }
>> diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
>> index 1629fa80b0..0ec25c4966 100755
>> --- a/t/t5300-pack-object.sh
>> +++ b/t/t5300-pack-object.sh
>> @@ -445,8 +445,7 @@ test_expect_success !PTHREADS,C_LOCALE_OUTPUT 
>> 'pack-objects --threads=N or pack.
>>   git -c pack.threads=2 pack-objects --stdout --all > >/dev/null 2>err &&
>>   cat err &&
>>   grep ^warning: err >warnings &&
>> - test_line_count = 2 warnings &&
>> - grep -F "no threads support, ignoring --threads" err &&
>> + test_line_count = 1 warnings &&
>>   grep -F "no threads support, ignoring pack.threads" err &&
>>   git -c pack.threads=2 pack-objects --threads=4 --stdout --all 
>> /dev/null 2>err &&
>>   grep ^warning: err >warnings &&
>
> Commenting on both 26 and 27.
>
> The usual way we document a known breakage to be fixed is to write a
> test that checks for the desired outcome with test_expect_failure,
> and when a patch corrects the behaviour we just flip it to expect
> success instead.  On the other hand, when we document a behaviour
> that is accepted/acceptable we would have a test that checks for the
> then-accepted behaviour with test_expect_success, and a patch that
> improves the behaviour would update the expectation.
>
> This follows the second pattern, even though the log message for the
> patches claim this is about an existing bug and its fix.
>
> Now, I am not saying (at least not yet) that 26 & 27 violates the
> established practice and they must be changed to expect seeing only
> one line of output in warnings with test_expect_failure in patch 26
> which is flipped to test_expect_success in patch 27.  Yes, it does
> not follow the usual pattern, but it gives a good food-for-thought.
>
> Perhaps our usual pattern may be suboptimal in illustrating in what
> way the current behaviour is not desirable, and the way these two
> patches document the current breakage and then documents the fixed
> behaviour may be a better example to follow?  With our usual way, it
> is not apparent until you actually run the tests with the current
> code what the questionable current behaviour is, but with the way
> patch 26 is written, we can tell that two warnings are given,
> instead of one.
>
> I dunno.  What do others think?

I think it makes sense to make use of test_expect_failure here. I'll
do that in v3.


Is it possible to use new conditional includes outside of git dir?

2017-05-15 Thread Mihails Strasuns
Hello,

I was very excited to try out new conditional include feature but have
quickly found out that it doesn't work with some of my custom scripts
because of the following behaviour:

# .gitconfig
[includeIf "gitdir:~/work/"]
path = ~/.gitconfig.work
[user]
name1 = Someone

# .gitconfig.work
[user]
name2 = SomeoneElse
$ cd ~/work; git config user.name1 # "Someone"
$ cd ~/work ; git config user.name2 # 

I see that it matches current `git config` documentation - indeed, it
mentions only matching on git directory and not arbitrary file path. I
wonder though, if this limitation is intentional and what is the
rationale behind it?

Thank you,
Michael



signature.asc
Description: OpenPGP digital signature


Re: [RFC PATCH 00/19] Diff machine: highlight moved lines.

2017-05-15 Thread Stefan Beller
On Mon, May 15, 2017 at 5:43 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> For details on *why* see the commit message of the last commit.
>
> Luckily, we have a good test case to see how effective this approach
> is in the flight.  Running
>
>   $ git diff master...'pu^{/Merge branch .js/blame-lib}'^2
>
> with your new feature should tell us that bulk of blame.[ch] that
> are new files came from builtin/blame.c with some symbols renamed.

Oh! Yeah, that looks nice. Though looking through that[1] it seems not yet
optimal to me.

So we have 2 additional colors for moved code to differentiate between
adjacent moved blocks. However in this implementation we toggle between
these two colors even if we're not adjacent, such that the moved_alternative
color shows up quite frequently.

So if we have normal lines in between, we may want to reset to "default"
moved color.

[1] http://i.imgur.com/djAoTf8.png

Thanks for pointing out this series,
Stefan


Git just passed Subversion on openhub.net

2017-05-15 Thread Øyvind A . Holm
openhub.net has a comparion of the number of public repositories on the 
net, based on searching public hosting services on the net. Git just 
passed Subversion after the number of Git repositories has exploded 
lately. It seems as lots of new repositories were created after cpython 
changed to Git in February.

I've been tracking the development on 
 since 2014-08, and all 
the data since then are availble on

  https://gitlab.com/sunny256/openhub-repositories
and
  https://github.com/sunny256/openhub-repositories

Current status:

 https://gitlab.com/sunny256/openhub-repositories/blob/master/status.txt

SVG graphs are available on

  https://gitlab.com/sunny256/openhub-repositories/tree/master/graph

Regards,
Øyvind

+-| Øyvind A. Holm  - N 60.37604° E 5.9° |-+
| OpenPGP: 0xFB0CBEE894A506E5 - http://www.sunbase.org/pubkey.asc |
| Fingerprint: A006 05D6 E676 B319 55E2  E77E FB0C BEE8 94A5 06E5 |
+| 27e0042e-3985-11e7-b3fc-db5caa6d21d3 |-+


signature.asc
Description: PGP signature


Re: [PATCH v3] builtin/log: honor log.decorate

2017-05-15 Thread Jonathan Nieder
Junio C Hamano wrote:
> "brian m. carlson"  writes:

>> The recent change that introduced autodecorating of refs accidentally
>> broke the ability of users to set log.decorate = false to override it.
>> When the git_log_config was traversed a second time with an option other
>> than log.decorate, the decoration style would be set to the automatic
>> style, even if the user had already overridden it.  Instead of setting
>> the option in config parsing, set it in init_log_defaults instead.
>>
>> Add a test for this case.  The actual additional config option doesn't
>> matter, but it needs to be something not already set in the
>> configuration file.
>>
>> Signed-off-by: brian m. carlson 
>> ---
>> Changes from v2:
>> * Add a test.  I tested that the config parsing both works with
>>   additional options and also can be overridden from the command line.
>
> Thanks, all.
>
> Will queue with Acked-by by Alex and Reviewed-by by Jonathan.

For completeness: yeah, this version is Reviewed-by: me.  Thanks, all.


Re: [PATCH 02/19] diff: move line ending check into emit_hunk_header

2017-05-15 Thread Stefan Beller
On Sun, May 14, 2017 at 11:48 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> In a later patch, I want to propose an option to detect
>> moved lines in a diff, which cannot be done in a one-pass over
>> the diff. Instead we need to go over the whole diff twice,
>> because we cannot detect the first line of the two corresponding
>> lines (+ and -) that got moved.
>>
>> So to prepare the diff machinery for two pass algorithms
>> (i.e. buffer it all up and then operate on the result),
>> move all emissions to places, such that the only emitting
>> function is emit_line_0.
>>
>> This patch moves code that is conceptually part of
>> emit_hunk_header, but was using output in fn_out_consume,
>> back to emit_hunk_header.
>
> Makes sort-of sense.  If I were selling this patch, I'd remove the
> first two paragraph and stress on how completing the line inside
> emit_hunk_header() is conceptually cleaner than doing it outside.
>
> emit_hunk_header() function is responsible for assembling a
> hunk header and calling emit_line() to send the hunk header
> to the output file.  Its only caller fn_out_consume() needs
> to prepare for a case where the function emits an incomplete
> line and add the terminating LF.
>
> Instead make sure emit_hunk_header() to always send a
> completed line to emit_line().
>
> or something like that.
>
> Note that I am not saying "buffering the entire diff in-core?  why
> should we support such a use case?".  I am saying that this change
> is a clean-up that is justifiable, without having to answer such a
> question.

Right, the first couple patches are more cleanup than preparation.
I considered sending them on their own, but then decided to rather
include it in this series.

I'll reword the commit message for a resend.

Thanks,
Stefan


Re: [PATCH] tag: duplicate mention of --contains should mention --no-contains

2017-05-15 Thread Marc Branchaud

On 2017-05-15 11:01 AM, Ævar Arnfjörð Bjarmason wrote:

On Mon, May 15, 2017 at 4:20 PM, Marc Branchaud  wrote:

On 2017-05-15 08:23 AM, Ævar Arnfjörð Bjarmason wrote:


Fix a duplicate mention of --contains in the SYNOPSIS to mention
--no-contains.

This fixes an error introduced in my commit ac3f5a3468 ("ref-filter:
add --no-contains option to tag/branch/for-each-ref", 2017-03-24).

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/git-tag.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index f8a0b787f4..1eb15afa1c 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -12,7 +12,7 @@ SYNOPSIS
 'git tag' [-a | -s | -u ] [-f] [-m  | -F ]
 [ | ]
 'git tag' -d ...
-'git tag' [-n[]] -l [--contains ] [--contains ]
+'git tag' [-n[]] -l [--contains ] [--no-contains ]



I think

[--[no-]contains ]

is the usual pattern...


[--points-at ] [--column[=] | --no-column]
[--create-reflog] [--sort=] [--format=]
[--[no-]merged []] [...]



... like with --[no-]merged, there.

M.


Thanks for the feedback, this was discussed earlier in the series and
we decided on the current format I'm submitting here.

Saying --[no-]merged is the convention we use for options where the
two are mutually exclusive, as is the case with the --[no-]merged
options:

$ git tag --merged v2.12.0 --no-merged v2.13.0
error: option `no-merged' is incompatible with --merged
[...]

But in the case of --contains and --no-contains you can provide both:

$ git tag --contains v2.12.0 --no-contains v2.13.0 'v*'
v2.12.0
v2.12.1
v2.12.2
v2.12.3
v2.13.0-rc0
v2.13.0-rc1
v2.13.0-rc2

So they don't use that convention, since it would imply that they're
mutually exclusive, rather than both being optional.


Ah, thanks.  My mistake!

M.



[PATCH 2/2] mingw: Suppress warning that :.gitattributes does not exist

2017-05-15 Thread Johannes Schindelin
On Windows, a file name containing a colon is illegal. We should
therefore expect the corresponding errno when `fopen()` is called for a
path of the form :.gitattributes.

This fixes the symptom reported in

https://github.com/git-for-windows/git/issues/255

Signed-off-by: Johannes Schindelin 
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 7e2134471cb..a0297088633 100644
--- a/attr.c
+++ b/attr.c
@@ -726,7 +726,7 @@ static struct attr_stack *read_attr_from_file(const char 
*path, int macro_ok)
int lineno = 0;
 
if (!fp) {
-   if (errno != ENOENT && errno != ENOTDIR)
+   if (errno != ENOENT && errno != ENOTDIR && errno != EINVAL)
warn_on_inaccessible(path);
return NULL;
}
-- 
2.13.0.windows.1


[PATCH 1/2] gitattributes: demonstrate that Git tries to read a bogus file

2017-05-15 Thread Johannes Schindelin
This problem has been reported originally in August 2015, as

https://github.com/git-for-windows/git/issues/255

The symptom: when passing :/ style arguments to
`git diff`, Git tries to read the attributes from a file called
:/.gitattributes.

This symptom is more prominent on Windows because the colon in the file
name is illegal, and therefore reported to the user. On Linux, the colon
is legal, and it just so happens that that file typically does not
exist, and therefore there are no adverse consequences.

However, it is still a bug: Git should not even attempt to open that
file. Let's add a test case to demonstrate that problem, even on Linux
and MacOSX.

The underlying problem will be really tricky to fix: the run_diff*()
family of functions expects the path passed via the diff_filespec
structs to be the path as if it were in the worktree. However, when
processing the `git diff  ` invocation, Git uses
setup_revisions()'s parsing of the pending objects to fill in this
information, and setup_revisions() simply copies the command-line
argument, rather than reconstructing the actual *file* path.

Signed-off-by: Johannes Schindelin 
---
 t/t0003-attributes.sh | 21 +
 1 file changed, 21 insertions(+)

diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index f19ae4f8ccd..a7820022d8d 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -323,4 +323,25 @@ test_expect_success 'bare repository: test 
info/attributes' '
)
 '
 
+test_expect_failure 'a file :.gitattributes is ignored' '
+   git init bogus-file &&
+   (
+   cd bogus-file &&
+   mkdir sub &&
+   test_commit sub/file &&
+   test_commit sub/file2 &&
+   commit=$(git rev-parse HEAD) &&
+   if test_have_prereq !MINGW
+   then
+   # Windows does not support colons in filenames
+   mkdir $commit:sub &&
+   echo "* -inva/id" >$commit:sub/.gitattributes
+   fi &&
+   git diff $commit:sub/file.t..$commit:sub/file2.t >out 2>err &&
+   ! grep "is not a valid attribute name" err &&
+   # On Windows, there will be a warning because of the colon
+   ! grep "warning: unable to access .$commit:sub" err
+   )
+'
+
 test_done
-- 
2.13.0.windows.1




  1   2   >