What's cooking in git.git (Oct 2018, #04; Fri, 19)

2018-10-18 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

Two large set of topics on "rebase in C" and "rebase -i in C" are
now in 'next'.  A handful of improvements around "git help", "git
grep", and "git status" are in 'master', as well as other internal
changes.

Note that "cf. " are not meant as "clear all of these and
you are home free".  They are primarily to prevent me from merging
topics that are not ready to 'next' by mistake and remind me where
to go back to read the last state of the discussion in the archive.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* bp/read-cache-parallel (2018-10-11) 7 commits
  (merged to 'next' on 2018-10-12 at ed6edde799)
 + read-cache: load cache entries on worker threads
 + ieot: add Index Entry Offset Table (IEOT) extension
 + read-cache: load cache extensions on a worker thread
 + config: add new index.threads config setting
 + eoie: add End of Index Entry (EOIE) extension
 + read-cache: clean up casting and byte decoding
 + read-cache.c: optimize reading index format v4

 A new extension to the index file has been introduced, which allows
 the file to be read in parallel.


* bp/rename-test-env-var (2018-09-28) 6 commits
  (merged to 'next' on 2018-10-12 at 201e451d20)
 + t: do not get self-test disrupted by environment warnings
 + preload-index: update GIT_FORCE_PRELOAD_TEST support
 + read-cache: update TEST_GIT_INDEX_VERSION support
 + fsmonitor: update GIT_TEST_FSMONITOR support
 + preload-index: use git_env_bool() not getenv() for customization
 + t/README: correct spelling of "uncommon"

 Some environment variables that control the runtime options of Git
 used during tests are getting renamed for consistency.


* ds/commit-graph-leakfix (2018-10-07) 3 commits
  (merged to 'next' on 2018-10-12 at 8cc7f2f1e9)
 + commit-graph: reduce initial oid allocation
 + builtin/commit-graph.c: UNLEAK variables
 + commit-graph: clean up leaked memory during write

 Code clean-up.


* jc/how-to-document-api (2018-09-29) 1 commit
  (merged to 'next' on 2018-10-12 at 7c9bd82285)
 + CodingGuidelines: document the API in *.h files

 Doc update.


* jt/avoid-ls-refs (2018-10-07) 4 commits
  (merged to 'next' on 2018-10-12 at 5775aabbc1)
 + fetch: do not list refs if fetching only hashes
 + transport: list refs before fetch if necessary
 + transport: do not list refs if possible
 + transport: allow skipping of ref listing

 Over some transports, fetching objects with an exact commit object
 name can be done without first seeing the ref advertisements.  The
 code has been optimized to exploit this.


* jt/cache-tree-allow-missing-object-in-partial-clone (2018-10-10) 1 commit
  (merged to 'next' on 2018-10-12 at 152ad8e336)
 + cache-tree: skip some blob checks in partial clone

 In a partial clone that will lazily be hydrated from the
 originating repository, we generally want to avoid "does this
 object exist (locally)?" on objects that we deliberately omitted
 when we created the clone.  The cache-tree codepath (which is used
 to write a tree object out of the index) however insisted that the
 object exists, even for paths that are outside of the partial
 checkout area.  The code has been updated to avoid such a check.


* jt/fetch-tips-in-partial-clone (2018-09-21) 2 commits
  (merged to 'next' on 2018-10-12 at 521b3fb44d)
 + fetch: in partial clone, check presence of targets
 + connected: document connectivity in partial clones

 "git fetch $repo $object" in a partial clone did not correctly
 fetch the asked-for object that is referenced by an object in
 promisor packfile, which has been fixed.


* jt/non-blob-lazy-fetch (2018-10-04) 2 commits
  (merged to 'next' on 2018-10-12 at 7466c6bd7d)
 + fetch-pack: exclude blobs when lazy-fetching trees
 + fetch-pack: avoid object flags if no_dependents

 A partial clone that is configured to lazily fetch missing objects
 will on-demand issue a "git fetch" request to the originating
 repository to fill not-yet-obtained objects.  The request has been
 optimized for requesting a tree object (and not the leaf blob
 objects contained in it) by telling the originating repository that
 no blobs are needed.


* nd/complete-fetch-multiple-args (2018-09-21) 1 commit
  (merged to 'next' on 2018-10-10 at f78e14123c)
 + completion: support "git fetch --multiple"

 Teach bash completion that "git fetch --multiple" only takes remote
 names as arguments and no refspecs.


* nd/help-commands-verbose-by-default (2018-10-03) 1 commit
  (merged to 'next' on 2018-10-12 at 32de8f53e0)
 + help -a: improve and make --verbose default

 "git help -a" and "git 

Re: [PATCH] receive: denyCurrentBranch=updateinstead should not blindly update

2018-10-18 Thread Eric Sunshine
On Fri, Oct 19, 2018 at 12:57 AM Junio C Hamano  wrote:
> The handling of receive.denyCurrentBranch=updateInstead was added to
> a switch statement that handles other values of the variable, but
> all the other case arms only checked a condition to reject the
> attempted push, or let later logic in the same function to still
> intervene, so that a push that does not fast-forward (which is
> checked after the switch statement in question) is still rejected.
>
> But the handling of updateInstead incorrectly took immediate effect,
> without giving other checks a chance to intervene.
>
> Instead of calling update_worktree() that causes the side effect
> immediately, just note the fact that we will need to call the
> funciton later, and first give other checks chance to reject the

s/funciton/function
s/chance/a &/

> request.  After the update-hook gets a chance to reject the push
> (which happens as the last step in a series of checks), call
> update_worktree() when we earlier detected the need to.
>
> Reported-by: Rajesh Madamanchi
> Signed-off-by: Junio C Hamano 


Re: [PATCH 1/1] commit-reach: fix first-parent heuristic

2018-10-18 Thread Junio C Hamano
"Derrick Stolee via GitGitGadget"  writes:

> We can also re-run the performance tests from commit 4fbcca4e
> "commit-reach: make can_all_from_reach... linear".
>
> Performance was measured on the Linux repository using
> 'test-tool reach can_all_from_reach'. The input included rows seeded by
> tag values. The "small" case included X-rows as v4.[0-9]* and Y-rows as
> v3.[0-9]*. This mimics a (very large) fetch that says "I have all major
> v3 releases and want all major v4 releases." The "large" case included
> X-rows as "v4.*" and Y-rows as "v3.*". This adds all release-candidate
> tags to the set, which does not greatly increase the number of objects
> that are considered, but does increase the number of 'from' commits,
> demonstrating the quadratic nature of the previous code.

These micro-benchmarks are interesting, but we should also remember
to describe the impact of the bug being fixed in the larger picture.
What end-user visible operations are affected?  Computing merge-base?
Finding if a push fast-forwards?  Something else?


[PATCH] receive: denyCurrentBranch=updateinstead should not blindly update

2018-10-18 Thread Junio C Hamano
The handling of receive.denyCurrentBranch=updateInstead was added to
a switch statement that handles other values of the variable, but
all the other case arms only checked a condition to reject the
attempted push, or let later logic in the same function to still
intervene, so that a push that does not fast-forward (which is
checked after the switch statement in question) is still rejected.

But the handling of updateInstead incorrectly took immediate effect,
without giving other checks a chance to intervene.

Instead of calling update_worktree() that causes the side effect
immediately, just note the fact that we will need to call the
funciton later, and first give other checks chance to reject the
request.  After the update-hook gets a chance to reject the push
(which happens as the last step in a series of checks), call
update_worktree() when we earlier detected the need to.

Reported-by: Rajesh Madamanchi
Signed-off-by: Junio C Hamano 
---
 builtin/receive-pack.c | 12 +---
 t/t5516-fetch-push.sh  |  8 +++-
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 95740f4f0e..79ee320948 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1026,6 +1026,7 @@ static const char *update(struct command *cmd, struct 
shallow_info *si)
const char *ret;
struct object_id *old_oid = &cmd->old_oid;
struct object_id *new_oid = &cmd->new_oid;
+   int do_update_worktree = 0;
 
/* only refs/... are allowed */
if (!starts_with(name, "refs/") || check_refname_format(name + 5, 0)) {
@@ -1051,9 +1052,8 @@ static const char *update(struct command *cmd, struct 
shallow_info *si)
refuse_unconfigured_deny();
return "branch is currently checked out";
case DENY_UPDATE_INSTEAD:
-   ret = update_worktree(new_oid->hash);
-   if (ret)
-   return ret;
+   /* pass -- let other checks intervene first */
+   do_update_worktree = 1;
break;
}
}
@@ -1118,6 +1118,12 @@ static const char *update(struct command *cmd, struct 
shallow_info *si)
return "hook declined";
}
 
+   if (do_update_worktree) {
+   ret = update_worktree(new_oid->hash);
+   if (ret)
+   return ret;
+   }
+
if (is_null_oid(new_oid)) {
struct strbuf err = STRBUF_INIT;
if (!parse_object(the_repository, old_oid)) {
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 7a8f56db53..7316365a24 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1577,7 +1577,13 @@ test_expect_success 'receive.denyCurrentBranch = 
updateInstead' '
test $(git -C .. rev-parse master) = $(git rev-parse HEAD) &&
git diff --quiet &&
git diff --cached --quiet
-   )
+   ) &&
+
+   # (6) updateInstead intervened by fast-forward check
+   test_must_fail git push void master^:master &&
+   test $(git -C void rev-parse HEAD) = $(git rev-parse master) &&
+   git -C void diff --quiet &&
+   git -C void diff --cached --quiet
 '
 
 test_expect_success 'updateInstead with push-to-checkout hook' '
-- 
2.19.1-450-ga4b8ab5363



[PATCH v3] fetch: replace string-list used as a look-up table with a hashmap

2018-10-18 Thread Junio C Hamano
In find_non_local_tags() helper function (used to implement the
"follow tags"), we use string_list_has_string() on two string lists
as a way to see if a refname has already been processed, etc.

All this code predates more modern in-core lookup API like hashmap;
replace them with two hashmaps and one string list---the hashmaps
are used for look-ups and the string list is to keep the order of
items in the returned result stable (which is the only single thing
hashmap does worse than lookups on string-list).

Similarly, get_ref_map() uses another string-list as a look-up table
for existing refs.  Replace it with a hashmap.

Signed-off-by: Junio C Hamano 
---

 * This converts another string-list user in the same file.  For now
   I am done with this topic; I'm willing to fix bugs in this patch,
   but do not intend to spend time killing more string-lists used as
   look-up tables.

 builtin/fetch.c | 152 +---
 1 file changed, 106 insertions(+), 46 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 0696abfc2a..0f8e333022 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -223,18 +223,6 @@ static void add_merge_config(struct ref **head,
}
 }
 
-static int add_existing(const char *refname, const struct object_id *oid,
-   int flag, void *cbdata)
-{
-   struct string_list *list = (struct string_list *)cbdata;
-   struct string_list_item *item = string_list_insert(list, refname);
-   struct object_id *old_oid = xmalloc(sizeof(*old_oid));
-
-   oidcpy(old_oid, oid);
-   item->util = old_oid;
-   return 0;
-}
-
 static int will_fetch(struct ref **head, const unsigned char *sha1)
 {
struct ref *rm = *head;
@@ -246,16 +234,76 @@ static int will_fetch(struct ref **head, const unsigned 
char *sha1)
return 0;
 }
 
+struct refname_hash_entry {
+   struct hashmap_entry ent; /* must be the first member */
+   struct object_id oid;
+   char refname[FLEX_ARRAY];
+};
+
+static int refname_hash_entry_cmp(const void *hashmap_cmp_fn_data,
+ const void *e1_,
+ const void *e2_,
+ const void *keydata)
+{
+   const struct refname_hash_entry *e1 = e1_;
+   const struct refname_hash_entry *e2 = e2_;
+
+   return strcmp(e1->refname, keydata ? keydata : e2->refname);
+}
+
+static struct refname_hash_entry *refname_hash_add(struct hashmap *map,
+  const char *refname,
+  const struct object_id *oid)
+{
+   struct refname_hash_entry *ent;
+   size_t len = strlen(refname);
+
+   FLEX_ALLOC_MEM(ent, refname, refname, len);
+   hashmap_entry_init(ent, memhash(refname, len));
+   oidcpy(&ent->oid, oid);
+   hashmap_add(map, ent);
+   return ent;
+}
+
+static int add_one_refname(const char *refname,
+  const struct object_id *oid,
+  int flag, void *cbdata)
+{
+   struct hashmap *refname_map = cbdata;
+
+   (void) refname_hash_add(refname_map, refname, oid);
+   return 0;
+}
+
+static void refname_hash_init(struct hashmap *map)
+{
+   hashmap_init(map, refname_hash_entry_cmp, NULL, 0);
+}
+
+static int refname_hash_exists(struct hashmap *map, const char *refname)
+{
+   struct hashmap_entry key;
+   size_t len = strlen(refname);
+   hashmap_entry_init(&key, memhash(refname, len));
+
+   return !!hashmap_get(map, &key, refname);
+}
+
 static void find_non_local_tags(const struct ref *refs,
struct ref **head,
struct ref ***tail)
 {
-   struct string_list existing_refs = STRING_LIST_INIT_DUP;
-   struct string_list remote_refs = STRING_LIST_INIT_NODUP;
+   struct hashmap existing_refs;
+   struct hashmap remote_refs;
+   struct string_list remote_refs_list = STRING_LIST_INIT_NODUP;
+   struct string_list_item *remote_ref_item;
const struct ref *ref;
-   struct string_list_item *item = NULL;
+   struct refname_hash_entry *item = NULL;
 
-   for_each_ref(add_existing, &existing_refs);
+   refname_hash_init(&existing_refs);
+   refname_hash_init(&remote_refs);
+
+   for_each_ref(add_one_refname, &existing_refs);
for (ref = refs; ref; ref = ref->next) {
if (!starts_with(ref->name, "refs/tags/"))
continue;
@@ -271,10 +319,10 @@ static void find_non_local_tags(const struct ref *refs,
!has_object_file_with_flags(&ref->old_oid,
OBJECT_INFO_QUICK) &&
!will_fetch(head, ref->old_oid.hash) &&
-   !has_sha1_file_with_flags(item->util,
+   !has_sha1_file_with_flags(item->oid.hash,
   

Re: receive.denyCurrentBranch=updateInstead updates working tree even when receive.denyNonFastForwards rejects push

2018-10-18 Thread Junio C Hamano
Junio C Hamano  writes:

> Rajesh Madamanchi  writes:
>
>> Hi, I am looking to report the below behavior when seems incorrect to
>> me when receive.denyCurrentBranch is set to updateInstead and
>> receive.denyNonFastForwards is set to true.
>
> It seems that we took a lazy but incorrect route while adding the
> DENY_UPDATE_INSTEAD hack to builtin/receive-pack.c::update() and new
> code went to a wrong place in a series of checks.  Everythng else in
> the same switch() statement either refuses or just decides to let
> later step to update without taking actual action, so that later
> checks such as "the new tip commit must have been transferred", "the
> new tip must be a fast-forward of the old tip", etc., but the one
> for DENY_UPDATE_INSTEAD immediately calls update_worktree() there.
> It should be changed to decide to later call the function when
> everybody else in the series of checks agrees that it is OK to let
> this push accepted, and then the actual call is made somewhere near
> where we call run_update_hook(), probably after the hook says it is
> OK to update.

So here is a lunch-time hack that is not even compile tested but
illustrates the idea outlined above.  We'd need to add tests to
protect the fix from future breakages (if the fix is correct, that
is, which I do not quite know---but it feels right ;-).

 builtin/receive-pack.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 7f089be11e..4bf316dbba 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1050,9 +1050,7 @@ static const char *update(struct command *cmd, struct 
shallow_info *si)
refuse_unconfigured_deny();
return "branch is currently checked out";
case DENY_UPDATE_INSTEAD:
-   ret = update_worktree(new_oid->hash);
-   if (ret)
-   return ret;
+   /* pass -- let other checks intervene first */
break;
}
}
@@ -1117,6 +1115,12 @@ static const char *update(struct command *cmd, struct 
shallow_info *si)
return "hook declined";
}
 
+   if (deny_current_branch == DENY_UPDATE_INSTEAD) {
+   ret = update_worktree(new_oid->hash);
+   if (ret)
+   return ret;
+   }
+
if (is_null_oid(new_oid)) {
struct strbuf err = STRBUF_INIT;
if (!parse_object(the_repository, old_oid)) {


Re: [PATCH v2 0/3] Clear flags before each v2 request

2018-10-18 Thread Junio C Hamano
Jonathan Tan  writes:

> Jonathan Tan (3):
>   upload-pack: make have_obj not global
>   upload-pack: make want_obj not global
>   upload-pack: clear flags before each v2 request

It took a bit of time why 2/3 did not apply cleanly but it turns out
this is based on a slightly older tip of 'master' (but still ahead
of 'maint'), that lacks 829a3215 ("commit-graph: close_commit_graph
before shallow walk", 2018-08-20).  Applying and merging it to make
it up-to-date with the tip of 'master' went smoothly once I figured
it out.

The first two clean-up patches are probably overdue and worth doing
regardless of the bugfix.  Nicely done.

The first two steps use static objects local to a transport method
to "preserve the existing behaviour", and because this codepath
happens to want a clean slate every time it gets called, the third
step manages to lose it, which is a nice progression.  But it makes
me wonder if it also hints that there may be a need to invent a
state object that is passed around from the transport layer across
requests, if we want to fulfill a request by calling multiple
transport methods in the future.  In any case, there is no immediate
need to address this comment ;-).

Will replace.  Thanks.



Re: [PATCH v4 9/9] Documentation/config: add odb..promisorRemote

2018-10-18 Thread Junio C Hamano
Jonathan Nieder  writes:

> Junio C Hamano wrote:
> ...
>> It is a good idea to implicitly include the promisor-remote to the
>> set of secondary places to consult to help existing versions of Git,
>> but once the repository starts fetching incomplete subgraphs and
>> adding new object.missingobjectremote [*1*], these versions of Git
>> will stop working correctly, so I am not sure if it is all that
>> useful approach for compatibility in practice.
>
> Can you spell this out for me more?  Do you mean that a remote from
> this list might make a promise that the original partialClone remote
> can't keep?

It was my failed attempt to demonstrate that I understood what was
being discussed by rephrasing JTan's

Or allow extensions.partialClone= wherein  is not in the
missingObjectRemote, in which case  is tried first, so that
we don't have to reject some configurations.



Re: [PATCH 0/1] commit-reach: fix first-parent heuristic

2018-10-18 Thread Junio C Hamano
"Derrick Stolee via GitGitGadget"  writes:

> I originally reported this fix [1] after playing around with the trace2
> series for measuring performance. Since trace2 isn't merging quickly, I
> pulled the performance fix patch out and am sending it on its own. The only
> difference here is that we don't have the tracing to verify the performance
> fix in the test script.

Thanks for sending this separately.  What's the current status of
the tracing patch series, by the way?


Re: [PATCH] multi-pack-index: avoid dead store for struct progress

2018-10-18 Thread Junio C Hamano
Carlo Marcelo Arenas Belón   writes:

> it is initialized unconditionally by a call to start_progress
> below.
>
> Signed-off-by: Carlo Marcelo Arenas Belón 
> ---
>  midx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/midx.c b/midx.c
> index ea2f3ffe2e..4fac0cd08a 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -941,7 +941,7 @@ static void midx_report(const char *fmt, ...)
>  int verify_midx_file(const char *object_dir)
>  {
>   uint32_t i;
> - struct progress *progress = NULL;
> + struct progress *progress;
>   struct multi_pack_index *m = load_multi_pack_index(object_dir, 1);
>   verify_midx_error = 0;

Both changes make sense.  It's kind of surprising that this still
matters, though; I would have expected a compiler would notice.


Re: [PATCH v2 02/13] ci/lib.sh: encapsulate Travis-specific things

2018-10-18 Thread Junio C Hamano
SZEDER Gábor  writes:

> On Mon, Oct 15, 2018 at 03:12:00AM -0700, Johannes Schindelin via 
> GitGitGadget wrote:
>> diff --git a/ci/lib.sh b/ci/lib.sh
>> index 06970f7213..8532555b4e 100755
>> --- a/ci/lib.sh
>> +++ b/ci/lib.sh
>> @@ -1,5 +1,26 @@
>>  # Library of functions shared by all CI scripts
>>  
>> +if test true = "$TRAVIS"
>> +then
>> +...
>> +export GIT_PROVE_OPTS="--timer --jobs 3 --state=failed,slow,save"
>> +export GIT_TEST_OPTS="--verbose-log -x --immediate"
>> +fi
>
> Please set all these variables ...

Do you mean "VAR=VAL; export VAR" is kosher, "export VAR=VAL" is
not?

>> @@ -81,7 +102,6 @@ check_unignored_build_artifacts ()
>>  # and installing dependencies.
>>  set -ex
>
> ... after we turn on 'set -x', so the variables' values will be
> visible in the logs.

Ah, no, you didn't.  Although I think both are valid points, I think
ci/lib.sh is expected to be used only inside a more predictable
environment (e.g. we know the shell used is not a random POSIX shell
but one that is happy with "export VAR=VAL"), so it should be OK.
Showing the values of these variables in the log may still be good
idea.

> (Or move this 'set -ex' to the beginning of the script?  Then we
> could perhaps avoid similar issues in the future.)

Sure (provided that it is an issue to begin with---if we are
interested in the value of TRAVIS_BRANCH, for example, being able to
see it only because "CI_BRANCH=$TRAVIS_BRANCH" assignment is made
feels a bit sideways---we'd be better off explicitly logging
anything we are interested in in the longer term, no?).


Re: [PATCH v1 1/1] index_bulk_checkin(): Take off_t, not size_t

2018-10-18 Thread Junio C Hamano
tbo...@web.de writes:

> bulk-checkin.c | 4 ++--
>  bulk-checkin.h | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)

If you lost SP in your editor, then it is OK but if format-patch
lost it for some reason, plasee tell me as we need to find the bug.

>
> diff --git a/bulk-checkin.c b/bulk-checkin.c
> index 409ecb566b..2631e82d6c 100644
> --- a/bulk-checkin.c
> +++ b/bulk-checkin.c
> @@ -189,7 +189,7 @@ static void prepare_to_stream(struct bulk_checkin_state 
> *state,
>  
>  static int deflate_to_pack(struct bulk_checkin_state *state,
>  struct object_id *result_oid,
> -int fd, size_t size,
> +int fd, off_t size,

The size is once casted to uintmax_t for recording in the object
header (which is fine), and then passed to stream_to_pack(), which
still takes, and more importantly, does comparisons and chunking in,
size_t after this patch.  Without xsize_t() around size passed in
the call to stream_to_pack(), you may silently be truncating off_t
down to size_t in this function.

> @@ -258,7 +258,7 @@ static int deflate_to_pack(struct bulk_checkin_state 
> *state,
>  }
>  
>  int index_bulk_checkin(struct object_id *oid,
> -int fd, size_t size, enum object_type type,
> +int fd, off_t size, enum object_type type,
>  const char *path, unsigned flags)
>  {
>   int status = deflate_to_pack(&state, oid, fd, size, type,

This one is a thin wrapper around deflate_to_pack() above.

Its sole caller is sha1-file.c::index_stream() and takes size_t from
its callers, and passes size_t to index_bulk_checkin().

The sole caller of index_stream(), sha1-file.c::index_fd(), wants to
pass st->st_size, and it uses xsize_t() because index_stream() and
callchain underneath currently take size_t.  You want that callchain
to take off_t with this patch.

The whole purpose of stream_to_pack() is to take potentially large
input from the file on the filesystem, chop that into manageable
chunks and feed the underlying hashing and deflating machinery that
takes possibly smaller integer types to represent the sizes of data
they take in a single call, so once that function is taught to take
ofs_t I think you can say you converted the entire callchain from
index_fd() down.

So, this looks like a good starting step, but I suspect it needs one
more level of digging for it to become truly useful.


Re: [PATCH] diff: don't attempt to strip prefix from absolute Windows paths

2018-10-18 Thread Junio C Hamano
Johannes Sixt  writes:

> Use is_absolute_path() to detect Windows style absolute paths.

When cd676a51 ("diff --relative: output paths as relative to the
current subdirectory", 2008-02-12) was done, neither "is_dir_sep()"
nor "has_dos_drive_prefix()" existed---the latter had to wait until
25fe217b ("Windows: Treat Windows style path names.", 2008-03-05),
but we didn't notice that the above code needs to use the Windows
aware is_absolute_path() when we applied the change.

> One might wonder whether the check for a directory separator that
> is visible in the patch context should be changed from == '/' to
> is_dir_sep() or not. It turns out not to be necessary. That code
> only ever investigates paths that have undergone pathspec
> normalization, after which there are only forward slashes even on
> Windows.

Thanks for carefully explaining.

>  static void strip_prefix(int prefix_length, const char **namep, const char 
> **otherp)
>  {
>   /* Strip the prefix but do not molest /dev/null and absolute paths */
> - if (*namep && **namep != '/') {
> + if (*namep && !is_absolute_path(*namep)) {
>   *namep += prefix_length;
>   if (**namep == '/')
>   ++*namep;
>   }
> - if (*otherp && **otherp != '/') {
> + if (*otherp && !is_absolute_path(*otherp)) {
>   *otherp += prefix_length;
>   if (**otherp == '/')
>   ++*otherp;

When I read the initial report and guessed the root cause without
looking at the code, I didn't expect the problematic area to be this
isolated and the solution to be this simple.

Nicely done.


Re: [PATCH] config.mak.dev: enable -Wunused-function

2018-10-18 Thread Junio C Hamano
Ramsay Jones  writes:

> I haven't looked too deeply, but this seems to be caused by
> Junio's commit 42c89ea70a ("SQUASH??? - convert the other user of
> string-list as db", 2018-10-17) which removes a call to the
> add_existing() function - the subject of the warning.

That is very understandable and it is immediately obvious to me that
the unused-function warning is being useful there ;-)


Re: [PATCH 1/3] t6501: use --quiet when testing gc stderr

2018-10-18 Thread Junio C Hamano
Derrick Stolee  writes:

> This code from builtin/gc.c makes it look like we are doing that:
>
>     if (gc_write_commit_graph)
>     write_commit_graph_reachable(get_object_directory(), 0,
>  !quiet && !daemonized);
>
> But really, daemonized is only for when running in the background.

But that is something we can easily fix, no?

"git grep isatty" tells you that we use isatty(2) to decide if we
want to go verbose or show progress.  Another frequent use of isatty
in our codebase of course is to use of a pager and columnar output
formats.


Re: [PATCH v1 1/2] reset: don't compute unstaged changes after reset when --quiet

2018-10-18 Thread Junio C Hamano
Ben Peart  writes:

> Note the status command after the reset doesn't really change as it
> still must lstat() every file (the 0.02 difference is well within the
> variability of run to run differences).

Of course, it would not make an iota of difference, whether reset
refreshes the cached stat index fully, to the cost of later lstat().
What the refreshing saves is having to scan the contents to find that
the file is unchanged at runtime.

If your lstat() is not significantly faster than opening and
scanning the file, the optimization based on the cached-stat
information becomes moot.  In a working tree full of unmodified
files, stale cached-stat info in the index will cause us to compare
the contents and waste a lot of time, and that is what refreshing
avoids.  If the "status" in your test sequence do not have to do
that (e.g. the cached-stat information is already up-to-date and
there is no point running refresh in reset), then I'd expect no
difference between these two tests.

> To move this forward, here is what I propose:
>
> 1) If the '--quiet' flag is passed, we silently take advantage of the
> fact we can avoid having to do an "extra" lstat() of every file and
> scope the refresh_index() call to those paths that we know have
> changed.

That's pretty much what the patch under discussion does.

> 2) I can remove the note in the documentation of --quiet which I only
> added to facilitate discoverability.

Quite honestly, I am not sure if this (meaning #1 above) alone need
to be even discoverable.  Those who want --quiet output would use
it, those who want to be told which paths are modified would not,
and those who want to quickly be told which paths are modified would
not be helped by the limited refresh anyway, so "with --quiet you
can make it go faster" would not help anybody.

> 3) I can also edit the documentation for reset.quietDefault (maybe I
> should rename that to "reset.quiet"?) so that it does not discuss the
> potential performance impact.

I think reset.quiet (or reset.verbosity) is a good thing to have
regardless.



Re: [PATCH v4 9/9] Documentation/config: add odb..promisorRemote

2018-10-18 Thread Jonathan Nieder
Junio C Hamano wrote:
> Jonathan Tan  writes:
>> Jonathan Nieder wrote:

>>> [object]
>>> missingObjectRemote = local-cache-remote
>>> missingObjectRemote = origin
>>
>> In the presence of missingObjectRemote, old versions of Git, when lazily
>> fetching, would only know to try the extensions.partialClone remote. But
>> this is safe because existing data wouldn't be clobbered (since we're
>> not using ideas like adding meaning to the contents of the .promisor
>> file). Also, other things like fsck and gc still work.
>
> It is a good idea to implicitly include the promisor-remote to the
> set of secondary places to consult to help existing versions of Git,
> but once the repository starts fetching incomplete subgraphs and
> adding new object.missingobjectremote [*1*], these versions of Git
> will stop working correctly, so I am not sure if it is all that
> useful approach for compatibility in practice.

Can you spell this out for me more?  Do you mean that a remote from
this list might make a promise that the original partialClone remote
can't keep?

If we're careful to only page in objects that were promised by the
original partialClone remote, then a promise "I promise to supply you
on demand with any object directly or indirectly reachable from any
object you have fetched from me" from the partialClone remote should
be enough.

> [Footnote]
>
> *1* That name with two "object" in it sounds horrible.

Sorry about that.  Another name used while discussing this was
objectAccess.promisorRemote.  As long as the idea is clear (that this
means "remotes to use when attempting to obtain an object that is not
already available locally"), I am not attached to any particular name.

Thanks,
Jonathan


Re: receive.denyCurrentBranch=updateInstead updates working tree even when receive.denyNonFastForwards rejects push

2018-10-18 Thread Junio C Hamano
Rajesh Madamanchi  writes:

> Hi, I am looking to report the below behavior when seems incorrect to
> me when receive.denyCurrentBranch is set to updateInstead and
> receive.denyNonFastForwards is set to true.

It seems that we took a lazy but incorrect route while adding the
DENY_UPDATE_INSTEAD hack to builtin/receive-pack.c::update() and new
code went to a wrong place in a series of checks.  Everythng else in
the same switch() statement either refuses or just decides to let
later step to update without taking actual action, so that later
checks such as "the new tip commit must have been transferred", "the
new tip must be a fast-forward of the old tip", etc., but the one
for DENY_UPDATE_INSTEAD immediately calls update_worktree() there.
It should be changed to decide to later call the function when
everybody else in the series of checks agrees that it is OK to let
this push accepted, and then the actual call is made somewhere near
where we call run_update_hook(), probably after the hook says it is
OK to update.



Re: [PATCH v4 9/9] Documentation/config: add odb..promisorRemote

2018-10-18 Thread Junio C Hamano
Jonathan Tan  writes:

>>  [object]
>>  missingObjectRemote = local-cache-remote
>>  missingObjectRemote = origin
>> 
> In the presence of missingObjectRemote, old versions of Git, when lazily
> fetching, would only know to try the extensions.partialClone remote. But
> this is safe because existing data wouldn't be clobbered (since we're
> not using ideas like adding meaning to the contents of the .promisor
> file). Also, other things like fsck and gc still work.

It is a good idea to implicitly include the promisor-remote to the
set of secondary places to consult to help existing versions of Git,
but once the repository starts fetching incomplete subgraphs and
adding new object.missingobjectremote [*1*], these versions of Git
will stop working correctly, so I am not sure if it is all that
useful approach for compatibility in practice.


[Footnote]

*1* That name with two "object" in it sounds horrible.  I think the
same keyname in 'core' section may sit better (this feature sounds
more 'core' than other cruft that crept into 'core' section over
time).  

Or "odb.remoteAlternate" (as opposed to object/info/alternates that
are local alternates), perhaps.


Re: [RFC PATCH 0/2] Bring the_repository into cmd_foo

2018-10-18 Thread Stefan Beller
On Thu, Oct 18, 2018 at 2:01 PM Jonathan Tan  wrote:
>
> > > Or instead we could accelerate the long term plan of removing a
> > > hard coded the_repository and have each cmd builtin take an additional
> > > repository pointer from the init code, such that we'd bring all of Git to
> > > work on arbitrary repositories. Then the standard test suite should be
> > > okay, as there is no special case for the_repository any more.
> >
> > Demo'd in this RFC series for git-merge-base.
> >
> > The core idea is found in patch 1,
> > and the proof of concept is found in patch 2.
>
> I don't think working around the_repository is sufficient, as there are
> other ways to access the same repository state (the_index, directly
> accessing files through file I/O).

Sure, but that would stick out like a sore thumb?

> Instead I would prefer a test like in
> t/test-repository.c - each patch set would probably only need one test
> for the last function converted, since typically the last function uses
> every other function converted.

I'll look into that.

>
> Also, even if we decided that working around the_repository is
> sufficient, I don't think this get_the_repository() is a good approach.
> If (or when) we decide to convert all builtins to not use
> the_repository, we would have to clean up all such calls.

Just like we have to cleanup the calls to the_repository or the_index
in general now (c.f. nd/the-index)

> Better would be to pass the_repository from the invoker of the cmd
> builtin, and reuse NO_THE_REPOSITORY_COMPATIBILITY_MACROS in the
> builtin. (I haven't thought much about how to transition to this, but
> one way might be to extend "struct cmd_struct" in git.c to also have a
> new-style function pointer, and #define GIT_CMD(c, f, o) {c, NULL, o, f}
> or something like that.)

This sounds like the next step to me.

> This doesn't directly address the fact that the builtin might call lib
> code that indirectly references the_repository, but I think that this
> won't be an issue because by the time we're ready to convert builtins to
> not use the_repository, most if not all of the lib code would have
> NO_THE_REPOSITORY_COMPATIBILITY_MACROS defined anyway.

And until then we double up on tests, one time the regular end-to-end tests
and additional tests for repository agnostic units in test-repository.c ?

The whole point of this approach is to keep the testing at the level
that we currently have and make the tests more powerful in doubling
for both (a) testing existing behavior, (b) getting fairly good coverage
of repository-fication of the code base by these 2 simple knobs.


[PATCH] alias: detect loops in mixed execution mode

2018-10-18 Thread Ævar Arnfjörð Bjarmason
Add detection for aliasing loops in cases where one of the aliases
re-invokes git as a shell command. This catches cases like:

[alias]
foo = !git bar
bar = !git foo

Before this change running "git {foo,bar}" would create a
forkbomb. Now using the aliasing loop detection and call history
reporting added in 82f71d9a5a ("alias: show the call history when an
alias is looping", 2018-09-16) and c6d75bc17a ("alias: add support for
aliases of an alias", 2018-09-16) we'll instead report:

fatal: alias loop detected: expansion of 'foo' does not terminate:
  foo <==
  bar ==>

Since the implementation carries the call history in an environment
variable, using the same sort of trick as used for -c (see
2b64fc894d ("pass "git -c foo=bar" params through environment",
2010-08-23) ). For example:

[alias]
one = two
two = !git three
three = four
four = !git five
five = two

Will, on "git one" report:

fatal: alias loop detected: expansion of 'one' does not terminate:
  one
  two <==
  three
  four
  five ==>

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

Implements what I suggested in
https://public-inbox.org/git/87o9dar9qc@evledraar.gmail.com/

 cache.h  |  1 +
 git.c| 36 ++--
 t/t0001-init.sh  |  1 +
 t/t0014-alias.sh | 15 ++-
 4 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index d508f3d4f8..00cbd25f1c 100644
--- a/cache.h
+++ b/cache.h
@@ -478,6 +478,7 @@ static inline enum object_type object_type(unsigned int 
mode)
 #define TEMPLATE_DIR_ENVIRONMENT "GIT_TEMPLATE_DIR"
 #define CONFIG_ENVIRONMENT "GIT_CONFIG"
 #define CONFIG_DATA_ENVIRONMENT "GIT_CONFIG_PARAMETERS"
+#define COMMAND_HISTORY_ENVIRONMENT "GIT_COMMAND_HISTORY"
 #define EXEC_PATH_ENVIRONMENT "GIT_EXEC_PATH"
 #define CEILING_DIRECTORIES_ENVIRONMENT "GIT_CEILING_DIRECTORIES"
 #define NO_REPLACE_OBJECTS_ENVIRONMENT "GIT_NO_REPLACE_OBJECTS"
diff --git a/git.c b/git.c
index 5920f8019b..cba242836c 100644
--- a/git.c
+++ b/git.c
@@ -672,12 +672,43 @@ static void execv_dashed_external(const char **argv)
exit(128);
 }
 
+static void init_cmd_history(struct strbuf *env, struct string_list *cmd_list)
+{
+   const char *old = getenv(COMMAND_HISTORY_ENVIRONMENT);
+   struct strbuf **cmd_history, **ptr;
+
+   if (!old || !*old)
+   return;
+
+   strbuf_addstr(env, old);
+   strbuf_rtrim(env);
+
+   cmd_history = strbuf_split_buf(old, strlen(old), ' ', 0);
+   for (ptr = cmd_history; *ptr; ptr++) {
+   strbuf_rtrim(*ptr);
+   string_list_append(cmd_list, (*ptr)->buf);
+   }
+   strbuf_list_free(cmd_history);
+}
+
+static void add_cmd_history(struct strbuf *env, struct string_list *cmd_list,
+   const char *cmd)
+{
+   string_list_append(cmd_list, cmd);
+   if (env->len)
+   strbuf_addch(env, ' ');
+   strbuf_addstr(env, cmd);
+   setenv(COMMAND_HISTORY_ENVIRONMENT, env->buf, 1);
+}
+
 static int run_argv(int *argcp, const char ***argv)
 {
int done_alias = 0;
-   struct string_list cmd_list = STRING_LIST_INIT_NODUP;
+   struct string_list cmd_list = STRING_LIST_INIT_DUP;
struct string_list_item *seen;
+   struct strbuf env = STRBUF_INIT;
 
+   init_cmd_history(&env, &cmd_list);
while (1) {
/*
 * If we tried alias and futzed with our environment,
@@ -711,7 +742,7 @@ static int run_argv(int *argcp, const char ***argv)
  " not terminate:%s"), cmd_list.items[0].string, 
sb.buf);
}
 
-   string_list_append(&cmd_list, *argv[0]);
+   add_cmd_history(&env, &cmd_list, *argv[0]);
 
/*
 * It could be an alias -- this works around the insanity
@@ -724,6 +755,7 @@ static int run_argv(int *argcp, const char ***argv)
}
 
string_list_clear(&cmd_list, 0);
+   strbuf_release(&env);
 
return done_alias;
 }
diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index 182da069f1..eb2ca8a172 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -93,6 +93,7 @@ test_expect_success 'No extra GIT_* on alias scripts' '
sed -n \
-e "/^GIT_PREFIX=/d" \
-e "/^GIT_TEXTDOMAINDIR=/d" \
+   -e "/^GIT_COMMAND_HISTORY=/d" \
-e "/^GIT_/s/=.*//p" |
sort
EOF
diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh
index a070e645d7..9ed03a4a4f 100755
--- a/t/t0014-alias.sh
+++ b/t/t0014-alias.sh
@@ -27,14 +27,11 @@ test_expect_success 'looping aliases - internal execution' '
test_i18ngrep "^fatal: alias loop detected: expansion of" output
 '
 
-# This test is disabled until external loops are fixed, because would block
-# the test suite for a full minute.
-#
-#test_expect_failure 'loo

Re: [PATCH] send-email: explicitly disable authentication

2018-10-18 Thread Joshua Watt
On Thu, 2018-10-18 at 17:53 -0400, Eric Sunshine wrote:
> On Thu, Oct 18, 2018 at 5:16 PM Joshua Watt 
> wrote:
> > It can be necessary to disable SMTP authentication by a mechanism
> > other
> > than sendemail.smtpuser being undefined. For example, if the user
> > has
> > sendemail.smtpuser set globally but wants to disable authentication
> > locally in one repository.
> > 
> > --smtp-auth and sendemail.smtpauth now understand the value 'none'
> > which
> > means to disable authentication completely, even if an
> > authentication
> > user is specified.
> 
> Implementation complexity aside, spelling the option --no-smtp-auth
> might be more intuitive and consistent than --smtp-auth=none.

One advantage of --smtp-auth=none is that it can also be done with a
config variable sendemail.smtpauth="none". Would be also add a config
variable like sendemail.nosmtpauth (the negative seems strange to me)? 

Or maybe --no-smtp-auth is just a shorthand alias for --smtp-auth=none?

> 
> > The value 'none' is lower case to avoid conflicts with any RFC 4422
> > authentication mechanisms.
> > 
> > Signed-off-by: Joshua Watt 
-- 
Joshua Watt 


Re: [PATCH v2 02/13] ci/lib.sh: encapsulate Travis-specific things

2018-10-18 Thread SZEDER Gábor
On Mon, Oct 15, 2018 at 03:12:00AM -0700, Johannes Schindelin via GitGitGadget 
wrote:
> diff --git a/ci/lib.sh b/ci/lib.sh
> index 06970f7213..8532555b4e 100755
> --- a/ci/lib.sh
> +++ b/ci/lib.sh
> @@ -1,5 +1,26 @@
>  # Library of functions shared by all CI scripts
>  
> +if test true = "$TRAVIS"
> +then
> + # We are running within Travis CI
> + CI_BRANCH="$TRAVIS_BRANCH"
> + CI_COMMIT="$TRAVIS_COMMIT"
> + CI_JOB_ID="$TRAVIS_JOB_ID"
> + CI_JOB_NUMBER="$TRAVIS_JOB_NUMBER"
> + CI_OS_NAME="$TRAVIS_OS_NAME"
> + CI_REPO_SLUG="$TRAVIS_REPO_SLUG"
> +
> + cache_dir="$HOME/travis-cache"
> +
> + url_for_job_id () {
> + echo "https://travis-ci.org/$CI_REPO_SLUG/jobs/$1";
> + }
> +
> + BREW_INSTALL_PACKAGES="git-lfs gettext"
> + export GIT_PROVE_OPTS="--timer --jobs 3 --state=failed,slow,save"
> + export GIT_TEST_OPTS="--verbose-log -x --immediate"
> +fi

Please set all these variables ...

> +
>  skip_branch_tip_with_tag () {
>   # Sometimes, a branch is pushed at the same time the tag that points
>   # at the same commit as the tip of the branch is pushed, and building
> @@ -13,10 +34,10 @@ skip_branch_tip_with_tag () {
>   # we can skip the build because we won't be skipping a build
>   # of a tag.
>  
> - if TAG=$(git describe --exact-match "$TRAVIS_BRANCH" 2>/dev/null) &&
> - test "$TAG" != "$TRAVIS_BRANCH"
> + if TAG=$(git describe --exact-match "$CI_BRANCH" 2>/dev/null) &&
> + test "$TAG" != "$CI_BRANCH"
>   then
> - echo "$(tput setaf 2)Tip of $TRAVIS_BRANCH is exactly at 
> $TAG$(tput sgr0)"
> + echo "$(tput setaf 2)Tip of $CI_BRANCH is exactly at $TAG$(tput 
> sgr0)"
>   exit 0
>   fi
>  }
> @@ -25,7 +46,7 @@ skip_branch_tip_with_tag () {
>  # job if we encounter the same tree again and can provide a useful info
>  # message.
>  save_good_tree () {
> - echo "$(git rev-parse $TRAVIS_COMMIT^{tree}) $TRAVIS_COMMIT 
> $TRAVIS_JOB_NUMBER $TRAVIS_JOB_ID" >>"$good_trees_file"
> + echo "$(git rev-parse $CI_COMMIT^{tree}) $CI_COMMIT $CI_JOB_NUMBER 
> $CI_JOB_ID" >>"$good_trees_file"
>   # limit the file size
>   tail -1000 "$good_trees_file" >"$good_trees_file".tmp
>   mv "$good_trees_file".tmp "$good_trees_file"
> @@ -35,7 +56,7 @@ save_good_tree () {
>  # successfully before (e.g. because the branch got rebased, changing only
>  # the commit messages).
>  skip_good_tree () {
> - if ! good_tree_info="$(grep "^$(git rev-parse $TRAVIS_COMMIT^{tree}) " 
> "$good_trees_file")"
> + if ! good_tree_info="$(grep "^$(git rev-parse $CI_COMMIT^{tree}) " 
> "$good_trees_file")"
>   then
>   # Haven't seen this tree yet, or no cached good trees file yet.
>   # Continue the build job.
> @@ -45,18 +66,18 @@ skip_good_tree () {
>   echo "$good_tree_info" | {
>   read tree prev_good_commit prev_good_job_number prev_good_job_id
>  
> - if test "$TRAVIS_JOB_ID" = "$prev_good_job_id"
> + if test "$CI_JOB_ID" = "$prev_good_job_id"
>   then
>   cat <<-EOF
> - $(tput setaf 2)Skipping build job for commit 
> $TRAVIS_COMMIT.$(tput sgr0)
> + $(tput setaf 2)Skipping build job for commit 
> $CI_COMMIT.$(tput sgr0)
>   This commit has already been built and tested 
> successfully by this build job.
>   To force a re-build delete the branch's cache and then 
> hit 'Restart job'.
>   EOF
>   else
>   cat <<-EOF
> - $(tput setaf 2)Skipping build job for commit 
> $TRAVIS_COMMIT.$(tput sgr0)
> + $(tput setaf 2)Skipping build job for commit 
> $CI_COMMIT.$(tput sgr0)
>   This commit's tree has already been built and tested 
> successfully in build job $prev_good_job_number for commit $prev_good_commit.
> - The log of that build job is available at 
> https://travis-ci.org/$TRAVIS_REPO_SLUG/jobs/$prev_good_job_id
> + The log of that build job is available at 
> $(url_for_job_id $prev_good_job_id)
>   To force a re-build delete the branch's cache and then 
> hit 'Restart job'.
>   EOF
>   fi
> @@ -81,7 +102,6 @@ check_unignored_build_artifacts ()
>  # and installing dependencies.
>  set -ex

... after we turn on 'set -x', so the variables' values will be
visible in the logs.

(Or move this 'set -ex' to the beginning of the script?  Then we
could perhaps avoid similar issues in the future.)

> -cache_dir="$HOME/travis-cache"
>  good_trees_file="$cache_dir/good-trees"
>  
>  mkdir -p "$cache_dir"


Re: [PATCH] send-email: explicitly disable authentication

2018-10-18 Thread Eric Sunshine
On Thu, Oct 18, 2018 at 5:16 PM Joshua Watt  wrote:
> It can be necessary to disable SMTP authentication by a mechanism other
> than sendemail.smtpuser being undefined. For example, if the user has
> sendemail.smtpuser set globally but wants to disable authentication
> locally in one repository.
>
> --smtp-auth and sendemail.smtpauth now understand the value 'none' which
> means to disable authentication completely, even if an authentication
> user is specified.

Implementation complexity aside, spelling the option --no-smtp-auth
might be more intuitive and consistent than --smtp-auth=none.

> The value 'none' is lower case to avoid conflicts with any RFC 4422
> authentication mechanisms.
>
> Signed-off-by: Joshua Watt 


[PATCH] send-email: explicitly disable authentication

2018-10-18 Thread Joshua Watt
It can be necessary to disable SMTP authentication by a mechanism other
than sendemail.smtpuser being undefined. For example, if the user has
sendemail.smtpuser set globally but wants to disable authentication
locally in one repository.

--smtp-auth and sendemail.smtpauth now understand the value 'none' which
means to disable authentication completely, even if an authentication
user is specified.

The value 'none' is lower case to avoid conflicts with any RFC 4422
authentication mechanisms.

Signed-off-by: Joshua Watt 
---
 Documentation/git-send-email.txt | 4 +++-
 git-send-email.perl  | 5 +++--
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt
index 465a4ecbe..751a4851e 100644
--- a/Documentation/git-send-email.txt
+++ b/Documentation/git-send-email.txt
@@ -190,7 +190,9 @@ $ git send-email --smtp-auth="PLAIN LOGIN GSSAPI" ...
 If at least one of the specified mechanisms matches the ones advertised by the
 SMTP server and if it is supported by the utilized SASL library, the mechanism
 is used for authentication. If neither 'sendemail.smtpAuth' nor `--smtp-auth`
-is specified, all mechanisms supported by the SASL library can be used.
+is specified, all mechanisms supported by the SASL library can be used. The
+special value 'none' maybe specified to completely disable authentication
+independently of `--smtp-user`
 
 --smtp-pass[=]::
Password for SMTP-AUTH. The argument is optional: If no
diff --git a/git-send-email.perl b/git-send-email.perl
index 2be5dac33..4a74cd350 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -82,7 +82,8 @@ sub usage {
  Pass an empty string to disable 
certificate
  verification.
 --smtp-domain * The domain name sent to HELO/EHLO 
handshake
---smtp-auth   * Space-separated list of allowed AUTH 
mechanisms.
+--smtp-auth   * Space-separated list of allowed AUTH 
mechanisms, or
+ "none" to disable authentication.
  This setting forces to use one of the 
listed mechanisms.
 --smtp-debug<0|1>  * Disable, enable Net::SMTP debug.
 
@@ -1241,7 +1242,7 @@ sub smtp_host_string {
 # (smtp_user was not specified), and 0 otherwise.
 
 sub smtp_auth_maybe {
-   if (!defined $smtp_authuser || $auth) {
+   if (!defined $smtp_authuser || $auth || $smtp_auth eq "none") {
return 1;
}
 
-- 
2.17.1



Re: [RFC PATCH 0/2] Bring the_repository into cmd_foo

2018-10-18 Thread Jonathan Tan
> > Or instead we could accelerate the long term plan of removing a
> > hard coded the_repository and have each cmd builtin take an additional
> > repository pointer from the init code, such that we'd bring all of Git to
> > work on arbitrary repositories. Then the standard test suite should be
> > okay, as there is no special case for the_repository any more.
> 
> Demo'd in this RFC series for git-merge-base.
> 
> The core idea is found in patch 1,
> and the proof of concept is found in patch 2.

I don't think working around the_repository is sufficient, as there are
other ways to access the same repository state (the_index, directly
accessing files through file I/O). Instead I would prefer a test like in
t/test-repository.c - each patch set would probably only need one test
for the last function converted, since typically the last function uses
every other function converted.

Also, even if we decided that working around the_repository is
sufficient, I don't think this get_the_repository() is a good approach.
If (or when) we decide to convert all builtins to not use
the_repository, we would have to clean up all such calls.

Better would be to pass the_repository from the invoker of the cmd
builtin, and reuse NO_THE_REPOSITORY_COMPATIBILITY_MACROS in the
builtin. (I haven't thought much about how to transition to this, but
one way might be to extend "struct cmd_struct" in git.c to also have a
new-style function pointer, and #define GIT_CMD(c, f, o) {c, NULL, o, f}
or something like that.)

This doesn't directly address the fact that the builtin might call lib
code that indirectly references the_repository, but I think that this
won't be an issue because by the time we're ready to convert builtins to
not use the_repository, most if not all of the lib code would have
NO_THE_REPOSITORY_COMPATIBILITY_MACROS defined anyway.


[PATCH v2 2/3] upload-pack: make want_obj not global

2018-10-18 Thread Jonathan Tan
Because upload_pack_v2() can be invoked multiple times in the same
process, the static variable want_obj may not be empty when it is
invoked. To make further analysis of this situation easier, make the
variable local; analysis will be done in a subsequent patch.

The new local variable in upload_pack_v2() is static to preserve
existing behavior; this is not necessary in upload_pack() because
upload_pack() is only invoked once per process.

Signed-off-by: Jonathan Tan 
---
 upload-pack.c | 116 --
 1 file changed, 66 insertions(+), 50 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index cb2401f90d..451bf47e7f 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -52,7 +52,6 @@ static int no_progress, daemon_mode;
 #define ALLOW_ANY_SHA1 07
 static unsigned int allow_unadvertised_object_request;
 static int shallow_nr;
-static struct object_array want_obj;
 static struct object_array extra_edge_obj;
 static unsigned int timeout;
 static int keepalive = 5;
@@ -98,7 +97,8 @@ static int write_one_shallow(const struct commit_graft 
*graft, void *cb_data)
return 0;
 }
 
-static void create_pack_file(const struct object_array *have_obj)
+static void create_pack_file(const struct object_array *have_obj,
+const struct object_array *want_obj)
 {
struct child_process pack_objects = CHILD_PROCESS_INIT;
char data[8193], progress[128];
@@ -159,9 +159,9 @@ static void create_pack_file(const struct object_array 
*have_obj)
if (shallow_nr)
for_each_commit_graft(write_one_shallow, pipe_fd);
 
-   for (i = 0; i < want_obj.nr; i++)
+   for (i = 0; i < want_obj->nr; i++)
fprintf(pipe_fd, "%s\n",
-   oid_to_hex(&want_obj.objects[i].item->oid));
+   oid_to_hex(&want_obj->objects[i].item->oid));
fprintf(pipe_fd, "--not\n");
for (i = 0; i < have_obj->nr; i++)
fprintf(pipe_fd, "%s\n",
@@ -337,19 +337,21 @@ static int got_oid(const char *hex, struct object_id *oid,
return 0;
 }
 
-static int ok_to_give_up(const struct object_array *have_obj)
+static int ok_to_give_up(const struct object_array *have_obj,
+struct object_array *want_obj)
 {
uint32_t min_generation = GENERATION_NUMBER_ZERO;
 
if (!have_obj->nr)
return 0;
 
-   return can_all_from_reach_with_flag(&want_obj, THEY_HAVE,
+   return can_all_from_reach_with_flag(want_obj, THEY_HAVE,
COMMON_KNOWN, oldest_have,
min_generation);
 }
 
-static int get_common_commits(struct object_array *have_obj)
+static int get_common_commits(struct object_array *have_obj,
+ struct object_array *want_obj)
 {
struct object_id oid;
char last_hex[GIT_MAX_HEXSZ + 1];
@@ -367,7 +369,7 @@ static int get_common_commits(struct object_array *have_obj)
 
if (!line) {
if (multi_ack == 2 && got_common
-   && !got_other && ok_to_give_up(have_obj)) {
+   && !got_other && ok_to_give_up(have_obj, want_obj)) 
{
sent_ready = 1;
packet_write_fmt(1, "ACK %s ready\n", last_hex);
}
@@ -388,7 +390,7 @@ static int get_common_commits(struct object_array *have_obj)
switch (got_oid(arg, &oid, have_obj)) {
case -1: /* they have what we do not */
got_other = 1;
-   if (multi_ack && ok_to_give_up(have_obj)) {
+   if (multi_ack && ok_to_give_up(have_obj, 
want_obj)) {
const char *hex = oid_to_hex(&oid);
if (multi_ack == 2) {
sent_ready = 1;
@@ -581,7 +583,7 @@ static int has_unreachable(struct object_array *src)
return 1;
 }
 
-static void check_non_tip(void)
+static void check_non_tip(struct object_array *want_obj)
 {
int i;
 
@@ -592,14 +594,14 @@ static void check_non_tip(void)
 */
if (!stateless_rpc && !(allow_unadvertised_object_request & 
ALLOW_REACHABLE_SHA1))
goto error;
-   if (!has_unreachable(&want_obj))
+   if (!has_unreachable(want_obj))
/* All the non-tip ones are ancestors of what we advertised */
return;
 
 error:
/* Pick one of them (we know there at least is one) */
-   for (i = 0; i < want_obj.nr; i++) {
-   struct object *o = want_obj.objects[i].item;
+   for (i = 0; i < want_obj->nr; i++) {
+   struct object *o = want_obj->objects[i].item;
if (!is_our_ref(o))
die("git upload-pack: not our r

[PATCH v2 0/3] Clear flags before each v2 request

2018-10-18 Thread Jonathan Tan
To explain the differences in this version of the patch set, I'll quote
an email [1] from Junio:

[1] https://public-inbox.org/git/xmqq5zxzvnq1@gitster-ct.c.googlers.com/

> The change to the code itself sort-of makes sense (I say sort-of
> because I didn't carefully look at the callers to see if they mind
> getting all these flags cleared, but the ones that are cleared are
> the ones that are involved mostly in the negotiation and shold be
> OK).

I have included 2 additional patches for these reasons:

 - After reading the section of Junio's email quoted above, I took
   another look at the flags, and found that not only is state stored in
   the flags between invocations of upload_pack_v2(), state is also
   stored in the want_obj and have_obj global variables. The additional
   patches help clean those up.

 - To help reviewers who want to see if the callers mind getting all 8
   flags cleared, I have included a discussion of all 8 flags in the
   commit message of patch 3. The additional patches made the discussion
   easier.

Responses to other points:

> Hmph, what if commit O had a long history behind it?  
> 
> Should fetching of B result in fetching the whole history?

I think so - when we fetch without --depth or any similar arguments, I
think it's reasonable to have all objects referenced by the fetched
tips.

> Would we
> notice that now all of A's parents are available locally and declare
> that the repository is no longer shallow?

We could, but I think this is outside the scope of this patch set.

> Use test_seq instead, or you'll get hit by test-lint?

Thanks for the pointer to test-lint. I've used test_seq, and checked
that test-lint doesn't print any errors.

> Applied on 'master' or 'maint', this new test does not pass even
> with s/seq/test_&/, so there may be something else wrong with it,
> though.

Thanks - there was a copy-and-paste error (should have grepped for
"fetch< version 2", not "git< version 2").

Jonathan Tan (3):
  upload-pack: make have_obj not global
  upload-pack: make want_obj not global
  upload-pack: clear flags before each v2 request

 t/t5702-protocol-v2.sh |  25 +++
 upload-pack.c  | 153 -
 2 files changed, 115 insertions(+), 63 deletions(-)

-- 
2.19.0.271.gfe8321ec05.dirty



[PATCH v2 1/3] upload-pack: make have_obj not global

2018-10-18 Thread Jonathan Tan
Because upload_pack_v2() can be invoked multiple times in the same
process, the static variable have_obj may not be empty when it is
invoked. To make further analysis of this situation easier, make the
variable local; analysis will be done in a subsequent patch.

The new local variable in upload_pack_v2() is static to preserve
existing behavior; this is not necessary in upload_pack() because
upload_pack() is only invoked once per process.

Signed-off-by: Jonathan Tan 
---
 upload-pack.c | 58 ---
 1 file changed, 32 insertions(+), 26 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 62a1000f44..cb2401f90d 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -52,7 +52,6 @@ static int no_progress, daemon_mode;
 #define ALLOW_ANY_SHA1 07
 static unsigned int allow_unadvertised_object_request;
 static int shallow_nr;
-static struct object_array have_obj;
 static struct object_array want_obj;
 static struct object_array extra_edge_obj;
 static unsigned int timeout;
@@ -99,7 +98,7 @@ static int write_one_shallow(const struct commit_graft 
*graft, void *cb_data)
return 0;
 }
 
-static void create_pack_file(void)
+static void create_pack_file(const struct object_array *have_obj)
 {
struct child_process pack_objects = CHILD_PROCESS_INIT;
char data[8193], progress[128];
@@ -164,9 +163,9 @@ static void create_pack_file(void)
fprintf(pipe_fd, "%s\n",
oid_to_hex(&want_obj.objects[i].item->oid));
fprintf(pipe_fd, "--not\n");
-   for (i = 0; i < have_obj.nr; i++)
+   for (i = 0; i < have_obj->nr; i++)
fprintf(pipe_fd, "%s\n",
-   oid_to_hex(&have_obj.objects[i].item->oid));
+   oid_to_hex(&have_obj->objects[i].item->oid));
for (i = 0; i < extra_edge_obj.nr; i++)
fprintf(pipe_fd, "%s\n",
oid_to_hex(&extra_edge_obj.objects[i].item->oid));
@@ -303,7 +302,8 @@ static void create_pack_file(void)
die("git upload-pack: %s", abort_msg);
 }
 
-static int got_oid(const char *hex, struct object_id *oid)
+static int got_oid(const char *hex, struct object_id *oid,
+  struct object_array *have_obj)
 {
struct object *o;
int we_knew_they_have = 0;
@@ -331,17 +331,17 @@ static int got_oid(const char *hex, struct object_id *oid)
parents->item->object.flags |= THEY_HAVE;
}
if (!we_knew_they_have) {
-   add_object_array(o, NULL, &have_obj);
+   add_object_array(o, NULL, have_obj);
return 1;
}
return 0;
 }
 
-static int ok_to_give_up(void)
+static int ok_to_give_up(const struct object_array *have_obj)
 {
uint32_t min_generation = GENERATION_NUMBER_ZERO;
 
-   if (!have_obj.nr)
+   if (!have_obj->nr)
return 0;
 
return can_all_from_reach_with_flag(&want_obj, THEY_HAVE,
@@ -349,7 +349,7 @@ static int ok_to_give_up(void)
min_generation);
 }
 
-static int get_common_commits(void)
+static int get_common_commits(struct object_array *have_obj)
 {
struct object_id oid;
char last_hex[GIT_MAX_HEXSZ + 1];
@@ -367,11 +367,11 @@ static int get_common_commits(void)
 
if (!line) {
if (multi_ack == 2 && got_common
-   && !got_other && ok_to_give_up()) {
+   && !got_other && ok_to_give_up(have_obj)) {
sent_ready = 1;
packet_write_fmt(1, "ACK %s ready\n", last_hex);
}
-   if (have_obj.nr == 0 || multi_ack)
+   if (have_obj->nr == 0 || multi_ack)
packet_write_fmt(1, "NAK\n");
 
if (no_done && sent_ready) {
@@ -385,10 +385,10 @@ static int get_common_commits(void)
continue;
}
if (skip_prefix(line, "have ", &arg)) {
-   switch (got_oid(arg, &oid)) {
+   switch (got_oid(arg, &oid, have_obj)) {
case -1: /* they have what we do not */
got_other = 1;
-   if (multi_ack && ok_to_give_up()) {
+   if (multi_ack && ok_to_give_up(have_obj)) {
const char *hex = oid_to_hex(&oid);
if (multi_ack == 2) {
sent_ready = 1;
@@ -404,14 +404,14 @@ static int get_common_commits(void)
packet_write_fmt(1, "ACK %s common\n", 
last_hex);
else if (multi_ack)
packet_write_fmt(1, "ACK %s 
continue\n", last_hex);
-   

[PATCH v2 3/3] upload-pack: clear flags before each v2 request

2018-10-18 Thread Jonathan Tan
Suppose a server has the following commit graph:

 A   B
  \ /
   O

We create a client by cloning A from the server with depth 1, and add
many commits to it (so that future fetches span multiple requests due to
lengthy negotiation). If it then fetches B using protocol v2, the fetch
spanning multiple requests, the resulting packfile does not contain O
even though the client did report that A is shallow.

This is because upload_pack_v2() can be called multiple times while
processing the same session. During the 2nd and all subsequent
invocations, some object flags remain from the previous invocations. In
particular, CLIENT_SHALLOW remains, preventing process_shallow() from
adding client-reported shallows to the "shallows" array, and hence
pack-objects not knowing about these client-reported shallows.

Therefore, teach upload_pack_v2() to clear object flags at the start of
each invocation. This has some other results:

 - THEY_HAVE gates addition of objects to have_obj in process_haves().
   Previously in upload_pack_v2(), have_obj needed to be static because
   once an object is added to have_obj, it is never readded and thus we
   needed to retain the contents of have_obj between invocations. Now
   that flags are cleared, this is no longer necessary. This patch does
   not change the behavior of ok_to_give_up() (THEY_HAVE is still set on
   each "have") and got_oid() (used only in non-v2)); THEY_HAVE is not
   used in any other function.

 - WANTED gates addition of objects to want_obj in parse_want() and
   parse_want_ref(). It is also used in receive_needs(), but that is
   only used in non-v2. For the same reasons as THEY_HAVE, want_obj no
   longer needs to be static in upload_pack_v2().

 - CLIENT_SHALLOW is changed as discussed above.

Clearing of the other 5 flags does not affect functionality in v2. (Note
that in non-v2, upload_pack() is only called once per process, so each
invocation starts with blank flags anyway.)

 - OUR_REF is only used in non-v2.

 - COMMON_KNOWN is only used as a scratch flag in ok_to_give_up().

 - SHALLOW is passed to invocations in deepen() and
   deepen_by_rev_list(), but upload-pack doesn't use it.

 - NOT_SHALLOW is used by send_shallow() and send_unshallow(), but
   invocations of those functions are always preceded by code that sets
   NOT_SHALLOW on the appropriate objects.

 - HIDDEN_REF is only used in non-v2.

Signed-off-by: Jonathan Tan 
---
 t/t5702-protocol-v2.sh | 25 +
 upload-pack.c  | 13 +
 2 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 88a886975d..4adc4b00e3 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -429,6 +429,31 @@ test_expect_success 'fetch supports include-tag and tag 
following' '
git -C client cat-file -e $(git -C client rev-parse annotated_tag)
 '
 
+test_expect_success 'upload-pack respects client shallows' '
+   rm -rf server client trace &&
+
+   git init server &&
+   test_commit -C server base &&
+   test_commit -C server client_has &&
+
+   git clone --depth=1 "file://$(pwd)/server" client &&
+
+   # Add extra commits to the client so that the whole fetch takes more
+   # than 1 request (due to negotiation)
+   for i in $(test_seq 1 32)
+   do
+   test_commit -C client c$i
+   done &&
+
+   git -C server checkout -b newbranch base &&
+   test_commit -C server client_wants &&
+
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client -c protocol.version=2 \
+   fetch origin newbranch &&
+   # Ensure that protocol v2 is used
+   grep "fetch< version 2" trace
+'
+
 # Test protocol v2 with 'http://' transport
 #
 . "$TEST_DIRECTORY"/lib-httpd.sh
diff --git a/upload-pack.c b/upload-pack.c
index 451bf47e7f..42c5ec44cf 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -37,6 +37,9 @@
 #define CLIENT_SHALLOW (1u << 18)
 #define HIDDEN_REF (1u << 19)
 
+#define ALL_FLAGS (THEY_HAVE | OUR_REF | WANTED | COMMON_KNOWN | SHALLOW | \
+   NOT_SHALLOW | CLIENT_SHALLOW | HIDDEN_REF)
+
 static timestamp_t oldest_have;
 
 static int deepen_relative;
@@ -1409,10 +1412,10 @@ int upload_pack_v2(struct repository *r, struct 
argv_array *keys,
 {
enum fetch_state state = FETCH_PROCESS_ARGS;
struct upload_pack_data data;
-   /* NEEDSWORK: make this non-static */
-   static struct object_array have_obj;
-   /* NEEDSWORK: make this non-static */
-   static struct object_array want_obj;
+   struct object_array have_obj = OBJECT_ARRAY_INIT;
+   struct object_array want_obj = OBJECT_ARRAY_INIT;
+
+   clear_object_flags(ALL_FLAGS);
 
git_config(upload_pack_config, NULL);
 
@@ -1464,6 +1467,8 @@ int upload_pack_v2(struct repository *r, struct 
argv_array *keys,
}
 
upload_pack_data_clear(&data);
+   object_array_clear(&have_obj);
+   object_array_clear(&want_obj

Shouldn't git be able to apply diffs that it created with --ignore-whitespace?

2018-10-18 Thread Mahmoud Al-Qudsi
Hello again all,

I think I've previously broached this subject before, but I think I perhaps
wasn't clear enough about what I was trying to do or why I feel that git is at
fault here.

(I'm running git 2.19.1)

Starting with a fully-committed, not-dirty codebase, I open(ed) a poorly
formatted, mixed-whitespace file (that I absolutely did not author!) under
version control and make some very localized changes. My editor, being very
smart and helpful, fixes up the line ending on save, and I exit.

At this point, my source file contains a) deliberate changes I want, and b)
whitespace changes I wish I could commit but that should not be a part of my
patch.

Shouldn't the following workflow be supported:

~> git diff -w > foo.diff
~> git reset --hard
~> git apply [--ignore-whitespace] < foo.diff

Because that throws an error in this case:

> error: patch failed: includes/helpers/class.phpmailer.php:1182
> error: includes/helpers/class.phpmailer.php: patch does not apply

I feel like this did work, once upon a time. Perhaps prior to the same that
broke `git add -p` when whitespace was mangled on editor exit/save (2b8ea7f3c7)?

To help debug this, I'm attaching the output of the following taken after I've
made my changes to the file and wish to generate a clean diff, the former of
which applies just fine after `git reset --hard`, while the latter does not:

~> git diff > with_whitespace.diff
~> git diff -w > without_whitespace.diff

(I can also privately share the original file off-list if needed.)

I don't believe the list allows binary attachments and because we're dealing
with line-ending mangling I definitely do not want to include them inline, so
I've uploaded them here:

* http://share.neosmart.net/View/Index/nGujqm.diff
* http://share.neosmart.net/View/Index/f4dkVF.diff

(You can download them as-is by clicking the floppy icon in the top-right)

With thanks,

Mahmoud Al-Qudsi
NeoSmart Technologies


Re: [PATCH] multi-pack-index: avoid dead store for struct progress

2018-10-18 Thread Carlo Arenas
On Thu, Oct 18, 2018 at 12:36 PM Derrick Stolee  wrote:
>
> Is there a tool that reports a wasted
> initialization that you used to find this?

I'd used clang's analyzer recently to track a similar issue before in a
different codebase, but not for this specific case.

Carlo


Re: [PATCH] multi-pack-index: avoid dead store for struct progress

2018-10-18 Thread Derrick Stolee

On 10/18/2018 2:59 PM, Carlo Marcelo Arenas Belón wrote:

it is initialized unconditionally by a call to start_progress
below.

Signed-off-by: Carlo Marcelo Arenas Belón 
---
  midx.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/midx.c b/midx.c
index ea2f3ffe2e..4fac0cd08a 100644
--- a/midx.c
+++ b/midx.c
@@ -941,7 +941,7 @@ static void midx_report(const char *fmt, ...)
  int verify_midx_file(const char *object_dir)
  {
uint32_t i;
-   struct progress *progress = NULL;
+   struct progress *progress;
struct multi_pack_index *m = load_multi_pack_index(object_dir, 1);
verify_midx_error = 0;
  
This makes sense as a cleanup. Is there a tool that reports a wasted 
initialization that you used to find this?


-Stolee


Re: [PATCH] diff: don't attempt to strip prefix from absolute Windows paths

2018-10-18 Thread Johannes Sixt
Am 18.10.18 um 20:49 schrieb Stefan Beller:
> On Thu, Oct 18, 2018 at 11:38 AM Johannes Sixt  wrote:
> 
>> There is one peculiarity, though: [...]
> 
> The explanation makes sense, and the code looks good.
> Do we want to have a test for this niche case?
> 

Good point. That would be the following. But give me a day or two to
cross-check on Windows and whether it really catches the breakage.

diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 453e6c35eb..6e0dd6f9e5 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -127,4 +127,14 @@ test_expect_success 'diff --no-index from repo subdir 
respects config (implicit)
test_cmp expect actual.head
 '
 
+test_expect_success 'diff --no-index from repo subdir with absolute paths' '
+   cat <<-EOF >expect &&
+   1   1   $(pwd)/non/git/{a => b}
+   EOF
+   test_expect_code 1 \
+   git -C repo/sub diff --numstat \
+   "$(pwd)/non/git/a" "$(pwd)/non/git/b" >actual &&
+   test_cmp expect actual
+'
+
 test_done


[PATCH v1 1/1] index_bulk_checkin(): Take off_t, not size_t

2018-10-18 Thread tboegi
From: Torsten Bögershausen 

When streaming data from disk into a blob, use off_t instead of
size_t, which is a better choice for file length.

Signed-off-by: Torsten Bögershausen 
---

This is based on an old patch from 2017, which never made it to the list.
I think it make sense to have off_t/size_t more consistent,
reviews/comments are welcome.

bulk-checkin.c | 4 ++--
 bulk-checkin.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index 409ecb566b..2631e82d6c 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -189,7 +189,7 @@ static void prepare_to_stream(struct bulk_checkin_state 
*state,
 
 static int deflate_to_pack(struct bulk_checkin_state *state,
   struct object_id *result_oid,
-  int fd, size_t size,
+  int fd, off_t size,
   enum object_type type, const char *path,
   unsigned flags)
 {
@@ -258,7 +258,7 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
 }
 
 int index_bulk_checkin(struct object_id *oid,
-  int fd, size_t size, enum object_type type,
+  int fd, off_t size, enum object_type type,
   const char *path, unsigned flags)
 {
int status = deflate_to_pack(&state, oid, fd, size, type,
diff --git a/bulk-checkin.h b/bulk-checkin.h
index f438f93811..09b2affdf3 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -7,7 +7,7 @@
 #include "cache.h"
 
 extern int index_bulk_checkin(struct object_id *oid,
- int fd, size_t size, enum object_type type,
+ int fd, off_t size, enum object_type type,
  const char *path, unsigned flags);
 
 extern void plug_bulk_checkin(void);
-- 
2.19.0.271.gfe8321ec05



Re: [PATCH 4/9] submodule.c: move global changed_submodule_names into fetch submodule struct

2018-10-18 Thread Stefan Beller
On Wed, Oct 17, 2018 at 2:26 PM Jonathan Tan  wrote:
>
> > The `changed_submodule_names` are only used for fetching, so let's make it
> > part of the struct that is passed around for fetching submodules.
>
> Keep the titles of commit messages to 50 characters or under.

renamed

>
> > +static void calculate_changed_submodule_paths(
> > + struct submodule_parallel_fetch *spf)
>
> Instead of taking the entire struct, could this just take the list of
> changed submodule names instead?

I think so, done.


Re: [PATCH v1 1/2] reset: don't compute unstaged changes after reset when --quiet

2018-10-18 Thread Ben Peart




On 10/18/2018 2:26 PM, Duy Nguyen wrote:

On Thu, Oct 18, 2018 at 8:18 PM Ben Peart  wrote:

I actually started my effort to speed up reset by attempting to
multi-thread refresh_index().  You can see a work in progress at:

https://github.com/benpeart/git/pull/new/refresh-index-multithread-gvfs

The patch doesn't always work as it is still not thread safe.  When it
works, it's great but I ran into to many difficulties trying to debug
the remaining threading issues (even adding print statements would
change the timing and the repro would disappear).  It will take a lot of
code review to discover and fix the remaining non-thread safe code paths.

In addition, the optimized code path that takes advantage of fsmonitor,
uses multiple threads, fscache, etc _already exists_ in preload_index().
   Trying to recreate all those optimizations in refresh_index() is (as I
discovered) a daunting task.


Why not make refresh_index() run preload_index() first (or the
parallel lstat part to be precise), and only do the heavy
content-based refresh in single thread mode?



Head smack! Why didn't I think of that?

That is a terrific suggestion.  Calling preload_index() right before the 
big for loop in refresh_index() is a trivial and effective way to do the 
bulk of the updating with the optimized code.  After doing that, most of 
the cache entries can bail out quickly down in refresh_cache_ent() when 
it tests ce_uptodate(ce).


Here are the numbers using that optimization (hot cache, averaged across 
3 runs):


0.32 git add asdf
1.67 git reset asdf
1.68 git status
3.67 Total

vs without it:

0.32 git add asdf
2.48 git reset asdf
1.50 git status
4.30 Total

For a savings in the reset command of 32% and 15% overall.

Clearly doing the refresh_index() faster is not as much savings as not 
doing it at all.  Given how simple this patch is, I think it makes sense 
to do both so that we have optimized each path to is fullest.


[PATCH] multi-pack-index: avoid dead store for struct progress

2018-10-18 Thread Carlo Marcelo Arenas Belón
it is initialized unconditionally by a call to start_progress
below.

Signed-off-by: Carlo Marcelo Arenas Belón 
---
 midx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/midx.c b/midx.c
index ea2f3ffe2e..4fac0cd08a 100644
--- a/midx.c
+++ b/midx.c
@@ -941,7 +941,7 @@ static void midx_report(const char *fmt, ...)
 int verify_midx_file(const char *object_dir)
 {
uint32_t i;
-   struct progress *progress = NULL;
+   struct progress *progress;
struct multi_pack_index *m = load_multi_pack_index(object_dir, 1);
verify_midx_error = 0;
 
-- 
2.19.1



Re: [PATCH] diff: don't attempt to strip prefix from absolute Windows paths

2018-10-18 Thread Stefan Beller
On Thu, Oct 18, 2018 at 11:38 AM Johannes Sixt  wrote:

> There is one peculiarity, though: [...]

The explanation makes sense, and the code looks good.
Do we want to have a test for this niche case?


[PATCH] unpack-trees: avoid dead store for struct progress

2018-10-18 Thread Carlo Marcelo Arenas Belón
it is unconditionally initialized a few lines below

Signed-off-by: Carlo Marcelo Arenas Belón 
---
 unpack-trees.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index f25089b878..88dc9a615e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -340,7 +340,7 @@ static int check_updates(struct unpack_trees_options *o)
 {
unsigned cnt = 0;
int errs = 0;
-   struct progress *progress = NULL;
+   struct progress *progress;
struct index_state *index = &o->result;
struct checkout state = CHECKOUT_INIT;
int i;
-- 
2.19.1



[RFC PATCH 2/2 (BREAKS BUILD)] builtin/merge-base.c: do not rely on the_repository any more

2018-10-18 Thread Stefan Beller
To avoid creeping in the dependency of the_repository,
use GIT_NO_THE_REPOSITORY in the test to prove it still works.

Signed-off-by: Stefan Beller 
---

This doesn't work yet, as we have not converted get_oid, yet.
It proves that GIT_NO_THE_REPOSITORY works, though.

Stefan

 builtin/merge-base.c  | 67 ++-
 t/t6010-merge-base.sh |  3 +-
 2 files changed, 43 insertions(+), 27 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 1c92099070..29341f8839 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -1,3 +1,4 @@
+#define NO_THE_REPOSITORY_COMPATIBILITY_MACROS
 #include "builtin.h"
 #include "cache.h"
 #include "config.h"
@@ -37,27 +38,28 @@ static const char * const merge_base_usage[] = {
NULL
 };
 
-static struct commit *get_commit_reference(const char *arg)
+static struct commit *get_commit_reference(struct repository *r,
+  const char *arg)
 {
struct object_id revkey;
-   struct commit *r;
+   struct commit *ref;
 
if (get_oid(arg, &revkey))
die("Not a valid object name %s", arg);
-   r = lookup_commit_reference(the_repository, &revkey);
-   if (!r)
+   ref = lookup_commit_reference(r, &revkey);
+   if (!ref)
die("Not a valid commit name %s", arg);
 
-   return r;
+   return ref;
 }
 
-static int handle_independent(int count, const char **args)
+static int handle_independent(struct repository *r, int count, const char 
**args)
 {
struct commit_list *revs = NULL, *rev;
int i;
 
for (i = count - 1; i >= 0; i--)
-   commit_list_insert(get_commit_reference(args[i]), &revs);
+   commit_list_insert(get_commit_reference(r, args[i]), &revs);
 
reduce_heads_replace(&revs);
 
@@ -71,14 +73,16 @@ static int handle_independent(int count, const char **args)
return 0;
 }
 
-static int handle_octopus(int count, const char **args, int show_all)
+static int handle_octopus(struct repository *r,
+ int count, const char **args,
+ int show_all)
 {
struct commit_list *revs = NULL;
struct commit_list *result, *rev;
int i;
 
for (i = count - 1; i >= 0; i--)
-   commit_list_insert(get_commit_reference(args[i]), &revs);
+   commit_list_insert(get_commit_reference(r, args[i]), &revs);
 
result = get_octopus_merge_bases(revs);
free_commit_list(revs);
@@ -97,15 +101,15 @@ static int handle_octopus(int count, const char **args, 
int show_all)
return 0;
 }
 
-static int handle_is_ancestor(int argc, const char **argv)
+static int handle_is_ancestor(struct repository *r, int argc, const char 
**argv)
 {
struct commit *one, *two;
 
if (argc != 2)
die("--is-ancestor takes exactly two commits");
-   one = get_commit_reference(argv[0]);
-   two = get_commit_reference(argv[1]);
-   if (in_merge_bases(one, two))
+   one = get_commit_reference(r, argv[0]);
+   two = get_commit_reference(r, argv[1]);
+   if (repo_in_merge_bases(r, one, two))
return 0;
else
return 1;
@@ -116,19 +120,22 @@ struct rev_collect {
int nr;
int alloc;
unsigned int initial : 1;
+   struct repository *repo;
 };
 
-static void add_one_commit(struct object_id *oid, struct rev_collect *revs)
+static void add_one_commit(struct repository *r,
+  struct object_id *oid,
+  struct rev_collect *revs)
 {
struct commit *commit;
 
if (is_null_oid(oid))
return;
 
-   commit = lookup_commit(the_repository, oid);
+   commit = lookup_commit(r, oid);
if (!commit ||
(commit->object.flags & TMP_MARK) ||
-   parse_commit(commit))
+   repo_parse_commit(r, commit))
return;
 
ALLOC_GROW(revs->commit, revs->nr + 1, revs->alloc);
@@ -144,13 +151,13 @@ static int collect_one_reflog_ent(struct object_id *ooid, 
struct object_id *noid
 
if (revs->initial) {
revs->initial = 0;
-   add_one_commit(ooid, revs);
+   add_one_commit(revs->repo, ooid, revs);
}
-   add_one_commit(noid, revs);
+   add_one_commit(revs->repo, noid, revs);
return 0;
 }
 
-static int handle_fork_point(int argc, const char **argv)
+static int handle_fork_point(struct repository *r, int argc, const char **argv)
 {
struct object_id oid;
char *refname;
@@ -173,13 +180,14 @@ static int handle_fork_point(int argc, const char **argv)
if (get_oid(commitname, &oid))
die("Not a valid object name: '%s'", commitname);
 
-   derived = lookup_commit_reference(the_repository, &oid);
+   derived = lookup_commit_reference(r, &oid);
memset(&revs, 0, sizeof(revs))

[RFC PATCH 1/2] repository: have get_the_repository() to remove the_repository dependency

2018-10-18 Thread Stefan Beller
The struct 'the_repo' contains all data that of the main repository.
As we move more and more globals into this struct, the usual way of
accessing these is using 'the_repository', which can be used as a drop in
replacement for accessing the migrated globals.

However during the migration of globals into the repository object, it is
not clear if some code path rely on the_repository or can work on an
arbitrary repository (as we'd eventually want for submodules) due to the
excessive use of the_repository throughout the code base.

To address this, introduce a function 'get_the_repository(void)' which
will return the main repository and set the_repository to NULL when the
environment variable GIT_NO_THE_REPOSITORY is set.

This function is to be strictly used only at the beginning of builtin
command to assign it to a local repository pointer that we'll use to
pass through the code base.

By having the possibility to set the_repository to NULL, we'll get
a segfault when we try to access the_repository instead of using the
handle that we pass around.

This approach let's us have the_repository in the setup code, which
in its current form is not yet able to transition into a world where
the repository handle is passed around and only test the passing around
of the repository handle for later stage code.

Eventually in the future the setup code will produce the repository
handle and each 'cmd_foo(int argc, char **argv)' builtin would get the
repository via an additional parameter.

Signed-off-by: Stefan Beller 
---
 repository.c | 10 ++
 repository.h | 13 -
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index 5dd1486718..babb88 100644
--- a/repository.c
+++ b/repository.c
@@ -20,6 +20,16 @@ void initialize_the_repository(void)
repo_set_hash_algo(&the_repo, GIT_HASH_SHA1);
 }
 
+struct repository *get_the_repository(void)
+{
+   struct repository *r = the_repository;
+
+   if (getenv("GIT_NO_THE_REPOSITORY"))
+   the_repository = NULL;
+
+   return r;
+}
+
 static void expand_base_dir(char **out, const char *in,
const char *base_dir, const char *def_in)
 {
diff --git a/repository.h b/repository.h
index 9f16c42c1e..26f5d64f68 100644
--- a/repository.h
+++ b/repository.h
@@ -114,13 +114,24 @@ void repo_set_gitdir(struct repository *repo, const char 
*root,
 const struct set_gitdir_args *extra_args);
 void repo_set_worktree(struct repository *repo, const char *path);
 void repo_set_hash_algo(struct repository *repo, int algo);
-void initialize_the_repository(void);
 int repo_init(struct repository *r, const char *gitdir, const char *worktree);
 int repo_submodule_init(struct repository *submodule,
struct repository *superproject,
const char *path);
 void repo_clear(struct repository *repo);
 
+/*
+ * Initializes the repository 'the_repository', which is used in the transition
+ * phase of moving globals into the repository struct.
+ */
+void initialize_the_repository(void);
+
+/*
+ * To be called once; after the call use only returned repository, and do not
+ * use the_repository any more
+ */
+struct repository *get_the_repository(void);
+
 /*
  * Populates the repository's index from its index_file, an index struct will
  * be allocated if needed.
-- 
2.19.0



[PATCH] diff: don't attempt to strip prefix from absolute Windows paths

2018-10-18 Thread Johannes Sixt
git diff can be invoked with absolute paths. Typically, this triggers
the --no-index case. Then the absolute paths remain in the file names
that are printed in the output.

There is one peculiarity, though: When the command is invoked from a
a sub-directory in a repository, then it is attempted to strip the
sub-directory from the beginning of relative paths. Yet, to detect a
relative path the code just checks for an initial forward slash.
This mistakes a Windows style path like D:/base as a relative path
and the output looks like this, for example:

  D:\test\one>git -P diff --numstat D:\base D:\diff
  1   1   {ase => iff}/1.txt

where the correct output should be

  D:\test\one>git -P diff --numstat D:\base D:\diff
  1   1   D:/{base => diff}/1.txt

If the sub-directory where 'git diff' is invoked is sufficiently deep
that the prefix becomes longer than the path to be printed, then the
subsequent code even accesses the paths out of bounds!

Use is_absolute_path() to detect Windows style absolute paths.

One might wonder whether the check for a directory separator that
is visible in the patch context should be changed from == '/' to
is_dir_sep() or not. It turns out not to be necessary. That code
only ever investigates paths that have undergone pathspec
normalization, after which there are only forward slashes even on
Windows.

Signed-off-by: Johannes Sixt 
---
 diff.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index f0c7557b40..d18eb198f2 100644
--- a/diff.c
+++ b/diff.c
@@ -4267,12 +4267,12 @@ static void diff_fill_oid_info(struct diff_filespec 
*one)
 static void strip_prefix(int prefix_length, const char **namep, const char 
**otherp)
 {
/* Strip the prefix but do not molest /dev/null and absolute paths */
-   if (*namep && **namep != '/') {
+   if (*namep && !is_absolute_path(*namep)) {
*namep += prefix_length;
if (**namep == '/')
++*namep;
}
-   if (*otherp && **otherp != '/') {
+   if (*otherp && !is_absolute_path(*otherp)) {
*otherp += prefix_length;
if (**otherp == '/')
++*otherp;
-- 
2.19.1.406.g1aa3f475f3


[RFC PATCH 0/2] Bring the_repository into cmd_foo

2018-10-18 Thread Stefan Beller
> On Wed, Oct 17, 2018 at 5:41 AM Derrick Stolee  wrote:
>> I had one high-level question: How are we testing that these "arbitrary
>> repository" changes are safe?
> [...]
> Or instead we could accelerate the long term plan of removing a
> hard coded the_repository and have each cmd builtin take an additional
> repository pointer from the init code, such that we'd bring all of Git to
> work on arbitrary repositories. Then the standard test suite should be
> okay, as there is no special case for the_repository any more.

Demo'd in this RFC series for git-merge-base.

The core idea is found in patch 1,
and the proof of concept is found in patch 2.

What do you think?

Thanks,
Stefan

Stefan Beller (3):
  repository: have get_the_repository() to remove the_repository
dependency
  builtin/merge-base.c: do not rely on the_repository any more

 builtin/merge-base.c  | 67 ++-
 repository.c  | 10 +++
 repository.h  | 13 -
 t/t6010-merge-base.sh |  3 +-

-- 
2.19.0



Re: [PATCH v1 1/2] reset: don't compute unstaged changes after reset when --quiet

2018-10-18 Thread Duy Nguyen
On Thu, Oct 18, 2018 at 8:18 PM Ben Peart  wrote:
> I actually started my effort to speed up reset by attempting to
> multi-thread refresh_index().  You can see a work in progress at:
>
> https://github.com/benpeart/git/pull/new/refresh-index-multithread-gvfs
>
> The patch doesn't always work as it is still not thread safe.  When it
> works, it's great but I ran into to many difficulties trying to debug
> the remaining threading issues (even adding print statements would
> change the timing and the repro would disappear).  It will take a lot of
> code review to discover and fix the remaining non-thread safe code paths.
>
> In addition, the optimized code path that takes advantage of fsmonitor,
> uses multiple threads, fscache, etc _already exists_ in preload_index().
>   Trying to recreate all those optimizations in refresh_index() is (as I
> discovered) a daunting task.

Why not make refresh_index() run preload_index() first (or the
parallel lstat part to be precise), and only do the heavy
content-based refresh in single thread mode?
-- 
Duy


Re: [PATCH v1 1/2] reset: don't compute unstaged changes after reset when --quiet

2018-10-18 Thread Ben Peart




On 10/18/2018 2:36 AM, Jeff King wrote:

On Thu, Oct 18, 2018 at 12:40:48PM +0900, Junio C Hamano wrote:


Jeff King  writes:


Whereas for the new config variable, you'd probably set it not because
you want it quiet all the time, but because you want to get some time
savings. So there it does make sense to me to explain.

Other than that, this seems like an obvious and easy win. It does feel a
little hacky (you're really losing something in the output, and ideally
we'd just be able to give that answer quickly), but this may be OK as a
hack in the interim.


After "git reset --quiet -- this/area/" with this change, any
operation you'd do next that needs to learn if working tree files
are different from what is recorded in the index outside that area
will have to spend more cycles, because the refresh done by "reset"
is now limited to the area.  So if your final goal is "make 'reset'
as fast as possible", this is an obvious and easy win.  For other
goals, i.e. "make the overall experience of using Git feel faster",
it is not so obvious to me, though.


The final goal is to make git faster (especially on larger repos) and 
this proposal accomplishes that.  Let's look at why that is.


By scoping down (or eliminating) what refresh_index() has to lstat() at 
the end of the reset command, clearly the reset command is faster.  Yes, 
the index isn't as "fresh" because not everything was updated but that 
doesn't typically impact the performance of subsequent commands.


On the next command, git still has to lstat() every file because it 
isn't sure what changes could have happened in the file system.  As a 
result, the overall impact is that we have had to lstat() every file one 
fewer times between the two commands.  A net win overall.


In addition, the preload_index() code that does the lstat() command is 
highly optimized across multiple threads (and on Windows takes advantage 
of the fscache).  This means that it can lstat() every file _much_ 
faster than the single threaded loop in refresh_index().  This also 
makes the overall performance of the pair of git commands faster as we 
got rid of the slow lstat() loop and kept the fast one.


Here are some numbers to demonstrate that.  These are hot cache numbers 
as they are easier to generate.  Cold cache numbers make the net perf 
win significantly better as the cost for the reset jumps from 2.43 
seconds to 7.16 seconds.


0.32 git add asdf
0.31 git -c reset.quiet=true reset asdf
1.34 git status
1.97 Total


0.32 git add asdf
2.43 git -c reset.quiet=false reset asdf
1.32 git status
4.07 Total

Note the status command after the reset doesn't really change as it 
still must lstat() every file (the 0.02 difference is well within the 
variability of run to run differences).


FWIW, none of these numbers are using fsmonitor.



There was additional discussion about whether this should be tied to the 
"--quiet" option and how it should be documented.


One option would be to change the default behavior of reset so that it 
doesn't do the refresh_index() call at all.  This speeds up reset by 
default so there are no user discoverability issues but changes the 
default behavior which is an issue.


Another option that was suggested was to add a separate flag that could 
be passed to reset so that the "quiet" and "fast" options don't get 
conflated.  I don't care for that option because the two options (at 
this point and for the foreseeable future) would be identical in 
behavior from the end users perspective.


It was also suggested that there be a single "fast and quiet" option for 
all of git instead of separate options for each command.  I worry about 
that because now we're forcing users to choose between the "fast and 
quiet" version of git and the "slow and noisy" version.  How do we help 
them decide which they want?  That seems difficult to explain so that 
they can make a rational choice and also hard to discover.  I also have 
to wonder who would say "give me the slow and noisy version please." :)


I'd prefer we systematically move towards a model where the default 
values that are chosen for various settings throughout the code are all 
configurable via settings.  All defaults by necessity make certain 
assumptions about user preference, data shape, machine performance, etc 
and if those assumptions don't match the user's environment then the 
hard coded defaults aren't appropriate.  We do our best but its going to 
be hit or miss.


A consistent way to be able to change those defaults would be very 
useful in those circumstances.  To be clear, I'm not proposing we do a 
wholesale update of our preferences model at this point in time - that 
seems like a significant undertaking and I don't want to tie this 
specific optimization to a potential change in how default settings work.



To move this forward, here is what I propose:

1) If the '--quiet' flag is passed, we silently take advantage of the 
fact we can avoid having to do an "extra" lstat() 

[PATCH/RFC] thread-utils: better wrapper to avoid #ifdef NO_PTHREADS

2018-10-18 Thread Nguyễn Thái Ngọc Duy
On Thu, Oct 18, 2018 at 7:09 PM Jeff King  wrote:
> > In this particular case though I think we should be able to avoid so
> > much #if if we make a wrapper for pthread api that would return an
> > error or something when pthread is not available. But similar
> > situation may happen elsewhere too.
>
> Yeah, I think that is generally the preferred method anyway, just
> because of readability and simplicity.

I've wanted to do this for a while, so let's test the water and see if
it's well received.

This patch is a proof of concept that adds just enough macros so that
I can build index-pack.c on a single thread mode with zero #ifdef
related to NO_PTHREADS.

Besides readability and simplicity, it reduces the chances of breaking
conditional builds (e.g. you rename a variable name but forgot that
the variable is in #if block that is not used by your
compiler/platform).

Performance-wise I don't think there is any loss for single thread
mode. I rely on compilers recognizing HAVE_THREADS being a constant
and remove dead code or at least optimize in favor of non-dead code.

Memory-wise, yes we use some more memory in single thread mode. But we
don't have zillions of mutexes or thread id, so a bit extra memory
does not worry me so much.

Hmm?
---
 Makefile |  2 +-
 builtin/index-pack.c | 68 
 thread-utils.c   | 30 +++
 thread-utils.h   | 38 +++--
 4 files changed, 84 insertions(+), 54 deletions(-)

diff --git a/Makefile b/Makefile
index 5bf1af369e..ef852031bd 100644
--- a/Makefile
+++ b/Makefile
@@ -981,6 +981,7 @@ LIB_OBJS += sub-process.o
 LIB_OBJS += symlinks.o
 LIB_OBJS += tag.o
 LIB_OBJS += tempfile.o
+LIB_OBJS += thread-utils.o
 LIB_OBJS += tmp-objdir.o
 LIB_OBJS += trace.o
 LIB_OBJS += trailer.o
@@ -1664,7 +1665,6 @@ ifdef NO_PTHREADS
 else
BASIC_CFLAGS += $(PTHREAD_CFLAGS)
EXTLIBS += $(PTHREAD_LIBS)
-   LIB_OBJS += thread-utils.o
 endif
 
 ifdef HAVE_PATHS_H
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 2004e25da2..bbd66ca025 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -42,9 +42,7 @@ struct base_data {
 };
 
 struct thread_local {
-#ifndef NO_PTHREADS
pthread_t thread;
-#endif
struct base_data *base_cache;
size_t base_cache_used;
int pack_fd;
@@ -98,8 +96,6 @@ static uint32_t input_crc32;
 static int input_fd, output_fd;
 static const char *curr_pack;
 
-#ifndef NO_PTHREADS
-
 static struct thread_local *thread_data;
 static int nr_dispatched;
 static int threads_active;
@@ -179,26 +175,6 @@ static void cleanup_thread(void)
free(thread_data);
 }
 
-#else
-
-#define read_lock()
-#define read_unlock()
-
-#define counter_lock()
-#define counter_unlock()
-
-#define work_lock()
-#define work_unlock()
-
-#define deepest_delta_lock()
-#define deepest_delta_unlock()
-
-#define type_cas_lock()
-#define type_cas_unlock()
-
-#endif
-
-
 static int mark_link(struct object *obj, int type, void *data, struct 
fsck_options *options)
 {
if (!obj)
@@ -364,22 +340,20 @@ static NORETURN void bad_object(off_t offset, const char 
*format, ...)
 
 static inline struct thread_local *get_thread_data(void)
 {
-#ifndef NO_PTHREADS
-   if (threads_active)
-   return pthread_getspecific(key);
-   assert(!threads_active &&
-  "This should only be reached when all threads are gone");
-#endif
+   if (HAVE_THREADS) {
+   if (threads_active)
+   return pthread_getspecific(key);
+   assert(!threads_active &&
+  "This should only be reached when all threads are gone");
+   }
return ¬hread_data;
 }
 
-#ifndef NO_PTHREADS
 static void set_thread_data(struct thread_local *data)
 {
if (threads_active)
pthread_setspecific(key, data);
 }
-#endif
 
 static struct base_data *alloc_base_data(void)
 {
@@ -1092,7 +1066,6 @@ static void resolve_base(struct object_entry *obj)
find_unresolved_deltas(base_obj);
 }
 
-#ifndef NO_PTHREADS
 static void *threaded_second_pass(void *data)
 {
set_thread_data(data);
@@ -1116,7 +1089,6 @@ static void *threaded_second_pass(void *data)
}
return NULL;
 }
-#endif
 
 /*
  * First pass:
@@ -1213,7 +1185,6 @@ static void resolve_deltas(void)
progress = start_progress(_("Resolving deltas"),
  nr_ref_deltas + nr_ofs_deltas);
 
-#ifndef NO_PTHREADS
nr_dispatched = 0;
if (nr_threads > 1 || getenv("GIT_FORCE_THREADS")) {
init_thread();
@@ -1229,7 +1200,6 @@ static void resolve_deltas(void)
cleanup_thread();
return;
}
-#endif
 
for (i = 0; i < nr_objects; i++) {
struct object_entry *obj = &objects[i];
@@ -1531,11 +1501,11 @@ static int git_index_pack_config(const char *k, const 
char *v, void *cb)
  

Re: [PATCH 0/9] Resending sb/submodule-recursive-fetch-gets-the-tip

2018-10-18 Thread Stefan Beller
On Thu, Oct 18, 2018 at 12:30 AM Junio C Hamano  wrote:
>
> Stefan Beller  writes:
>
> > This is based on ao/submodule-wo-gitmodules-checked-out.
> >
> > This resends origin/sb/submodule-recursive-fetch-gets-the-tip, resolving
> > the issues pointed out via 
> > origin/xxx/sb-submodule-recursive-fetch-gets-the-tip-in-pu
> > by basing this series on origin/ao/submodule-wo-gitmodules-checked-out
>
> Applying this round to the result of merging ao/submodule-* to
> 'master' requires this to work, it seems, as you've introduced a
> call to repo-init thing in the meantime with another topic.

Unfortunately, yes, sb/submodule-update-in-c had one such call.

> > This is based on ao/submodule-wo-gitmodules-checked-out.
> Thanks.  I had an impression that we were not entirely happy with
> the topic and are expecting a reroll, but let's hope that the part
> we expect to be updated won't have much overlaps.

I slowly came to the realization that this topic might be ripped up
into 2 or more topics, as some of the cleanups seem to be orthogonal.


[PATCH 1/1] commit-reach: fix first-parent heuristic

2018-10-18 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

The algorithm in can_all_from_reach_with_flags() performs a depth-
first-search, terminated by generation number, intending to use
a hueristic that "important" commits are found in the first-parent
history. This heuristic is valuable in scenarios like fetch
negotiation.

However, there is a problem! After the search finds a target commit,
it should pop all commits off the stack and mark them as "can reach".
This logic is incorrect, so the algorithm instead walks all reachable
commits above the generation-number cutoff.

The existing algorithm is still an improvement over the previous
algorithm, as the worst-case complexity went from quadratic to linear.
The performance measurement at the time was good, but not dramatic.
By fixing this heuristic, we reduce the number of walked commits.

We can also re-run the performance tests from commit 4fbcca4e
"commit-reach: make can_all_from_reach... linear".

Performance was measured on the Linux repository using
'test-tool reach can_all_from_reach'. The input included rows seeded by
tag values. The "small" case included X-rows as v4.[0-9]* and Y-rows as
v3.[0-9]*. This mimics a (very large) fetch that says "I have all major
v3 releases and want all major v4 releases." The "large" case included
X-rows as "v4.*" and Y-rows as "v3.*". This adds all release-candidate
tags to the set, which does not greatly increase the number of objects
that are considered, but does increase the number of 'from' commits,
demonstrating the quadratic nature of the previous code.

Small Case:

4fbcca4e~1: 0.85 s
  4fbcca4e: 0.26 s (num_walked: 1,011,035)
  HEAD: 0.14 s (num_walked: 8,601)

Large Case:

4fbcca4e~1: 24.0  s
  4fbcca4e:  0.12 s (num_walked:  503,925)
  HEAD:  0.06 s (num_walked:  217,243)

Signed-off-by: Derrick Stolee 
---
 commit-reach.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/commit-reach.c b/commit-reach.c
index 9f79ce0a22..79419be8af 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -593,8 +593,10 @@ int can_all_from_reach_with_flag(struct object_array *from,
while (stack) {
struct commit_list *parent;
 
-   if (stack->item->object.flags & with_flag) {
+   if (stack->item->object.flags & (with_flag | RESULT)) {
pop_commit(&stack);
+   if (stack)
+   stack->item->object.flags |= RESULT;
continue;
}
 
-- 
gitgitgadget


[PATCH 0/1] commit-reach: fix first-parent heuristic

2018-10-18 Thread Derrick Stolee via GitGitGadget
I originally reported this fix [1] after playing around with the trace2
series for measuring performance. Since trace2 isn't merging quickly, I
pulled the performance fix patch out and am sending it on its own. The only
difference here is that we don't have the tracing to verify the performance
fix in the test script.

See the patch message for details about the fix.

Thanks, -Stolee

[1] 
https://public-inbox.org/git/20180906151309.66712-7-dsto...@microsoft.com/

[RFC PATCH 6/6] commit-reach: fix first-parent heuristic

Derrick Stolee (1):
  commit-reach: fix first-parent heuristic

 commit-reach.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


base-commit: a4b8ab5363a32f283a61ef3a962853556d136c0e
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-51%2Fderrickstolee%2Ffirst-parent-heuristic-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-51/derrickstolee/first-parent-heuristic-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/51
-- 
gitgitgadget


Re: [PATCH] config.mak.dev: enable -Wunused-function

2018-10-18 Thread Jeff King
On Thu, Oct 18, 2018 at 05:48:16PM +0200, Duy Nguyen wrote:

> >   - conditional compilation, where we may or may not need a
> > static helper. These generally fall into one of two
> > categories:
> >
> >   - the call should not be conditional, but rather the
> > function body itself should be (and may just be a
> > no-op on one side of the #if). That keeps the
> > conditional pollution out of the main code.
> >
> >   - call-chains of static helpers should all be in the
> > same #if block, so they are all-or-nothing
> 
> Grouping is not always desired because it could break better function
> layout. Have a look at read-cache.c where _ieot_extension functions
> are #if'd because the only call sites are from pthread code (#if'd far
> away).

True, though as long as they are triggered by the same set of #if
conditions, that is fine. Putting them in the same block  is just an
easy way to make sure that is the case. ;)

> In this particular case though I think we should be able to avoid so
> much #if if we make a wrapper for pthread api that would return an
> error or something when pthread is not available. But similar
> situation may happen elsewhere too.

Yeah, I think that is generally the preferred method anyway, just
because of readability and simplicity.

> Having said that, if people do consider MAYBE_UNUSED before #if'ing
> everywhere (and opening up more conditional build problems in future),
> I think this change is fine.

I'd like to use it as a last resort, certainly. Mostly the fact that we
compile cleanly _now_ makes me think that it probably won't be that hard
to keep it going.

I think the biggest potential problem with this is going to be obscure
configurations where some functions are used or not used. So somebody
silencing a compiler warning may inadvertently break another case if
they're not careful. But that's already a problem to some degree (and
part of why we try to push the conditionality out to the whole-function
level).

-Peff


Re: [PATCH] config.mak.dev: enable -Wunused-function

2018-10-18 Thread Ramsay Jones



On 18/10/2018 08:05, Jeff King wrote:
> We explicitly omitted -Wunused-function when we added
> -Wextra, but there is no need: the current code compiles
> cleanly with it. And it's worth having, since it can let you
> know when there are cascading effects from a cleanup (e.g.,
> deleting one function lets you delete its static helpers).
> 
> There are cases where we may need an unused function to
> exist, but we can handle these easily:
> 
>   - macro-generated code like commit-slab; there we have the
> MAYBE_UNUSED annotation to silence the compiler
> 
>   - conditional compilation, where we may or may not need a
> static helper. These generally fall into one of two
> categories:
> 
>   - the call should not be conditional, but rather the
>   function body itself should be (and may just be a
>   no-op on one side of the #if). That keeps the
>   conditional pollution out of the main code.
> 
>   - call-chains of static helpers should all be in the
> same #if block, so they are all-or-nothing
> 
> And if there's some case that doesn't cover, we still
> have MAYBE_UNUSED as a fallback.
> 
> Signed-off-by: Jeff King 
> ---
>  config.mak.dev | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/config.mak.dev b/config.mak.dev
> index 92d268137f..bbeeff44fe 100644
> --- a/config.mak.dev
> +++ b/config.mak.dev
> @@ -34,7 +34,6 @@ ifeq ($(filter extra-all,$(DEVOPTS)),)
>  CFLAGS += -Wno-empty-body
>  CFLAGS += -Wno-missing-field-initializers
>  CFLAGS += -Wno-sign-compare
> -CFLAGS += -Wno-unused-function
>  CFLAGS += -Wno-unused-parameter
>  endif
>  endif
> 

Heh, as luck would have it, tonight I had an -Wunused-function
warning on cygwin, but not Linux, when building the 'pu' branch.

[On linux, I am using DEVELOPER=1 in my config.mak etc., whereas
on cygwin I am still using an explicit (but very similar) list
of -W args.]

I haven't looked too deeply, but this seems to be caused by
Junio's commit 42c89ea70a ("SQUASH??? - convert the other user of
string-list as db", 2018-10-17) which removes a call to the
add_existing() function - the subject of the warning.

[BTW there is another 'static add_existing()' in builtin/show_ref.c]

ATB,
Ramsay Jones



No --no-gpg-sign option with "git rebase"

2018-10-18 Thread Luca Weiss
See subject, would be quite useful to have this. ("Countermand commit.gpgSign 
configuration variable that is set to force each and every commit to be 
signed.")

Thanks, Luca




Re: [PATCH] config.mak.dev: enable -Wunused-function

2018-10-18 Thread Duy Nguyen
On Thu, Oct 18, 2018 at 9:05 AM Jeff King  wrote:
>
> We explicitly omitted -Wunused-function when we added
> -Wextra, but there is no need: the current code compiles
> cleanly with it. And it's worth having, since it can let you
> know when there are cascading effects from a cleanup (e.g.,
> deleting one function lets you delete its static helpers).
>
> There are cases where we may need an unused function to
> exist, but we can handle these easily:
>
>   - macro-generated code like commit-slab; there we have the
> MAYBE_UNUSED annotation to silence the compiler
>
>   - conditional compilation, where we may or may not need a
> static helper. These generally fall into one of two
> categories:
>
>   - the call should not be conditional, but rather the
> function body itself should be (and may just be a
> no-op on one side of the #if). That keeps the
> conditional pollution out of the main code.
>
>   - call-chains of static helpers should all be in the
> same #if block, so they are all-or-nothing

Grouping is not always desired because it could break better function
layout. Have a look at read-cache.c where _ieot_extension functions
are #if'd because the only call sites are from pthread code (#if'd far
away).

In this particular case though I think we should be able to avoid so
much #if if we make a wrapper for pthread api that would return an
error or something when pthread is not available. But similar
situation may happen elsewhere too.

Having said that, if people do consider MAYBE_UNUSED before #if'ing
everywhere (and opening up more conditional build problems in future),
I think this change is fine.
-- 
Duy


Re: [PATCH v4] branch: introduce --show-current display option

2018-10-18 Thread Eric Sunshine
On Thu, Oct 18, 2018 at 5:51 AM Johannes Schindelin
 wrote:
> On Wed, 17 Oct 2018, Eric Sunshine wrote:
> > On Wed, Oct 17, 2018 at 6:18 AM Johannes Schindelin 
> >  wrote:
> > > My suspicion: it is essentially the `(exit 117)` that adds about 100ms to
> > > every of those 67 test cases.
> >
> > The subshell chain-linter adds a 'sed' and 'grep' invocation to each test 
> > which doesn't help. (v1 of the subshell chain-linter only added a 'sed', 
> > but that changed with v2.)
> > You could disable the subshell chain-linter like this if you want test the 
> > (exit 117) goop in isolation:
>
> You're right! This is actually responsible for about five of those seven
> seconds. The subshell still hurts a little, as it means that every single
> of the almost 20,000 test cases we have gets slowed down by ~0.03s, which
> amounts to almost 10 minutes.
>
> This is "only" for the Windows phase of our Continuous Testing, of course.
> Yet I think we can do better than this.
>
> How difficult/involved, do you think, would it be to add a t/helper/
> command for chain linting?

Probably more effort than it's worth, and it would only save one
process invocation.

Since the  subshell portion of the chain-linting is done by pure
textual inspection, an alternative I had considered was to just
perform it as a preprocess over the entire test suite, much like the
other t/Makefile "test-lint" targets. In other words, the entire test
suite might be tested in one go with something like this:

sed -f chainlint.sed t*.sh | grep -q '?![A-Z][A-Z]*?!' &&
echo "BROKEN &&-chain"

That won't work today since chainlint.sed isn't written to understand
everything which we might see outside of a test_expect_*, but doing it
that way is within the realm of possibility. There were two reasons
why I didn't pursue that approach.

First, although I was expecting Windows folks to complain (or at least
speak up) about the extra 'sed' and 'grep', nobody did, so my
impression was that those two extra commands were likely lost in the
noise of the rest of the boilerplate commands invoked by
test_expect_success(), test_run_(), test_eval_(), etc., and by
whatever expensive commands are invoked by each test itself. Second,
the top-level &&-chain "(exit 117)" linting kicks in even when you run
a single test script manually, say after editing a test, which is
exactly when you want to discover that you botched a &&-chain, so it
seemed a good idea for the subshell &&-chain linter to follow suit.
The t/Makefile "test-lint" targets, on the other hand, don't kick in
when running test scripts in isolation.

However, a pragmatic way to gain back those 10 minutes might be simply
to disable the chain-linter for continuous integration testing on
Windows, but leave it enabled on other platforms. This way, we'd still
catch broken &&-chains, with the exception of tests which are specific
to Windows, of which I think there are very few.


Re: [PATCH v2 13/13] commit-graph: specify OID version for SHA-256

2018-10-18 Thread Derrick Stolee

On 10/17/2018 8:06 PM, brian m. carlson wrote:

On Wed, Oct 17, 2018 at 04:31:19PM +0200, Duy Nguyen wrote:

On Wed, Oct 17, 2018 at 12:44 AM brian m. carlson
 wrote:

Honestly, anything in the .git directory that is not the v3 pack indexes
or the loose object file should be in exactly one hash algorithm.  We
could simply just leave this value at 1 all the time and ignore the
field, since we already know what algorithm it will use.

In this particular case, I agree, but not as a general principle. It's
nice to have independence for fsck-like tools. I don't know if we have
a tool that simply validates commit-graph file format (and not trying
to access any real object). But for such a tool, I guess we can just
pass the hash algorithm from command line. The user would have to
guess a bit.

I'm going to drop this patch for now.  I'll send a follow-up series
later which bumps the format version for this and the multi-pack index
and serializes them with the four-byte value.  I probably should have
caught this earlier, but unfortunately I don't always have the time to
look at every series that hits the list.
We should coordinate before incrementing the version number. I was 
working on making the file formats incremental (think split-index) but 
couldn't come up with a way to do it without incrementing the file 
format. It would be best to combine these features into v2 of each format.


Thanks,
-Stolee


Re: [PATCH 0/3] Use commit-graph by default

2018-10-18 Thread Derrick Stolee

On 10/17/2018 11:47 PM, Junio C Hamano wrote:

If I recall correctly, one more task that was discussed but hasn't
been addressed well is how the generation and incremental update of
it should integrate with the normal repository maintenance workflow
(perhaps "gc --auto").  If we are going to turn it on by default, it
would be good to see if we can avoid multiple independent walks done
over the same history graph by repack, prune and now commit-graph,
before such a change happens.
I don't remember a discussion on this, but it is an interesting point. 
I'll give it some thought to see if we can combine these walks.


Thanks,
-Stolee


Re: [PATCH 1/3] t6501: use --quiet when testing gc stderr

2018-10-18 Thread Derrick Stolee




On 10/18/2018 1:23 AM, Junio C Hamano wrote:

"Derrick Stolee via GitGitGadget"  writes:


From: Derrick Stolee 

The test script t6501-freshen-objects.sh has some tests that care
if 'git gc' has any output to stderr. This is intended to say that
no warnings occurred related to broken links. However, when we
have operations that output progress (like writing the commit-graph)
this causes the test to fail.

I see that the descriptor #2 is redirected into a regular file.  Why
should we be writing the progress indicator in that case in the
first place?  Shoudln't we be doing the usual "are we showing this
to an interactive terminal?" test?


This code from builtin/gc.c makes it look like we are doing that:

    if (gc_write_commit_graph)
    write_commit_graph_reachable(get_object_directory(), 0,
 !quiet && !daemonized);

But really, daemonized is only for when running in the background.

Do you have an example of a builtin that checks this interactive 
terminal behavior?


Thanks,
-Stolee


Re: [RFC v2] revision: Add --sticky-default option

2018-10-18 Thread Andreas Gruenbacher
On Thu, 18 Oct 2018 at 08:53, Jeff King  wrote:
> On Wed, Oct 17, 2018 at 03:49:47PM +0200, Andreas Gruenbacher wrote:
> > @@ -2431,7 +2446,11 @@ int setup_revisions(int argc, const char **argv, 
> > struct rev_info *revs, struct s
> >   opt->tweak(revs, opt);
> >   if (revs->show_merge)
> >   prepare_show_merge(revs);
> > - if (revs->def && !revs->pending.nr && !revs->rev_input_given && 
> > !got_rev_arg) {
> > + if (revs->sticky_default)
> > + cancel_default = has_interesting_revisions();
> > + else
> > + cancel_default = got_rev_arg;
> > + if (revs->def && !revs->rev_input_given && !cancel_default) {
>
> How do you want to handle "maybe has a ref" options like --stdin,
> --tags, etc? Those set revs->rev_input_given.
>
> With the code you have here, rev_input_given overrides any
> sticky_default decision, and you cannot do this:
>
>   git log --not --remotes=origin
>
> and get the default, because even though you have only UNINTERESTING
> commits, rev_input_given is true.
>
> If you move rev_input_given to the cancel_default line above the final
> conditional and turn that conditional into:
>
>   if (revs->def && !cancel_default)
>
> then it works.

Yes, this still needs fixing, I'm just not sure yet if this is the right way.

Thanks,
Andreas


How to start contributing

2018-10-18 Thread Πλάτων Κιορπελίδης
Hello,

I’m a computer science student and I’m interested in contributing to git.
I’ve read the GSoC git page with the ideas and micro-projects as I’m
interested in participating next summer.
I’ve also read the Documentation at the GitHub mirror.
I’ve never worked on such large project and I don’t know where to start from.
I’ve picked this microproject from the GSoC page:

Make “git tag –contains ” less chatty if  is invalid
“git tag – contains ” prints the whole help text if  is
invalid. It should only show the error message instead.
Thread: 
https://public-inbox.org/git/20160118215433.gb24...@sigill.intra.peff.net/

This bug is solved using the 3rd option, but I suspect that it’s still
here because the 2nd option is preferred.

How should I tackle this?

Thanks.


Re: On overriding make variables from the environment...

2018-10-18 Thread Junio C Hamano
SZEDER Gábor  writes:

> So, then it's either 'config.mak', or passing a 'CC=$CC' argument to
> _all_ make commands, including those that are not supposed to build
> anything, but only run the tests.  I find the latter aesthetically not
> particularly pleasing.

The config.mak file is available for individual builder-testers to
customize their build, and in the context of this discussion, I
think the CI builder is just one particular individual who happens
to be non human.  If it is easy to throw suitable settings from within
the CI configuration .yaml files into config.mak, I'd think that is
exactly how the mechanism was invented to be used, so...


Re: [RFC] revision: Add --sticky-default option

2018-10-18 Thread Junio C Hamano
Andreas Gruenbacher  writes:

>> >   # is --stdin a selector, too?
>> >   branches | git log --stdin --not origin/master
>
> Yes, it's a positive selector (since --not doesn't apply to --stdin).

But you should be able to do

printf "%s\n" ^maint master | git rev-list --stdin

Replace the second one with ^master and now you have nothing but
negative.

So, no, the line is much blurrier than you would think.


Re: [RFC] revision: Add --sticky-default option

2018-10-18 Thread Andreas Gruenbacher
On Thu, 18 Oct 2018 at 05:23, Junio C Hamano  wrote:
> Jeff King  writes:
>
> > I'd probably call it something verbose and boring like
> > --use-default-with-uninteresting or --default-on-negative.
> > I dunno.
>
> These two names are improvement, but there needs a hint that the
> change we are interested in is to use default even when revs are
> given as long as ALL of them are negative ones.  Which in turn means
> there is NO positive ones given.
>
> So perhaps "--use-default-without-any-positive".
>
> Having said that, I have to wonder how serious a breakage we are
> going to cause to established users and scripts if we made this
> change without any explicit option.  After all, it would be rather
> obvious that people will get a history with some commits (or none at
> all) when they were expecting no output that the "default behaviour"
> has changed.  I also wonder how would scripts take advantage of the
> current "defeat --default as soon as we see any rev, even a negative
> one"---in short, I am not sure if the theoretical regression this
> new "option" is trying to avoid is worth avoiding in the first
> place.
>
> Is there a way to say "usually this command has built-in --default=HEAD
> behaviour, but I am declining that" already, i.e.
>
> $ git log --no-default $REVS
>
> that will result in an empty set if we accept the change proposed
> here but make it unconditional?  If so "This and future versions of
> Git will honor the --default even when there are other revisions
> given on the command line, as long as they are ALL negative ones.
> This is a backward incompatibile change, but you can update your
> scripts with '--no-default' if you do not like the new behaviour" in
> the release notes may be a viable alternative way forward.

That would certainly work for me.

Andreas

> If there is no such way in the released versions of Git, then that
> would not work, and a strict opt-in like the approach taken by the
> proposed patch would become necessary.


Re: [RFC] revision: Add --sticky-default option

2018-10-18 Thread Andreas Gruenbacher
On Thu, 18 Oct 2018 at 08:59, Junio C Hamano  wrote:
> Jeff King  writes:
> > Just to play devil's advocate, how about this:
> >
> >   git log --branches=jk/* --not origin/master
> >
> > Right now that shows nothing if there are no matching branches. But I
> > think under the proposed behavior, it would start showing HEAD, which
> > seems counter-intuitive.
> >
> > Or are we going to count any positive selector as a positive ref, even
> > if it matches nothing?
>
> That sounds like an intuitive behaviour of the command, but I may
> change my mind when I look at other examples.
>
> When viewing that "--branches=jk/*" example in isolation, yes, these
> positive selectors that could produce positive revs should defeat
> the --default, especially when it is built-in (like "log").

I agree, that's the kind of behavior I had in mind. (And I think
that's also what the code implements.)

> When given by the user, I am not sure.  With something like this:
>
> git rev-list --default=HEAD --branches=jk/* ^master
>
> clearly the user anticipates that jk/* may or may not produce any
> positive refs; otherwise there is no point specifying the default.

With positive selectors, it is meaningful if the selectors match
nothing. So in the above example, if jk/* matches nothing, I would
really expect no output, and the default should not be applied.
So we should just document that --default= only sets the default
in case the default is used.

> But anyway...
>
> > I could buy that, though it means that the
> > command above is subtly different from one or both of:
> >
> >   branches() {
> > git for-each-ref --format='%(refname)' refs/heads/jk/*
> >   }
> >
> >   # is --stdin a selector, too?
> >   branches | git log --stdin --not origin/master

Yes, it's a positive selector (since --not doesn't apply to --stdin).

> >   # here we have no idea that the user did a query and must show HEAD
> >   git log $(branches) --not origin/master

In this case, 'git log' is more used like git rev-list which doesn't
default to HEAD. The 'git log --no-default' would neatly restore
sanity here.

> OK, scrap that---just as I predicted a few minutes ago, I now think
> that "do we have a positive selector that can produce zero or more
> result?" is an ill-defined question X-<.
>
> Thanks for a doze of sanity.

Andreas


[PATCH v2 4/5] add read_author_script() to libgit

2018-10-18 Thread Phillip Wood
From: Phillip Wood 

Add read_author_script() to sequencer.c based on the implementation in
builtin/am.c and update read_am_author_script() to use
read_author_script(). The sequencer code that reads the author script
will be updated in the next commit.

Signed-off-by: Phillip Wood 
---

Notes:
changes since v1:
 - added comment above read_author_script()
 - rebased to reflect changes added in patch 2

 builtin/am.c |  86 +
 sequencer.c  | 105 +++
 sequencer.h  |   3 ++
 3 files changed, 110 insertions(+), 84 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 991d13f9a2..c5373158c0 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -260,36 +260,6 @@ static int read_state_file(struct strbuf *sb, const struct 
am_state *state,
die_errno(_("could not read '%s'"), am_path(state, file));
 }
 
-/**
- * Take a series of KEY='VALUE' lines where VALUE part is
- * sq-quoted, and append  at the end of the string list
- */
-static int parse_key_value_squoted(char *buf, struct string_list *list)
-{
-   while (*buf) {
-   struct string_list_item *item;
-   char *np;
-   char *cp = strchr(buf, '=');
-   if (!cp) {
-   np = strchrnul(buf, '\n');
-   return error(_("unable to parse '%.*s'"),
-(int) (np - buf), buf);
-   }
-   np = strchrnul(cp, '\n');
-   *cp++ = '\0';
-   item = string_list_append(list, buf);
-
-   buf = np + (*np == '\n');
-   *np = '\0';
-   cp = sq_dequote(cp);
-   if (!cp)
-   return error(_("unable to dequote value of '%s'"),
-item->string);
-   item->util = xstrdup(cp);
-   }
-   return 0;
-}
-
 /**
  * Reads and parses the state directory's "author-script" file, and sets
  * state->author_name, state->author_email and state->author_date accordingly.
@@ -309,65 +279,13 @@ static int parse_key_value_squoted(char *buf, struct 
string_list *list)
 static int read_am_author_script(struct am_state *state)
 {
const char *filename = am_path(state, "author-script");
-   struct strbuf buf = STRBUF_INIT;
-   struct string_list kv = STRING_LIST_INIT_DUP;
-   int retval = -1; /* assume failure */
-   int i, name_i = -2, email_i = -2, date_i = -2, err = 0;
-   int fd;
 
assert(!state->author_name);
assert(!state->author_email);
assert(!state->author_date);
 
-   fd = open(filename, O_RDONLY);
-   if (fd < 0) {
-   if (errno == ENOENT)
-   return 0;
-   return error_errno(_("could not open '%s' for reading"),
-  filename);
-   }
-   strbuf_read(&buf, fd, 0);
-   close(fd);
-   if (parse_key_value_squoted(buf.buf, &kv))
-   goto finish;
-
-   for (i = 0; i < kv.nr; i++) {
-   if (!strcmp(kv.items[i].string, "GIT_AUTHOR_NAME")) {
-   if (name_i >= 0)
-   name_i = error(_("'GIT_AUTHOR_NAME' already 
given"));
-   else
-   name_i = i;
-   } else if (!strcmp(kv.items[i].string, "GIT_AUTHOR_EMAIL")) {
-   if (email_i >= 0)
-   email_i = error(_("'GIT_AUTHOR_EMAIL' already 
given"));
-   else
-   email_i = i;
-   } else if (!strcmp(kv.items[i].string, "GIT_AUTHOR_DATE")) {
-   if (date_i >= 0)
-   date_i = error(_("'GIT_AUTHOR_DATE' already 
given"));
-   else
-   date_i = i;
-   } else {
-   err = error(_("unknown variable '%s'"),
-   kv.items[i].string);
-   }
-   }
-   if (name_i == -2)
-   error(_("missing 'GIT_AUTHOR_NAME'"));
-   if (email_i == -2)
-   error(_("missing 'GIT_AUTHOR_EMAIL'"));
-   if (date_i == -2)
-   error(_("missing 'GIT_AUTHOR_DATE'"));
-   if (date_i < 0 || email_i < 0 || date_i < 0 || err)
-   goto finish;
-   state->author_name = kv.items[name_i].util;
-   state->author_email = kv.items[email_i].util;
-   state->author_date = kv.items[date_i].util;
-   retval = 0;
-finish:
-   string_list_clear(&kv, !!retval);
-   strbuf_release(&buf);
-   return retval;
+   return read_author_script(filename, &state->author_name,
+ &state->author_email, &state->author_date, 1);
 }
 
 /**
diff --git a/sequencer.c b/sequencer.c
index dc2c58d464..3530dbeb6c 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -660,6 +660,111 

Re: On overriding make variables from the environment...

2018-10-18 Thread Johannes Schindelin
Hi Gábor,

On Wed, 17 Oct 2018, SZEDER Gábor wrote:

> On Tue, Oct 16, 2018 at 03:40:01PM -0700, Jonathan Nieder wrote:
> > SZEDER Gábor wrote:
> > > On Tue, Oct 16, 2018 at 02:54:56PM -0700, Jonathan Nieder wrote:
> > >> SZEDER Gábor wrote:
> > 
> > >>> Our Makefile has lines like these:
> > >>>
> > >>>   CFLAGS = -g -O2 -Wall
> > >>>   CC = cc
> > >>>   AR = ar
> > >>>   SPATCH = spatch
> > [...]
> > >>> I'm not sure what to do.  I'm fine with updating our 'ci/' scripts to
> > >>> explicitly respect CC in the environment (either by running 'make
> > >>> CC=$CC' or by writing $CC into 'config.mak').  Or I could update our
> > >>> Makefile to use '?=' for specific variables, but:
> > >>
> > >> That's a good question.  I don't have a strong opinion myself, so I
> > >> tend to trust larger projects like Linux to have thought this through
> > >> more, and they use 'CC = cc' as well.
> > >
> > > I don't think Linux is a good example to follow in this case, see e.g.
> > > 6d62c983f7 (Makefile: Change the default compiler from "gcc" to "cc",
> > > 2011-12-20) (in short: Git is supposed to be buildable with compilers
> > > other than GCC as well, while Linux not really, so they can hardcode
> > > 'CC = gcc').
> > 
> > Nowadays Linux builds with clang as well.  People also have other
> > reasons to override the CC setting (cross-compiling, etc) and to
> > override other settings like CFLAGS.
> > 
> > > Also, the projects I have on hand usually have 'CC = gcc' hardcoded in
> > > their Makefiles, too, but those Makefiles were generated by their
> > > ./configure script (which in turn by ./autogen.sh...), and those tend
> > > to write CC from the environment into the generated Makefiles.
> > 
> > Yes, I think that's what makes travis's setup normally work.  It makes
> > sense to me since ./configure is expected to have more implicit or
> > automatic behavior.  For "make", I kind of like that it doesn't depend
> > on environment variables that I am not thinking about when debugging a
> > reported build problem.
> > 
> > When building Git without autoconf, the corresponding place for
> > settings is config.mak.  Would it make sense for the ci scripts to
> > write a config.mak file?
> 
> A first I though it doesn't, but it turns out it acually does.
> 
> 'ci/run-build-and-tests.sh' basically runs:
> 
>   make
>   make test
> 
> I naively put a 'CC=$CC' argument at the end of the first command,
> thinking it should Just Work...  but then that second 'make test' got
> all clever on me, said "* new build flags", and then proceeded to
> rebuild everything with the default 'cc'.  (With the additional
> complication, that on Travis CI we actually run 'make --quiet test',
> which rebuilds everything, well, quietly...  so the rebuild itself is
> not even visible in the build logs.)
> 
> So, then it's either 'config.mak', or passing a 'CC=$CC' argument to
> _all_ make commands, including those that are not supposed to build
> anything, but only run the tests.  I find the latter aesthetically not
> particularly pleasing.

How about using `MAKEFLAGS`? I ran a quick test:

MAKEFLAGS='CC=blub' make -C .. git.o
make: Entering directory '/usr/src/git/wip'
* new build flags
CC git.o
/bin/sh: blub: command not found

In other words, you could add something like this to the ci/ script:

MAKEFLAGS="${MAKEFLAGS:+$MAKEFLAGS }CC=$CC"
export MAKEFLAGS

Ciao,
Dscho

[PATCH v2 5/5] sequencer: use read_author_script()

2018-10-18 Thread Phillip Wood
From: Phillip Wood 

Use the new function added in the last commit to read the author
script, updating read_env_script() and read_author_ident(). We now
have a single code path that reads the author script for am and all
flavors of rebase. This changes the behavior of read_env_script() as
previously it would set any environment variables that were in the
author-script file. Now it is an error if the file contains other
variables or any of GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL and
GIT_AUTHOR_DATE are missing. This is what am and the non interactive
version of rebase have been doing for several years so hopefully it
will not cause a problem for interactive rebase users. The advantage
is that we are reusing existing code from am which uses sq_dequote()
to properly dequote variables. This fixes potential problems with user
edited scripts as read_env_script() which did not track quotes
properly.

This commit also removes the fallback code for checking for a broken
author script after git is upgraded when a rebase is stopped. Now that
the parsing uses sq_dequote() it will reliably return an error if the
quoting is broken and the user will have to abort the rebase and
restart. This isn't ideal but it's a corner case and the detection of
the broken quoting could be confused by user edited author scripts.

Signed-off-by: Phillip Wood 
---

Notes:
changes since v1
 - use argv_array_pushf() as suggested by Eric
 - fixed strbuf handling as suggested by Eric
 - fix comments and commit message to reflect changed behavior of
   read_env_script()

 sequencer.c | 97 -
 1 file changed, 21 insertions(+), 76 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 3530dbeb6c..987542f67c 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -767,53 +767,24 @@ int read_author_script(const char *path, char **name, 
char **email, char **date,
 }
 
 /*
- * write_author_script() used to fail to terminate the last line with a "'" and
- * also escaped "'" incorrectly as "'''" rather than "'\\''". We check for
- * the terminating "'" on the last line to see how "'" has been escaped in case
- * git was upgraded while rebase was stopped.
- */
-static int quoting_is_broken(const char *s, size_t n)
-{
-   /* Skip any empty lines in case the file was hand edited */
-   while (n > 0 && s[--n] == '\n')
-   ; /* empty */
-   if (n > 0 && s[n] != '\'')
-   return 1;
-
-   return 0;
-}
-
-/*
- * Read a list of environment variable assignments (such as the author-script
- * file) into an environment block. Returns -1 on error, 0 otherwise.
+ * Read a GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL AND GIT_AUTHOR_DATE from a
+ * file with shell quoting into struct argv_array. Returns -1 on
+ * error, 0 otherwise.
  */
 static int read_env_script(struct argv_array *env)
 {
-   struct strbuf script = STRBUF_INIT;
-   int i, count = 0, sq_bug;
-   const char *p2;
-   char *p;
+   char *name, *email, *date;
 
-   if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
+   if (read_author_script(rebase_path_author_script(),
+  &name, &email, &date, 0))
return -1;
-   /* write_author_script() used to quote incorrectly */
-   sq_bug = quoting_is_broken(script.buf, script.len);
-   for (p = script.buf; *p; p++)
-   if (sq_bug && skip_prefix(p, "'''", &p2))
-   strbuf_splice(&script, p - script.buf, p2 - p, "'", 1);
-   else if (skip_prefix(p, "'\\''", &p2))
-   strbuf_splice(&script, p - script.buf, p2 - p, "'", 1);
-   else if (*p == '\'')
-   strbuf_splice(&script, p-- - script.buf, 1, "", 0);
-   else if (*p == '\n') {
-   *p = '\0';
-   count++;
-   }
 
-   for (i = 0, p = script.buf; i < count; i++) {
-   argv_array_push(env, p);
-   p += strlen(p) + 1;
-   }
+   argv_array_pushf(env, "GIT_AUTHOR_NAME=%s", name);
+   argv_array_pushf(env, "GIT_AUTHOR_EMAIL=%s", email);
+   argv_array_pushf(env, "GIT_AUTHOR_DATE=%s", date);
+   free(name);
+   free(email);
+   free(date);
 
return 0;
 }
@@ -833,54 +804,28 @@ static char *get_author(const char *message)
 /* Read author-script and return an ident line (author  timestamp) */
 static const char *read_author_ident(struct strbuf *buf)
 {
-   const char *keys[] = {
-   "GIT_AUTHOR_NAME=", "GIT_AUTHOR_EMAIL=", "GIT_AUTHOR_DATE="
-   };
struct strbuf out = STRBUF_INIT;
-   char *in, *eol;
-   const char *val[3];
-   int i = 0;
+   char *name, *email, *date;
 
-   if (strbuf_read_file(buf, rebase_path_author_script(), 256) <= 0)
+   if (read_author_script(rebase_path_author_script(),
+  &name, &email, &date, 0))
 

[PATCH v2 1/5] am: don't die in read_author_script()

2018-10-18 Thread Phillip Wood
From: Phillip Wood 

The caller is already prepared to handle errors returned from this
function so there is no need for it to die if it cannot read the file.

Suggested-by: Eric Sunshine 
Signed-off-by: Phillip Wood 
---
 builtin/am.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/builtin/am.c b/builtin/am.c
index 5e866d17c7..b68578bc3f 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -318,7 +318,8 @@ static int read_author_script(struct am_state *state)
if (fd < 0) {
if (errno == ENOENT)
return 0;
-   die_errno(_("could not open '%s' for reading"), filename);
+   return error_errno(_("could not open '%s' for reading"),
+  filename);
}
strbuf_read(&buf, fd, 0);
close(fd);
-- 
2.19.0



[PATCH v2 0/5] am/rebase: share read_author_script()

2018-10-18 Thread Phillip Wood
From: Phillip Wood 

Thanks to Eric for his feedback on v1. I've rerolled based on
that. Patches 1 & 2 are new and try to address some of the concerns
Eric raised, particularly the error handling for a badly edited author
script. See the notes on patches 4 & 5 for the changes to those (they
were previously 2 & 3).

v1 cover letter:

This is a follow up to pw/rebase-i-author-script-fix, it reduces code
duplication and improves rebase's parsing of the author script. After
this I'll do another series to share the code to write the author
script.

Phillip Wood (5):
  am: don't die in read_author_script()
  am: improve author-script error reporting
  am: rename read_author_script()
  add read_author_script() to libgit
  sequencer: use read_author_script()

 builtin/am.c |  60 ++--
 sequencer.c  | 192 ---
 sequencer.h  |   3 +
 3 files changed, 128 insertions(+), 127 deletions(-)

-- 
2.19.0



[PATCH v2 3/5] am: rename read_author_script()

2018-10-18 Thread Phillip Wood
From: Phillip Wood 

Rename read_author_script() in preparation for adding a shared
read_author_script() function to libgit.

Signed-off-by: Phillip Wood 
---
 builtin/am.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index d42b725273..991d13f9a2 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -306,7 +306,7 @@ static int parse_key_value_squoted(char *buf, struct 
string_list *list)
  * script, and thus if the file differs from what this function expects, it is
  * better to bail out than to do something that the user does not expect.
  */
-static int read_author_script(struct am_state *state)
+static int read_am_author_script(struct am_state *state)
 {
const char *filename = am_path(state, "author-script");
struct strbuf buf = STRBUF_INIT;
@@ -441,7 +441,7 @@ static void am_load(struct am_state *state)
BUG("state file 'last' does not exist");
state->last = strtol(sb.buf, NULL, 10);
 
-   if (read_author_script(state) < 0)
+   if (read_am_author_script(state) < 0)
die(_("could not parse author script"));
 
read_commit_msg(state);
-- 
2.19.0



[PATCH v2 2/5] am: improve author-script error reporting

2018-10-18 Thread Phillip Wood
From: Phillip Wood 

If there are errors in a user edited author-script there was no
indication of what was wrong. This commit adds some specific error messages
depending on the problem. It also relaxes the requirement that the
variables appear in a specific order in the file to match the behavior
of 'rebase --interactive'.

Signed-off-by: Phillip Wood 
---
 builtin/am.c | 49 +++--
 1 file changed, 39 insertions(+), 10 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index b68578bc3f..d42b725273 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -270,8 +270,11 @@ static int parse_key_value_squoted(char *buf, struct 
string_list *list)
struct string_list_item *item;
char *np;
char *cp = strchr(buf, '=');
-   if (!cp)
-   return -1;
+   if (!cp) {
+   np = strchrnul(buf, '\n');
+   return error(_("unable to parse '%.*s'"),
+(int) (np - buf), buf);
+   }
np = strchrnul(cp, '\n');
*cp++ = '\0';
item = string_list_append(list, buf);
@@ -280,7 +283,8 @@ static int parse_key_value_squoted(char *buf, struct 
string_list *list)
*np = '\0';
cp = sq_dequote(cp);
if (!cp)
-   return -1;
+   return error(_("unable to dequote value of '%s'"),
+item->string);
item->util = xstrdup(cp);
}
return 0;
@@ -308,6 +312,7 @@ static int read_author_script(struct am_state *state)
struct strbuf buf = STRBUF_INIT;
struct string_list kv = STRING_LIST_INIT_DUP;
int retval = -1; /* assume failure */
+   int i, name_i = -2, email_i = -2, date_i = -2, err = 0;
int fd;
 
assert(!state->author_name);
@@ -326,14 +331,38 @@ static int read_author_script(struct am_state *state)
if (parse_key_value_squoted(buf.buf, &kv))
goto finish;
 
-   if (kv.nr != 3 ||
-   strcmp(kv.items[0].string, "GIT_AUTHOR_NAME") ||
-   strcmp(kv.items[1].string, "GIT_AUTHOR_EMAIL") ||
-   strcmp(kv.items[2].string, "GIT_AUTHOR_DATE"))
+   for (i = 0; i < kv.nr; i++) {
+   if (!strcmp(kv.items[i].string, "GIT_AUTHOR_NAME")) {
+   if (name_i >= 0)
+   name_i = error(_("'GIT_AUTHOR_NAME' already 
given"));
+   else
+   name_i = i;
+   } else if (!strcmp(kv.items[i].string, "GIT_AUTHOR_EMAIL")) {
+   if (email_i >= 0)
+   email_i = error(_("'GIT_AUTHOR_EMAIL' already 
given"));
+   else
+   email_i = i;
+   } else if (!strcmp(kv.items[i].string, "GIT_AUTHOR_DATE")) {
+   if (date_i >= 0)
+   date_i = error(_("'GIT_AUTHOR_DATE' already 
given"));
+   else
+   date_i = i;
+   } else {
+   err = error(_("unknown variable '%s'"),
+   kv.items[i].string);
+   }
+   }
+   if (name_i == -2)
+   error(_("missing 'GIT_AUTHOR_NAME'"));
+   if (email_i == -2)
+   error(_("missing 'GIT_AUTHOR_EMAIL'"));
+   if (date_i == -2)
+   error(_("missing 'GIT_AUTHOR_DATE'"));
+   if (date_i < 0 || email_i < 0 || date_i < 0 || err)
goto finish;
-   state->author_name = kv.items[0].util;
-   state->author_email = kv.items[1].util;
-   state->author_date = kv.items[2].util;
+   state->author_name = kv.items[name_i].util;
+   state->author_email = kv.items[email_i].util;
+   state->author_date = kv.items[date_i].util;
retval = 0;
 finish:
string_list_clear(&kv, !!retval);
-- 
2.19.0



Re: [PATCH v4] branch: introduce --show-current display option

2018-10-18 Thread Johannes Schindelin
Hi Eric,

On Wed, 17 Oct 2018, Eric Sunshine wrote:

> On Wed, Oct 17, 2018 at 6:18 AM Johannes Schindelin 
>  wrote:
> > I realized yesterday that the &&-chain linting we use for every single
> > test case takes a noticeable chunk of time:
> >
> > $ time ./t0006-date.sh --quiet
> > real0m20.973s
> > $ time ./t0006-date.sh --quiet --no-chain-lint
> > real0m13.607s
> >
> > My suspicion: it is essentially the `(exit 117)` that adds about 100ms to
> > every of those 67 test cases.
> 
> The subshell chain-linter adds a 'sed' and 'grep' invocation to each test 
> which doesn't help. (v1 of the subshell chain-linter only added a 'sed', but 
> that changed with v2.)
> 
> > With that in mind, I would like to suggest that we should start to be very
> > careful about using subshells in our test suite.
> 
> You could disable the subshell chain-linter like this if you want test the 
> (exit 117) goop in isolation:
> 
> --- 8< ---
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 3f95bfda60..48323e503c 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -675,8 +675,7 @@ test_run_ () {
>   trace=
>   # 117 is magic because it is unlikely to match the exit
>   # code of other programs
> - if $(printf '%s\n' "$1" | sed -f 
> "$GIT_BUILD_DIR/t/chainlint.sed" | grep -q '?![A-Z][A-Z]*?!') ||
> - test "OK-117" != "$(test_eval_ "(exit 117) && 
> $1${LF}${LF}echo OK-\$?" 3>&1)"
> + if test "OK-117" != "$(test_eval_ "(exit 117) && 
> $1${LF}${LF}echo OK-\$?" 3>&1)"
>   then
>   error "bug in the test script: broken &&-chain or 
> run-away HERE-DOC: $1"
>   fi
> --- 8< ---

You're right! This is actually responsible for about five of those seven
seconds. The subshell still hurts a little, as it means that every single
of the almost 20,000 test cases we have gets slowed down by ~0.03s, which
amounts to almost 10 minutes.

This is "only" for the Windows phase of our Continuous Testing, of course.
Yet I think we can do better than this.

How difficult/involved, do you think, would it be to add a t/helper/
command for chain linting?

Ciao,
Dscho


Re: [PATCH 1/1] subtree: make install targets depend on build targets

2018-10-18 Thread Christian Hesse
Junio C Hamano  on Thu, 2018/10/18 11:09:
> Jonathan Nieder  writes:
> 
> > The rule says
> >
> >  install-html: html
> > $(INSTALL) -d -m 755 $(DESTDIR)$(htmldir)
> > $(INSTALL) -m 644 $^ $(DESTDIR)$(htmldir)
> >
> > and $^ substitutes to "html" after this change.  
> 
> Sorry about that.
> 
> From: Junio C Hamano 
> Date: Thu, 18 Oct 2018 11:07:17 +0900
> Subject: [PATCH] Revert "subtree: make install targets depend on build
> targets"
> 
> This reverts commit 744f7c4c314dc0e7816ac05520e8358c8318187a.
> 
> These targets do depend on the fact that each prereq is explicitly
> listed via their use of $^, which I failed to notice, and broke the
> build.
>
> [...]
>
> @@ -98,4 +98,4 @@ clean:
>   $(RM) $(GIT_SUBTREE)
>   $(RM) *.xml *.html *.1
>  
> -.PHONY: FORCE man html install-man install-html
> +.PHONY: FORCE

We could keep the phony part at least...

-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Best regards my address:*/=0;b=c[a++];)
putchar(b-1/(/*Chriscc -ox -xc - && ./x*/b/42*2-3)*42);}


pgpjIFH2PdoOs.pgp
Description: OpenPGP digital signature


Re: [PATCH 0/9] Resending sb/submodule-recursive-fetch-gets-the-tip

2018-10-18 Thread Junio C Hamano
Stefan Beller  writes:

> This is based on ao/submodule-wo-gitmodules-checked-out.
>
> This resends origin/sb/submodule-recursive-fetch-gets-the-tip, resolving
> the issues pointed out via 
> origin/xxx/sb-submodule-recursive-fetch-gets-the-tip-in-pu
> by basing this series on origin/ao/submodule-wo-gitmodules-checked-out

Applying this round to the result of merging ao/submodule-* to
'master' requires this to work, it seems, as you've introduced a
call to repo-init thing in the meantime with another topic.

Subject: [PATCH] fixup! repository: repo_submodule_init to take a submodule 
struct

---
 builtin/submodule--helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 5f8a804a6e..015aa1471f 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2037,7 +2037,7 @@ static int ensure_core_worktree(int argc, const char 
**argv, const char *prefix)
if (!sub)
BUG("We could get the submodule handle before?");
 
-   if (repo_submodule_init(&subrepo, the_repository, path))
+   if (repo_submodule_init(&subrepo, the_repository, sub))
die(_("could not get a repository handle for submodule '%s'"), 
path);
 
if (!repo_config_get_string(&subrepo, "core.worktree", &cw)) {
-- 
2.19.1-450-ga4b8ab5363



Re: [PATCH 1/3] ref-filter: free memory from used_atom

2018-10-18 Thread Оля Тележная
чт, 18 окт. 2018 г. в 9:51, Junio C Hamano :
>
> Jeff King  writes:
>
> > Presumably it came from the manual comment-style fixup.
>
> Wow, that was embarrassing.  Thanks for catching it.

Jeff, thanks a lot!
I just sent new version where I fixed all known issues including that comment.
>
> >
> > With that fix, the tests run fine for me under ASan/UBSan (with the
> > exception of t5310, but that's fixed already in a parallel topic).
> >
> > -Peff


[PATCH v2 1/3] ref-filter: free memory from used_atom

2018-10-18 Thread Olga Telezhnaya
Release memory from used_atom variable for reducing number of memory
leaks.

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/ref-filter.c b/ref-filter.c
index e1bcb4ca8a197..70f1d13ab3beb 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1996,6 +1996,10 @@ void ref_array_clear(struct ref_array *array)
 {
int i;
 
+   for (i = 0; i < used_atom_cnt; i++)
+   free((char *)used_atom[i].name);
+   FREE_AND_NULL(used_atom);
+   used_atom_cnt = 0;
for (i = 0; i < array->nr; i++)
free_array_item(array->items[i]);
FREE_AND_NULL(array->items);

--
https://github.com/git/git/pull/538


[PATCH v2 3/3] ref-filter: free item->value and item->value->s

2018-10-18 Thread Olga Telezhnaya
Release item->value.
Initialize item->value->s dynamically and then release its resources.
Release some local variables.

Final goal of this patch is to reduce number of memory leaks.

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 96 +---
 1 file changed, 54 insertions(+), 42 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 70f1d13ab3beb..ca52ee4608c2a 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -875,7 +875,7 @@ static void grab_common_values(struct atom_value *val, int 
deref, struct expand_
if (deref)
name++;
if (!strcmp(name, "objecttype"))
-   v->s = type_name(oi->type);
+   v->s = xstrdup(type_name(oi->type));
else if (!strcmp(name, "objectsize")) {
v->value = oi->size;
v->s = xstrfmt("%lu", oi->size);
@@ -899,9 +899,9 @@ static void grab_tag_values(struct atom_value *val, int 
deref, struct object *ob
if (deref)
name++;
if (!strcmp(name, "tag"))
-   v->s = tag->tag;
+   v->s = xstrdup(tag->tag);
else if (!strcmp(name, "type") && tag->tagged)
-   v->s = type_name(tag->tagged->type);
+   v->s = xstrdup(type_name(tag->tagged->type));
else if (!strcmp(name, "object") && tag->tagged)
v->s = xstrdup(oid_to_hex(&tag->tagged->oid));
}
@@ -1032,7 +1032,7 @@ static void grab_date(const char *buf, struct atom_value 
*v, const char *atomnam
v->value = timestamp;
return;
  bad:
-   v->s = "";
+   v->s = xstrdup("");
v->value = 0;
 }
 
@@ -1227,7 +1227,7 @@ static void fill_missing_values(struct atom_value *val)
for (i = 0; i < used_atom_cnt; i++) {
struct atom_value *v = &val[i];
if (v->s == NULL)
-   v->s = "";
+   v->s = xstrdup("");
}
 }
 
@@ -1273,7 +1273,8 @@ static inline char *copy_advance(char *dst, const char 
*src)
 static const char *lstrip_ref_components(const char *refname, int len)
 {
long remaining = len;
-   const char *start = refname;
+   const char *start = xstrdup(refname);
+   const char *to_free = start;
 
if (len < 0) {
int i;
@@ -1294,20 +1295,24 @@ static const char *lstrip_ref_components(const char 
*refname, int len)
while (remaining > 0) {
switch (*start++) {
case '\0':
-   return "";
+   free((char *)to_free);
+   return xstrdup("");
case '/':
remaining--;
break;
}
}
 
+   start = xstrdup(start);
+   free((char *)to_free);
return start;
 }
 
 static const char *rstrip_ref_components(const char *refname, int len)
 {
long remaining = len;
-   char *start = xstrdup(refname);
+   const char *start = xstrdup(refname);
+   const char *to_free = start;
 
if (len < 0) {
int i;
@@ -1327,9 +1332,10 @@ static const char *rstrip_ref_components(const char 
*refname, int len)
 
while (remaining-- > 0) {
char *p = strrchr(start, '/');
-   if (p == NULL)
-   return "";
-   else
+   if (p == NULL) {
+   free((char *)to_free);
+   return xstrdup("");
+   } else
p[0] = '\0';
}
return start;
@@ -1344,7 +1350,7 @@ static const char *show_ref(struct refname_atom *atom, 
const char *refname)
else if (atom->option == R_RSTRIP)
return rstrip_ref_components(refname, atom->rstrip);
else
-   return refname;
+   return xstrdup(refname);
 }
 
 static void fill_remote_ref_details(struct used_atom *atom, const char 
*refname,
@@ -1358,7 +1364,7 @@ static void fill_remote_ref_details(struct used_atom 
*atom, const char *refname,
   NULL, AHEAD_BEHIND_FULL) < 0) {
*s = xstrdup(msgs.gone);
} else if (!num_ours && !num_theirs)
-   *s = "";
+   *s = xstrdup("");
else if (!num_ours)
*s = xstrfmt(msgs.behind, num_theirs);
else if (!num_theirs)
@@ -1373,36 +1379,31 @@ static void fill_remote_ref_details(struct used_atom 
*atom, const char *refname,
}
} else if (atom->u.remote_ref.option == RR_TRACKSHORT) {
if (stat_tracking_info(branch, &num_ours, &num_theirs,
-  NULL, AHEAD_BEHIND_FULL) < 0)
+  

[PATCH v2 2/3] ls-remote: release memory instead of UNLEAK

2018-10-18 Thread Olga Telezhnaya
Use ref_array_clear() to release memory instead of UNLEAK macros.

Signed-off-by: Olga Telezhnaia 
---
 builtin/ls-remote.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 1a25df7ee15b4..6a0cdec30d2d7 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -151,6 +151,6 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
}
 
UNLEAK(sorting);
-   UNLEAK(ref_array);
+   ref_array_clear(&ref_array);
return status;
 }

--
https://github.com/git/git/pull/538


Re: [PATCH] config.mak.dev: enable -Wunused-function

2018-10-18 Thread Jeff King
On Thu, Oct 18, 2018 at 03:05:22AM -0400, Jeff King wrote:

> diff --git a/config.mak.dev b/config.mak.dev
> index 92d268137f..bbeeff44fe 100644
> --- a/config.mak.dev
> +++ b/config.mak.dev
> @@ -34,7 +34,6 @@ ifeq ($(filter extra-all,$(DEVOPTS)),)
>  CFLAGS += -Wno-empty-body
>  CFLAGS += -Wno-missing-field-initializers
>  CFLAGS += -Wno-sign-compare
> -CFLAGS += -Wno-unused-function
>  CFLAGS += -Wno-unused-parameter

By the way, I wondered how close we were to being able to use
-Wunused-parameter. The answer is "not very".

However, I've been digging into the results, and it does find a number
of bugs. I'm 168 (rough) patches deep right now, and I have it compiling
cleanly. Most of those are just annotations, but I'll start posting
fixes as I organize and clean them up.

-Peff


[PATCH] config.mak.dev: enable -Wunused-function

2018-10-18 Thread Jeff King
We explicitly omitted -Wunused-function when we added
-Wextra, but there is no need: the current code compiles
cleanly with it. And it's worth having, since it can let you
know when there are cascading effects from a cleanup (e.g.,
deleting one function lets you delete its static helpers).

There are cases where we may need an unused function to
exist, but we can handle these easily:

  - macro-generated code like commit-slab; there we have the
MAYBE_UNUSED annotation to silence the compiler

  - conditional compilation, where we may or may not need a
static helper. These generally fall into one of two
categories:

  - the call should not be conditional, but rather the
function body itself should be (and may just be a
no-op on one side of the #if). That keeps the
conditional pollution out of the main code.

  - call-chains of static helpers should all be in the
same #if block, so they are all-or-nothing

And if there's some case that doesn't cover, we still
have MAYBE_UNUSED as a fallback.

Signed-off-by: Jeff King 
---
 config.mak.dev | 1 -
 1 file changed, 1 deletion(-)

diff --git a/config.mak.dev b/config.mak.dev
index 92d268137f..bbeeff44fe 100644
--- a/config.mak.dev
+++ b/config.mak.dev
@@ -34,7 +34,6 @@ ifeq ($(filter extra-all,$(DEVOPTS)),)
 CFLAGS += -Wno-empty-body
 CFLAGS += -Wno-missing-field-initializers
 CFLAGS += -Wno-sign-compare
-CFLAGS += -Wno-unused-function
 CFLAGS += -Wno-unused-parameter
 endif
 endif
-- 
2.19.1.823.g84a1dd67bf