date:20171102

[PATCH] mailmap: use Kaartic Sivaraam's new address

2017-11-02 Thread Kaartic Sivaraam

Map the old address to the new, hopefully more permanent one.

Signed-off-by: Kaartic Sivaraam 
---
 .mailmap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.mailmap b/.mailmap
index ab85e0d16..2634baef2 100644
--- a/.mailmap
+++ b/.mailmap
@@ -113,6 +113,7 @@ Junio C Hamano  
 Junio C Hamano  
 Junio C Hamano  
 Junio C Hamano  
+Kaartic Sivaraam  
 Karl Wiberg  Karl  Hasselström
 Karl Wiberg  
 Karsten Blees  
-- 
2.15.0.461.gf957c703b.dirty

Re: [PATCHv2 6/7] builtin/describe.c: describe a blob

2017-11-02 Thread Junio C Hamano

Jacob Keller  writes:

> I think both questions are valuable, the first is "which commit last
> had this blob", the second  is "which commit first introduced this
> blob", neither of which can always give a definitive answer. It really
> depends on what question you're asking up front.

Given that "describe" is about giving an object _a_ name that is
plausibly useful to refer to it, it is not a good match for the
above query that wants to know where it came from, how long it
remains in the tree and where it ceases to be relevant.  In order to
support that use case, a totally different and possibly more useful
avenue would be to think how this can be hooked into "git log"
machinery.

A refresher for how "git log [--options] " works may be
beneficial.  We walk history and compare the tree of the commit we
are looking at with the tree of its parent commits.  If everything
within  is the same, we mark the transition between the
parent and our commit TREESAME (other commits, i.e. the ones that
have meaningful change within , are !TREESAME).  Then the
output routine presents the set of commits that includes commits
that are !TREESAME, within the constraints of the --options given
(e.g. with --full-history, sides of a merge that is TREESAME may
still be shown to preserve connectedness of the resulting graph).

It is easy to imagine that we can restrict "git log" traversal with
a "--blobchange=" option instead of (or in addition to) the
limitation  gives us.  Instead of treating a commit whose
diff against its parent commit has any filepair that is different
within  as "!TREESAME", we can treat a commit whose diff
against its parent commit has a filepair that has the  on
either side of the filepair as "!TREESAME" (in other words, we
ignore a transition that is not about introducing or forgetting the
 we are looking for as an "interesting change").  That would
give you a commit history graph in which only (and all) such commits
that either adds or removes the  in it.

Hmm?

Re: [RFC PATCH v3 3/4] branch: introduce dont_fail parameter for branchname validation

2017-11-02 Thread Kaartic Sivaraam


On Friday 03 November 2017 12:12 AM, Stefan Beller wrote:

On Thu, Nov 2, 2017 at 1:39 AM, Kaartic Sivaraam
 wrote:

On Thursday 02 November 2017 12:24 PM, Kaartic Sivaraam wrote:


Signed-off-by: Kaartic Sivaraam 
Signed-off-by: Kaartic Sivaraam 



I just now saw this small glitch as a consequence of recently
changing my email address. I would prefer to keep the second one
but as the other patches have the first one it's better to keep
the first one for now.


If you prefer one of them, you may have incentive to
add an entry into .mailmap file, otherwise I'd kindly ask you
to. :) (c.f. `git log --no-merges -- .mailmap`)



Sure, I'll do that. My intuition says this doesn't remove the duplicated 
 sign-off line. Anyways, there's for sure a v4 that's going to update 
the connector string in [4/4] and another update. I'll be careful to 
address these issues in v4.


---
Kaartic

Re: [RFC PATCH v3 4/4] builtin/branch: give more useful error messages when renaming

2017-11-02 Thread Kaartic Sivaraam


On Thursday 02 November 2017 07:51 PM, Eric Sunshine wrote:


Nicely explained; easily understood.



Good to hear that.




Translators can correct me, but this smells like "sentence lego"[1],
which we'd like to avoid. Translators lack full context when presented
with bits and pieces of a sentence like this, thus the translation may
be of poor quality; it may even be entirely incorrect since they don't
necessarily know how you will be composing the various pieces.

You _might_ be able to able to resolve this by dropping "and" from:

 "foo is moo, and bar is boo"

to turn the error messages into independent clauses:

 "foo is moo; bar is boo"
> but I'm no translator, and even that may fail the lego sniff test.

Though I can't be very sure that the one you suggested will pass the 
"lego sniff test", its better than the "and" I used. Further, my 
instinct says it wouldn't qualify as sentence lego (it's just a ";").




A sure-fire way to avoid lego construction would be simply to emit
each messages on its own line:

 fatal: branch 'tset' doesn't exist
 fatal: branch 'master' already exists

(though, you might consider that too ugly).



Though it might not look that ugly, I don't know how you could make 
'git' die() twice (or am I missing something)! Of course we could use 
'error()' to report the errors and then 'die()' with a message like 
"Branch rename failed" but I find that to be a little too verbose and 
further using the connector ";" instead of "and" does seem to reduce the 
possibilities for the above program fragment to pass the "lego sniff test".


---
Kaartic

Re: [PATCHv3 0/7] git describe blob

2017-11-02 Thread Stefan Beller

On Thu, Nov 2, 2017 at 6:46 PM, Junio C Hamano  wrote:
> Jacob Keller  writes:
>
>> I agree, "something" is better than "nothing", and we can work to
>> improve "something" in the future, especially after we get more real
>> use and see what people think. Only question would be how much do we
>> need to document the current behavior might be open for improvement?
>
> If
>
>  - it always digs to the root of the history no matter what; and/or

this is fixed.

>  - it almost always yields correct but suboptimal result


this is not, for the lack of knowing what the optimal result is.

>
> then that fact must be documented in BUGS section, possibly a brief
> descrition of why that limitation is there, with a hint to invite
> people to look into fixing it.
>
> We should mark it prominently as experimental and advertise it as
> such.  "It's too slow in real project to be usable"

I found it quite fast after fixing the history walking, but still.

> and "Its output
> bases the answer on an irrelevant commit" are not something we want
> our users to experience, except for those with inclination to (or
> ability and time to) help improve the feature.

I think the current documentation states exactly this.

Thanks.

Re: [PATCHv2] config: document blame configuration

2017-11-02 Thread Stefan Beller

On Thu, Nov 2, 2017 at 6:26 PM, SZEDER Gábor  wrote:
> On Thu, Nov 2, 2017 at 7:10 PM, Stefan Beller  wrote:
>
>> +blame.blankBoundary::
>> +   Show blank SHA-1 for boundary commits in linkgit:git-blame[1].
>
> This is still SHA-1 instead of object id (or perhaps "commit object
> name" would be even better).
> Not sure whether oversight or intentional.

definitely oversight.

Re: [PATCH 3/3] mingw: document the experimental standard handle redirection

2017-11-02 Thread Junio C Hamano

Johannes Schindelin  writes:

>> If I was correct in assuming that "2>&1" is just as foreign as
>> ">/dev/null", then we should be shunning "2>&1" just like we shun
>> ">/dev/null".  That was all I meant to say.
>
> Did you know that `2>&1` works in Powershell?

No.  And that makes me curious if ">&-" is also there to replace
"off" ;-)

Re: [PATCH 1/2] wrapper.c: consistently quote filenames in error messages

2017-11-02 Thread Junio C Hamano

Simon Ruderich  writes:

> On Thu, Nov 02, 2017 at 02:16:52PM +0900, Junio C Hamano wrote:
>> Junio C Hamano writes:
>>> This patch is incomplete without adjusting a handful of tests to
>>> expect the updated messages, no?
>>
>> I'll squash these in while queuing, but there might be more that I
>> didn't notice.
>
> Sorry, didn't think about the tests.

Heh, tests are not something you need to think about, if you always
run them after making changes.

> I've re-checked and I think those are the only affected tests.
> The test suite passes with your squashed changes.

OK.  Thanks.

Re: [PATCHv3 0/7] git describe blob

2017-11-02 Thread Junio C Hamano

Jacob Keller  writes:

> I agree, "something" is better than "nothing", and we can work to
> improve "something" in the future, especially after we get more real
> use and see what people think. Only question would be how much do we
> need to document the current behavior might be open for improvement?

If 

 - it always digs to the root of the history no matter what; and/or
 - it almost always yields correct but suboptimal result

then that fact must be documented in BUGS section, possibly a brief
descrition of why that limitation is there, with a hint to invite
people to look into fixing it.

We should mark it prominently as experimental and advertise it as
such.  "It's too slow in real project to be usable" and "Its output
bases the answer on an irrelevant commit" are not something we want
our users to experience, except for those with inclination to (or
ability and time to) help improve the feature.

Re: [PATCHv2] config: document blame configuration

2017-11-02 Thread SZEDER Gábor

On Thu, Nov 2, 2017 at 7:10 PM, Stefan Beller  wrote:

> +blame.blankBoundary::
> +   Show blank SHA-1 for boundary commits in linkgit:git-blame[1].

This is still SHA-1 instead of object id (or perhaps "commit object
name" would be even better).
Not sure whether oversight or intentional.

[PATCH] setup.c: don't try to access '//HEAD' during repo discovery

2017-11-02 Thread SZEDER Gábor

Commit ce9b8aab5 (setup_git_directory_1(): avoid changing global state,
2017-03-13) changed how the git directory is discovered, and as a side
effect when the discovery reaches the root directory Git tries to
access paths like '//HEAD' and '//objects'.  This interacts badly with
Cygwin, which interprets it as a UNC file share, and so demand-loads a
bunch of extra DLLs and attempts to resolve/contact the server named
HEAD.  This obviously doesn't work too well, especially over a slow
network link.

Special case the root directory in is_git_directory() to prevent
accessing paths with leading double slashes.

Reported-by: Andrew Baumann 
Signed-off-by: SZEDER Gábor 
---

I'm not quite sure whether this is the right or complete fix.  I can't
test it on Cygwin, and I don't know whether Git for Windows is
affected with it's drive prefixes in paths.  Anyway, at least strace
output on Linux looks good to me.

 setup.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/setup.c b/setup.c
index 03f51e056..0cfc5e676 100644
--- a/setup.c
+++ b/setup.c
@@ -311,6 +311,10 @@ int is_git_directory(const char *suspect)
int ret = 0;
size_t len;
 
+   /* To avoid accessing '//HEAD' & co when checking the root directory */
+   if (!strcmp(suspect, "/"))
+   suspect = "";
+
/* Check worktree-related signatures */
strbuf_addf(, "%s/HEAD", suspect);
if (validate_headref(path.buf))
-- 
2.15.0.67.gb67a46776

Re: git tries to stat //HEAD when searching for a repo, leading to huge delays on Cygwin

2017-11-02 Thread Jeff King

On Thu, Nov 02, 2017 at 11:45:55PM +, Andrew Baumann wrote:

> I have a workaround for this, but someone on stack overflow [1]
> suggested reporting it upstream, so here you go:
> 
> I have a fancy shell prompt that executes "git rev-parse
> --is-inside-work-tree" to determine whether we're currently inside a
> working directory. This causes git to walk up the directory hierarchy
> looking for a containing git repo. For example, when invoked from my
> home directory, it stats the following paths, in order:
> 
> /home/me/.git
> /home/me/.git/HEAD
> /home/me/HEAD
> /home
> /home/.git
> /home/.git/HEAD
> /home/HEAD
> /
> /.git
> /.git/HEAD
> //HEAD
> 
> The last name (//HEAD) interacts badly with Cygwin, which interprets
> it as a UNC file share, and so demand-loads a bunch of extra DLLs and
> attempts to resolve/contact the server named HEAD. This obviously
> doesn't work too well, especially over a slow network link.
> 
> I've tested with the latest Cygwin git (2.15.0); this was also present
> in a prior version.

Interesting. I can reproduce on Linux (but of course "//HEAD" is cheap
to look at there). It bisects to ce9b8aab5d (setup_git_directory_1():
avoid changing global state, 2017-03-13). Before that, the end of the
strace for "git rev-parse --git-dir" looks like:

  chdir("..") = 0
  stat(".git", 0x7fffba398e00)= -1 ENOENT (No such file or 
directory)
  lstat(".git/HEAD", 0x7fffba398dd0)  = -1 ENOENT (No such file or 
directory)
  lstat("./HEAD", 0x7fffba398dd0) = -1 ENOENT (No such file or 
directory)
  write(2, "fatal: Not a git repository (or "..., 69) = 69

and after:

  stat("/.git", 0x7ffdb28b7eb0)   = -1 ENOENT (No such file or 
directory)
  lstat("/.git/HEAD", 0x7ffdb28b7e80) = -1 ENOENT (No such file or 
directory)
  lstat("//HEAD", 0x7ffdb28b7e80) = -1 ENOENT (No such file or 
directory)
  write(2, "fatal: Not a git repository (or "..., 69) = 69

Switching to using absolute paths rather than chdir-ing around is
intentional for that commit, but it looks like we just need to
special-case the construction of the root path.

Like this, perhaps:

diff --git a/setup.c b/setup.c
index 27a31de33f..5d0b6a88e3 100644
--- a/setup.c
+++ b/setup.c
@@ -283,7 +283,9 @@ int is_git_directory(const char *suspect)
size_t len;
 
/* Check worktree-related signatures */
-   strbuf_addf(, "%s/HEAD", suspect);
+   strbuf_addstr(, suspect);
+   strbuf_complete(, '/');
+   strbuf_addstr(, "HEAD");
if (validate_headref(path.buf))
goto done;
 

-Peff

Re: [PATCHv3 0/7] git describe blob

2017-11-02 Thread Jacob Keller

On Thu, Nov 2, 2017 at 12:41 PM, Stefan Beller  wrote:
> Thanks for the discussion on v2[1].
>
> Interdiff is below, just fixing minor things.
>
> We'll keep the original algorithm for now, deferring an improvement on
> the algorithm front towards a future developer.
>

I agree, "something" is better than "nothing", and we can work to
improve "something" in the future, especially after we get more real
use and see what people think. Only question would be how much do we
need to document the current behavior might be open for improvement?

> Thanks,
> Stefan
>
> [1] https://public-inbox.org/git/20171031211852.13001-1-sbel...@google.com/
>
> Stefan Beller (7):
>   t6120: fix typo in test name
>   list-objects.c: factor out traverse_trees_and_blobs
>   revision.h: introduce blob/tree walking in order of the commits
>   builtin/describe.c: rename `oid` to avoid variable shadowing
>   builtin/describe.c: print debug statements earlier
>   builtin/describe.c: factor out describe_commit
>   builtin/describe.c: describe a blob
>
>  Documentation/git-describe.txt |  13 +++-
>  Documentation/rev-list-options.txt |   5 ++
>  builtin/describe.c | 125 
> -
>  list-objects.c |  52 +--
>  revision.c |   2 +
>  revision.h |   3 +-
>  t/t6100-rev-list-in-order.sh   |  47 ++
>  t/t6120-describe.sh|  17 -
>  8 files changed, 214 insertions(+), 50 deletions(-)
>  create mode 100755 t/t6100-rev-list-in-order.sh
> diff --git c/Documentation/git-describe.txt w/Documentation/git-describe.txt
> index 077c3c2193..79ec0be62a 100644
> --- c/Documentation/git-describe.txt
> +++ w/Documentation/git-describe.txt
> @@ -11,7 +11,7 @@ SYNOPSIS
>  [verse]
>  'git describe' [--all] [--tags] [--contains] [--abbrev=] [...]
>  'git describe' [--all] [--tags] [--contains] [--abbrev=] --dirty[=]
> -'git describe' [] 
> +'git describe' [] 
>
>  DESCRIPTION
>  ---
> diff --git c/builtin/describe.c w/builtin/describe.c
> index 382cbae908..cf08bef344 100644
> --- c/builtin/describe.c
> +++ w/builtin/describe.c
> @@ -440,6 +440,7 @@ struct process_commit_data {
> struct object_id current_commit;
> struct object_id looking_for;
> struct strbuf *dst;
> +   struct rev_info *revs;
>  };
>
>  static void process_commit(struct commit *commit, void *data)
> @@ -456,6 +457,7 @@ static void process_object(struct object *obj, const char 
> *path, void *data)
> reset_revision_walk();
> describe_commit(>current_commit, pcd->dst);
> strbuf_addf(pcd->dst, ":%s", path);
> +   pcd->revs->max_count = 0;
> }
>  }
>
> @@ -463,7 +465,7 @@ static void describe_blob(struct object_id oid, struct 
> strbuf *dst)
>  {
> struct rev_info revs;
> struct argv_array args = ARGV_ARRAY_INIT;
> -   struct process_commit_data pcd = { null_oid, oid, dst};
> +   struct process_commit_data pcd = { null_oid, oid, dst, };
>
> argv_array_pushl(, "internal: The first arg is not parsed",
> "--all", "--reflog", /* as many starting points as possible */
> @@ -497,14 +499,12 @@ static void describe(const char *arg, int last_one)
> die(_("Not a valid object name %s"), arg);
> cmit = lookup_commit_reference_gently(, 1);
>
> -   if (cmit) {
> +   if (cmit)
> describe_commit(, );
> -   } else {
> -   if (lookup_blob())
> -   describe_blob(oid, );
> -   else
> -   die(_("%s is neither a commit nor blob"), arg);
> -   }
> +   else if (lookup_blob())
> +   describe_blob(oid, );
> +   else
> +   die(_("%s is neither a commit nor blob"), arg);
>
> puts(sb.buf);
>
> diff --git c/list-objects.c w/list-objects.c
> index 03438e5a8b..07a92f35fe 100644
> --- c/list-objects.c
> +++ w/list-objects.c
> @@ -184,13 +184,13 @@ static void add_pending_tree(struct rev_info *revs, 
> struct tree *tree)
>  }
>
>  static void traverse_trees_and_blobs(struct rev_info *revs,
> -struct strbuf *base_path,
> +struct strbuf *base,
>  show_object_fn show_object,
>  void *data)
>  {
> int i;
>
> -   assert(base_path->len == 0);
> +   assert(base->len == 0);
>
> for (i = 0; i < revs->pending.nr; i++) {
> struct object_array_entry *pending = revs->pending.objects + 
> i;
> @@ -208,12 +208,12 @@ static void traverse_trees_and_blobs(struct rev_info 
> *revs,
> path = "";
> if (obj->type == OBJ_TREE) {
> process_tree(revs, (struct tree *)obj, show_object,
> -

RE: NEED A LOAN?

2017-11-02 Thread Chen Alex Prime Funds

Hello, 

Do you need a loan? 

Regards

Chen Alex
Prime Funds

git tries to stat //HEAD when searching for a repo, leading to huge delays on Cygwin

2017-11-02 Thread Andrew Baumann

Hi,

I have a workaround for this, but someone on stack overflow [1] suggested 
reporting it upstream, so here you go:

I have a fancy shell prompt that executes "git rev-parse --is-inside-work-tree" 
to determine whether we're currently inside a working directory. This causes 
git to walk up the directory hierarchy looking for a containing git repo. For 
example, when invoked from my home directory, it stats the following paths, in 
order:

/home/me/.git
/home/me/.git/HEAD
/home/me/HEAD
/home
/home/.git
/home/.git/HEAD
/home/HEAD
/
/.git
/.git/HEAD
//HEAD

The last name (//HEAD) interacts badly with Cygwin, which interprets it as a 
UNC file share, and so demand-loads a bunch of extra DLLs and attempts to 
resolve/contact the server named HEAD. This obviously doesn't work too well, 
especially over a slow network link.

I've tested with the latest Cygwin git (2.15.0); this was also present in a 
prior version.

Cheers,
Andrew

[1] https://stackoverflow.com/questions/47084672

Re: [PATCH 00/14] WIP Partial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests

2017-11-02 Thread Jonathan Tan

On Thu,  2 Nov 2017 20:31:15 +
Jeff Hostetler  wrote:

> From: Jeff Hostetler 
> 
> This is part 3 of 3 for partial clone.
> It assumes that part 1 [1] and part 2 [2] are in place.
> 
> Part 3 is concerned with the commands: clone, fetch, upload-pack, fetch-pack,
> remote-curl, index-pack, and the pack-protocol.
> 
> Jonathan and I independently started on this task.  This is a first
> pass at merging those efforts.  So there are several places that need
> refactoring and cleanup.  In particular, the test cases should be
> squashed and new tests added.

Thanks. What are your future plans with this patch set? In particular, the
tests don't pass at HEAD^.

I took a quick glance to see if there were any issues that I could
immediately spot, but couldn't find any. I thought of fetch_if_missing,
but it seems that it is indeed used in this patch set (as expected).

I'll look at it more thorougly, and feel free to let me know if there is
anything in particular you would like comments on.

Re: [PATCH v3 7/7] remote-mediawiki: show progress while fetching namespaces

2017-11-02 Thread Antoine Beaupré

On 2017-11-02 22:31:02, Thomas Adam wrote:
> On Thu, Nov 02, 2017 at 06:26:43PM -0400, Antoine Beaupré wrote:
>> On 2017-11-02 22:18:07, Thomas Adam wrote:
>> > Hi,
>> >
>> > On Thu, Nov 02, 2017 at 05:25:18PM -0400, Antoine Beaupré wrote:
>> >> +print {*STDERR} "$#{$mw_pages} found in namespace 
>> >> $local_namespace ($namespace_id)\n";
>> >
>> > How is this any different to using warn()?  I appreciate you're using a
>> > globbed filehandle, but it seems superfluous to me.
>> 
>> It's what is used everywhere in the module, I'm just tagging along.
>> 
>> This was discussed before: there's an issue about cleaning up the
>> messaging in that module, that can be fixed separately.
>
> Understood.  That should happen sooner rather than later.

Actually, is there a standard way to do this in git with Perl
extensions? I know about "option verbosity N" but how should I translate
this into Perl? Carp? Warn? Log::Any? Log4perl?

Recommendations welcome...

A.

-- 
Si Dieu existe, j'espère qu'Il a une excuse valable
- Daniel Pennac

Re: [PATCH v3 4/7] remote-mediawiki: skip virtual namespaces

2017-11-02 Thread Antoine Beaupré

On 2017-11-02 18:43:00, Eric Sunshine wrote:
> On Thu, Nov 2, 2017 at 5:25 PM, Antoine Beaupré  wrote:
>> Virtual namespaces do not correspond to pages in the database and are
>> automatically generated by MediaWiki. It makes little sense,
>> therefore, to fetch pages from those namespaces and the MW API doesn't
>> support listing those pages.
>>
>> According to the documentation, those virtual namespaces are currently
>> "Special" (-1) and "Media" (-2) but we treat all negative namespaces
>> as "virtual" as a future-proofing mechanism.
>>
>> Reviewed-by: Eric Sunshine 
>
> It probably would be best to omit this Reviewed-by: since it was not
> provided explicitly. More importantly, I'm neither a user of nor
> familiar with MediaWiki or its API, so a Reviewed-by: from me has
> little or no value. Probably best would be for someone such as
> Matthieu to give his Reviewed-by: if he so desires.

Alright, I was wondering what the process was for those. I didn't want
to leave your contributions by the wayside...

I'll wait a little while longer for more feedback and then resend
without those. unless...

@junio: my github repo has the branch without those Reviewed-by tags,
iirc. so if you can to merge from there, that will keep me from sending
yet another pile of patches for such a trivial change...

a.

-- 
Semantics is the gravity of abstraction.

Re: [PATCH v3 7/7] remote-mediawiki: show progress while fetching namespaces

2017-11-02 Thread Eric Sunshine

On Thu, Nov 2, 2017 at 5:25 PM, Antoine Beaupré  wrote:
> Without this, the fetch process seems hanged while we fetch page
> listings across the namespaces. Obviously, it should be possible to
> silence this with -q, but that's an issue already present everywhere
> in the code and should be fixed separately:
>
> https://github.com/Git-Mediawiki/Git-Mediawiki/issues/30
>
> Reviewed-by: Eric Sunshine 

Ditto: It would be best to drop this Reviewed-by: since it has no
value with my name attached to it and was not provided explicitly.

> Signed-off-by: Antoine Beaupré

Re: [PATCH v3 6/7] remote-mediawiki: process namespaces in order

2017-11-02 Thread Eric Sunshine

On Thu, Nov 2, 2017 at 5:25 PM, Antoine Beaupré  wrote:
> Ideally, we'd process them in numeric order since that is more
> logical, but we can't do that yet since this is where we find the
> numeric identifiers in the first place. Lexicographic order is a good
> compromise.
>
> Reviewed-by: Eric Sunshine 

Ditto: It would be best to drop this Reviewed-by: since it has no
value with my name attached to it and was not provided explicitly.

> Signed-off-by: Antoine Beaupré

Re: [PATCH v3 5/7] remote-mediawiki: support fetching from (Main) namespace

2017-11-02 Thread Eric Sunshine

On Thu, Nov 2, 2017 at 5:25 PM, Antoine Beaupré  wrote:
> When we specify a list of namespaces to fetch from, by default the MW
> API will not fetch from the default namespace, refered to as "(Main)"
> in the documentation:
>
> https://www.mediawiki.org/wiki/Manual:Namespace#Built-in_namespaces
>
> I haven't found a way to address that "(Main)" namespace when getting
> the namespace ids: indeed, when listing namespaces, there is no
> "canonical" field for the main namespace, although there is a "*"
> field that is set to "" (empty). So in theory, we could specify the
> empty namespace to get the main namespace, but that would make
> specifying namespaces harder for the user: we would need to teach
> users about the "empty" default namespace. It would also make the code
> more complicated: we'd need to parse quotes in the configuration.
>
> So we simply override the query here and allow the user to specify
> "(Main)" since that is the publicly documented name.
>
> Reviewed-by: Eric Sunshine 

As with the previous patch, it would be best to drop this Reviewed-by:
since it has no value with my name attached to it and was not provided
explicitly.

> Signed-off-by: Antoine Beaupré

Re: [PATCH v3 4/7] remote-mediawiki: skip virtual namespaces

2017-11-02 Thread Eric Sunshine

On Thu, Nov 2, 2017 at 5:25 PM, Antoine Beaupré  wrote:
> Virtual namespaces do not correspond to pages in the database and are
> automatically generated by MediaWiki. It makes little sense,
> therefore, to fetch pages from those namespaces and the MW API doesn't
> support listing those pages.
>
> According to the documentation, those virtual namespaces are currently
> "Special" (-1) and "Media" (-2) but we treat all negative namespaces
> as "virtual" as a future-proofing mechanism.
>
> Reviewed-by: Eric Sunshine 

It probably would be best to omit this Reviewed-by: since it was not
provided explicitly. More importantly, I'm neither a user of nor
familiar with MediaWiki or its API, so a Reviewed-by: from me has
little or no value. Probably best would be for someone such as
Matthieu to give his Reviewed-by: if he so desires.

> Signed-off-by: Antoine Beaupré

Re: [PATCH v3 7/7] remote-mediawiki: show progress while fetching namespaces

2017-11-02 Thread Thomas Adam

On Thu, Nov 02, 2017 at 06:26:43PM -0400, Antoine Beaupré wrote:
> On 2017-11-02 22:18:07, Thomas Adam wrote:
> > Hi,
> >
> > On Thu, Nov 02, 2017 at 05:25:18PM -0400, Antoine Beaupré wrote:
> >> +print {*STDERR} "$#{$mw_pages} found in namespace 
> >> $local_namespace ($namespace_id)\n";
> >
> > How is this any different to using warn()?  I appreciate you're using a
> > globbed filehandle, but it seems superfluous to me.
> 
> It's what is used everywhere in the module, I'm just tagging along.
> 
> This was discussed before: there's an issue about cleaning up the
> messaging in that module, that can be fixed separately.

Understood.  That should happen sooner rather than later.

-- Thomas Adam

Re: [PATCH v3 7/7] remote-mediawiki: show progress while fetching namespaces

2017-11-02 Thread Antoine Beaupré

On 2017-11-02 22:18:07, Thomas Adam wrote:
> Hi,
>
> On Thu, Nov 02, 2017 at 05:25:18PM -0400, Antoine Beaupré wrote:
>> +print {*STDERR} "$#{$mw_pages} found in namespace $local_namespace 
>> ($namespace_id)\n";
>
> How is this any different to using warn()?  I appreciate you're using a
> globbed filehandle, but it seems superfluous to me.

It's what is used everywhere in the module, I'm just tagging along.

This was discussed before: there's an issue about cleaning up the
messaging in that module, that can be fixed separately.

A.
-- 
N'aimer qu'un seul est barbarie, car c'est au détriment de tous les
autres. Fût-ce l'amour de Dieu.
- Nietzsche, "Par delà le bien et le mal"

Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-02 Thread Jonathan Tan

On Thu,  2 Nov 2017 20:20:44 +
Jeff Hostetler  wrote:

> From: Jeff Hostetler 
> 
> Introduce the ability to have missing objects in a repo.  This
> functionality is guarded by new repository extension options:
> `extensions.partialcloneremote` and
> `extensions.partialclonefilter`.

With this, it is unclear what happens if extensions.partialcloneremote
is not set but extensions.partialclonefilter is set. For something as
significant as a repository extension (which Git uses to determine if it
will even attempt to interact with a repo), I think - I would prefer
just extensions.partialclone (or extensions.partialcloneremote, though I
prefer the former) which determines the remote (the important part,
which controls the dynamic object fetching), and have another option
"core.partialclonefilter" which is only useful if
"extensions.partialclone" is set.

I also don't think extensions.partialclonefilter (or
core.partialclonefilter, if we use my suggestion) needs to be introduced
so early in the patch set when it will only be used once we start
fetching/cloning.

> +void partial_clone_utils_register(
> + const struct list_objects_filter_options *filter_options,
> + const char *remote,
> + const char *cmd_name)
> +{

This function is useful once we have fetch/clone, but probably not
before that. Since the fetch/clone patches are several patches ahead,
could this be moved there?

> @@ -420,6 +420,19 @@ static int check_repo_format(const char *var, const char 
> *value, void *vdata)
>   ;
>   else if (!strcmp(ext, "preciousobjects"))
>   data->precious_objects = git_config_bool(var, value);
> +
> + else if (!strcmp(ext, KEY_PARTIALCLONEREMOTE))
> + if (!value)
> + return config_error_nonbool(var);
> + else
> + data->partial_clone_remote = xstrdup(value);
> +
> + else if (!strcmp(ext, KEY_PARTIALCLONEFILTER))
> + if (!value)
> + return config_error_nonbool(var);
> + else
> + data->partial_clone_filter = xstrdup(value);
> +
>   else
>   string_list_append(>unknown_extensions, ext);
>   } else if (strcmp(var, "core.bare") == 0) {

With a complicated block, probably better to use braces in these
clauses.

Re: [PATCH v3 7/7] remote-mediawiki: show progress while fetching namespaces

2017-11-02 Thread Thomas Adam

Hi,

On Thu, Nov 02, 2017 at 05:25:18PM -0400, Antoine Beaupré wrote:
> +print {*STDERR} "$#{$mw_pages} found in namespace $local_namespace 
> ($namespace_id)\n";

How is this any different to using warn()?  I appreciate you're using a
globbed filehandle, but it seems superfluous to me.

Kindly,
Thomas

Re: Regression[2.14.3->2.15]: Interactive rebase fails if submodule is modified

2017-11-02 Thread Orgad Shaneh

On Thu, Nov 2, 2017 at 8:34 PM, Stefan Beller  wrote:
> On Thu, Nov 2, 2017 at 1:30 AM, Orgad Shaneh  wrote:
>> I can't reproduce this with a minimal example, but it happens in my project.
>>
>> What I tried to do for reproducing is:
>> rm -rf super sub
>> mkdir sub; cd sub; git init
>> git commit --allow-empty -m 'Initial commit'
>> mkdir ../super; cd ../super
>> git init
>> git submodule add ../sub
>> touch foo; git add foo sub
>> git commit -m 'Initial commit'
>> touch a; git add a; git commit -m 'a'
>> touch b; git add b; git commit -m 'b'
>> cd sub; git commit --allow-empty -m 'New commit'; cd ..
>> git rebase -i HEAD^^
>>
>> Then drop a.
>>
>> In my project I get:
>> error: cannot rebase: You have unstaged changes.
>>
>> This works fine with 2.14.3.
>
>   git log --oneline v2.14.3..v2.15.0 -- submodule.c
> doesn't give any promising hints (i.e. I don't think one of a
> submodule related series introduced this either by chance or
> on purpose)
>
> "rebase -i" was rewritten into C in 570676e011, though
> that series was extensively tested by DScho, so I wouldn't
> want to point fingers here quickly.
>
> Would you be willing to bisect this behavior?

Bisected to ff6f1f564c48def1f8e1852826bab58af5044b06:
submodule-config: lazy-load a repository's .gitmodules file

- Orgad

RE

2017-11-02 Thread hola

my name is Mrs. Alice Walton, a business woman an America Citizen and the 
heiress to the fortune of Walmart stores, born October 7, 1949. I have a 
mission for you worth $100,000,000.00(Hundred Million United State Dollars) 
which I intend using for CHARITY

[PATCH v3 1/7] remote-mediawiki: add namespace support

2017-11-02 Thread Antoine Beaupré

From: Kevin 

This introduces a new remote.origin.namespaces argument that is a
space-separated list of namespaces. The list of pages extract is then
taken from all the specified namespaces.

Reviewed-by: Antoine Beaupré 
Signed-off-by: Antoine Beaupré 
---
 contrib/mw-to-git/git-remote-mediawiki.perl | 25 +
 1 file changed, 25 insertions(+)

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
b/contrib/mw-to-git/git-remote-mediawiki.perl
index e7f857c1a..5ffb57595 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -63,6 +63,10 @@ chomp(@tracked_pages);
 my @tracked_categories = split(/[ \n]/, run_git("config --get-all 
remote.${remotename}.categories"));
 chomp(@tracked_categories);
 
+# Just like @tracked_categories, but for MediaWiki namespaces.
+my @tracked_namespaces = split(/[ \n]/, run_git("config --get-all 
remote.${remotename}.namespaces"));
+chomp(@tracked_namespaces);
+
 # Import media files on pull
 my $import_media = run_git("config --get --bool 
remote.${remotename}.mediaimport");
 chomp($import_media);
@@ -256,6 +260,23 @@ sub get_mw_tracked_categories {
return;
 }
 
+sub get_mw_tracked_namespaces {
+my $pages = shift;
+foreach my $local_namespace (@tracked_namespaces) {
+my $mw_pages = $mediawiki->list( {
+action => 'query',
+list => 'allpages',
+apnamespace => get_mw_namespace_id($local_namespace),
+aplimit => 'max' } )
+|| die $mediawiki->{error}->{code} . ': '
+. $mediawiki->{error}->{details} . "\n";
+foreach my $page (@{$mw_pages}) {
+$pages->{$page->{title}} = $page;
+}
+}
+return;
+}
+
 sub get_mw_all_pages {
my $pages = shift;
# No user-provided list, get the list of pages from the API.
@@ -319,6 +340,10 @@ sub get_mw_pages {
$user_defined = 1;
get_mw_tracked_categories(\%pages);
}
+   if (@tracked_namespaces) {
+   $user_defined = 1;
+   get_mw_tracked_namespaces(\%pages);
+   }
if (!$user_defined) {
get_mw_all_pages(\%pages);
}
-- 
2.11.0

[PATCH v3 4/7] remote-mediawiki: skip virtual namespaces

2017-11-02 Thread Antoine Beaupré

Virtual namespaces do not correspond to pages in the database and are
automatically generated by MediaWiki. It makes little sense,
therefore, to fetch pages from those namespaces and the MW API doesn't
support listing those pages.

According to the documentation, those virtual namespaces are currently
"Special" (-1) and "Media" (-2) but we treat all negative namespaces
as "virtual" as a future-proofing mechanism.

Reviewed-by: Eric Sunshine 
Signed-off-by: Antoine Beaupré 
---
 contrib/mw-to-git/git-remote-mediawiki.perl | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
b/contrib/mw-to-git/git-remote-mediawiki.perl
index e7616e1a2..21fb2e302 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -264,10 +264,13 @@ sub get_mw_tracked_categories {
 sub get_mw_tracked_namespaces {
 my $pages = shift;
 foreach my $local_namespace (@tracked_namespaces) {
+my $namespace_id = get_mw_namespace_id($local_namespace);
+# virtual namespaces don't support allpages
+next if !defined($namespace_id) || $namespace_id < 0;
 my $mw_pages = $mediawiki->list( {
 action => 'query',
 list => 'allpages',
-apnamespace => get_mw_namespace_id($local_namespace),
+apnamespace => $namespace_id,
 aplimit => 'max' } )
 || die $mediawiki->{error}->{code} . ': '
 . $mediawiki->{error}->{details} . "\n";
-- 
2.11.0

[PATCH v3 0/7] remote-mediawiki: namespace support

2017-11-02 Thread Antoine Beaupré

This should be the final roll of patches for namespace support. I
included the undef check even though that problem occurs elsewhere in
the code. I also removed the needless "my" move.

Hopefully that should be the last in the queue!

[PATCH v3 2/7] remote-mediawiki: allow fetching namespaces with spaces

2017-11-02 Thread Antoine Beaupré

From: Ingo Ruhnke 

we still want to use spaces as separators in the config, but we should
allow the user to specify namespaces with spaces, so we use underscore
for this.

Reviewed-by: Antoine Beaupré 
Signed-off-by: Antoine Beaupré 
---
 contrib/mw-to-git/git-remote-mediawiki.perl | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
b/contrib/mw-to-git/git-remote-mediawiki.perl
index 5ffb57595..a1d783789 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -65,6 +65,7 @@ chomp(@tracked_categories);
 
 # Just like @tracked_categories, but for MediaWiki namespaces.
 my @tracked_namespaces = split(/[ \n]/, run_git("config --get-all 
remote.${remotename}.namespaces"));
+for (@tracked_namespaces) { s/_/ /g; }
 chomp(@tracked_namespaces);
 
 # Import media files on pull
-- 
2.11.0

[PATCH v3 3/7] remote-mediawiki: show known namespace choices on failure

2017-11-02 Thread Antoine Beaupré

If we fail to find a requested namespace, we should tell the user
which ones we know about, since those were already fetched. This
allows users to fetch all namespaces by specifying a dummy namespace,
failing, then copying the list of namespaces in the config.

Eventually, we should have a flag that allows fetching all namespaces
automatically.

Reviewed-by: Antoine Beaupré 
Signed-off-by: Antoine Beaupré 
---
 contrib/mw-to-git/git-remote-mediawiki.perl | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
b/contrib/mw-to-git/git-remote-mediawiki.perl
index a1d783789..e7616e1a2 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -1334,7 +1334,9 @@ sub get_mw_namespace_id {
my $id;
 
if (!defined $ns) {
-   print {*STDERR} "No such namespace ${name} on MediaWiki.\n";
+   my @namespaces = sort keys %namespace_id;
+   for (@namespaces) { s/ /_/g; }
+   print {*STDERR} "No such namespace ${name} on MediaWiki, known 
namespaces: @namespaces\n";
$ns = {is_namespace => 0};
$namespace_id{$name} = $ns;
}
-- 
2.11.0

[PATCH v3 6/7] remote-mediawiki: process namespaces in order

2017-11-02 Thread Antoine Beaupré

Ideally, we'd process them in numeric order since that is more
logical, but we can't do that yet since this is where we find the
numeric identifiers in the first place. Lexicographic order is a good
compromise.

Reviewed-by: Eric Sunshine 
Signed-off-by: Antoine Beaupré 
---
 contrib/mw-to-git/git-remote-mediawiki.perl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
b/contrib/mw-to-git/git-remote-mediawiki.perl
index 898541a9f..f53e638cf 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -263,7 +263,7 @@ sub get_mw_tracked_categories {
 
 sub get_mw_tracked_namespaces {
 my $pages = shift;
-foreach my $local_namespace (@tracked_namespaces) {
+foreach my $local_namespace (sort @tracked_namespaces) {
 my $namespace_id;
 if ($local_namespace eq "(Main)") {
 $namespace_id = 0;
-- 
2.11.0

[PATCH v3 7/7] remote-mediawiki: show progress while fetching namespaces

2017-11-02 Thread Antoine Beaupré

Without this, the fetch process seems hanged while we fetch page
listings across the namespaces. Obviously, it should be possible to
silence this with -q, but that's an issue already present everywhere
in the code and should be fixed separately:

https://github.com/Git-Mediawiki/Git-Mediawiki/issues/30

Reviewed-by: Eric Sunshine 
Signed-off-by: Antoine Beaupré 
---
 contrib/mw-to-git/git-remote-mediawiki.perl | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
b/contrib/mw-to-git/git-remote-mediawiki.perl
index f53e638cf..dc43a950b 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -279,6 +279,7 @@ sub get_mw_tracked_namespaces {
 aplimit => 'max' } )
 || die $mediawiki->{error}->{code} . ': '
 . $mediawiki->{error}->{details} . "\n";
+print {*STDERR} "$#{$mw_pages} found in namespace $local_namespace 
($namespace_id)\n";
 foreach my $page (@{$mw_pages}) {
 $pages->{$page->{title}} = $page;
 }
-- 
2.11.0

[PATCH v3 5/7] remote-mediawiki: support fetching from (Main) namespace

2017-11-02 Thread Antoine Beaupré

When we specify a list of namespaces to fetch from, by default the MW
API will not fetch from the default namespace, refered to as "(Main)"
in the documentation:

https://www.mediawiki.org/wiki/Manual:Namespace#Built-in_namespaces

I haven't found a way to address that "(Main)" namespace when getting
the namespace ids: indeed, when listing namespaces, there is no
"canonical" field for the main namespace, although there is a "*"
field that is set to "" (empty). So in theory, we could specify the
empty namespace to get the main namespace, but that would make
specifying namespaces harder for the user: we would need to teach
users about the "empty" default namespace. It would also make the code
more complicated: we'd need to parse quotes in the configuration.

So we simply override the query here and allow the user to specify
"(Main)" since that is the publicly documented name.

Reviewed-by: Eric Sunshine 
Signed-off-by: Antoine Beaupré 
---
 contrib/mw-to-git/git-remote-mediawiki.perl | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
b/contrib/mw-to-git/git-remote-mediawiki.perl
index 21fb2e302..898541a9f 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -264,7 +264,12 @@ sub get_mw_tracked_categories {
 sub get_mw_tracked_namespaces {
 my $pages = shift;
 foreach my $local_namespace (@tracked_namespaces) {
-my $namespace_id = get_mw_namespace_id($local_namespace);
+my $namespace_id;
+if ($local_namespace eq "(Main)") {
+$namespace_id = 0;
+} else {
+$namespace_id = get_mw_namespace_id($local_namespace);
+}
 # virtual namespaces don't support allpages
 next if !defined($namespace_id) || $namespace_id < 0;
 my $mw_pages = $mediawiki->list( {
-- 
2.11.0

Re: [PATCH 4/7] remote-mediawiki: skip virtual namespaces

2017-11-02 Thread Antoine Beaupré

On 2017-11-02 10:24:40, Junio C Hamano wrote:
> Antoine Beaupré  writes:
>
>> It might still worth fixing this, but I'm not sure what the process is
>> here - in the latest "what's cooking" Junio said this patchset would be
>> merged in "next". Should I reroll the patchset to fix this or not?
>
> The process is for you (the contributor of the topic) to yell at me,
> "don't merge it yet, there still are updates to come".

YELL! "don't merge it yet, there still are updates to come". :)

> That message _may_ come to late, in which case we may have to go
> incremental, but I usually try to leave at least a few days between
> the time I mark a topic as "will merge" and the time I actually do
> the merge, for this exact reason.

Awesome, thanks for the update.

i'll roll a v4 with the last tweaks, hopefully that will be the last.

a.

-- 
How inappropriate to call this planet 'Earth' when it is quite clearly
'Ocean'.
- Arthur C. Clarke

Re: No log --no-decorate completion?

2017-11-02 Thread Stefan Beller

On Thu, Nov 2, 2017 at 1:25 PM, Max Rothman  wrote:
> No problem! Let me know if I've done something wrong with this patch,
> I'm new to git's contributor process.

Thanks for coming up with a patch!
Yeah, the contribution process has some initial road blocks;
I'll point them out below.

> completion: add missing completions for log, diff, show

I would think this is a good commit subject as it describes
the changes made superficially

> * Add bash completion for the missing --no-* options on git log
> * Add bash completion for --textconv and --indent-heuristic families to
>   git diff and all commands that use diff's options
> * Add bash completion for --no-abbrev-commit, --expand-tabs, and
>   --no-expand-tabs to git show

This describes what happens in this patch, but not why, which helps
future readers of commit message more than the "what".
I'm also just guessing but maybe:

A user might have configured a repo to show diffs in a certain way,
add the --no-* options to the autocompletion to override it easier
from the command line.
New in this patch is autocompletion for --textconv as well as abbreviation
options as that needs to be flipped all the time in .

I wonder if we really want to expose indent-heuristic,
I guess by not having it experimental any more it makes sense to do so.
c.f. https://public-inbox.org/git/20171029151228.607834-1-...@dwim.me/

At the end of a commit message, the Git project requires a sign off.
(See section (5) in Documentation/SubmittingPatches;
tl;dr: add Signed-off-by: NAME  if you can agree to
https://developercertificate.org/)

> ---
>  contrib/completion/git-completion.bash | 18 ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/contrib/completion/git-completion.bash
> b/contrib/completion/git-completion.bash
> index 0e16f017a4144..252a6c8c0c80c 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -1412,6 +1412,8 @@ __git_diff_common_options="--stat --numstat
> --shortstat --summary
> --dirstat-by-file= --cumulative
> --diff-algorithm=
> --submodule --submodule=
> +   --indent-heuristic --no-indent-heuristic
> +   --textconv --no-textconv
>  "
>
>  _git_diff ()
> @@ -1752,6 +1754,10 @@ _git_log ()
> __gitcomp "$__git_diff_submodule_formats" ""
> "${cur##--submodule=}"
> return
> ;;
> +   --no-walk=*)
> +   __gitcomp "sorted unsorted" "" "${cur##--no-walk=}"
> +   return
> +   ;;
> --*)
> __gitcomp "
> $__git_log_common_options
> @@ -1759,16 +1765,19 @@ _git_log ()
> $__git_log_gitk_options
> --root --topo-order --date-order --reverse
> --follow --full-diff
> -   --abbrev-commit --abbrev=
> +   --abbrev-commit --no-abbrev-commit --abbrev=
> --relative-date --date=
> --pretty= --format= --oneline
> --show-signature
> --cherry-mark
> --cherry-pick
> --graph
> -   --decorate --decorate=
> +   --decorate --decorate= --no-decorate
> --walk-reflogs
> +   --no-walk --no-walk= --do-walk
> --parents --children
> +   --expand-tabs --expand-tabs= --no-expand-tabs
> +   --patch
> $merge
> $__git_diff_common_options
> --pickaxe-all --pickaxe-regex
> @@ -2816,8 +2825,9 @@ _git_show ()
> return
> ;;
> --*)
> -   __gitcomp "--pretty= --format= --abbrev-commit --oneline
> -   --show-signature
> +   __gitcomp "--pretty= --format= --abbrev-commit
> --no-abbrev-commit
> +   --oneline --show-signature --patch
> +   --expand-tabs --expand-tabs= --no-expand-tabs
> $__git_diff_common_options
> "
> return
>

The patch looks good, but doesn't apply because the email contains
white spaces instead of tabs. Maybe try https://submitgit.herokuapp.com/
(or fix/change your email client to send a patch; the gmail web interface
doesn't work. I personally got 'git send-email' up and running;
The Documentation/SubmittingPatches has a section on email clients, too)

Thanks,
Stefan

Re: [PATCH 5/7] remote-mediawiki: support fetching from (Main) namespace

2017-11-02 Thread Antoine Beaupré

On 2017-11-01 15:56:51, Eric Sunshine wrote:
>> diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
>> b/contrib/mw-to-git/git-remote-mediawiki.perl
>> @@ -264,9 +264,14 @@ sub get_mw_tracked_categories {
>>  sub get_mw_tracked_namespaces {
>>  my $pages = shift;
>>  foreach my $local_namespace (@tracked_namespaces) {
>> -my $namespace_id = get_mw_namespace_id($local_namespace);
>> +my ($namespace_id, $mw_pages);
>> +if ($local_namespace eq "(Main)") {
>> +$namespace_id = 0;
>> +} else {
>> +$namespace_id = get_mw_namespace_id($local_namespace);
>> +}
>
> I meant to ask this in the previous round, but with the earlier patch
> mixing several distinct changes into one, I plumb forgot: Would it
> make sense to move this "(Main)" special case into
> get_mw_namespace_id() itself? After all, that function is all about
> determining an ID associated with a name, and "(Main)" is a name.

Right. At first sight, I agree: get_mw_namespace_id should do the right
thing. But then, I look at the code of that function, and it strikes me
as ... well... really hard to actually do this the right way.

In fact, I suspect that passing "" to get_mw_namespace_id would actually
do the right thing. The problem, as I explained before, is that passing
that in the configuration is pretty hard: it would needlessly complicate
the configuration setting, so I think it's a fair shortcut to do it
here.

>>  next if $namespace_id < 0; # virtual namespaces don't support 
>> allpages
>> -my $mw_pages = $mediawiki->list( {
>> +$mw_pages = $mediawiki->list( {
>
> Why did the "my" of $my_pages get moved up to the top of the foreach
> loop? I can't seem to see any reason for it. Is this an unrelated
> change accidentally included in this patch?

Just a habit of declaring functions at the beginning of a block. Maybe
it's because I'm old? :)

I'll reroll a last patchset with those fixes.

A.

-- 
One of the strongest motives that leads men to art and science is
escape from everyday life with its painful crudity and hopeless
dreariness. Such men make this cosmos and its construction the pivot
of their emotional life, in order to find the peace and security which
they cannot find in the narrow whirlpool of personal experience.
   - Albert Einstein

[PATCH 01/14] upload-pack: add object filtering for partial clone

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Teach upload-pack to negotiate object filtering over the protocol and
to send filter parameters to pack-objects.  This is intended for partial
clone and fetch.

The idea to make upload-pack configurable using uploadpack.allowFilter
comes from Jonathan Tan's work in [1].

[1] 
https://public-inbox.org/git/f211093280b422c32cc1b7034130072f35c5ed51.1506714999.git.jonathanta...@google.com/

Signed-off-by: Jeff Hostetler 
---
 Documentation/config.txt  |  4 
 Documentation/technical/pack-protocol.txt |  8 
 Documentation/technical/protocol-capabilities.txt |  8 
 upload-pack.c | 20 +++-
 4 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1ac0ae6..e528210 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3268,6 +3268,10 @@ uploadpack.packObjectsHook::
was run. I.e., `upload-pack` will feed input intended for
`pack-objects` to the hook, and expects a completed packfile on
stdout.
+
+uploadpack.allowFilter::
+   If this option is set, `upload-pack` will advertise partial
+   clone and partial fetch object filtering.
 +
 Note that this configuration variable is ignored if it is seen in the
 repository-level config (this is a safety measure against fetching from
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index ed1eae8..a43a113 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -212,6 +212,7 @@ out of what the server said it could do with the first 
'want' line.
   upload-request=  want-list
   *shallow-line
   *1depth-request
+  [filter-request]
   flush-pkt
 
   want-list =  first-want
@@ -227,6 +228,8 @@ out of what the server said it could do with the first 
'want' line.
   additional-want   =  PKT-LINE("want" SP obj-id)
 
   depth =  1*DIGIT
+
+  filter-request=  PKT-LINE("filter" SP filter-spec)
 
 
 Clients MUST send all the obj-ids it wants from the reference
@@ -249,6 +252,11 @@ complete those commits. Commits whose parents are not 
received as a
 result are defined as shallow and marked as such in the server. This
 information is sent back to the client in the next step.
 
+The client can optionally request that pack-objects omit various
+objects from the packfile using one of several filtering techniques.
+These are intended for use with partial clone and partial fetch
+operations.  See `rev-list` for possible "filter-spec" values.
+
 Once all the 'want's and 'shallow's (and optional 'deepen') are
 transferred, clients MUST send a flush-pkt, to tell the server side
 that it is done sending the list.
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index 26dcc6f..332d209 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -309,3 +309,11 @@ to accept a signed push certificate, and asks the  
to be
 included in the push certificate.  A send-pack client MUST NOT
 send a push-cert packet unless the receive-pack server advertises
 this capability.
+
+filter
+--
+
+If the upload-pack server advertises the 'filter' capability,
+fetch-pack may send "filter" commands to request a partial clone
+or partial fetch and request that the server omit various objects
+from the packfile.
diff --git a/upload-pack.c b/upload-pack.c
index e25f725..64a57a4 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -10,6 +10,8 @@
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "list-objects-filter.h"
+#include "list-objects-filter-options.h"
 #include "run-command.h"
 #include "connect.h"
 #include "sigchain.h"
@@ -64,6 +66,10 @@ static int advertise_refs;
 static int stateless_rpc;
 static const char *pack_objects_hook;
 
+static int filter_capability_requested;
+static int filter_advertise;
+static struct list_objects_filter_options filter_options;
+
 static void reset_timeout(void)
 {
alarm(timeout);
@@ -131,6 +137,7 @@ static void create_pack_file(void)
argv_array_push(_objects.args, "--delta-base-offset");
if (use_include_tag)
argv_array_push(_objects.args, "--include-tag");
+   arg_format_list_objects_filter(_objects.args, _options);
 
pack_objects.in = -1;
pack_objects.out = -1;
@@ -794,6 +801,12 @@ static void receive_needs(void)
deepen_rev_list = 1;
continue;
}
+   if (skip_prefix(line, "filter ", )) {
+   if (!filter_capability_requested)
+   die("git

[PATCH 07/14] fetch-pack: test support excluding large blobs

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Created tests to verify fetch-pack and upload-pack support
for excluding large blobs using --filter=blobs:limit=
parameter.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 80a1a32..fdb98a8 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -755,4 +755,31 @@ test_expect_success 'fetching deepen' '
)
 '
 
+test_expect_success 'filtering by size' '
+   rm -rf server client &&
+   test_create_repo server &&
+   test_commit -C server one &&
+   test_config -C server uploadpack.allowfilter 1 &&
+
+   test_create_repo client &&
+   git -C client fetch-pack --filter=blobs:limit=0 ../server HEAD &&
+
+   # Ensure that object is not inadvertently fetched
+   test_must_fail git -C client cat-file -e $(git hash-object server/one.t)
+'
+
+test_expect_success 'filtering by size has no effect if support for it is not 
advertised' '
+   rm -rf server client &&
+   test_create_repo server &&
+   test_commit -C server one &&
+
+   test_create_repo client &&
+   git -C client fetch-pack --filter=blobs:limit=0 ../server HEAD 2> err &&
+
+   # Ensure that object is fetched
+   git -C client cat-file -e $(git hash-object server/one.t) &&
+
+   test_i18ngrep "filtering not recognized by server" err
+'
+
 test_done
-- 
2.9.3

[PATCH 03/14] fetch: refactor calculation of remote list

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Separate out the calculation of remotes to be fetched from and the
actual fetching. This will allow us to include an additional step before
the actual fetching in a subsequent commit.

Signed-off-by: Jonathan Tan 
---
 builtin/fetch.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 225c734..1b1f039 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1322,7 +1322,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
 {
int i;
struct string_list list = STRING_LIST_INIT_DUP;
-   struct remote *remote;
+   struct remote *remote = NULL;
int result = 0;
struct argv_array argv_gc_auto = ARGV_ARRAY_INIT;
 
@@ -1367,17 +1367,14 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
else if (argc > 1)
die(_("fetch --all does not make sense with refspecs"));
(void) for_each_remote(get_one_remote_for_fetch, );
-   result = fetch_multiple();
} else if (argc == 0) {
/* No arguments -- use default remote */
remote = remote_get(NULL);
-   result = fetch_one(remote, argc, argv);
} else if (multiple) {
/* All arguments are assumed to be remotes or groups */
for (i = 0; i < argc; i++)
if (!add_remote_or_group(argv[i], ))
die(_("No such remote or remote group: %s"), 
argv[i]);
-   result = fetch_multiple();
} else {
/* Single remote or group */
(void) add_remote_or_group(argv[0], );
@@ -1385,14 +1382,19 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
/* More than one remote */
if (argc > 1)
die(_("Fetching a group and specifying refspecs 
does not make sense"));
-   result = fetch_multiple();
} else {
/* Zero or one remotes */
remote = remote_get(argv[0]);
-   result = fetch_one(remote, argc-1, argv+1);
+   argc--;
+   argv++;
}
}
 
+   if (remote)
+   result = fetch_one(remote, argc, argv);
+   else
+   result = fetch_multiple();
+
if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
struct argv_array options = ARGV_ARRAY_INIT;
 
-- 
2.9.3

[PATCH 13/14] fetch-pack: restore save_commit_buffer after use

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

In fetch-pack, the global variable save_commit_buffer is set to 0, but
not restored to its original value after use.

In particular, if show_log() (in log-tree.c) is invoked after
fetch_pack() in the same process, show_log() will return before printing
out the commit message (because the invocation to
get_cached_commit_buffer() returns NULL, because the commit buffer was
not saved). I discovered this when attempting to run "git log -S" in a
partial clone, triggering the case where revision walking lazily loads
missing objects.

Therefore, restore save_commit_buffer to its original value after use.

An alternative to solve the problem I had is to replace
get_cached_commit_buffer() with get_commit_buffer(). That invocation was
introduced in commit a97934d ("use get_cached_commit_buffer where
appropriate", 2014-06-13) to replace "commit->buffer" introduced in
commit 3131b71 ("Add "--show-all" revision walker flag for debugging",
2008-02-13). In the latter commit, the commit author seems to be
deciding between not showing an unparsed commit at all and showing an
unparsed commit without the message (which is what the commit does), and
did not mention parsing the unparsed commit, so I prefer to preserve the
existing behavior.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 fetch-pack.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/fetch-pack.c b/fetch-pack.c
index 895e8f9..121f03e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -717,6 +717,7 @@ static int everything_local(struct fetch_pack_args *args,
 {
struct ref *ref;
int retval;
+   int old_save_commit_buffer = save_commit_buffer;
timestamp_t cutoff = 0;
 
save_commit_buffer = 0;
@@ -784,6 +785,9 @@ static int everything_local(struct fetch_pack_args *args,
print_verbose(args, _("already have %s (%s)"), 
oid_to_hex(remote),
  ref->name);
}
+
+   save_commit_buffer = old_save_commit_buffer;
+
return retval;
 }
 
-- 
2.9.3

[PATCH 06/14] pack-objects: test support for blob filtering

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

As part of an effort to improve Git support for very large repositories
in which clients typically have only a subset of all version-controlled
blobs, test pack-objects support for --filter=blobs:limit=, packing only
blobs not exceeding that size unless the blob corresponds to a file
whose name starts with ".git". upload-pack will eventually be taught to
use this new parameter if needed to exclude certain blobs during a fetch
or clone, potentially drastically reducing network consumption when
serving these very large repositories.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5300-pack-object.sh  | 45 +
 t/test-lib-functions.sh | 12 
 2 files changed, 57 insertions(+)

diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 9c68b99..0739a07 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -457,6 +457,51 @@ test_expect_success !PTHREADS,C_LOCALE_OUTPUT 
'pack-objects --threads=N or pack.
grep -F "no threads support, ignoring pack.threads" err
 '
 
+lcut () {
+   perl -e '$/ = undef; $_ = <>; s/^.{'$1'}//s; print $_'
+}
+
+test_expect_success 'filtering by size works with multiple excluded' '
+   rm -rf server &&
+   git init server &&
+   printf a > server/a &&
+   printf b > server/b &&
+   printf c-very-long-file > server/c &&
+   printf d-very-long-file > server/d &&
+   git -C server add a b c d &&
+   git -C server commit -m x &&
+
+   git -C server rev-parse HEAD >objects &&
+   git -C server pack-objects --revs --stdout --filter=blobs:limit=10 
my.pack &&
+
+   # Ensure that only the small blobs are in the packfile
+   git index-pack my.pack &&
+   git verify-pack -v my.idx >objectlist &&
+   grep $(git hash-object server/a) objectlist &&
+   grep $(git hash-object server/b) objectlist &&
+   ! grep $(git hash-object server/c) objectlist &&
+   ! grep $(git hash-object server/d) objectlist
+'
+
+test_expect_success 'filtering by size never excludes special files' '
+   rm -rf server &&
+   git init server &&
+   printf a-very-long-file > server/a &&
+   printf a-very-long-file > server/.git-a &&
+   printf b-very-long-file > server/b &&
+   git -C server add a .git-a b &&
+   git -C server commit -m x &&
+
+   git -C server rev-parse HEAD >objects &&
+   git -C server pack-objects --revs --stdout --filter=blobs:limit=10 
my.pack &&
+
+   # Ensure that the .git-a blob is in the packfile, despite also
+   # appearing as a non-.git file
+   git index-pack my.pack &&
+   git verify-pack -v my.idx >objectlist &&
+   grep $(git hash-object server/a) objectlist
+'
+
 #
 # WARNING!
 #
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 1701fe2..07b79c7 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1020,3 +1020,15 @@ nongit () {
"$@"
)
 }
+
+# Converts big-endian pairs of hexadecimal digits into bytes. For example,
+# "printf 61620d0a | hex_pack" results in "ab\r\n".
+hex_pack () {
+   perl -e '$/ = undef; $input = <>; print pack("H*", $input)'
+}
+
+# Converts bytes into big-endian pairs of hexadecimal digits. For example,
+# "printf 'ab\r\n' | hex_unpack" results in "61620d0a".
+hex_unpack () {
+   perl -e '$/ = undef; $input = <>; print unpack("H2" x length($input), 
$input)'
+}
-- 
2.9.3

[PATCH 09/14] t5500: add fetch-pack tests for partial clone

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 36 
 1 file changed, 36 insertions(+)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index fdb98a8..7c8339f 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -782,4 +782,40 @@ test_expect_success 'filtering by size has no effect if 
support for it is not ad
test_i18ngrep "filtering not recognized by server" err
 '
 
+fetch_blob_max_bytes () {
+ SERVER="$1"
+ URL="$2"
+
+   rm -rf "$SERVER" client &&
+   test_create_repo "$SERVER" &&
+   test_commit -C "$SERVER" one &&
+   test_config -C "$SERVER" uploadpack.allowfilter 1 &&
+
+   git clone "$URL" client &&
+   test_config -C client extensions.partialcloneremote origin &&
+
+   test_commit -C "$SERVER" two &&
+
+   git -C client fetch --filter=blobs:limit=0 origin HEAD:somewhere &&
+
+   # Ensure that commit is fetched, but blob is not
+   test_config -C client extensions.partialcloneremote "arbitrary string" 
&&
+   git -C client cat-file -e $(git -C "$SERVER" rev-parse two) &&
+   test_must_fail git -C client cat-file -e $(git hash-object 
"$SERVER/two.t")
+}
+
+test_expect_success 'fetch with filtering' '
+fetch_blob_max_bytes server server
+'
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'fetch with filtering and HTTP' '
+fetch_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+'
+
+stop_httpd
+
+
 test_done
-- 
2.9.3

[PATCH 12/14] unpack-trees: batch fetching of missing blobs

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

When running checkout, first prefetch all blobs that are to be updated
but are missing. This means that only one pack is downloaded during such
operations, instead of one per missing blob.

This operates only on the blob level - if a repository has a missing
tree, they are still fetched one at a time.

This does not use the delayed checkout mechanism introduced in commit
2841e8f ("convert: add "status=delayed" to filter process protocol",
2017-06-30) due to significant conceptual differences - in particular,
for partial clones, we already know what needs to be fetched based on
the contents of the local repo alone, whereas for status=delayed, it is
the filter process that tells us what needs to be checked in the end.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 fetch-object.c   | 27 +++
 fetch-object.h   |  5 +
 t/t5601-clone.sh | 52 
 unpack-trees.c   | 22 ++
 4 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/fetch-object.c b/fetch-object.c
index 369b61c..21b4dfa 100644
--- a/fetch-object.c
+++ b/fetch-object.c
@@ -3,12 +3,12 @@
 #include "pkt-line.h"
 #include "strbuf.h"
 #include "transport.h"
+#include "fetch-object.h"
 
-void fetch_object(const char *remote_name, const unsigned char *sha1)
+static void fetch_refs(const char *remote_name, struct ref *ref)
 {
struct remote *remote;
struct transport *transport;
-   struct ref *ref;
int original_fetch_if_missing = fetch_if_missing;
 
fetch_if_missing = 0;
@@ -17,10 +17,29 @@ void fetch_object(const char *remote_name, const unsigned 
char *sha1)
die(_("Remote with no URL"));
transport = transport_get(remote, remote->url[0]);
 
-   ref = alloc_ref(sha1_to_hex(sha1));
-   hashcpy(ref->old_oid.hash, sha1);
transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
transport_set_option(transport, TRANS_OPT_NO_HAVES, "1");
transport_fetch_refs(transport, ref);
fetch_if_missing = original_fetch_if_missing;
 }
+
+void fetch_object(const char *remote_name, const unsigned char *sha1)
+{
+   struct ref *ref = alloc_ref(sha1_to_hex(sha1));
+   hashcpy(ref->old_oid.hash, sha1);
+   fetch_refs(remote_name, ref);
+}
+
+void fetch_objects(const char *remote_name, const struct oid_array *to_fetch)
+{
+   struct ref *ref = NULL;
+   int i;
+
+   for (i = 0; i < to_fetch->nr; i++) {
+   struct ref *new_ref = alloc_ref(oid_to_hex(_fetch->oid[i]));
+   oidcpy(_ref->old_oid, _fetch->oid[i]);
+   new_ref->next = ref;
+   ref = new_ref;
+   }
+   fetch_refs(remote_name, ref);
+}
diff --git a/fetch-object.h b/fetch-object.h
index f371300..4b269d0 100644
--- a/fetch-object.h
+++ b/fetch-object.h
@@ -1,6 +1,11 @@
 #ifndef FETCH_OBJECT_H
 #define FETCH_OBJECT_H
 
+#include "sha1-array.h"
+
 extern void fetch_object(const char *remote_name, const unsigned char *sha1);
 
+extern void fetch_objects(const char *remote_name,
+ const struct oid_array *to_fetch);
+
 #endif
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 567161e..3211f86 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -611,6 +611,58 @@ test_expect_success 'partial clone: warn if server does 
not support object filte
test_i18ngrep "filtering not recognized by server" err
 '
 
+test_expect_success 'batch missing blob request during checkout' '
+   rm -rf server client &&
+
+   test_create_repo server &&
+   echo a >server/a &&
+   echo b >server/b &&
+   git -C server add a b &&
+
+   git -C server commit -m x &&
+   echo aa >server/a &&
+   echo bb >server/b &&
+   git -C server add a b &&
+   git -C server commit -m x &&
+
+   test_config -C server uploadpack.allowfilter 1 &&
+   test_config -C server uploadpack.allowanysha1inwant 1 &&
+
+   git clone --filter=blobs:limit=0 "file://$(pwd)/server" client &&
+
+   # Ensure that there is only one negotiation by checking that there is
+   # only "done" line sent. ("done" marks the end of negotiation.)
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client checkout HEAD^ &&
+   grep "git> done" trace >done_lines &&
+   test_line_count = 1 done_lines
+'
+
+test_expect_success 'batch missing blob request does not inadvertently try to 
fetch gitlinks' '
+   rm -rf server client &&
+
+   test_create_repo repo_for_submodule &&
+   test_commit -C repo_for_submodule x &&
+
+   test_create_repo server &&
+   echo a >server/a &&
+   echo b >server/b &&
+   git -C server add a b &&
+   git -C server commit -m x &&
+
+   echo aa >server/a &&
+   echo bb >server/b &&
+   # Also add a gitlink pointing to an arbitrary repository

[PATCH 14/14] index-pack: silently assume missing objects are promisor

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Teach index-pack to not complain about missing objects
when the --promisor flag is given.  The assumption is that
index-pack is currently building the idx and promisor data
and the is_promisor_object() query would fail anyway.

Signed-off-by: Jeff Hostetler 
---
 builtin/index-pack.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 31cd5ba..51693dc 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -82,6 +82,7 @@ static int verbose;
 static int show_resolving_progress;
 static int show_stat;
 static int check_self_contained_and_connected;
+static int arg_promisor_given;
 
 static struct progress *progress;
 
@@ -223,10 +224,11 @@ static unsigned check_object(struct object *obj)
unsigned long size;
int type = sha1_object_info(obj->oid.hash, );
 
-   if (type <= 0) {
+   if (type <= 0 && arg_promisor_given) {
/*
-* TODO Use the promisor code to conditionally
-* try to fetch this object -or- assume it is ok.
+* Assume this missing object is promised.  We can't
+* confirm it because we are indexing the packfile
+* that omitted it.
 */
obj->flags |= FLAG_CHECKED;
return 0;
@@ -1717,8 +1719,10 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
keep_msg = arg + 7;
} else if (!strcmp(arg, "--promisor")) {
promisor_msg = "";
+   arg_promisor_given = 1;
} else if (starts_with(arg, "--promisor=")) {
promisor_msg = arg + strlen("--promisor=");
+   arg_promisor_given = 1;
} else if (starts_with(arg, "--threads=")) {
char *end;
nr_threads = strtoul(arg+10, , 0);
-- 
2.9.3

[PATCH 11/14] t5500: more tests for partial clone and fetch

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 60 +++
 1 file changed, 56 insertions(+), 4 deletions(-)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 7c8339f..86cf653 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -782,7 +782,7 @@ test_expect_success 'filtering by size has no effect if 
support for it is not ad
test_i18ngrep "filtering not recognized by server" err
 '
 
-fetch_blob_max_bytes () {
+setup_blob_max_bytes () {
  SERVER="$1"
  URL="$2"
 
@@ -794,7 +794,11 @@ fetch_blob_max_bytes () {
git clone "$URL" client &&
test_config -C client extensions.partialcloneremote origin &&
 
-   test_commit -C "$SERVER" two &&
+   test_commit -C "$SERVER" two
+}
+
+do_blob_max_bytes() {
+   SERVER="$1" &&
 
git -C client fetch --filter=blobs:limit=0 origin HEAD:somewhere &&
 
@@ -805,14 +809,62 @@ fetch_blob_max_bytes () {
 }
 
 test_expect_success 'fetch with filtering' '
-fetch_blob_max_bytes server server
+   setup_blob_max_bytes server server &&
+   do_blob_max_bytes server
+'
+
+test_expect_success 'fetch respects configured filtering' '
+   setup_blob_max_bytes server server &&
+
+   test_config -C client extensions.partialclonefilter blobs:limit=0 &&
+
+   git -C client fetch origin HEAD:somewhere &&
+
+   # Ensure that commit is fetched, but blob is not
+   test_config -C client extensions.partialcloneremote "arbitrary string" 
&&
+   git -C client cat-file -e $(git -C server rev-parse two) &&
+   test_must_fail git -C client cat-file -e $(git hash-object server/two.t)
+'
+
+test_expect_success 'pull respects configured filtering' '
+   setup_blob_max_bytes server server &&
+
+   # Hide two.t from tip so that client does not load it upon the
+   # automatic checkout that pull performs
+   git -C server rm two.t &&
+   test_commit -C server three &&
+
+   test_config -C server uploadpack.allowanysha1inwant 1 &&
+   test_config -C client extensions.partialclonefilter blobs:limit=0 &&
+
+   git -C client pull origin &&
+
+   # Ensure that commit is fetched, but blob is not
+   test_config -C client extensions.partialcloneremote "arbitrary string" 
&&
+   git -C client cat-file -e $(git -C server rev-parse two) &&
+   test_must_fail git -C client cat-file -e $(git hash-object server/two.t)
+'
+
+test_expect_success 'clone configures filtering' '
+   rm -rf server client &&
+   test_create_repo server &&
+   test_commit -C server one &&
+   test_commit -C server two &&
+   test_config -C server uploadpack.allowanysha1inwant 1 &&
+
+   git clone --filter=blobs:limit=12345 server client &&
+
+   # Ensure that we can, for example, checkout HEAD^
+   rm -rf client/.git/objects/* &&
+   git -C client checkout HEAD^
 '
 
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
 test_expect_success 'fetch with filtering and HTTP' '
-fetch_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+   setup_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server" &&
+   do_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server"
 '
 
 stop_httpd
-- 
2.9.3

[PATCH 00/14] WIP Partial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

This is part 3 of 3 for partial clone.
It assumes that part 1 [1] and part 2 [2] are in place.

Part 3 is concerned with the commands: clone, fetch, upload-pack, fetch-pack,
remote-curl, index-pack, and the pack-protocol.

Jonathan and I independently started on this task.  This is a first
pass at merging those efforts.  So there are several places that need
refactoring and cleanup.  In particular, the test cases should be
squashed and new tests added.

[1] 
https://public-inbox.org/git/20171102124445.fbffd43521cd35f6a71e1...@google.com/T/
[2] TODO


Jeff Hostetler (5):
  upload-pack: add object filtering for partial clone
  clone, fetch-pack, index-pack, transport: partial clone
  fetch: add object filtering for partial fetch
  remote-curl: add object filtering for partial clone
  index-pack: silently assume missing objects are promisor

Jonathan Tan (9):
  fetch: refactor calculation of remote list
  pack-objects: test support for blob filtering
  fetch-pack: test support excluding large blobs
  fetch: add from_promisor and exclude-promisor-objects parameters
  t5500: add fetch-pack tests for partial clone
  t5601: test for partial clone
  t5500: more tests for partial clone and fetch
  unpack-trees: batch fetching of missing blobs
  fetch-pack: restore save_commit_buffer after use

 Documentation/config.txt  |   4 +
 Documentation/gitremote-helpers.txt   |   4 +
 Documentation/technical/pack-protocol.txt |   8 ++
 Documentation/technical/protocol-capabilities.txt |   8 ++
 builtin/clone.c   |  24 -
 builtin/fetch-pack.c  |   4 +
 builtin/fetch.c   |  83 ++--
 builtin/index-pack.c  |  14 +++
 connected.c   |   3 +
 fetch-object.c|  27 -
 fetch-object.h|   5 +
 fetch-pack.c  |  17 
 fetch-pack.h  |   2 +
 remote-curl.c |  10 +-
 t/t5300-pack-object.sh|  45 +
 t/t5500-fetch-pack.sh | 115 ++
 t/t5601-clone.sh  | 101 +++
 t/test-lib-functions.sh   |  12 +++
 transport-helper.c|   5 +
 transport.c   |   4 +
 transport.h   |   5 +
 unpack-trees.c|  22 +
 upload-pack.c |  20 +++-
 23 files changed, 526 insertions(+), 16 deletions(-)

-- 
2.9.3

[PATCH 02/14] clone, fetch-pack, index-pack, transport: partial clone

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 builtin/clone.c  |  9 +
 builtin/fetch-pack.c |  4 
 builtin/index-pack.c | 10 ++
 fetch-pack.c | 13 +
 fetch-pack.h |  2 ++
 transport-helper.c   |  5 +
 transport.c  |  4 
 transport.h  |  5 +
 8 files changed, 52 insertions(+)

diff --git a/builtin/clone.c b/builtin/clone.c
index dbddd98..fceb9e7 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -26,6 +26,7 @@
 #include "run-command.h"
 #include "connected.h"
 #include "packfile.h"
+#include "list-objects-filter-options.h"
 
 /*
  * Overall FIXMEs:
@@ -60,6 +61,7 @@ static struct string_list option_optional_reference = 
STRING_LIST_INIT_NODUP;
 static int option_dissociate;
 static int max_jobs = -1;
 static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
+static struct list_objects_filter_options filter_options;
 
 static int recurse_submodules_cb(const struct option *opt,
 const char *arg, int unset)
@@ -135,6 +137,7 @@ static struct option builtin_clone_options[] = {
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
TRANSPORT_FAMILY_IPV6),
+   OPT_PARSE_LIST_OBJECTS_FILTER(_options),
OPT_END()
 };
 
@@ -1073,6 +1076,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
warning(_("--shallow-since is ignored in local clones; 
use file:// instead."));
if (option_not.nr)
warning(_("--shallow-exclude is ignored in local 
clones; use file:// instead."));
+   if (filter_options.choice)
+   warning(_("--filter is ignored in local clones; use 
file:// instead."));
if (!access(mkpath("%s/shallow", path), F_OK)) {
if (option_local > 0)
warning(_("source repository is shallow, 
ignoring --local"));
@@ -1104,6 +1109,10 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_set_option(transport, TRANS_OPT_UPLOADPACK,
 option_upload_pack);
 
+   if (filter_options.choice)
+   transport_set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
+filter_options.raw_value);
+
if (transport->smart_options && !deepen)
transport->smart_options->check_self_contained_and_connected = 
1;
 
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 9a7ebf6..d0fdaa8 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -153,6 +153,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
args.no_haves = 1;
continue;
}
+   if (skip_prefix(arg, ("--" CL_ARG__FILTER "="), )) {
+   parse_list_objects_filter(_options, arg);
+   continue;
+   }
usage(fetch_pack_usage);
}
if (deepen_not.nr)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index a0a35e6..31cd5ba 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -222,6 +222,16 @@ static unsigned check_object(struct object *obj)
if (!(obj->flags & FLAG_CHECKED)) {
unsigned long size;
int type = sha1_object_info(obj->oid.hash, );
+
+   if (type <= 0) {
+   /*
+* TODO Use the promisor code to conditionally
+* try to fetch this object -or- assume it is ok.
+*/
+   obj->flags |= FLAG_CHECKED;
+   return 0;
+   }
+
if (type <= 0)
die(_("did not receive expected object %s"),
  oid_to_hex(>oid));
diff --git a/fetch-pack.c b/fetch-pack.c
index 4640b4e..895e8f9 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -29,6 +29,7 @@ static int deepen_not_ok;
 static int fetch_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
 static int agent_supported;
+static int server_supports_filtering;
 static struct lock_file shallow_lock;
 static const char *alternate_shallow_file;
 
@@ -379,6 +380,8 @@ static int find_common(struct fetch_pack_args *args,
if (deepen_not_ok)  strbuf_addstr(, " 
deepen-not");
if (agent_supported)strbuf_addf(, " agent=%s",

git_user_agent_sanitized());
+   if (args->filter_options.choice)
+   strbuf_addstr(, " filter");
packet_buf_write(_buf, "want %s%s\n", remote_hex, 
c.buf);

[PATCH 04/14] fetch: add object filtering for partial fetch

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Teach fetch to use the list-objects filtering parameters
to allow a "partial fetch" following a "partial clone".

Signed-off-by: Jeff Hostetler 
---
 builtin/fetch.c | 66 -
 1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 1b1f039..150ca0a 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -18,6 +18,7 @@
 #include "argv-array.h"
 #include "utf8.h"
 #include "packfile.h"
+#include "list-objects-filter-options.h"
 
 static const char * const builtin_fetch_usage[] = {
N_("git fetch [] [ [...]]"),
@@ -55,6 +56,7 @@ static int recurse_submodules_default = 
RECURSE_SUBMODULES_ON_DEMAND;
 static int shown_url = 0;
 static int refmap_alloc, refmap_nr;
 static const char **refmap_array;
+static struct list_objects_filter_options filter_options;
 
 static int git_fetch_config(const char *k, const char *v, void *cb)
 {
@@ -160,6 +162,7 @@ static struct option builtin_fetch_options[] = {
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
TRANSPORT_FAMILY_IPV6),
+   OPT_PARSE_LIST_OBJECTS_FILTER(_options),
OPT_END()
 };
 
@@ -754,6 +757,7 @@ static int store_updated_refs(const char *raw_url, const 
char *remote_name,
const char *filename = dry_run ? "/dev/null" : git_path_fetch_head();
int want_status;
int summary_width = transport_summary_width(ref_map);
+   struct check_connected_options opt = CHECK_CONNECTED_INIT;
 
fp = fopen(filename, "a");
if (!fp)
@@ -765,7 +769,7 @@ static int store_updated_refs(const char *raw_url, const 
char *remote_name,
url = xstrdup("foreign");
 
rm = ref_map;
-   if (check_connected(iterate_ref_map, , NULL)) {
+   if (check_connected(iterate_ref_map, , )) {
rc = error(_("%s did not send all necessary objects\n"), url);
goto abort;
}
@@ -1044,6 +1048,9 @@ static struct transport *prepare_transport(struct remote 
*remote, int deepen)
set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, "yes");
if (update_shallow)
set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
+   if (filter_options.choice)
+   set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
+  filter_options.raw_value);
return transport;
 }
 
@@ -1242,6 +1249,20 @@ static int fetch_multiple(struct string_list *list)
int i, result = 0;
struct argv_array argv = ARGV_ARRAY_INIT;
 
+   if (filter_options.choice) {
+   /*
+* We currently only support partial-fetches to the remote
+* used for the partial-clone because we only support 1
+* promisor remote, so we DO NOT allow explicit command
+* line filter arguments.
+*
+* Note that the loop below will spawn background fetches
+* for each remote and one of them MAY INHERIT the proper
+* partial-fetch settings, so everything is consistent.
+*/
+   die(_("partial-fetch is not supported on multiple remotes"));
+   }
+
if (!append && !dry_run) {
int errcode = truncate_fetch_head();
if (errcode)
@@ -1267,6 +1288,46 @@ static int fetch_multiple(struct string_list *list)
return result;
 }
 
+static inline void partial_fetch_one_setup(struct remote *remote)
+{
+#if 0 /* TODO */
+   if (filter_options.choice) {
+   /*
+* A partial-fetch was explicitly requested.
+*
+* If this is the first partial-* command on
+* this repo, we must register the partial
+* settings in the repository extension.
+*
+* If this follows a previous partial-* command
+* we must ensure the args are consistent with
+* the existing registration (because we don't
+* currently support mixing-and-matching).
+*/
+   partial_clone_utils_register(_options,
+remote->name, "fetch");
+   return;
+   }
+
+   if (is_partial_clone_registered() &&
+   !strcmp(remote->name, repository_format_partial_clone_remote)) {
+   /*
+* If a partial-* command has already been used on
+* this repo and it was to this remote, we should
+* inherit the filter settings used previously.
+* That is, if clone omitted very large blobs, then
+* fetch should too.
+*
+* Use the cached filter-spec and create the filter
+* settings.
+

[PATCH 08/14] fetch: add from_promisor and exclude-promisor-objects parameters

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Teach fetch to use from-promisor and exclude-promisor-objects
parameters with sub-commands.  Initialize fetch_if_missing
global variable.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/fetch.c | 9 ++---
 connected.c | 3 +++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 150ca0a..ab53df3 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -19,6 +19,7 @@
 #include "utf8.h"
 #include "packfile.h"
 #include "list-objects-filter-options.h"
+#include "partial-clone-utils.h"
 
 static const char * const builtin_fetch_usage[] = {
N_("git fetch [] [ [...]]"),
@@ -1048,9 +1049,11 @@ static struct transport *prepare_transport(struct remote 
*remote, int deepen)
set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, "yes");
if (update_shallow)
set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
-   if (filter_options.choice)
+   if (filter_options.choice) {
set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
   filter_options.raw_value);
+   set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+   }
return transport;
 }
 
@@ -1290,7 +1293,6 @@ static int fetch_multiple(struct string_list *list)
 
 static inline void partial_fetch_one_setup(struct remote *remote)
 {
-#if 0 /* TODO */
if (filter_options.choice) {
/*
 * A partial-fetch was explicitly requested.
@@ -1325,7 +1327,6 @@ static inline void partial_fetch_one_setup(struct remote 
*remote)
_options,
repository_format_partial_clone_filter);
}
-#endif
 }
 
 static int fetch_one(struct remote *remote, int argc, const char **argv)
@@ -1392,6 +1393,8 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
 
packet_trace_identity("fetch");
 
+   fetch_if_missing = 0;
+
/* Record the command line for the reflog */
strbuf_addstr(_rla, "fetch");
for (i = 1; i < argc; i++)
diff --git a/connected.c b/connected.c
index f416b05..6015316 100644
--- a/connected.c
+++ b/connected.c
@@ -4,6 +4,7 @@
 #include "connected.h"
 #include "transport.h"
 #include "packfile.h"
+#include "partial-clone-utils.h"
 
 /*
  * If we feed all the commits we want to verify to this command
@@ -56,6 +57,8 @@ int check_connected(sha1_iterate_fn fn, void *cb_data,
argv_array_push(_list.args,"rev-list");
argv_array_push(_list.args, "--objects");
argv_array_push(_list.args, "--stdin");
+   if (is_partial_clone_registered())
+   argv_array_push(_list.args, "--exclude-promisor-objects");
argv_array_push(_list.args, "--not");
argv_array_push(_list.args, "--all");
argv_array_push(_list.args, "--quiet");
-- 
2.9.3

[PATCH 10/14] t5601: test for partial clone

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/clone.c  | 17 ++---
 t/t5601-clone.sh | 49 +
 2 files changed, 63 insertions(+), 3 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index fceb9e7..08315d8 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -27,6 +27,7 @@
 #include "connected.h"
 #include "packfile.h"
 #include "list-objects-filter-options.h"
+#include "partial-clone-utils.h"
 
 /*
  * Overall FIXMEs:
@@ -889,6 +890,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
struct refspec *refspec;
const char *fetch_pattern;
 
+   fetch_if_missing = 0;
+
packet_trace_identity("clone");
argc = parse_options(argc, argv, prefix, builtin_clone_options,
 builtin_clone_usage, 0);
@@ -1109,11 +1112,13 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_set_option(transport, TRANS_OPT_UPLOADPACK,
 option_upload_pack);
 
-   if (filter_options.choice)
+   if (filter_options.choice) {
transport_set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
 filter_options.raw_value);
+   transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+   }
 
-   if (transport->smart_options && !deepen)
+   if (transport->smart_options && !deepen && !filter_options.choice)
transport->smart_options->check_self_contained_and_connected = 
1;
 
refs = transport_get_remote_refs(transport);
@@ -1173,13 +1178,18 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
write_refspec_config(src_ref_prefix, our_head_points_at,
remote_head_points_at, _top);
 
+   if (filter_options.choice)
+   partial_clone_utils_register(_options, "origin",
+"clone");
+
if (is_local)
clone_local(path, git_dir);
else if (refs && complete_refs_before_fetch)
transport_fetch_refs(transport, mapped_refs);
 
update_remote_refs(refs, mapped_refs, remote_head_points_at,
-  branch_top.buf, reflog_msg.buf, transport, 
!is_local);
+  branch_top.buf, reflog_msg.buf, transport,
+  !is_local && !filter_options.choice);
 
update_head(our_head_points_at, remote_head, reflog_msg.buf);
 
@@ -1200,6 +1210,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
}
 
junk_mode = JUNK_LEAVE_REPO;
+   fetch_if_missing = 1;
err = checkout(submodule_progress);
 
strbuf_release(_msg);
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 9c56f77..567161e 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -571,4 +571,53 @@ test_expect_success 'GIT_TRACE_PACKFILE produces a usable 
pack' '
git -C replay.git index-pack -v --stdin  err &&
+
+   test_i18ngrep "filtering not recognized by server" err
+'
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'partial clone using HTTP' '
+partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+'
+
+stop_httpd
+
 test_done
-- 
2.9.3

[PATCH 05/14] remote-curl: add object filtering for partial clone

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 Documentation/gitremote-helpers.txt |  4 
 remote-curl.c   | 10 --
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/Documentation/gitremote-helpers.txt 
b/Documentation/gitremote-helpers.txt
index 6da3f41..d46d561 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -468,6 +468,10 @@ set by Git if the remote helper has the 'option' 
capability.
 
 TODO document 'option from-promisor' and 'option no-haves' ?
 
+'option filter '::
+   An object filter specification for partial clone or fetch
+   as described in rev-list.
+
 SEE ALSO
 
 linkgit:git-remote[1]
diff --git a/remote-curl.c b/remote-curl.c
index 41e8a42..840f3ce 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -13,6 +13,7 @@
 #include "credential.h"
 #include "sha1-array.h"
 #include "send-pack.h"
+#include "list-objects-filter-options.h"
 
 static struct remote *remote;
 /* always ends with a trailing slash */
@@ -22,6 +23,7 @@ struct options {
int verbosity;
unsigned long depth;
char *deepen_since;
+   char *partial_clone_filter;
struct string_list deepen_not;
struct string_list push_options;
unsigned progress : 1,
@@ -163,11 +165,12 @@ static int set_option(const char *name, const char *value)
} else if (!strcmp(name, "from-promisor")) {
options.from_promisor = 1;
return 0;
-
} else if (!strcmp(name, "no-haves")) {
options.no_haves = 1;
return 0;
-
+   } else if (!strcmp(name, "filter")) {
+   options.partial_clone_filter = xstrdup(value);
+   return 0;
} else {
return 1 /* unsupported */;
}
@@ -837,6 +840,9 @@ static int fetch_git(struct discovery *heads,
argv_array_push(, "--from-promisor");
if (options.no_haves)
argv_array_push(, "--no-haves");
+   if (options.partial_clone_filter)
+   argv_array_pushf(, "--%s=%s",
+CL_ARG__FILTER, options.partial_clone_filter);
argv_array_push(, url.buf);
 
for (i = 0; i < nr_heads; i++) {
-- 
2.9.3

Re: No log --no-decorate completion?

2017-11-02 Thread Max Rothman

No problem! Let me know if I've done something wrong with this patch,
I'm new to git's contributor process.

completion: add missing completions for log, diff, show

* Add bash completion for the missing --no-* options on git log
* Add bash completion for --textconv and --indent-heuristic families to
  git diff and all commands that use diff's options
* Add bash completion for --no-abbrev-commit, --expand-tabs, and
  --no-expand-tabs to git show
---
 contrib/completion/git-completion.bash | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/contrib/completion/git-completion.bash
b/contrib/completion/git-completion.bash
index 0e16f017a4144..252a6c8c0c80c 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1412,6 +1412,8 @@ __git_diff_common_options="--stat --numstat
--shortstat --summary
--dirstat-by-file= --cumulative
--diff-algorithm=
--submodule --submodule=
+   --indent-heuristic --no-indent-heuristic
+   --textconv --no-textconv
 "

 _git_diff ()
@@ -1752,6 +1754,10 @@ _git_log ()
__gitcomp "$__git_diff_submodule_formats" ""
"${cur##--submodule=}"
return
;;
+   --no-walk=*)
+   __gitcomp "sorted unsorted" "" "${cur##--no-walk=}"
+   return
+   ;;
--*)
__gitcomp "
$__git_log_common_options
@@ -1759,16 +1765,19 @@ _git_log ()
$__git_log_gitk_options
--root --topo-order --date-order --reverse
--follow --full-diff
-   --abbrev-commit --abbrev=
+   --abbrev-commit --no-abbrev-commit --abbrev=
--relative-date --date=
--pretty= --format= --oneline
--show-signature
--cherry-mark
--cherry-pick
--graph
-   --decorate --decorate=
+   --decorate --decorate= --no-decorate
--walk-reflogs
+   --no-walk --no-walk= --do-walk
--parents --children
+   --expand-tabs --expand-tabs= --no-expand-tabs
+   --patch
$merge
$__git_diff_common_options
--pickaxe-all --pickaxe-regex
@@ -2816,8 +2825,9 @@ _git_show ()
return
;;
--*)
-   __gitcomp "--pretty= --format= --abbrev-commit --oneline
-   --show-signature
+   __gitcomp "--pretty= --format= --abbrev-commit
--no-abbrev-commit
+   --oneline --show-signature --patch
+   --expand-tabs --expand-tabs= --no-expand-tabs
$__git_diff_common_options
"
return

On Tue, Oct 24, 2017 at 11:32 AM, Stefan Beller  wrote:
> On Tue, Oct 24, 2017 at 8:15 AM, Max Rothman  wrote:
>> Just re-discovered this in my inbox. So is this worth fixing? I could
>> (probably) figure out a patch.
>>
>> Thanks,
>> Max
>>
>
> $ git log -- 
> Display all 100 possibilities? (y or n)
>
> I guess adding one more option is no big deal, so go for it!
> We also have a couple --no-options already (as you said earlier),
> which I think makes it even more viable.
>
> Tangent:
> I wonder if we can cascade the completion by common
> prefixes, e.g. --no- or --ignore- occurs a couple of times,
> such that we could only show these --no- once and only
> if you try autocompleting thereafter you get all --no- options.
>
> Thanks for reviving this thread!
> Stefan

[PATCH 8/9] sha1_file: support lazily fetching missing objects

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Teach sha1_file to fetch objects from the remote configured in
extensions.partialcloneremote whenever an object is requested but missing.

The fetching of objects can be suppressed through a global variable.
This is used by fsck and index-pack.

However, by default, such fetching is not suppressed. This is meant as a
temporary measure to ensure that all Git commands work in such a
situation. Future patches will update some commands to either tolerate
missing objects (without fetching them) or be more efficient in fetching
them.

In order to determine the code changes in sha1_file.c necessary, I
investigated the following:
 (1) functions in sha1_file.c that take in a hash, without the user
 regarding how the object is stored (loose or packed)
 (2) functions in packfile.c (because I need to check callers that know
 about the loose/packed distinction and operate on both differently,
 and ensure that they can handle the concept of objects that are
 neither loose nor packed)

(1) is handled by the modification to sha1_object_info_extended().

For (2), I looked at for_each_packed_object and others.  For
for_each_packed_object, the callers either already work or are fixed in
this patch:
 - reachable - only to find recent objects
 - builtin/fsck - already knows about missing objects
 - builtin/cat-file - warning message added in this commit

Callers of the other functions do not need to be changed:
 - parse_pack_index
   - http - indirectly from http_get_info_packs
   - find_pack_entry_one
 - this searches a single pack that is provided as an argument; the
   caller already knows (through other means) that the sought object
   is in a specific pack
 - find_sha1_pack
   - fast-import - appears to be an optimization to not store a file if
 it is already in a pack
   - http-walker - to search through a struct alt_base
   - http-push - to search through remote packs
 - has_sha1_pack
   - builtin/fsck - already knows about promisor objects
   - builtin/count-objects - informational purposes only (check if loose
 object is also packed)
   - builtin/prune-packed - check if object to be pruned is packed (if
 not, don't prune it)
   - revision - used to exclude packed objects if requested by user
   - diff - just for optimization

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/cat-file.c   |   3 +
 builtin/fetch-pack.c |   2 +
 builtin/fsck.c   |   3 +
 builtin/index-pack.c |   6 ++
 builtin/rev-list.c   |  35 +--
 cache.h  |   8 +++
 fetch-object.c   |   3 +
 list-objects.c   |   8 ++-
 object.c |   2 +-
 revision.c   |  32 +-
 revision.h   |   5 +-
 sha1_file.c  |  39 
 t/t0410-partial-clone.sh | 152 +++
 13 files changed, 277 insertions(+), 21 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index f5fa4fd..ba77b73 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -13,6 +13,7 @@
 #include "tree-walk.h"
 #include "sha1-array.h"
 #include "packfile.h"
+#include "partial-clone-utils.h"
 
 struct batch_options {
int enabled;
@@ -475,6 +476,8 @@ static int batch_objects(struct batch_options *opt)
 
for_each_loose_object(batch_loose_object, , 0);
for_each_packed_object(batch_packed_object, , 0);
+   if (is_partial_clone_registered())
+   warning("This repository has partial clone enabled. 
Some objects may not be loaded.");
 
cb.opt = opt;
cb.expand = 
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 9f303cf..9a7ebf6 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -53,6 +53,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
struct oid_array shallow = OID_ARRAY_INIT;
struct string_list deepen_not = STRING_LIST_INIT_DUP;
 
+   fetch_if_missing = 0;
+
packet_trace_identity("fetch-pack");
 
memset(, 0, sizeof(args));
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 578a7c8..3b76c0e 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -678,6 +678,9 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
int i;
struct alternate_object_database *alt;
 
+   /* fsck knows how to handle missing promisor objects */
+   fetch_if_missing = 0;
+
errors_found = 0;
check_replace_refs = 0;
 
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 24c2f05..a0a35e6 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1657,6 +1657,12 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
unsigned foreign_nr = 1;/* zero is a "good" value, assume bad */
int report_end_of_input = 0;
 
+

[PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Introduce the ability to have missing objects in a repo.  This
functionality is guarded by new repository extension options:
`extensions.partialcloneremote` and
`extensions.partialclonefilter`.

See the update to Documentation/technical/repository-version.txt
in this patch for more information.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 Documentation/technical/repository-version.txt | 22 
 Makefile   |  1 +
 cache.h|  4 ++
 config.h   |  3 +
 environment.c  |  2 +
 partial-clone-utils.c  | 78 ++
 partial-clone-utils.h  | 34 +++
 setup.c| 15 +
 8 files changed, 159 insertions(+)
 create mode 100644 partial-clone-utils.c
 create mode 100644 partial-clone-utils.h

diff --git a/Documentation/technical/repository-version.txt 
b/Documentation/technical/repository-version.txt
index 00ad379..9d488db 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -86,3 +86,25 @@ for testing format-1 compatibility.
 When the config key `extensions.preciousObjects` is set to `true`,
 objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
 `git repack -d`).
+
+`partialcloneremote`
+
+
+When the config key `extensions.partialcloneremote` is set, it indicates
+that the repo was created with a partial clone (or later performed
+a partial fetch) and that the remote may have omitted sending
+certain unwanted objects.  Such a remote is called a "promisor remote"
+and it promises that all such omitted objects can be fetched from it
+in the future.
+
+The value of this key is the name of the promisor remote.
+
+`partialclonefilter`
+
+
+When the config key `extensions.partialclonefilter` is set, it gives
+the initial filter expression used to create the partial clone.
+This value becomed the default filter expression for subsequent
+fetches (called "partial fetches") from the promisor remote.  This
+value may also be set by the first explicit partial fetch following a
+normal clone.
diff --git a/Makefile b/Makefile
index ca378a4..12d141a 100644
--- a/Makefile
+++ b/Makefile
@@ -838,6 +838,7 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += pager.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += parse-options-cb.o
+LIB_OBJS += partial-clone-utils.o
 LIB_OBJS += patch-delta.o
 LIB_OBJS += patch-ids.o
 LIB_OBJS += path.o
diff --git a/cache.h b/cache.h
index 6440e2b..4b785c0 100644
--- a/cache.h
+++ b/cache.h
@@ -860,12 +860,16 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION 0
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_precious_objects;
+extern char *repository_format_partial_clone_remote;
+extern char *repository_format_partial_clone_filter;
 
 struct repository_format {
int version;
int precious_objects;
int is_bare;
char *work_tree;
+   char *partial_clone_remote; /* value of extensions.partialcloneremote */
+   char *partial_clone_filter; /* value of extensions.partialclonefilter */
struct string_list unknown_extensions;
 };
 
diff --git a/config.h b/config.h
index a49d264..90544ef 100644
--- a/config.h
+++ b/config.h
@@ -34,6 +34,9 @@ struct config_options {
const char *git_dir;
 };
 
+#define KEY_PARTIALCLONEREMOTE "partialcloneremote"
+#define KEY_PARTIALCLONEFILTER "partialclonefilter"
+
 typedef int (*config_fn_t)(const char *, const char *, void *);
 extern int git_default_config(const char *, const char *, void *);
 extern int git_config_from_file(config_fn_t fn, const char *, void *);
diff --git a/environment.c b/environment.c
index 8289c25..2fcf9bb 100644
--- a/environment.c
+++ b/environment.c
@@ -27,6 +27,8 @@ int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_precious_objects;
+char *repository_format_partial_clone_remote;
+char *repository_format_partial_clone_filter;
 const char *git_commit_encoding;
 const char *git_log_output_encoding;
 const char *apply_default_whitespace;
diff --git a/partial-clone-utils.c b/partial-clone-utils.c
new file mode 100644
index 000..32cc20d
--- /dev/null
+++ b/partial-clone-utils.c
@@ -0,0 +1,78 @@
+#include "cache.h"
+#include "config.h"
+#include "partial-clone-utils.h"
+
+int is_partial_clone_registered(void)
+{
+   if (repository_format_partial_clone_remote ||
+   repository_format_partial_clone_filter)
+   return 1;
+
+   return 0;
+}
+
+void partial_clone_utils_register(
+   const struct list_objects_filter_options *filter_options,
+   const char *remote,
+   const char *cmd_name)
+{
+

[PATCH 6/9] index-pack: refactor writing of .keep files

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

In a subsequent commit, index-pack will be taught to write ".promisor"
files which are similar to the ".keep" files it knows how to write.
Refactor the writing of ".keep" files, so that the implementation of
writing ".promisor" files becomes easier.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/index-pack.c | 99 
 1 file changed, 53 insertions(+), 46 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 8ec459f..4f305a7 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1389,15 +1389,58 @@ static void fix_unresolved_deltas(struct sha1file *f)
free(sorted_by_pos);
 }
 
+static const char *derive_filename(const char *pack_name, const char *suffix,
+  struct strbuf *buf)
+{
+   size_t len;
+   if (!strip_suffix(pack_name, ".pack", ))
+   die(_("packfile name '%s' does not end with '.pack'"),
+   pack_name);
+   strbuf_add(buf, pack_name, len);
+   strbuf_addch(buf, '.');
+   strbuf_addstr(buf, suffix);
+   return buf->buf;
+}
+
+static void write_special_file(const char *suffix, const char *msg,
+  const char *pack_name, const unsigned char *sha1,
+  const char **report)
+{
+   struct strbuf name_buf = STRBUF_INIT;
+   const char *filename;
+   int fd;
+   int msg_len = strlen(msg);
+
+   if (pack_name)
+   filename = derive_filename(pack_name, suffix, _buf);
+   else
+   filename = odb_pack_name(_buf, sha1, suffix);
+
+   fd = odb_pack_keep(filename);
+   if (fd < 0) {
+   if (errno != EEXIST)
+   die_errno(_("cannot write %s file '%s'"),
+ suffix, filename);
+   } else {
+   if (msg_len > 0) {
+   write_or_die(fd, msg, msg_len);
+   write_or_die(fd, "\n", 1);
+   }
+   if (close(fd) != 0)
+   die_errno(_("cannot close written %s file '%s'"),
+ suffix, filename);
+   *report = suffix;
+   }
+   strbuf_release(_buf);
+}
+
 static void final(const char *final_pack_name, const char *curr_pack_name,
  const char *final_index_name, const char *curr_index_name,
- const char *keep_name, const char *keep_msg,
- unsigned char *sha1)
+ const char *keep_msg, unsigned char *sha1)
 {
const char *report = "pack";
struct strbuf pack_name = STRBUF_INIT;
struct strbuf index_name = STRBUF_INIT;
-   struct strbuf keep_name_buf = STRBUF_INIT;
int err;
 
if (!from_stdin) {
@@ -1409,28 +1452,9 @@ static void final(const char *final_pack_name, const 
char *curr_pack_name,
die_errno(_("error while closing pack file"));
}
 
-   if (keep_msg) {
-   int keep_fd, keep_msg_len = strlen(keep_msg);
-
-   if (!keep_name)
-   keep_name = odb_pack_name(_name_buf, sha1, "keep");
-
-   keep_fd = odb_pack_keep(keep_name);
-   if (keep_fd < 0) {
-   if (errno != EEXIST)
-   die_errno(_("cannot write keep file '%s'"),
- keep_name);
-   } else {
-   if (keep_msg_len > 0) {
-   write_or_die(keep_fd, keep_msg, keep_msg_len);
-   write_or_die(keep_fd, "\n", 1);
-   }
-   if (close(keep_fd) != 0)
-   die_errno(_("cannot close written keep file 
'%s'"),
- keep_name);
-   report = "keep";
-   }
-   }
+   if (keep_msg)
+   write_special_file("keep", keep_msg, final_pack_name, sha1,
+  );
 
if (final_pack_name != curr_pack_name) {
if (!final_pack_name)
@@ -1472,7 +1496,6 @@ static void final(const char *final_pack_name, const char 
*curr_pack_name,
 
strbuf_release(_name);
strbuf_release(_name);
-   strbuf_release(_name_buf);
 }
 
 static int git_index_pack_config(const char *k, const char *v, void *cb)
@@ -1615,26 +1638,13 @@ static void show_pack_info(int stat_only)
}
 }
 
-static const char *derive_filename(const char *pack_name, const char *suffix,
-  struct strbuf *buf)
-{
-   size_t len;
-   if (!strip_suffix(pack_name, ".pack", ))
-   die(_("packfile name '%s' does not end with '.pack'"),
-   pack_name);
-   strbuf_add(buf, pack_name,

[PATCH 3/9] fsck: support refs pointing to promisor objects

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Teach fsck to not treat refs referring to missing promisor objects as an
error when extensions.partialclone is set.

For the purposes of warning about no default refs, such refs are still
treated as legitimate refs.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/fsck.c   |  8 
 t/t0410-partial-clone.sh | 24 
 2 files changed, 32 insertions(+)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2934299..ee937bb 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -434,6 +434,14 @@ static int fsck_handle_ref(const char *refname, const 
struct object_id *oid,
 
obj = parse_object(oid);
if (!obj) {
+   if (is_promisor_object(oid)) {
+   /*
+* Increment default_refs anyway, because this is a
+* valid ref.
+*/
+default_refs++;
+return 0;
+   }
error("%s: invalid sha1 pointer %s", refname, oid_to_hex(oid));
errors_found |= ERROR_REACHABLE;
/* We'll continue with the rest despite the error.. */
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 52347fb..5a03ead 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -13,6 +13,14 @@ pack_as_from_promisor () {
>repo/.git/objects/pack/pack-$HASH.promisor
 }
 
+promise_and_delete () {
+   HASH=$(git -C repo rev-parse "$1") &&
+   git -C repo tag -a -m message my_annotated_tag "$HASH" &&
+   git -C repo rev-parse my_annotated_tag | pack_as_from_promisor &&
+   git -C repo tag -d my_annotated_tag &&
+   delete_object repo "$HASH"
+}
+
 test_expect_success 'missing reflog object, but promised by a commit, passes 
fsck' '
test_create_repo repo &&
test_commit -C repo my_commit &&
@@ -78,4 +86,20 @@ test_expect_success 'missing reflog object alone fails fsck, 
even with extension
test_must_fail git -C repo fsck
 '
 
+test_expect_success 'missing ref object, but promised, passes fsck' '
+   rm -rf repo &&
+   test_create_repo repo &&
+   test_commit -C repo my_commit &&
+
+   A=$(git -C repo commit-tree -m a HEAD^{tree}) &&
+
+   # Reference $A only from ref
+   git -C repo branch my_branch "$A" &&
+   promise_and_delete "$A" &&
+
+   git -C repo config core.repositoryformatversion 1 &&
+   git -C repo config extensions.partialcloneremote "arbitrary string" &&
+   git -C repo fsck
+'
+
 test_done
-- 
2.9.3

[PATCH 2/9] fsck: introduce partialclone extension

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Currently, Git does not support repos with very large numbers of objects
or repos that wish to minimize manipulation of certain blobs (for
example, because they are very large) very well, even if the user
operates mostly on part of the repo, because Git is designed on the
assumption that every referenced object is available somewhere in the
repo storage. In such an arrangement, the full set of objects is usually
available in remote storage, ready to be lazily downloaded.

Introduce the ability to have missing objects in a repo.  This
functionality is guarded behind a new repository extension option
`extensions.partialcloneremote`.
See Documentation/technical/repository-version.txt for more information.

Teach fsck about the new state of affairs. In this commit, teach fsck
that missing promisor objects referenced from the reflog are not an
error case; in future commits, fsck will be taught about other cases.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/fsck.c   |  2 +-
 cache.h  |  3 +-
 packfile.c   | 78 --
 packfile.h   | 13 
 setup.c  |  3 --
 t/t0410-partial-clone.sh | 81 
 6 files changed, 172 insertions(+), 8 deletions(-)
 create mode 100755 t/t0410-partial-clone.sh

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 56afe40..2934299 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -398,7 +398,7 @@ static void fsck_handle_reflog_oid(const char *refname, 
struct object_id *oid,
xstrfmt("%s@{%"PRItime"}", refname, 
timestamp));
obj->flags |= USED;
mark_object_reachable(obj);
-   } else {
+   } else if (!is_promisor_object(oid)) {
error("%s: invalid reflog entry %s", refname, 
oid_to_hex(oid));
errors_found |= ERROR_REACHABLE;
}
diff --git a/cache.h b/cache.h
index 4b785c0..5f84103 100644
--- a/cache.h
+++ b/cache.h
@@ -1589,7 +1589,8 @@ extern struct packed_git {
unsigned pack_local:1,
 pack_keep:1,
 freshened:1,
-do_not_close:1;
+do_not_close:1,
+pack_promisor:1;
unsigned char sha1[20];
struct revindex_entry *revindex;
/* something like ".git/objects/pack/x.pack" */
diff --git a/packfile.c b/packfile.c
index 4a5fe7a..b015a54 100644
--- a/packfile.c
+++ b/packfile.c
@@ -8,6 +8,12 @@
 #include "list.h"
 #include "streaming.h"
 #include "sha1-lookup.h"
+#include "commit.h"
+#include "object.h"
+#include "tag.h"
+#include "tree-walk.h"
+#include "tree.h"
+#include "partial-clone-utils.h"
 
 char *odb_pack_name(struct strbuf *buf,
const unsigned char *sha1,
@@ -643,10 +649,10 @@ struct packed_git *add_packed_git(const char *path, 
size_t path_len, int local)
return NULL;
 
/*
-* ".pack" is long enough to hold any suffix we're adding (and
+* ".promisor" is long enough to hold any suffix we're adding (and
 * the use xsnprintf double-checks that)
 */
-   alloc = st_add3(path_len, strlen(".pack"), 1);
+   alloc = st_add3(path_len, strlen(".promisor"), 1);
p = alloc_packed_git(alloc);
memcpy(p->pack_name, path, path_len);
 
@@ -654,6 +660,10 @@ struct packed_git *add_packed_git(const char *path, size_t 
path_len, int local)
if (!access(p->pack_name, F_OK))
p->pack_keep = 1;
 
+   xsnprintf(p->pack_name + path_len, alloc - path_len, ".promisor");
+   if (!access(p->pack_name, F_OK))
+   p->pack_promisor = 1;
+
xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
if (stat(p->pack_name, ) || !S_ISREG(st.st_mode)) {
free(p);
@@ -781,7 +791,8 @@ static void prepare_packed_git_one(char *objdir, int local)
if (ends_with(de->d_name, ".idx") ||
ends_with(de->d_name, ".pack") ||
ends_with(de->d_name, ".bitmap") ||
-   ends_with(de->d_name, ".keep"))
+   ends_with(de->d_name, ".keep") ||
+   ends_with(de->d_name, ".promisor"))
string_list_append(, path.buf);
else
report_garbage(PACKDIR_FILE_GARBAGE, path.buf);
@@ -1889,6 +1900,9 @@ int for_each_packed_object(each_packed_object_fn cb, void 
*data, unsigned flags)
for (p = packed_git; p; p = p->next) {
if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
continue;
+   if ((flags & FOR_EACH_OBJECT_PROMISOR_ONLY) &&
+   !p->pack_promisor)
+   continue;

[PATCH 9/9] gc: do not repack promisor packfiles

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Teach gc to stop traversal at promisor objects, and to leave promisor
packfiles alone. This has the effect of only repacking non-promisor
packfiles, and preserves the distinction between promisor packfiles and
non-promisor packfiles.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 Documentation/git-pack-objects.txt |  4 +++
 builtin/gc.c   |  4 +++
 builtin/pack-objects.c | 14 ++
 builtin/prune.c|  7 +
 builtin/repack.c   | 12 +++--
 t/t0410-partial-clone.sh   | 54 --
 6 files changed, 91 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 6786351..ee462c6 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -246,6 +246,10 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it 
creates a bundle.
Ignore missing objects without error.  This may be used with
or without and of the above filtering.
 
+--exclude-promisor-objects::
+   Silently omit referenced but missing objects from the packfile.
+   This is used with partial clone.
+
 SEE ALSO
 
 linkgit:git-rev-list[1]
diff --git a/builtin/gc.c b/builtin/gc.c
index 3c5eae0..a17806a 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -20,6 +20,7 @@
 #include "argv-array.h"
 #include "commit.h"
 #include "packfile.h"
+#include "partial-clone-utils.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -458,6 +459,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
argv_array_push(, prune_expire);
if (quiet)
argv_array_push(, "--no-progress");
+   if (is_partial_clone_registered())
+   argv_array_push(,
+   "--exclude-promisor-objects");
if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
return error(FAILED_RUN, prune.argv[0]);
}
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index e16722f..957e459 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -83,6 +83,7 @@ static unsigned long window_memory_limit = 0;
 
 static struct list_objects_filter_options filter_options;
 static int arg_ignore_missing;
+static int arg_exclude_promisor_objects;
 
 /*
  * stats
@@ -2561,6 +2562,11 @@ static void show_object(struct object *obj, const char 
*name, void *data)
if (arg_ignore_missing && !has_object_file(>oid))
return;
 
+   if (arg_exclude_promisor_objects &&
+   !has_object_file(>oid) &&
+   is_promisor_object(>oid))
+   return;
+
add_preferred_base_object(name);
add_object_entry(obj->oid.hash, obj->type, name, 0);
obj->flags |= OBJECT_ADDED;
@@ -2972,6 +2978,8 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
OPT_PARSE_LIST_OBJECTS_FILTER(_options),
OPT_BOOL(0, "filter-ignore-missing", _ignore_missing,
 N_("ignore and omit missing objects from packfile")),
+   OPT_BOOL(0, "exclude-promisor-objects", 
_exclude_promisor_objects,
+N_("do not pack objects in promisor packfiles")),
OPT_END(),
};
 
@@ -3017,6 +3025,12 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
argv_array_push(, "--unpacked");
}
 
+   if (arg_exclude_promisor_objects) {
+   use_internal_rev_list = 1;
+   fetch_if_missing = 0;
+   argv_array_push(, "--exclude-promisor-objects");
+   }
+
if (!reuse_object)
reuse_delta = 0;
if (pack_compression_level == -1)
diff --git a/builtin/prune.c b/builtin/prune.c
index cddabf2..be34645 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -101,12 +101,15 @@ int cmd_prune(int argc, const char **argv, const char 
*prefix)
 {
struct rev_info revs;
struct progress *progress = NULL;
+   int exclude_promisor_objects = 0;
const struct option options[] = {
OPT__DRY_RUN(_only, N_("do not remove, show only")),
OPT__VERBOSE(, N_("report pruned objects")),
OPT_BOOL(0, "progress", _progress, N_("show progress")),
OPT_EXPIRY_DATE(0, "expire", ,
N_("expire objects older than ")),
+   OPT_BOOL(0, "exclude-promisor-objects", 
_promisor_objects,
+N_("limit traversal to objects outside promisor 
packfiles")),
OPT_END()
};
char *s;
@@ -139,6 +142,10 @@ int cmd_prune(int argc, const char **argv, const char

[PATCH 5/9] fsck: support promisor objects as CLI argument

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Teach fsck to not treat missing promisor objects provided on the CLI as
an error when extensions.partialcloneremote is set.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/fsck.c   |  2 ++
 t/t0410-partial-clone.sh | 13 +
 2 files changed, 15 insertions(+)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 4c2a56d..578a7c8 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -750,6 +750,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
struct object *obj = lookup_object(oid.hash);
 
if (!obj || !(obj->flags & HAS_OBJ)) {
+   if (is_promisor_object())
+   continue;
error("%s: object missing", oid_to_hex());
errors_found |= ERROR_OBJECT;
continue;
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index b1d404e..002e071 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -125,4 +125,17 @@ test_expect_success 'missing object, but promised, passes 
fsck' '
git -C repo fsck
 '
 
+test_expect_success 'missing CLI object, but promised, passes fsck' '
+   rm -rf repo &&
+   test_create_repo repo &&
+   test_commit -C repo my_commit &&
+
+   A=$(git -C repo commit-tree -m a HEAD^{tree}) &&
+   promise_and_delete "$A" &&
+
+   git -C repo config core.repositoryformatversion 1 &&
+   git -C repo config extensions.partialcloneremote "arbitrary string" &&
+   git -C repo fsck "$A"
+'
+
 test_done
-- 
2.9.3

[PATCH 7/9] introduce fetch-object: fetch one promisor object

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Introduce fetch-object, providing the ability to fetch one object from a
promisor remote.

This uses fetch-pack. To do this, the transport mechanism has been
updated with 2 flags, "from-promisor" to indicate that the resulting
pack comes from a promisor remote (and thus should be annotated as such
by index-pack), and "no-haves" to suppress the sending of "have" lines.

This will be tested in a subsequent commit.

NEEDSWORK: update this when we have more information about protocol v2,
which should allow a way to suppress the ref advertisement and
officially allow any object type to be "want"-ed.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 Documentation/gitremote-helpers.txt |  2 ++
 Makefile|  1 +
 builtin/fetch-pack.c|  8 
 builtin/index-pack.c| 16 +---
 fetch-object.c  | 23 +++
 fetch-object.h  |  6 ++
 fetch-pack.c|  8 ++--
 fetch-pack.h|  2 ++
 remote-curl.c   | 17 -
 transport.c |  8 
 transport.h |  8 
 11 files changed, 93 insertions(+), 6 deletions(-)
 create mode 100644 fetch-object.c
 create mode 100644 fetch-object.h

diff --git a/Documentation/gitremote-helpers.txt 
b/Documentation/gitremote-helpers.txt
index 4a584f3..6da3f41 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -466,6 +466,8 @@ set by Git if the remote helper has the 'option' capability.
Transmit  as a push option. As the push option
must not contain LF or NUL characters, the string is not encoded.
 
+TODO document 'option from-promisor' and 'option no-haves' ?
+
 SEE ALSO
 
 linkgit:git-remote[1]
diff --git a/Makefile b/Makefile
index 12d141a..7a0679a 100644
--- a/Makefile
+++ b/Makefile
@@ -792,6 +792,7 @@ LIB_OBJS += ewah/ewah_bitmap.o
 LIB_OBJS += ewah/ewah_io.o
 LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
+LIB_OBJS += fetch-object.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
 LIB_OBJS += gettext.o
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d1..9f303cf 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -143,6 +143,14 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
args.update_shallow = 1;
continue;
}
+   if (!strcmp("--from-promisor", arg)) {
+   args.from_promisor = 1;
+   continue;
+   }
+   if (!strcmp("--no-haves", arg)) {
+   args.no_haves = 1;
+   continue;
+   }
usage(fetch_pack_usage);
}
if (deepen_not.nr)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4f305a7..24c2f05 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1429,14 +1429,16 @@ static void write_special_file(const char *suffix, 
const char *msg,
if (close(fd) != 0)
die_errno(_("cannot close written %s file '%s'"),
  suffix, filename);
-   *report = suffix;
+   if (report)
+   *report = suffix;
}
strbuf_release(_buf);
 }
 
 static void final(const char *final_pack_name, const char *curr_pack_name,
  const char *final_index_name, const char *curr_index_name,
- const char *keep_msg, unsigned char *sha1)
+ const char *keep_msg, const char *promisor_msg,
+ unsigned char *sha1)
 {
const char *report = "pack";
struct strbuf pack_name = STRBUF_INIT;
@@ -1455,6 +1457,9 @@ static void final(const char *final_pack_name, const char 
*curr_pack_name,
if (keep_msg)
write_special_file("keep", keep_msg, final_pack_name, sha1,
   );
+   if (promisor_msg)
+   write_special_file("promisor", promisor_msg, final_pack_name,
+  sha1, NULL);
 
if (final_pack_name != curr_pack_name) {
if (!final_pack_name)
@@ -1644,6 +1649,7 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
const char *curr_index;
const char *index_name = NULL, *pack_name = NULL;
const char *keep_msg = NULL;
+   const char *promisor_msg = NULL;
struct strbuf index_name_buf = STRBUF_INIT;
struct pack_idx_entry **idx_objects;
struct pack_idx_option opts;
@@ -1693,6 +1699,10 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
keep_msg = "";
}

[PATCH 4/9] fsck: support referenced promisor objects

2017-11-02 Thread Jeff Hostetler

From: Jonathan Tan 

Teach fsck to not treat missing promisor objects indirectly pointed to
by refs as an error when extensions.partialcloneremote is set.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/fsck.c   | 11 +++
 t/t0410-partial-clone.sh | 23 +++
 2 files changed, 34 insertions(+)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index ee937bb..4c2a56d 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -149,6 +149,15 @@ static int mark_object(struct object *obj, int type, void 
*data, struct fsck_opt
if (obj->flags & REACHABLE)
return 0;
obj->flags |= REACHABLE;
+
+   if (is_promisor_object(>oid))
+   /*
+* Further recursion does not need to be performed on this
+* object since it is a promisor object (so it does not need to
+* be added to "pending").
+*/
+   return 0;
+
if (!(obj->flags & HAS_OBJ)) {
if (parent && !has_object_file(>oid)) {
printf("broken link from %7s %s\n",
@@ -208,6 +217,8 @@ static void check_reachable_object(struct object *obj)
 * do a full fsck
 */
if (!(obj->flags & HAS_OBJ)) {
+   if (is_promisor_object(>oid))
+   return;
if (has_sha1_pack(obj->oid.hash))
return; /* it is in pack - forget about it */
printf("missing %s %s\n", printable_type(obj),
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 5a03ead..b1d404e 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -102,4 +102,27 @@ test_expect_success 'missing ref object, but promised, 
passes fsck' '
git -C repo fsck
 '
 
+test_expect_success 'missing object, but promised, passes fsck' '
+   rm -rf repo &&
+   test_create_repo repo &&
+   test_commit -C repo 1 &&
+   test_commit -C repo 2 &&
+   test_commit -C repo 3 &&
+   git -C repo tag -a annotated_tag -m "annotated tag" &&
+
+   C=$(git -C repo rev-parse 1) &&
+   T=$(git -C repo rev-parse 2^{tree}) &&
+   B=$(git hash-object repo/3.t) &&
+   AT=$(git -C repo rev-parse annotated_tag) &&
+
+   promise_and_delete "$C" &&
+   promise_and_delete "$T" &&
+   promise_and_delete "$B" &&
+   promise_and_delete "$AT" &&
+
+   git -C repo config core.repositoryformatversion 1 &&
+   git -C repo config extensions.partialcloneremote "arbitrary string" &&
+   git -C repo fsck
+'
+
 test_done
-- 
2.9.3

[PATCH 0/9] WIP Partial clone part 2: fsck and promisors

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

This is part 2 of a proposed 3 part sequence for partial clone.
Part 2 assumes part 1 [1] is in place.

Part 2 is concerned with fsck, gc, initial support for dynamic
object fetching, and tracking promisor objects.  Jonathan Tan
originally developed this code.  I have moved it on top of [1].

[1] 
https://public-inbox.org/git/20171102124445.fbffd43521cd35f6a71e1...@google.com/T/
[2] https://public-inbox.org/git/cover.1506714999.git.jonathanta...@google.com/


Jeff Hostetler (1):
  extension.partialclone: introduce partial clone extension

Jonathan Tan (8):
  fsck: introduce partialclone extension
  fsck: support refs pointing to promisor objects
  fsck: support referenced promisor objects
  fsck: support promisor objects as CLI argument
  index-pack: refactor writing of .keep files
  introduce fetch-object: fetch one promisor object
  sha1_file: support lazily fetching missing objects
  gc: do not repack promisor packfiles

 Documentation/git-pack-objects.txt |   4 +
 Documentation/gitremote-helpers.txt|   2 +
 Documentation/technical/repository-version.txt |  22 ++
 Makefile   |   2 +
 builtin/cat-file.c |   3 +
 builtin/fetch-pack.c   |  10 +
 builtin/fsck.c |  26 +-
 builtin/gc.c   |   4 +
 builtin/index-pack.c   | 113 
 builtin/pack-objects.c |  14 +
 builtin/prune.c|   7 +
 builtin/repack.c   |  12 +-
 builtin/rev-list.c |  35 ++-
 cache.h|  15 +-
 config.h   |   3 +
 environment.c  |   2 +
 fetch-object.c |  26 ++
 fetch-object.h |   6 +
 fetch-pack.c   |   8 +-
 fetch-pack.h   |   2 +
 list-objects.c |   8 +-
 object.c   |   2 +-
 packfile.c |  78 +-
 packfile.h |  13 +
 partial-clone-utils.c  |  78 ++
 partial-clone-utils.h  |  34 +++
 remote-curl.c  |  17 +-
 revision.c |  32 ++-
 revision.h |   5 +-
 setup.c|  12 +
 sha1_file.c|  39 ++-
 t/t0410-partial-clone.sh   | 343 +
 transport.c|   8 +
 transport.h|   8 +
 34 files changed, 917 insertions(+), 76 deletions(-)
 create mode 100644 fetch-object.c
 create mode 100644 fetch-object.h
 create mode 100644 partial-clone-utils.c
 create mode 100644 partial-clone-utils.h
 create mode 100755 t/t0410-partial-clone.sh

-- 
2.9.3

Re: Git libsecret No Unlock Dialog Issue

2017-11-02 Thread Yaroslav Sapozhnyk

I've tested the code change locally and seems like it fixes the issue.

Yaroslav

On Thu, Nov 2, 2017 at 2:55 PM, Dennis Kaarsemaker
 wrote:
> On Thu, 2017-11-02 at 11:35 -0700, Stefan Beller wrote:
>> On Thu, Nov 2, 2017 at 9:00 AM, Yaroslav Sapozhnyk
>>  wrote:
>> > When using Git on Fedora with locked password store
>> > credential-libsecret asks for username/password instead of displaying
>> > the unlock dialog.
>>
>> Git as packaged by Fedora or upstream Git (which version)?
>
> Looking at the code: current upstream git. Looking at the documentation
> for libsecret, this should fix it. I've not been able to test it
> though.
>
> diff --git a/contrib/credential/libsecret/git-credential-libsecret.c 
> b/contrib/credential/libsecret/git-credential-libsecret.c
> index 4c56979d8a..b4750c9ee8 100644
> --- a/contrib/credential/libsecret/git-credential-libsecret.c
> +++ b/contrib/credential/libsecret/git-credential-libsecret.c
> @@ -104,7 +104,7 @@ static int keyring_get(struct credential *c)
> items = secret_service_search_sync(service,
>SECRET_SCHEMA_COMPAT_NETWORK,
>attributes,
> -  SECRET_SEARCH_LOAD_SECRETS,
> +  SECRET_SEARCH_LOAD_SECRETS | 
> SECRET_SEARCH_UNLOCK,
>NULL,
>);
> g_hash_table_unref(attributes);



-- 
Regards,
Yaroslav Sapozhnyk

Re: [PATCH v2 0/6] Partial clone part 1: object filtering

2017-11-02 Thread Jonathan Tan

On Thu,  2 Nov 2017 17:50:07 +
Jeff Hostetler  wrote:

> From: Jeff Hostetler 
> 
> Here is V2 of the list-object filtering. It replaces [1]
> and reflect a refactoring and simplification of the original.

Thanks, overall this looks quite good. I reviewed patches 2-6 (skipping
1 since it's already in next), made my comments on 4, and don't have any
for the rest (besides what's below).

> I've added "--filter-ignore-missing" parameter to rev-list and
> pack-objects to ignore missing objects rather than error out.
> This allows this patch series to better stand on its own eliminates
> the need in part 1 for "patch 9" from V1.
> 
> This is a brute force ignore all missing objects.  Later, in part
> 2 or part 3 when --exclude-promisor-objects is introduced, we will
> be able to ignore EXPECTED missing objects.

(This is regarding patches 5 and 6.) Is the intention to support both
flags? (That is, --ignore-missing to ignore without checking whether the
object being missing is not unexpected, and --exclude-promisor-objects
to check and ignore.)

[PATCHv3 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing

2017-11-02 Thread Stefan Beller

The function `describe` has already a variable named `oid` declared at
the beginning of the function for an object id.  Do not shadow that
variable with a pointer to an object id.

Signed-off-by: Stefan Beller 
---
 builtin/describe.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 29075dbd0f..fd61f463cf 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -381,9 +381,9 @@ static void describe(const char *arg, int last_one)
}
 
if (!match_cnt) {
-   struct object_id *oid = >object.oid;
+   struct object_id *cmit_oid = >object.oid;
if (always) {
-   printf("%s", find_unique_abbrev(oid->hash, abbrev));
+   printf("%s", find_unique_abbrev(cmit_oid->hash, 
abbrev));
if (suffix)
printf("%s", suffix);
printf("\n");
@@ -392,11 +392,11 @@ static void describe(const char *arg, int last_one)
if (unannotated_cnt)
die(_("No annotated tags can describe '%s'.\n"
"However, there were unannotated tags: try 
--tags."),
-   oid_to_hex(oid));
+   oid_to_hex(cmit_oid));
else
die(_("No tags can describe '%s'.\n"
"Try --always, or create some tags."),
-   oid_to_hex(oid));
+   oid_to_hex(cmit_oid));
}
 
QSORT(all_matches, match_cnt, compare_pt);
-- 
2.15.0.7.g980e40477f

[PATCHv3 1/7] t6120: fix typo in test name

2017-11-02 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---
 t/t6120-describe.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 1c0e8659d9..c8b7ed82d9 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -304,7 +304,7 @@ test_expect_success 'describe chokes on severely broken 
submodules' '
mv .git/modules/sub1/ .git/modules/sub_moved &&
test_must_fail git describe --dirty
 '
-test_expect_success 'describe ignoring a borken submodule' '
+test_expect_success 'describe ignoring a broken submodule' '
git describe --broken >out &&
test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" &&
grep broken out
-- 
2.15.0.7.g980e40477f

[PATCHv3 6/7] builtin/describe.c: factor out describe_commit

2017-11-02 Thread Stefan Beller

In the next patch we'll learn how to describe more than just commits,
so factor out describing commits into its own function.  That will make
the next patches easy as we still need to describe a commit as part of
describing blobs.

While factoring out the functionality to describe_commit, make sure
that any output to stdout is put into a strbuf, which we can print
afterwards, using puts which also adds the line ending.

Signed-off-by: Stefan Beller 
---
 builtin/describe.c | 63 --
 1 file changed, 37 insertions(+), 26 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 3136efde31..9e9a5ed5d4 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -256,7 +256,7 @@ static unsigned long finish_depth_computation(
return seen_commits;
 }
 
-static void display_name(struct commit_name *n)
+static void append_name(struct commit_name *n, struct strbuf *dst)
 {
if (n->prio == 2 && !n->tag) {
n->tag = lookup_tag(>oid);
@@ -272,19 +272,18 @@ static void display_name(struct commit_name *n)
}
 
if (n->tag)
-   printf("%s", n->tag->tag);
+   strbuf_addstr(dst, n->tag->tag);
else
-   printf("%s", n->path);
+   strbuf_addstr(dst, n->path);
 }
 
-static void show_suffix(int depth, const struct object_id *oid)
+static void append_suffix(int depth, const struct object_id *oid, struct 
strbuf *dst)
 {
-   printf("-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev));
+   strbuf_addf(dst, "-%d-g%s", depth, find_unique_abbrev(oid->hash, 
abbrev));
 }
 
-static void describe(const char *arg, int last_one)
+static void describe_commit(struct object_id *oid, struct strbuf *dst)
 {
-   struct object_id oid;
struct commit *cmit, *gave_up_on = NULL;
struct commit_list *list;
struct commit_name *n;
@@ -293,26 +292,18 @@ static void describe(const char *arg, int last_one)
unsigned long seen_commits = 0;
unsigned int unannotated_cnt = 0;
 
-   if (debug)
-   fprintf(stderr, _("describe %s\n"), arg);
-
-   if (get_oid(arg, ))
-   die(_("Not a valid object name %s"), arg);
-   cmit = lookup_commit_reference();
-   if (!cmit)
-   die(_("%s is not a valid '%s' object"), arg, commit_type);
+   cmit = lookup_commit_reference(oid);
 
n = find_commit_name(>object.oid);
if (n && (tags || all || n->prio == 2)) {
/*
 * Exact match to an existing ref.
 */
-   display_name(n);
+   append_name(n, dst);
if (longformat)
-   show_suffix(0, n->tag ? >tag->tagged->oid : );
+   append_suffix(0, n->tag ? >tag->tagged->oid : oid, 
dst);
if (suffix)
-   printf("%s", suffix);
-   printf("\n");
+   strbuf_addstr(dst, suffix);
return;
}
 
@@ -386,10 +377,9 @@ static void describe(const char *arg, int last_one)
if (!match_cnt) {
struct object_id *cmit_oid = >object.oid;
if (always) {
-   printf("%s", find_unique_abbrev(cmit_oid->hash, 
abbrev));
+   strbuf_addstr(dst, find_unique_abbrev(cmit_oid->hash, 
abbrev));
if (suffix)
-   printf("%s", suffix);
-   printf("\n");
+   strbuf_addstr(dst, suffix);
return;
}
if (unannotated_cnt)
@@ -437,15 +427,36 @@ static void describe(const char *arg, int last_one)
}
}
 
-   display_name(all_matches[0].name);
+   append_name(all_matches[0].name, dst);
if (abbrev)
-   show_suffix(all_matches[0].depth, >object.oid);
+   append_suffix(all_matches[0].depth, >object.oid, dst);
if (suffix)
-   printf("%s", suffix);
-   printf("\n");
+   strbuf_addstr(dst, suffix);
+}
+
+static void describe(const char *arg, int last_one)
+{
+   struct object_id oid;
+   struct commit *cmit;
+   struct strbuf sb = STRBUF_INIT;
+
+   if (debug)
+   fprintf(stderr, _("describe %s\n"), arg);
+
+   if (get_oid(arg, ))
+   die(_("Not a valid object name %s"), arg);
+   cmit = lookup_commit_reference();
+   if (!cmit)
+   die(_("%s is not a valid '%s' object"), arg, commit_type);
+
+   describe_commit(, );
+
+   puts(sb.buf);
 
if (!last_one)
clear_commit_marks(cmit, -1);
+
+   strbuf_release();
 }
 
 int cmd_describe(int argc, const char **argv, const char *prefix)
-- 
2.15.0.7.g980e40477f

[PATCHv3 3/7] revision.h: introduce blob/tree walking in order of the commits

2017-11-02 Thread Stefan Beller

The functionality to list tree objects in the order they were seen
while traversing the commits will be used in the next commit,
where we teach `git describe` to describe not only commits, but
trees and blobs, too.

Signed-off-by: Stefan Beller 
---
 Documentation/rev-list-options.txt |  5 
 list-objects.c |  2 ++
 revision.c |  2 ++
 revision.h |  3 ++-
 t/t6100-rev-list-in-order.sh   | 47 ++
 5 files changed, 58 insertions(+), 1 deletion(-)
 create mode 100755 t/t6100-rev-list-in-order.sh

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 13501e1556..9066e0c777 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -686,6 +686,11 @@ ifdef::git-rev-list[]
all object IDs which I need to download if I have the commit
object _bar_ but not _foo_''.
 
+--in-commit-order::
+   Print tree and blob ids in order of the commits. The tree
+   and blob ids are printed after they are first referenced
+   by a commit.
+
 --objects-edge::
Similar to `--objects`, but also print the IDs of excluded
commits prefixed with a ``-'' character.  This is used by
diff --git a/list-objects.c b/list-objects.c
index 7c2ce9c4bd..07a92f35fe 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -239,6 +239,8 @@ void traverse_commit_list(struct rev_info *revs,
if (commit->tree)
add_pending_tree(revs, commit->tree);
show_commit(commit, data);
+   if (revs->tree_blobs_in_commit_order)
+   traverse_trees_and_blobs(revs, , show_object, data);
}
traverse_trees_and_blobs(revs, , show_object, data);
 
diff --git a/revision.c b/revision.c
index d167223e69..9329d4ebbf 100644
--- a/revision.c
+++ b/revision.c
@@ -1845,6 +1845,8 @@ static int handle_revision_opt(struct rev_info *revs, int 
argc, const char **arg
revs->dense = 0;
} else if (!strcmp(arg, "--show-all")) {
revs->show_all = 1;
+   } else if (!strcmp(arg, "--in-commit-order")) {
+   revs->tree_blobs_in_commit_order = 1;
} else if (!strcmp(arg, "--remove-empty")) {
revs->remove_empty_trees = 1;
} else if (!strcmp(arg, "--merges")) {
diff --git a/revision.h b/revision.h
index 54761200ad..86985d68aa 100644
--- a/revision.h
+++ b/revision.h
@@ -121,7 +121,8 @@ struct rev_info {
bisect:1,
ancestry_path:1,
first_parent_only:1,
-   line_level_traverse:1;
+   line_level_traverse:1,
+   tree_blobs_in_commit_order:1;
 
/* Diff flags */
unsigned intdiff:1,
diff --git a/t/t6100-rev-list-in-order.sh b/t/t6100-rev-list-in-order.sh
new file mode 100755
index 00..d4d539b0da
--- /dev/null
+++ b/t/t6100-rev-list-in-order.sh
@@ -0,0 +1,47 @@
+#!/bin/sh
+
+test_description='rev-list testing in-commit-order'
+
+. ./test-lib.sh
+
+test_expect_success 'rev-list --in-commit-order' '
+   for x in one two three four
+   do
+   echo $x >$x &&
+   git add $x &&
+   git commit -m "add file $x" ||
+   return 1
+   done &&
+   for x in four three
+   do
+   git rm $x &&
+   git commit -m "remove $x" ||
+   return 1
+   done &&
+   git rev-list --in-commit-order --objects HEAD >actual.raw &&
+   cut -c 1-40 >actual expect.raw <<-\EOF &&
+   HEAD^{commit}
+   HEAD^{tree}
+   HEAD^{tree}:one
+   HEAD^{tree}:two
+   HEAD~1^{commit}
+   HEAD~1^{tree}
+   HEAD~1^{tree}:three
+   HEAD~2^{commit}
+   HEAD~2^{tree}
+   HEAD~2^{tree}:four
+   HEAD~3^{commit}
+   # HEAD~3^{tree} skipped, same as HEAD~1^{tree}
+   HEAD~4^{commit}
+   # HEAD~4^{tree} skipped, same as HEAD^{tree}
+   HEAD~5^{commit}
+   HEAD~5^{tree}
+   EOF
+   grep -v "#" >expect

[PATCHv3 5/7] builtin/describe.c: print debug statements earlier

2017-11-02 Thread Stefan Beller

For debuggers aid we'd want to print debug statements early, so
introduce a new line in the debug output that describes the whole
function, and then change the next debug output to describe why we
need to search. Conveniently drop the arg from the second line;
which will be useful in a follow up commit, that refactors the\
describe function.

Signed-off-by: Stefan Beller 
---
 builtin/describe.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index fd61f463cf..3136efde31 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -293,6 +293,9 @@ static void describe(const char *arg, int last_one)
unsigned long seen_commits = 0;
unsigned int unannotated_cnt = 0;
 
+   if (debug)
+   fprintf(stderr, _("describe %s\n"), arg);
+
if (get_oid(arg, ))
die(_("Not a valid object name %s"), arg);
cmit = lookup_commit_reference();
@@ -316,7 +319,7 @@ static void describe(const char *arg, int last_one)
if (!max_candidates)
die(_("no tag exactly matches '%s'"), 
oid_to_hex(>object.oid));
if (debug)
-   fprintf(stderr, _("searching to describe %s\n"), arg);
+   fprintf(stderr, _("No exact match on refs or tags, searching to 
describe\n"));
 
if (!have_util) {
struct hashmap_iter iter;
-- 
2.15.0.7.g980e40477f

[PATCHv3 7/7] builtin/describe.c: describe a blob

2017-11-02 Thread Stefan Beller

Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])

"This is an interesting endeavor, because describing things is hard."
  -- me, upon writing this patch.

When describing commits, we try to anchor them to tags or refs, as these
are conceptually on a higher level than the commit. And if there is no ref
or tag that matches exactly, we're out of luck.  So we employ a heuristic
to make up a name for the commit. These names are ambiguous, there might
be different tags or refs to anchor to, and there might be different
path in the DAG to travel to arrive at the commit precisely.

When describing a blob, we want to describe the blob from a higher layer
as well, which is a tuple of (commit, deep/path) as the tree objects
involved are rather uninteresting.  The same blob can be referenced by
multiple commits, so how we decide which commit to use?  This patch
implements a rather naive approach on this: As there are no back pointers
from blobs to commits in which the blob occurs, we'll start walking from
any tips available, listing the blobs in-order of the commit and once we
found the blob, we'll take the first commit that listed the blob.  For
source code this is likely not the first commit that introduced the blob,
but rather the latest commit that contained the blob.  For example:

  git describe v0.99:Makefile
  v0.99-5-gab6625e06a:Makefile

tells us the latest commit that contained the Makefile as it was in tag
v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next
commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist",
2005-07-11) touches the Makefile.

Let's see how this description turns out, if it is useful in day-to-day
use as I have the intuition that we'd rather want to see the *first*
commit that this blob was introduced to the repository (which can be
achieved easily by giving the `--reverse` flag in the describe_blob rev
walk).

[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob

Signed-off-by: Stefan Beller 
---
 Documentation/git-describe.txt | 13 -
 builtin/describe.c | 65 ++
 t/t6120-describe.sh| 15 ++
 3 files changed, 87 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index c924c945ba..79ec0be62a 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -3,7 +3,7 @@ git-describe(1)
 
 NAME
 
-git-describe - Describe a commit using the most recent tag reachable from it
+git-describe - Describe a commit or blob using the graph relations
 
 
 SYNOPSIS
@@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git describe' [--all] [--tags] [--contains] [--abbrev=] [...]
 'git describe' [--all] [--tags] [--contains] [--abbrev=] --dirty[=]
+'git describe' [] 
 
 DESCRIPTION
 ---
@@ -24,6 +25,16 @@ By default (without --all or --tags) `git describe` only 
shows
 annotated tags.  For more information about creating annotated tags
 see the -a and -s options to linkgit:git-tag[1].
 
+If the given object refers to a blob, it will be described
+as `:`, such that the blob can be found
+at `` in the ``. Note, that the commit is likely
+not the commit that introduced the blob, but the one that was found
+first; to find the commit that introduced the blob, you need to find
+the commit that last touched the path, e.g.
+`git log  -- `.
+As blobs do not point at the commits they are contained in,
+describing blobs is slow as we have to walk the whole graph.
+
 OPTIONS
 ---
 ...::
diff --git a/builtin/describe.c b/builtin/describe.c
index 9e9a5ed5d4..cf08bef344 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -3,6 +3,7 @@
 #include "lockfile.h"
 #include "commit.h"
 #include "tag.h"
+#include "blob.h"
 #include "refs.h"
 #include "builtin.h"
 #include "exec_cmd.h"
@@ -11,8 +12,9 @@
 #include "hashmap.h"
 #include "argv-array.h"
 #include "run-command.h"
+#include "revision.h"
+#include "list-objects.h"
 
-#define SEEN   (1u << 0)
 #define MAX_TAGS   (FLAG_BITS - 1)
 
 static const char * const describe_usage[] = {
@@ -434,6 +436,56 @@ static void describe_commit(struct object_id *oid, struct 
strbuf *dst)
strbuf_addstr(dst, suffix);
 }
 
+struct process_commit_data {
+   struct object_id current_commit;
+   struct object_id looking_for;
+   struct strbuf *dst;
+   struct rev_info *revs;
+};
+
+static void process_commit(struct commit *commit, void *data)
+{
+   struct process_commit_data *pcd = data;
+   pcd->current_commit = commit->object.oid;
+}
+
+static void process_object(struct object *obj, const char *path, void *data)
+{
+   struct process_commit_data *pcd = data;
+
+   if (!oidcmp(>looking_for, >oid) && !pcd->dst->len) {
+   reset_revision_walk();
+

[PATCHv3 0/7] git describe blob

2017-11-02 Thread Stefan Beller

Thanks for the discussion on v2[1].

Interdiff is below, just fixing minor things.

We'll keep the original algorithm for now, deferring an improvement on 
the algorithm front towards a future developer.

Thanks,
Stefan

[1] https://public-inbox.org/git/20171031211852.13001-1-sbel...@google.com/

Stefan Beller (7):
  t6120: fix typo in test name
  list-objects.c: factor out traverse_trees_and_blobs
  revision.h: introduce blob/tree walking in order of the commits
  builtin/describe.c: rename `oid` to avoid variable shadowing
  builtin/describe.c: print debug statements earlier
  builtin/describe.c: factor out describe_commit
  builtin/describe.c: describe a blob

 Documentation/git-describe.txt |  13 +++-
 Documentation/rev-list-options.txt |   5 ++
 builtin/describe.c | 125 -
 list-objects.c |  52 +--
 revision.c |   2 +
 revision.h |   3 +-
 t/t6100-rev-list-in-order.sh   |  47 ++
 t/t6120-describe.sh|  17 -
 8 files changed, 214 insertions(+), 50 deletions(-)
 create mode 100755 t/t6100-rev-list-in-order.sh
diff --git c/Documentation/git-describe.txt w/Documentation/git-describe.txt
index 077c3c2193..79ec0be62a 100644
--- c/Documentation/git-describe.txt
+++ w/Documentation/git-describe.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 [verse]
 'git describe' [--all] [--tags] [--contains] [--abbrev=] [...]
 'git describe' [--all] [--tags] [--contains] [--abbrev=] --dirty[=]
-'git describe' [] 
+'git describe' [] 
 
 DESCRIPTION
 ---
diff --git c/builtin/describe.c w/builtin/describe.c
index 382cbae908..cf08bef344 100644
--- c/builtin/describe.c
+++ w/builtin/describe.c
@@ -440,6 +440,7 @@ struct process_commit_data {
struct object_id current_commit;
struct object_id looking_for;
struct strbuf *dst;
+   struct rev_info *revs;
 };
 
 static void process_commit(struct commit *commit, void *data)
@@ -456,6 +457,7 @@ static void process_object(struct object *obj, const char 
*path, void *data)
reset_revision_walk();
describe_commit(>current_commit, pcd->dst);
strbuf_addf(pcd->dst, ":%s", path);
+   pcd->revs->max_count = 0;
}
 }
 
@@ -463,7 +465,7 @@ static void describe_blob(struct object_id oid, struct 
strbuf *dst)
 {
struct rev_info revs;
struct argv_array args = ARGV_ARRAY_INIT;
-   struct process_commit_data pcd = { null_oid, oid, dst};
+   struct process_commit_data pcd = { null_oid, oid, dst, };
 
argv_array_pushl(, "internal: The first arg is not parsed",
"--all", "--reflog", /* as many starting points as possible */
@@ -497,14 +499,12 @@ static void describe(const char *arg, int last_one)
die(_("Not a valid object name %s"), arg);
cmit = lookup_commit_reference_gently(, 1);
 
-   if (cmit) {
+   if (cmit)
describe_commit(, );
-   } else {
-   if (lookup_blob())
-   describe_blob(oid, );
-   else
-   die(_("%s is neither a commit nor blob"), arg);
-   }
+   else if (lookup_blob())
+   describe_blob(oid, );
+   else
+   die(_("%s is neither a commit nor blob"), arg);
 
puts(sb.buf);
 
diff --git c/list-objects.c w/list-objects.c
index 03438e5a8b..07a92f35fe 100644
--- c/list-objects.c
+++ w/list-objects.c
@@ -184,13 +184,13 @@ static void add_pending_tree(struct rev_info *revs, 
struct tree *tree)
 }
 
 static void traverse_trees_and_blobs(struct rev_info *revs,
-struct strbuf *base_path,
+struct strbuf *base,
 show_object_fn show_object,
 void *data)
 {
int i;
 
-   assert(base_path->len == 0);
+   assert(base->len == 0);
 
for (i = 0; i < revs->pending.nr; i++) {
struct object_array_entry *pending = revs->pending.objects + i;
@@ -208,12 +208,12 @@ static void traverse_trees_and_blobs(struct rev_info 
*revs,
path = "";
if (obj->type == OBJ_TREE) {
process_tree(revs, (struct tree *)obj, show_object,
-base_path, path, data);
+base, path, data);
continue;
}
if (obj->type == OBJ_BLOB) {
process_blob(revs, (struct blob *)obj, show_object,
-base_path, path, data);
+base, path, data);
continue;
}
die("unknown pending object %s (%s)",
diff --git c/t/t6100-rev-list-in-order.sh w/t/t6100-rev-list-in-order.sh
index

[PATCHv3 2/7] list-objects.c: factor out traverse_trees_and_blobs

2017-11-02 Thread Stefan Beller

With traverse_trees_and_blobs factored out of the main traverse function,
the next patch can introduce an in-order revision walking with ease.

In the next patch we'll call `traverse_trees_and_blobs` from within the
loop walking the commits, such that we'll have one invocation of that
function per commit.  That is why we do not want to have memory allocations
in that function, such as we'd have if we were to use a strbuf locally.
Pass a strbuf from traverse_commit_list into the blob and tree traversing
function as a scratch pad that only needs to be allocated once.

Signed-off-by: Stefan Beller 
---
 list-objects.c | 50 +++---
 1 file changed, 31 insertions(+), 19 deletions(-)

diff --git a/list-objects.c b/list-objects.c
index b3931fa434..7c2ce9c4bd 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -183,25 +183,15 @@ static void add_pending_tree(struct rev_info *revs, 
struct tree *tree)
add_pending_object(revs, >object, "");
 }
 
-void traverse_commit_list(struct rev_info *revs,
- show_commit_fn show_commit,
- show_object_fn show_object,
- void *data)
+static void traverse_trees_and_blobs(struct rev_info *revs,
+struct strbuf *base,
+show_object_fn show_object,
+void *data)
 {
int i;
-   struct commit *commit;
-   struct strbuf base;
 
-   strbuf_init(, PATH_MAX);
-   while ((commit = get_revision(revs)) != NULL) {
-   /*
-* an uninteresting boundary commit may not have its tree
-* parsed yet, but we are not going to show them anyway
-*/
-   if (commit->tree)
-   add_pending_tree(revs, commit->tree);
-   show_commit(commit, data);
-   }
+   assert(base->len == 0);
+
for (i = 0; i < revs->pending.nr; i++) {
struct object_array_entry *pending = revs->pending.objects + i;
struct object *obj = pending->item;
@@ -218,17 +208,39 @@ void traverse_commit_list(struct rev_info *revs,
path = "";
if (obj->type == OBJ_TREE) {
process_tree(revs, (struct tree *)obj, show_object,
-, path, data);
+base, path, data);
continue;
}
if (obj->type == OBJ_BLOB) {
process_blob(revs, (struct blob *)obj, show_object,
-, path, data);
+base, path, data);
continue;
}
die("unknown pending object %s (%s)",
oid_to_hex(>oid), name);
}
object_array_clear(>pending);
-   strbuf_release();
+}
+
+void traverse_commit_list(struct rev_info *revs,
+ show_commit_fn show_commit,
+ show_object_fn show_object,
+ void *data)
+{
+   struct commit *commit;
+   struct strbuf csp; /* callee's scratch pad */
+   strbuf_init(, PATH_MAX);
+
+   while ((commit = get_revision(revs)) != NULL) {
+   /*
+* an uninteresting boundary commit may not have its tree
+* parsed yet, but we are not going to show them anyway
+*/
+   if (commit->tree)
+   add_pending_tree(revs, commit->tree);
+   show_commit(commit, data);
+   }
+   traverse_trees_and_blobs(revs, , show_object, data);
+
+   strbuf_release();
 }
-- 
2.15.0.7.g980e40477f

Re: [PATCH v2 4/6] list-objects: filter objects in traverse_commit_list

2017-11-02 Thread Jonathan Tan

On Thu,  2 Nov 2017 17:50:11 +
Jeff Hostetler  wrote:

> +int parse_list_objects_filter(struct list_objects_filter_options 
> *filter_options,
> +   const char *arg)

Returning void is fine, I think. It seems that all your code paths
either return 0 or die.

> +{
> + struct object_context oc;
> + struct object_id sparse_oid;
> + const char *v0;
> + const char *v1;
> +
> + if (filter_options->choice)
> + die(_("multiple object filter types cannot be combined"));
> +
> + /*
> +  * TODO consider rejecting 'arg' if it contains any
> +  * TODO injection characters (since we might send this
> +  * TODO to a sub-command or to the server and we don't
> +  * TODO want to deal with legacy quoting/escaping for
> +  * TODO a new feature).
> +  */
> +
> + filter_options->raw_value = strdup(arg);
> +
> + if (skip_prefix(arg, "blob:", ) || skip_prefix(arg, "blobs:", )) {
> + if (!strcmp(v0, "none")) {
> + filter_options->choice = LOFC_BLOB_NONE;
> + return 0;
> + }
> +
> + if (skip_prefix(v0, "limit=", ) &&
> + git_parse_ulong(v1, _options->blob_limit_value)) {
> + filter_options->choice = LOFC_BLOB_LIMIT;
> + return 0;
> + }
> + }
> + else if (skip_prefix(arg, "sparse:", )) {

Style: join the 2 lines above.

> + if (skip_prefix(v0, "oid=", )) {
> + filter_options->choice = LOFC_SPARSE_OID;
> + if (!get_oid_with_context(v1, GET_OID_BLOB,
> +   _oid, )) {
> + /*
> +  * We successfully converted the 
> +  * into an actual OID.  Rewrite the raw_value
> +  * in canonoical form with just the OID.
> +  * (If we send this request to the server, we
> +  * want an absolute expression rather than a
> +  * local-ref-relative expression.)
> +  */

I think this would lead to confusing behavior - for example, a fetch
with "--filter=oid=mybranch:sparseconfig" would have different results
depending on whether "mybranch" refers to a valid object locally.

The way I see it, this should either (i) only accept full 40-character
OIDs or (ii) retain the raw string to be interpreted only when the
filtering is done. (i) is simpler and safer, but is not so useful. In
both cases, if the user really wants client-side interpretation, they
can still use "$(git rev-parse foo)" to make it explicit.

> + free((char *)filter_options->raw_value);
> + filter_options->raw_value =
> + xstrfmt("sparse:oid=%s",
> + oid_to_hex(_oid));
> + filter_options->sparse_oid_value =
> + oiddup(_oid);
> + } else {
> + /*
> +  * We could not turn the  into an
> +  * OID.  Leave the raw_value as is in case
> +  * the server can parse it.  (It may refer to
> +  * a branch, commit, or blob we don't have.)
> +  */
> + }
> + return 0;
> + }
> +
> + if (skip_prefix(v0, "path=", )) {
> + filter_options->choice = LOFC_SPARSE_PATH;
> + filter_options->sparse_path_value = strdup(v1);
> + return 0;
> + }
> + }
> +
> + die(_("invalid filter expression '%s'"), arg);
> + return 0;
> +}

[snip]

> +void arg_format_list_objects_filter(
> + struct argv_array *argv_array,
> + const struct list_objects_filter_options *filter_options)

Is this function used anywhere (in this patch or subsequent patches)?

> diff --git a/list-objects-filter.c b/list-objects-filter.c
> +/* See object.h and revision.h */
> +#define FILTER_REVISIT (1<<25)

Looking later in the code, this flag indicates that a tree has been
SHOWN, so it might be better to just call this FILTER_SHOWN.

Also document this flag in object.h in the table above FLAG_BITS.

> +static enum list_objects_filter_result filter_blobs_limit(
> + enum list_objects_filter_type filter_type,
> + struct object *obj,
> + const char *pathname,
> + const char *filename,
> + void *filter_data_)
> +{
> + struct filter_blobs_limit_data *filter_data = filter_data_;
> + unsigned long object_length;
> + enum object_type t;
> + int is_special_filename;
> +
> + switch (filter_type) {
> + default:
> +

Re: Git libsecret No Unlock Dialog Issue

2017-11-02 Thread Dennis Kaarsemaker

On Thu, 2017-11-02 at 11:35 -0700, Stefan Beller wrote:
> On Thu, Nov 2, 2017 at 9:00 AM, Yaroslav Sapozhnyk
>  wrote:
> > When using Git on Fedora with locked password store
> > credential-libsecret asks for username/password instead of displaying
> > the unlock dialog.
> 
> Git as packaged by Fedora or upstream Git (which version)?

Looking at the code: current upstream git. Looking at the documentation
for libsecret, this should fix it. I've not been able to test it
though.

diff --git a/contrib/credential/libsecret/git-credential-libsecret.c 
b/contrib/credential/libsecret/git-credential-libsecret.c
index 4c56979d8a..b4750c9ee8 100644
--- a/contrib/credential/libsecret/git-credential-libsecret.c
+++ b/contrib/credential/libsecret/git-credential-libsecret.c
@@ -104,7 +104,7 @@ static int keyring_get(struct credential *c)
items = secret_service_search_sync(service,
   SECRET_SCHEMA_COMPAT_NETWORK,
   attributes,
-  SECRET_SEARCH_LOAD_SECRETS,
+  SECRET_SEARCH_LOAD_SECRETS | 
SECRET_SEARCH_UNLOCK,
   NULL,
   );
g_hash_table_unref(attributes);

Re: Git libsecret No Unlock Dialog Issue

2017-11-02 Thread Yaroslav Sapozhnyk

Sorry, should have mentioned that. It's packaged by Fedora - 2.13.6.

Yaroslav

On Thu, Nov 2, 2017 at 2:35 PM, Stefan Beller  wrote:
> On Thu, Nov 2, 2017 at 9:00 AM, Yaroslav Sapozhnyk
>  wrote:
>> When using Git on Fedora with locked password store
>> credential-libsecret asks for username/password instead of displaying
>> the unlock dialog.
>
> Git as packaged by Fedora or upstream Git (which version)?
>
>> If the store is unlocked credential helper gets the credentials from
>> the store though.
>>
>> --
>> Regards,
>> Yaroslav Sapozhnyk



-- 
Regards,
Yaroslav Sapozhnyk

Re: [RFC PATCH v3 3/4] branch: introduce dont_fail parameter for branchname validation

2017-11-02 Thread Stefan Beller

On Thu, Nov 2, 2017 at 1:39 AM, Kaartic Sivaraam
 wrote:
> On Thursday 02 November 2017 12:24 PM, Kaartic Sivaraam wrote:
>>
>> Signed-off-by: Kaartic Sivaraam 
>> Signed-off-by: Kaartic Sivaraam 
>
>
> I just now saw this small glitch as a consequence of recently
> changing my email address. I would prefer to keep the second one
> but as the other patches have the first one it's better to keep
> the first one for now.

If you prefer one of them, you may have incentive to
add an entry into .mailmap file, otherwise I'd kindly ask you
to. :) (c.f. `git log --no-merges -- .mailmap`)

> But wait, it seems that this commit also has a different author
> identity (the email adress part). If this is a big enough issue,
> I'll fix that and send a v4 (possibly with any other suggested
> changes) else I'll leave it as it is.
>
> BTW, I added the Ccs to this mail which I forgot to do when
> sending the patches, hope it's not an issue.

Re: Git libsecret No Unlock Dialog Issue

2017-11-02 Thread Stefan Beller

On Thu, Nov 2, 2017 at 9:00 AM, Yaroslav Sapozhnyk
 wrote:
> When using Git on Fedora with locked password store
> credential-libsecret asks for username/password instead of displaying
> the unlock dialog.

Git as packaged by Fedora or upstream Git (which version)?

> If the store is unlocked credential helper gets the credentials from
> the store though.
>
> --
> Regards,
> Yaroslav Sapozhnyk

Re: Regression[2.14.3->2.15]: Interactive rebase fails if submodule is modified

2017-11-02 Thread Stefan Beller

On Thu, Nov 2, 2017 at 1:30 AM, Orgad Shaneh  wrote:
> I can't reproduce this with a minimal example, but it happens in my project.
>
> What I tried to do for reproducing is:
> rm -rf super sub
> mkdir sub; cd sub; git init
> git commit --allow-empty -m 'Initial commit'
> mkdir ../super; cd ../super
> git init
> git submodule add ../sub
> touch foo; git add foo sub
> git commit -m 'Initial commit'
> touch a; git add a; git commit -m 'a'
> touch b; git add b; git commit -m 'b'
> cd sub; git commit --allow-empty -m 'New commit'; cd ..
> git rebase -i HEAD^^
>
> Then drop a.
>
> In my project I get:
> error: cannot rebase: You have unstaged changes.
>
> This works fine with 2.14.3.

  git log --oneline v2.14.3..v2.15.0 -- submodule.c
doesn't give any promising hints (i.e. I don't think one of a
submodule related series introduced this either by chance or
on purpose)

"rebase -i" was rewritten into C in 570676e011, though
that series was extensively tested by DScho, so I wouldn't
want to point fingers here quickly.

Would you be willing to bisect this behavior?

Re: [PATCHv2 6/7] builtin/describe.c: describe a blob

2017-11-02 Thread Stefan Beller

On Thu, Nov 2, 2017 at 12:23 AM, Andreas Schwab  wrote:
> On Nov 01 2017, Johannes Schindelin  wrote:
>
>> Sure, but it is still a tricky thing. Imagine
>>
>> - A1 - B1 - A2 - B2 - B3
>>
>> where all the B* commits have the blob. Do you really want to report B1
>> rather than B2 as the commit introducing the blob? (I would prefer B2...)
>
> What if B3 renames or copies the blob?
>
> Andreas.

With the current proposed patch you'd find B3, and then use the diff machinery
to digg deeper from there (renames/copies ought to be easy to detect already?)

So with a copy B3 might be a better start than B1, as starting from B1 you
would not find B3 easily.

For a rename, I would think a reverse log/blame on B1:path may help.

With that said, I think I'll just reroll the series with the current logic
fixing the other minor issues that were brought up as B3 seems to
be the most versatile (though not optimal) answer for many use cases.

Thanks for that thought,
Stefan

[PATCHv2] config: document blame configuration

2017-11-02 Thread Stefan Beller

The options are currently only referenced by the git-blame man page,
also explain them in git-config, which is the canonical page to
contain all config options.

Signed-off-by: Stefan Beller 
---

 * correct option to blame.showRoot
 * camelCased other options
 * use linkgit:git-[1] instead of `git-cmd` as that
   is correct, but maybe overused.
 * --date is `backticked` now.

 Documentation/config.txt | 17 +
 1 file changed, 17 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1ac0ae6adb..ba0156b1e8 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -949,6 +949,23 @@ apply.whitespace::
Tells 'git apply' how to handle whitespaces, in the same way
as the `--whitespace` option. See linkgit:git-apply[1].
 
+blame.showRoot::
+   Do not treat root commits as boundaries in linkgit:git-blame[1].
+   This option defaults to false.
+
+blame.blankBoundary::
+   Show blank SHA-1 for boundary commits in linkgit:git-blame[1].
+   This option defaults to false.
+
+blame.showEmail::
+   Show the author email instead of author name in linkgit:git-blame[1].
+   This option defaults to false.
+
+blame.date::
+   Specifies the format used to output dates in linkgit:git-blame[1].
+   If unset the iso format is used. For supported values,
+   see the discussion of the `--date` option at linkgit:git-log[1].
+
 branch.autoSetupMerge::
Tells 'git branch' and 'git checkout' to set up new branches
so that linkgit:git-pull[1] will appropriately merge from the
-- 
2.15.0.7.g980e40477f

Re: [BUG] git clean -d is too greedy

2017-11-02 Thread Hermanni Suominen

I was a bit trigger happy posting this, after digging a bit more this is 
a way more serious than I originally thought.



   1) .gitignore exists in nested repo (either tracked or untracked)
   2) .gitignore is excluded


This can be any file, including those commonly excluded such as *~.

Demonstrating this in a way that looks a lot more likely use case which 
will potentially wipe out lots of uncommitted work:


# git init -q foo && cd foo
# git init -q bar && cd bar
# touch bar bar~
# git add bar && git commit -qm asd && cd ..
# git clean -e \*~ -dn
Would remove bar/bar

If there were more tracked files in nested repo they'd be removed as 
well as long as there is at least one excluded file in nested repo.


--
Hermanni

Re: On the nature of cover letters [WAS Re: [PATCH 0/6] Create Git/Packet.pm]

2017-11-02 Thread Stefan Beller

On Thu, Nov 2, 2017 at 1:01 AM, Jeff King  wrote:
> On Thu, Nov 02, 2017 at 04:48:45PM +0900, Junio C Hamano wrote:
>
>> Jeff King  writes:
>>
>> > If the clutter is too much, it could also go into a git-notes ref
>> > (that's not already implemented, but it seems like it would be pretty
>> > easy to teach "git am" to do that).
>>
>> FWIW, that is what I use a notes ref "amlog" for.
>
> Right, I completely forgot that you were already doing that. So jumping
> to the thread for a commit is basically:
>
>   $browser $(git notes --ref=amlog show $commit |
>  perl -pe 's{Message-Id: 
> <(.*)>}{https://public-inbox.org/git/$1/t}')
>
> (Or whichever archive you prefer to use). Thanks for the reminder.
>

Thanks from my side as well. I did not pay attention when this was
introduced and see the notes ref for the first time today.

I'll incorporate that into my digging repertoire.

Thanks,
Stefan

[PATCH v2 4/6] list-objects: filter objects in traverse_commit_list

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Create traverse_commit_list_filtered() and add filtering
interface to allow certain objects to be omitted from the
traversal.

Update traverse_commit_list() to be a wrapper for the above
with a null filter to minimize the number of callers that
needed to be changed.

Object filtering will be used in a future commit by rev-list
and pack-objects for partial clone and fetch to omit unwanted
objects from the result.

traverse_bitmap_commit_list() does not work with filtering.

If a packfile bitmap is present, it will not be used.

Signed-off-by: Jeff Hostetler 
---
 Makefile  |   2 +
 list-objects-filter-options.c | 119 
 list-objects-filter-options.h |  55 ++
 list-objects-filter.c | 408 ++
 list-objects-filter.h |  84 +
 list-objects.c|  95 --
 list-objects.h|   2 +-
 7 files changed, 748 insertions(+), 17 deletions(-)
 create mode 100644 list-objects-filter-options.c
 create mode 100644 list-objects-filter-options.h
 create mode 100644 list-objects-filter.c
 create mode 100644 list-objects-filter.h

diff --git a/Makefile b/Makefile
index cd75985..ca378a4 100644
--- a/Makefile
+++ b/Makefile
@@ -807,6 +807,8 @@ LIB_OBJS += levenshtein.o
 LIB_OBJS += line-log.o
 LIB_OBJS += line-range.o
 LIB_OBJS += list-objects.o
+LIB_OBJS += list-objects-filter.o
+LIB_OBJS += list-objects-filter-options.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
 LIB_OBJS += log-tree.o
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
new file mode 100644
index 000..31255e7
--- /dev/null
+++ b/list-objects-filter-options.c
@@ -0,0 +1,119 @@
+#include "cache.h"
+#include "commit.h"
+#include "config.h"
+#include "revision.h"
+#include "argv-array.h"
+#include "list-objects.h"
+#include "list-objects-filter.h"
+#include "list-objects-filter-options.h"
+
+/*
+ * Parse value of the argument to the "filter" keword.
+ * On the command line this looks like:
+ *   --filter=
+ * and in the pack protocol as:
+ *   "filter" SP 
+ *
+ *  ::= blob:none
+ *   blob:limit=[kmg]
+ *   sparse:oid=
+ *   sparse:path=
+ */
+int parse_list_objects_filter(struct list_objects_filter_options 
*filter_options,
+ const char *arg)
+{
+   struct object_context oc;
+   struct object_id sparse_oid;
+   const char *v0;
+   const char *v1;
+
+   if (filter_options->choice)
+   die(_("multiple object filter types cannot be combined"));
+
+   /*
+* TODO consider rejecting 'arg' if it contains any
+* TODO injection characters (since we might send this
+* TODO to a sub-command or to the server and we don't
+* TODO want to deal with legacy quoting/escaping for
+* TODO a new feature).
+*/
+
+   filter_options->raw_value = strdup(arg);
+
+   if (skip_prefix(arg, "blob:", ) || skip_prefix(arg, "blobs:", )) {
+   if (!strcmp(v0, "none")) {
+   filter_options->choice = LOFC_BLOB_NONE;
+   return 0;
+   }
+
+   if (skip_prefix(v0, "limit=", ) &&
+   git_parse_ulong(v1, _options->blob_limit_value)) {
+   filter_options->choice = LOFC_BLOB_LIMIT;
+   return 0;
+   }
+   }
+   else if (skip_prefix(arg, "sparse:", )) {
+   if (skip_prefix(v0, "oid=", )) {
+   filter_options->choice = LOFC_SPARSE_OID;
+   if (!get_oid_with_context(v1, GET_OID_BLOB,
+ _oid, )) {
+   /*
+* We successfully converted the 
+* into an actual OID.  Rewrite the raw_value
+* in canonoical form with just the OID.
+* (If we send this request to the server, we
+* want an absolute expression rather than a
+* local-ref-relative expression.)
+*/
+   free((char *)filter_options->raw_value);
+   filter_options->raw_value =
+   xstrfmt("sparse:oid=%s",
+   oid_to_hex(_oid));
+   filter_options->sparse_oid_value =
+   oiddup(_oid);
+   } else {
+   /*
+* We could not turn the  into an
+* OID.  Leave the raw_value as is in case
+* the server can parse it.  (It may refer to
+

[PATCH v2 5/6] rev-list: add list-objects filtering support

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Teach rev-list to use the filtering provided by the
traverse_commit_list_filtered() interface to omit
unwanted objects from the result.  This feature is
intended to help with partial clone.

Object filtering is only allowed when one of the "--objects*"
options are used.

When the "--filter-print-omitted" option is used, the omitted
objects are printed at the end.  These are marked with a "~".
This option can be combined with "--quiet" to get a list of
just the omitted objects.

Normally, rev-list will stop with an error when there are
missing objects.

When the "--filter-print-missing" option is used, rev-list
will print a list of any missing objects that should have
been included in the output (rather than stopping).
These are marked with a "?".

When the "--filter-ignore-missing" option is used, rev-list
will silently ignore any missing objects and continue without
error.

Add t6112 test.

Signed-off-by: Jeff Hostetler 
---
 Documentation/git-rev-list.txt  |   6 +-
 Documentation/rev-list-options.txt  |  34 ++
 builtin/rev-list.c  |  75 +++-
 t/t6112-rev-list-filters-objects.sh | 225 
 4 files changed, 337 insertions(+), 3 deletions(-)
 create mode 100755 t/t6112-rev-list-filters-objects.sh

diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index ef22f17..b8a3a5b 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -47,7 +47,11 @@ SYNOPSIS
 [ --fixed-strings | -F ]
 [ --date=]
 [ [ --objects | --objects-edge | --objects-edge-aggressive ]
-  [ --unpacked ] ]
+  [ --unpacked ]
+  [ --filter= ] ]
+[ --filter-print-missing ]
+[ --filter-print-omitted ]
+[ --filter-ignore-missing ]
 [ --pretty | --header ]
 [ --bisect ]
 [ --bisect-vars ]
diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 13501e1..9233134 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -706,6 +706,40 @@ ifdef::git-rev-list[]
 --unpacked::
Only useful with `--objects`; print the object IDs that are not
in packs.
+
+--filter=::
+   Only useful with one of the `--objects*`; omits objects (usually
+   blobs) from the list of printed objects.  The ''
+   may be one of the following:
++
+The form '--filter=blob:none' omits all blobs.
++
+The form '--filter=blob:limit=[kmg]' omits blobs larger than n bytes
+or units.  The value may be zero.  Special files matching '.git*' are
+alwayse included, regardless of size.
++
+The form '--filter=sparse:oid=' uses a sparse-checkout
+specification contained in the object (or the object that the expression
+evaluates to) to omit blobs not required by the corresponding sparse
+checkout.
++
+The form '--filter=sparse:path=' similarly uses a sparse-checkout
+specification contained in .
+
+--filter-print-missing::
+   Prints a list of the missing objects for the requested traversal.
+   Object IDs are prefixed with a ``?'' character.  The object type
+   is printed after the ID.  This may be used with or without any of
+   the above filtering options.
+
+--filter-ignore-missing::
+   Ignores missing objects encountered during the requested traversal.
+   This may be used with or without any of the above filtering options.
+
+--filter-print-omitted::
+   Only useful with one of the above `--filter*`; prints a list
+   of the omitted objects.  Object IDs are prefixed with a ``~''
+   character.
 endif::git-rev-list[]
 
 --no-walk[=(sorted|unsorted)]::
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index c1c74d4..cc9fa40 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -4,6 +4,8 @@
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "list-objects-filter.h"
+#include "list-objects-filter-options.h"
 #include "pack.h"
 #include "pack-bitmap.h"
 #include "builtin.h"
@@ -12,6 +14,7 @@
 #include "bisect.h"
 #include "progress.h"
 #include "reflog-walk.h"
+#include "oidset.h"
 
 static const char rev_list_usage[] =
 "git rev-list [OPTION] ... [ -- paths... ]\n"
@@ -54,6 +57,15 @@ static const char rev_list_usage[] =
 
 static struct progress *progress;
 static unsigned progress_counter;
+static struct list_objects_filter_options filter_options;
+static struct oidset missing_objects;
+static struct oidset omitted_objects;
+static int arg_print_missing;
+static int arg_print_omitted;
+static int arg_ignore_missing;
+
+#define DEFAULT_OIDSET_SIZE (16*1024)
+
 
 static void finish_commit(struct commit *commit, void *data);
 static void show_commit(struct commit *commit, void *data)
@@ -181,8 +193,16 @@ static void finish_commit(struct commit *commit, void 
*data)
 static void finish_object(struct

[PATCH v2 3/6] oidset: add iterator methods to oidset

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Add the usual iterator methods to oidset.
Add oidset_remove().

Signed-off-by: Jeff Hostetler 
---
 oidset.c | 10 ++
 oidset.h | 36 
 2 files changed, 46 insertions(+)

diff --git a/oidset.c b/oidset.c
index f1f874a..454c54f 100644
--- a/oidset.c
+++ b/oidset.c
@@ -24,6 +24,16 @@ int oidset_insert(struct oidset *set, const struct object_id 
*oid)
return 0;
 }
 
+int oidset_remove(struct oidset *set, const struct object_id *oid)
+{
+   struct oidmap_entry *entry;
+
+   entry = oidmap_remove(>map, oid);
+   free(entry);
+
+   return (entry != NULL);
+}
+
 void oidset_clear(struct oidset *set)
 {
oidmap_free(>map, 1);
diff --git a/oidset.h b/oidset.h
index f4c9e0f..783abce 100644
--- a/oidset.h
+++ b/oidset.h
@@ -24,6 +24,12 @@ struct oidset {
 
 #define OIDSET_INIT { OIDMAP_INIT }
 
+
+static inline void oidset_init(struct oidset *set, size_t initial_size)
+{
+   return oidmap_init(>map, initial_size);
+}
+
 /**
  * Returns true iff `set` contains `oid`.
  */
@@ -39,9 +45,39 @@ int oidset_contains(const struct oidset *set, const struct 
object_id *oid);
 int oidset_insert(struct oidset *set, const struct object_id *oid);
 
 /**
+ * Remove the oid from the set.
+ *
+ * Returns 1 if the oid was present in the set, 0 otherwise.
+ */
+int oidset_remove(struct oidset *set, const struct object_id *oid);
+
+/**
  * Remove all entries from the oidset, freeing any resources associated with
  * it.
  */
 void oidset_clear(struct oidset *set);
 
+struct oidset_iter {
+   struct oidmap_iter m_iter;
+};
+
+static inline void oidset_iter_init(struct oidset *set,
+   struct oidset_iter *iter)
+{
+   oidmap_iter_init(>map, >m_iter);
+}
+
+static inline struct object_id *oidset_iter_next(struct oidset_iter *iter)
+{
+   struct oidmap_entry *e = oidmap_iter_next(>m_iter);
+   return e ? >oid : NULL;
+}
+
+static inline struct object_id *oidset_iter_first(struct oidset *set,
+ struct oidset_iter *iter)
+{
+   oidset_iter_init(set, iter);
+   return oidset_iter_next(iter);
+}
+
 #endif /* OIDSET_H */
-- 
2.9.3

[PATCH v2 6/6] pack-objects: add list-objects filtering

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Teach pack-objects to use the filtering provided by the
traverse_commit_list_filtered() interface to omit unwanted
objects from the resulting packfile.

This feature is intended for partial clone/fetch.

Filtering requires the use of the "--stdout" option.

Add t5317 test.

Signed-off-by: Jeff Hostetler 
---
 Documentation/git-pack-objects.txt |  12 +-
 builtin/pack-objects.c |  28 ++-
 t/t5317-pack-objects-filter-objects.sh | 369 +
 3 files changed, 407 insertions(+), 2 deletions(-)
 create mode 100755 t/t5317-pack-objects-filter-objects.sh

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 473a161..6786351 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -12,7 +12,8 @@ SYNOPSIS
 'git pack-objects' [-q | --progress | --all-progress] [--all-progress-implied]
[--no-reuse-delta] [--delta-base-offset] [--non-empty]
[--local] [--incremental] [--window=] [--depth=]
-   [--revs [--unpacked | --all]] [--stdout | base-name]
+   [--revs [--unpacked | --all]]
+   [--stdout [--filter=] | base-name]
[--shallow] [--keep-true-parents] < object-list
 
 
@@ -236,6 +237,15 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it 
creates a bundle.
With this option, parents that are hidden by grafts are packed
nevertheless.
 
+--filter=::
+   Requires `--stdout`.  Omits certain objects (usually blobs) from
+   the resulting packfile.  See linkgit:git-rev-list[1] for valid
+   `` forms.
+
+--filter-ignore-missing:
+   Ignore missing objects without error.  This may be used with
+   or without and of the above filtering.
+
 SEE ALSO
 
 linkgit:git-rev-list[1]
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 6e77dfd..e16722f 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -15,6 +15,8 @@
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "list-objects-filter.h"
+#include "list-objects-filter-options.h"
 #include "pack-objects.h"
 #include "progress.h"
 #include "refs.h"
@@ -79,6 +81,9 @@ static unsigned long cache_max_small_delta_size = 1000;
 
 static unsigned long window_memory_limit = 0;
 
+static struct list_objects_filter_options filter_options;
+static int arg_ignore_missing;
+
 /*
  * stats
  */
@@ -2547,6 +2552,15 @@ static void show_commit(struct commit *commit, void 
*data)
 
 static void show_object(struct object *obj, const char *name, void *data)
 {
+   /*
+* Quietly ignore missing objects when they are expected.  This
+* avoids staging them and getting an odd error later.  If we are
+* not expecting them, stage it and let the normal error handling
+* deal with it.
+*/
+   if (arg_ignore_missing && !has_object_file(>oid))
+   return;
+
add_preferred_base_object(name);
add_object_entry(obj->oid.hash, obj->type, name, 0);
obj->flags |= OBJECT_ADDED;
@@ -2816,7 +2830,10 @@ static void get_object_list(int ac, const char **av)
if (prepare_revision_walk())
die("revision walk setup failed");
mark_edges_uninteresting(, show_edge);
-   traverse_commit_list(, show_commit, show_object, NULL);
+
+   traverse_commit_list_filtered(_options, ,
+ show_commit, show_object, NULL,
+ NULL);
 
if (unpack_unreachable_expiration) {
revs.ignore_missing_links = 1;
@@ -2952,6 +2969,9 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
 N_("use a bitmap index if available to speed up 
counting objects")),
OPT_BOOL(0, "write-bitmap-index", _bitmap_index,
 N_("write a bitmap index together with the pack 
index")),
+   OPT_PARSE_LIST_OBJECTS_FILTER(_options),
+   OPT_BOOL(0, "filter-ignore-missing", _ignore_missing,
+N_("ignore and omit missing objects from packfile")),
OPT_END(),
};
 
@@ -3028,6 +3048,12 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
if (!rev_list_all || !rev_list_reflog || !rev_list_index)
unpack_unreachable_expiration = 0;
 
+   if (filter_options.choice) {
+   if (!pack_to_stdout)
+   die("cannot use filtering with an indexable pack.");
+   use_bitmap_index = 0;
+   }
+
/*
 * "soft" reasons not to use bitmaps - for on-disk repack by default we 
want
 *
diff --git a/t/t5317-pack-objects-filter-objects.sh 
b/t/t5317-pack-objects-filter-objects.sh
new file mode 100755
index 000..4249557
--- /dev/null
+++ b/t/t5317-pack-objects-filter-objects.sh
@@ -0,0 +1,369 @@
+#!/bin/sh
+

[PATCH v2 1/6] dir: allow exclusions from blob in addition to file

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Refactor add_excludes() to separate the reading of the
exclude file into a buffer and the parsing of the buffer
into exclude_list items.

Add add_excludes_from_blob_to_list() to allow an exclude
file be specified with an OID without assuming a local
worktree or index exists.

Refactor read_skip_worktree_file_from_index() and add
do_read_blob() to eliminate duplication of preliminary
processing of blob contents.

Signed-off-by: Jeff Hostetler 
---
 dir.c | 132 ++
 dir.h |   3 ++
 2 files changed, 104 insertions(+), 31 deletions(-)

diff --git a/dir.c b/dir.c
index 1d17b80..1962374 100644
--- a/dir.c
+++ b/dir.c
@@ -220,6 +220,57 @@ int within_depth(const char *name, int namelen,
return 1;
 }
 
+/*
+ * Read the contents of the blob with the given OID into a buffer.
+ * Append a trailing LF to the end if the last line doesn't have one.
+ *
+ * Returns:
+ *-1 when the OID is invalid or unknown or does not refer to a blob.
+ * 0 when the blob is empty.
+ * 1 along with { data, size } of the (possibly augmented) buffer
+ *   when successful.
+ *
+ * Optionally updates the given sha1_stat with the given OID (when valid).
+ */
+static int do_read_blob(const struct object_id *oid,
+   struct sha1_stat *sha1_stat,
+   size_t *size_out,
+   char **data_out)
+{
+   enum object_type type;
+   unsigned long sz;
+   char *data;
+
+   *size_out = 0;
+   *data_out = NULL;
+
+   data = read_sha1_file(oid->hash, , );
+   if (!data || type != OBJ_BLOB) {
+   free(data);
+   return -1;
+   }
+
+   if (sha1_stat) {
+   memset(_stat->stat, 0, sizeof(sha1_stat->stat));
+   hashcpy(sha1_stat->sha1, oid->hash);
+   }
+
+   if (sz == 0) {
+   free(data);
+   return 0;
+   }
+
+   if (data[sz - 1] != '\n') {
+   data = xrealloc(data, st_add(sz, 1));
+   data[sz++] = '\n';
+   }
+
+   *size_out = xsize_t(sz);
+   *data_out = data;
+
+   return 1;
+}
+
 #define DO_MATCH_EXCLUDE   (1<<0)
 #define DO_MATCH_DIRECTORY (1<<1)
 #define DO_MATCH_SUBMODULE (1<<2)
@@ -600,32 +651,22 @@ void add_exclude(const char *string, const char *base,
x->el = el;
 }
 
-static void *read_skip_worktree_file_from_index(const struct index_state 
*istate,
-   const char *path, size_t *size,
-   struct sha1_stat *sha1_stat)
+static int read_skip_worktree_file_from_index(const struct index_state *istate,
+ const char *path,
+ size_t *size_out,
+ char **data_out,
+ struct sha1_stat *sha1_stat)
 {
int pos, len;
-   unsigned long sz;
-   enum object_type type;
-   void *data;
 
len = strlen(path);
pos = index_name_pos(istate, path, len);
if (pos < 0)
-   return NULL;
+   return -1;
if (!ce_skip_worktree(istate->cache[pos]))
-   return NULL;
-   data = read_sha1_file(istate->cache[pos]->oid.hash, , );
-   if (!data || type != OBJ_BLOB) {
-   free(data);
-   return NULL;
-   }
-   *size = xsize_t(sz);
-   if (sha1_stat) {
-   memset(_stat->stat, 0, sizeof(sha1_stat->stat));
-   hashcpy(sha1_stat->sha1, istate->cache[pos]->oid.hash);
-   }
-   return data;
+   return -1;
+
+   return do_read_blob(>cache[pos]->oid, sha1_stat, size_out, 
data_out);
 }
 
 /*
@@ -739,6 +780,10 @@ static void invalidate_directory(struct untracked_cache 
*uc,
dir->dirs[i]->recurse = 0;
 }
 
+static int add_excludes_from_buffer(char *buf, size_t size,
+   const char *base, int baselen,
+   struct exclude_list *el);
+
 /*
  * Given a file with name "fname", read it (either from disk, or from
  * an index if 'istate' is non-null), parse it and store the
@@ -754,9 +799,10 @@ static int add_excludes(const char *fname, const char 
*base, int baselen,
struct sha1_stat *sha1_stat)
 {
struct stat st;
-   int fd, i, lineno = 1;
+   int r;
+   int fd;
size_t size = 0;
-   char *buf, *entry;
+   char *buf;
 
fd = open(fname, O_RDONLY);
if (fd < 0 || fstat(fd, ) < 0) {
@@ -764,17 +810,13 @@ static int add_excludes(const char *fname, const char 
*base, int baselen,
warn_on_fopen_errors(fname);
else
close(fd);
-   if (!istate ||
-   (buf =

[PATCH v2 0/6] Partial clone part 1: object filtering

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Here is V2 of the list-object filtering. It replaces [1]
and reflect a refactoring and simplification of the original.

After much discussion on the "list-object-filter-map" I've replaced
it with a regular oidset -- the only need for the map was to store
the first observed pathname for each blob, but that itself was of
questionable value.

I've extended oidmap and oidset to have iterators.  These 2 commits
could be pulled out and applied on their own, but for now I need
them here.

There were also several comments on the layout of the filtering
API and the layout of the filter source code.  I've restructured
the filtering routines to put them in the same source file, and
made them all static.  These are now hidden behind a "factory-like"
function with a vtable.  This greatly simplifies the code in
traverse_commit_list_filtered().

I've added "--filter-ignore-missing" parameter to rev-list and
pack-objects to ignore missing objects rather than error out.
This allows this patch series to better stand on its own eliminates
the need in part 1 for "patch 9" from V1.

This is a brute force ignore all missing objects.  Later, in part
2 or part 3 when --exclude-promisor-objects is introduced, we will
be able to ignore EXPECTED missing objects.

Finally, patch 1 in this series is the same [2] which is currently
cooking in next.

[1] https://public-inbox.org/git/20171024185332.57261-1-...@jeffhostetler.com/

[2] * jh/dir-add-exclude-from-blob (2017-10-27) 1 commit
- dir: allow exclusions from blob in addition to file


Jeff Hostetler (6):
  dir: allow exclusions from blob in addition to file
  oidmap: add oidmap iterator methods
  oidset: add iterator methods to oidset
  list-objects: filter objects in traverse_commit_list
  rev-list: add list-objects filtering support
  pack-objects: add list-objects filtering

 Documentation/git-pack-objects.txt |  12 +-
 Documentation/git-rev-list.txt |   6 +-
 Documentation/rev-list-options.txt |  34 +++
 Makefile   |   2 +
 builtin/pack-objects.c |  28 ++-
 builtin/rev-list.c |  75 +-
 dir.c  | 132 ---
 dir.h  |   3 +
 list-objects-filter-options.c  | 119 ++
 list-objects-filter-options.h  |  55 +
 list-objects-filter.c  | 408 +
 list-objects-filter.h  |  84 +++
 list-objects.c |  95 ++--
 list-objects.h |   2 +-
 oidmap.h   |  22 ++
 oidset.c   |  10 +
 oidset.h   |  36 +++
 t/t5317-pack-objects-filter-objects.sh | 369 +
 t/t6112-rev-list-filters-objects.sh| 225 ++
 19 files changed, 1664 insertions(+), 53 deletions(-)
 create mode 100644 list-objects-filter-options.c
 create mode 100644 list-objects-filter-options.h
 create mode 100644 list-objects-filter.c
 create mode 100644 list-objects-filter.h
 create mode 100755 t/t5317-pack-objects-filter-objects.sh
 create mode 100755 t/t6112-rev-list-filters-objects.sh

-- 
2.9.3

[PATCH v2 2/6] oidmap: add oidmap iterator methods

2017-11-02 Thread Jeff Hostetler

From: Jeff Hostetler 

Add the usual map iterator functions to oidmap.

Signed-off-by: Jeff Hostetler 
---
 oidmap.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/oidmap.h b/oidmap.h
index 18f54cd..d3cd2bb 100644
--- a/oidmap.h
+++ b/oidmap.h
@@ -65,4 +65,26 @@ extern void *oidmap_put(struct oidmap *map, void *entry);
  */
 extern void *oidmap_remove(struct oidmap *map, const struct object_id *key);
 
+
+struct oidmap_iter {
+   struct hashmap_iter h_iter;
+};
+
+static inline void oidmap_iter_init(struct oidmap *map, struct oidmap_iter 
*iter)
+{
+   hashmap_iter_init(>map, >h_iter);
+}
+
+static inline void *oidmap_iter_next(struct oidmap_iter *iter)
+{
+   return hashmap_iter_next(>h_iter);
+}
+
+static inline void *oidmap_iter_first(struct oidmap *map,
+ struct oidmap_iter *iter)
+{
+   oidmap_iter_init(map, iter);
+   return oidmap_iter_next(iter);
+}
+
 #endif
-- 
2.9.3

Re: [PATCH] imap-send: handle NULL return of next_arg()

2017-11-02 Thread René Scharfe

Am 02.11.2017 um 03:18 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> @@ -807,6 +816,8 @@ static int get_cmd_result(struct imap_store *ctx, struct 
>> imap_cmd *tcmd)
>>  if (cmdp->cb.cont || cmdp->cb.data)
>>  imap->literal_pending = 0;
>>  arg = next_arg();
>> +if (!arg)
>> +arg = "";
> 
> This is being cute and needs reading of the post-context a bit.
> 
>   arg = next_arg();
> +if (!arg)
> +arg = "";
>   if (!strcmp("OK", arg))
>   resp = DRV_OK;
>   else {
>   if (!strcmp("NO", arg))
>   resp = RESP_NO;
>   else /*if (!strcmp("BAD", arg))*/
>   resp = RESP_BAD;
>   fprintf(stderr, "IMAP command '%s' returned response (%s) - 
> %s\n",
>   !starts_with(cmdp->cmd, "LOGIN") ?
>   cmdp->cmd : "LOGIN  ",
>   arg, cmd ? cmd : "");
>   }
>   if ((resp2 = parse_response_code(ctx, >cb, cmd)) > resp)
>   resp = resp2;
> 
> 
> Because the existing code does not explicitly check for "BAD", we
> get RESP_BAD, which is what we want in the end, and the error
> message we give has "returned response (%s)" which is filled with
> this empty string.
> 
> I notice that when this change matters, i.e. we hit a premature end,
> next_arg() that yielded NULL would have cleared cmd.  That is OK for
> the fprintf() we see above, but wouldn't it hit parse_response_code()
> that is shared between good and bad cases?  The function starts like
> so:
> 
>  static int parse_response_code(struct imap_store *ctx, struct 
> imap_cmd_cb *cb,
> char *s)
>  {
>  struct imap *imap = ctx->imap;
>  char *arg, *p;
> 
>  if (*s != '[')
>  return RESP_OK;  /* no response code */
> 
> so we will get an immediate NULL dereference, it appears.

Good catch.

The NULL check in the fprintf() call (two lines above) makes it obvious
already that cmd can be NULL.  So parse_response_code() needs to learn
to handle NULL passed as its third parameter.  And since "no response
code" yields "RESP_OK" I guess that this should be an appropriate
reaction to s == NULL as well.

RFC 3501 seems to agree (response codes are optional):

   7.1.Server Responses - Status Responses

   Status responses are OK, NO, BAD, PREAUTH and BYE.  OK, NO, and BAD
   can be tagged or untagged.  PREAUTH and BYE are always untagged.

   Status responses MAY include an OPTIONAL "response code".  A response
   code consists of data inside square brackets in the form of an atom,
   possibly followed by a space and arguments.  The response code
   contains additional information or status codes for client software
   beyond the OK/NO/BAD condition, and are defined when there is a
   specific action that a client can take based upon the additional
   information.

The follow-up patch below makes sense in this light, but I wonder why
it hasn't been necessary so far.  Do most IMAP servers actually send
response codes?  Or at least some other string after a status response?
Are no users of git imap-send left?  I have no way to test any of that.
:-(

René

-- >8 --
Subject: [PATCH 2/1] imap-send: handle missing response codes gracefully

Response codes are optional.  Exit parse_response_code() early if it's
passed a NULL string, indicating that we reached the end of the reply.
This avoids dereferencing said NULL pointer.

Noticed-by: Junio C Hamano 
Signed-off-by: Rene Scharfe 
---
 imap-send.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/imap-send.c b/imap-send.c
index 0031566309..12cc4ea4c8 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -684,7 +684,7 @@ static int parse_response_code(struct imap_store *ctx, 
struct imap_cmd_cb *cb,
struct imap *imap = ctx->imap;
char *arg, *p;

-   if (*s != '[')
+   if (!s || *s != '[')
return RESP_OK; /* no response code */
s++;
if (!(p = strchr(s, ']'))) {
-- 
2.15.0

[BUG] git clean -d is too greedy

2017-11-02 Thread Hermanni Suominen


Hi all,

Since commit 6b1db4310 it is possible to make git clean -d to remove 
nested git repositories if


  1) .gitignore exists in nested repo (either tracked or untracked)
  2) .gitignore is excluded

Regarding to 2) it doesn't matter if .gitignore is excluded from 
(another) .gitignore or from command-line (but I assume they populate 
same ignore list so that's just stating the obvious).


To demonstrate this I can run the following commands:

# git init -q foo && cd foo
# git init -q bar && cd bar
# touch .gitignore bar
# git add bar && git commit -qm asd && cd ..
# git clean -e .gitignore -dn
Would remove bar/bar

Pre 6b1db4310, and if .gitignore isn't exclued, nested repo is correctly 
skipped:


# git clean -dn
Would skip repository bar/ 



It probably isn't very common use case to exclude .gitignore but 
nevertheless this has been broken for a while, and to make things worse 
it can wipe out lots of uncommitted changes.


--
Hermanni

Re: [PATCH 3/3] mingw: document the experimental standard handle redirection

2017-11-02 Thread Johannes Schindelin

Hi Junio,

On Thu, 2 Nov 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> >> > I feel this is the wrong way round. `>/dev/null` may sound very intuitive
> >> > to you, but this feature is Windows only. Guess three times how intuitive
> >> > it sounds to Windows developers to write `>/dev/null` if you want to
> >> > suppress output...
> >> 
> >> It would be just as intuitive to write '2>&1' for dup-redirection,
> >
> > No. You misunderstand. I was mainly concerned with the `/dev/null`. Every
> > Windows developer knows what `>file.txt` means, and many know what
> > `2>error.txt` means. But `/dev/null` is not Windows, period.
> 
> Actually I did know that much.  
> 
> If I was correct in assuming that "2>&1" is just as foreign as
> ">/dev/null", then we should be shunning "2>&1" just like we shun
> ">/dev/null".  That was all I meant to say.

Did you know that `2>&1` works in Powershell?

> Are you saying "2>&1" is just as likely to be known as ">file.txt"
> and my assumption of foreignness of "2>&1" was incorrect?
> 
>   Side note: would ">NUL" look more familiar, I wonder, and
>   can stand for ">/dev/null" for the target audience?
> 
> > ... It is so not
> > Windows that Git itself translates it to `NUL` (which you Linux die-hards
> > won't have a clue about, I would wager a bet).
> 
> Ah, you lost your bet.  When can I collect ;-)?

As soon as we meet in person again.

Ciao,
Dscho

Git libsecret No Unlock Dialog Issue

2017-11-02 Thread Yaroslav Sapozhnyk

When using Git on Fedora with locked password store
credential-libsecret asks for username/password instead of displaying
the unlock dialog.

If the store is unlocked credential helper gets the credentials from
the store though.

-- 
Regards,
Yaroslav Sapozhnyk

COMPENSATION SCAM VICTIM'S $5,000 EVERYDAY,

2017-11-02 Thread Mr. David Hassn

Attn: Beneficiary,

I write to inform you that we already issued those documents to accompany your 
$5,000 payment each day. But the only problem we are having right here is your 
personal signatures which the Federal Administer of Fund Benin Republic 
requested that you must sign those documents before we can transfer funds to 
you. However, I told the officer in charge that it will not be necessary for 
you to come down here due to your occupation or some other thing that may not 
allow you to come down here to sign those documents yourself. The Minster 
Administrator of Fund said that you should get an attorney to sign on your 
behalf if you are unable to come down here in person.

I think this way is the best and the only way forward for you to receive the 
funds. I have negotiated with an attorney who would sign on your behalf. His 
name is Sam Floyd. According to him you are to pay for the accredited attorney. 
He is charging $110.00 to be paid before he gets those documents signed in your 
favor. I told him to consider signing those documents on your behalf today with 
a promise that you would pay him back from the $5,000 payment you suppose to 
receive tomorrow morning if he gets those documents signed today.

And his respond is that you have to pay the accredited attorney fee of $110.00 
before signing those documents. Well, I asked him for the very last time if he 
could allow you pay half of his fee today with a promise that you would pay him 
the Remain balance tomorrow from your $5,000 payment at the western union 
office once you pick up the $5,000 payment. Well, he said that you should pay 
the half of the fee $55.00 through western union

today. He want me to ask you if you would be so kind to pay him back the remain 
$55.00 balance at the western union office once you picked up the $5,000 pay 
out each day. I am hereby this writing to ask you to let me know if you would 
pay his balance immediately you pick up the $5,000 payment.

He wants you to pay half of $55.00 today if you are so kind to pay the balance 
at the western union office once you picked up the $5,000 payment tomorrow. But 
if you are not kind enough to pay his balance after picking the first transfer 
just do not bother yourself not to reply because I won't see another attorney 
here to do this. Here is information for you to pay $55.00 half of his fee 
through western union today.

Send the half of the fee $55.00 today via Money Gram or Western Union

Here is our agent information here in USA you will send the fee to him below:

RECEIVER NAME:. . BURL WILLIAMS
CITY:. . . . MARICOPA
STATE:..ARIZONA
COUNTRY:. . USA
TEST QUESTION:. GOD
TEST ANSWER:. . IS GOOD
AMOUNT:. . $55.00
MTCN. . . . .

The attorney will get those documents signed as soon as he confirms the half of 
his fee from you. I will advice you to pay him the half of $55.00 and let him 
sign the documents and you will then pay the other balance of $55.00 form your 
$5,000 payment.

I await your compliance.

Sincerely,
Mr. David Hassn,
Western Union Money Transfer Department
TELEPHONE NUMBER:+1 (207)536-9017.

Re: [RFC PATCH v3 4/4] builtin/branch: give more useful error messages when renaming

2017-11-02 Thread Eric Sunshine

On Thu, Nov 2, 2017 at 2:54 AM, Kaartic Sivaraam
 wrote:
> From: Kaartic Sivaraam 
>
> When trying to rename an inexistent branch to with a name of a branch
> that already exists the rename failed specifying the new branch name
> exists rather than specifying that the branch trying to be renamed
> doesn't exist.
>
> $ git branch -m tset master
> fatal: A branch named 'master' already exists.
>
> It's conventional to report that 'tset' doesn't exist rather than
> reporting that 'master' exists, the same way the 'mv' command does.
>
> (hypothetical)
> $ git branch -m tset master
> fatal: branch 'tset' doesn't exist.
>
> That has the problem that the error about an existing branch is shown
> only after the user corrects the error about inexistent branch.
>
> $ git branch -m test master
> fatal: A branch named 'master' already exists.
>
> This isn't useful either because the user would have corrected this error in
> a single go if he had been told this alongside the first error. So, give
> more useful error messages by giving errors about old branch name and new
> branch name at the same time. This is possible as the branch name validation
> functions now return the reason they were about to die, when requested.
>
> $ git branch -m tset master
> fatal: branch 'tset' doesn't exist, and branch 'master' already exists

Nicely explained; easily understood.

> Note: Thanks to the strbuf API that made it possible to easily construct
> the composite error message strings!

This may be a problem. See below...

> Signed-off-by: Kaartic Sivaraam 
> ---
> diff --git a/builtin/branch.c b/builtin/branch.c
> +static void get_error_msg(struct strbuf* error_msg, const char* oldname, 
> unsigned old_branch_exists,
> + const char* newname, enum branch_validation_result 
> res)
> +{
> +   const char* connector_string = _(", and ");
> +
> +   if (!old_branch_exists) {
> +   strbuf_addf(error_msg, _("branch '%s' doesn't exist"), 
> oldname);
> +   }
> +
> +   switch (res) {
> +   case BRANCH_EXISTS_NO_FORCE:
> +   strbuf_addf(error_msg, "%s", (!old_branch_exists) ? 
> connector_string : "");
> +   strbuf_addf(error_msg,_("branch '%s' already 
> exists"), newname);
> +   break;
> +   case CANNOT_FORCE_UPDATE_CURRENT_BRANCH:
> +   strbuf_addf(error_msg, "%s", (!old_branch_exists) ? 
> connector_string : "");
> +   strbuf_addstr(error_msg, _("cannot force update the 
> current branch"));
> +   break;
> +   case INVALID_BRANCH_NAME:
> +   strbuf_addf(error_msg, "%s", (!old_branch_exists) ? 
> connector_string : "");
> +   strbuf_addf(error_msg, _("branch name '%s' is 
> invalid"), newname);
> +   break;
> +   /* not necessary to handle success cases */
> +   case BRANCH_EXISTS:
> +   case BRANCH_DOESNT_EXIST:
> +   break;
> +   }
> +}

Translators can correct me, but this smells like "sentence lego"[1],
which we'd like to avoid. Translators lack full context when presented
with bits and pieces of a sentence like this, thus the translation may
be of poor quality; it may even be entirely incorrect since they don't
necessarily know how you will be composing the various pieces.

You _might_ be able to able to resolve this by dropping "and" from:

"foo is moo, and bar is boo"

to turn the error messages into independent clauses:

"foo is moo; bar is boo"

but I'm no translator, and even that may fail the lego sniff test.

A sure-fire way to avoid lego construction would be simply to emit
each messages on its own line:

fatal: branch 'tset' doesn't exist
fatal: branch 'master' already exists

(though, you might consider that too ugly).

[1]: http://www.gnu.org/software/gettext/manual/gettext.html#Preparing-Strings

Re: [PATCH 1/3] bisect: fix memory leak and document `find_bisection()`

2017-11-02 Thread Martin Ågren

On 2 November 2017 at 05:54, Junio C Hamano  wrote:
> Martin Ågren  writes:
>
>> `find_bisection()` rebuilds the commit list it is given by reversing it
>> and skipping uninteresting commits. The uninteresting list entries are
>> leaked. Free them to fix the leak.
>>
>> While we're here and understand what's going on, document the function.
>> In particular, make sure to document that the original list should not
>> be examined by the caller.
>
> Good.  Thanks.
>
> I notice that this has only two callers and both of them do
>
> revs.commits = find_bisection(revs.commits, ...);
>
> I wonder if updating its calling convention to
>
> (void) find_bisection(, ...);
>
> makes sense.  This is obviously outside the scope of this patch.

I think it would. I considered it briefly but punted, though I don't
really understand why now. In hindsight, it would have saved me some
work, since I wouldn't have had to carefully document that the original
list mustn't be examined. I'll let this simmer for a few days in case
other comments turn up, then do a v2 with a preparatory patch that
switches the calling convention as you suggested.

Thanks.

Re: [PATCH] reduce_heads: fix memory leaks

2017-11-02 Thread Martin Ågren

On 2 November 2017 at 04:11, Junio C Hamano  wrote:
> Martin Ågren  writes:
>
>> diff --git a/builtin/merge-base.c b/builtin/merge-base.c
>> index 6dbd167d3..b1b7590c4 100644
>> --- a/builtin/merge-base.c
>> +++ b/builtin/merge-base.c
>> @@ -59,6 +59,8 @@ static int handle_independent(int count, const char **args)
>>   commit_list_insert(get_commit_reference(args[i]), );
>>
>>   result = reduce_heads(revs);
>> + free_commit_list(revs);
>> +
>>   if (!result)
>>   return 1;
>
> The post-context of this hunk continues like so:
>
> while (result) {
> printf("%s\n", oid_to_hex(>item->object.oid));
> result = result->next;
> }
> return 0;
> }
>
> and we end up leaking "result".  This function is directly called from
> cmd_merge_base() and its value is returned to main(), so leaking it
> is not that a grave offence, but that excuse applies equally well to
> revs.

Good catch. I even have a patch to address the leak of `result`, except
I seem to have sorted it into another pile. For this series I just
grepped for "reduce_heads" and didn't stop to think about using UNLEAK,
nor about the leaking of `result`.

> I can see you are shooting for minimum change in this patch, but if
> we were writing this code in a codebase where reduce_heads_replace()
> is already available, I would imagine that we wouldn't use two separate
> variables, perhaps?

The way my other patch addresses the leaking of `result` is that it
rewrites the loop to avoid losing the original value of `result`, so
that it can be UNLEAK-ed at the very end. (That makes it obvious where
the leak happens, compared to adding an UNLEAK a few lines up.) If I do
`reduce_heads_replace()`, I'll need to touch the loop anyway, and
then I could probably just as well UNLEAK a little while at it.

I'll get to this within the next couple of days, then I'll see what it
looks like.

Thanks for your feedback.

1 2 >

1 - 100 of 120 matches

Mail list logo