Re: [PATCH v2 1/9] perf/run: add '--config' option to the 'run' script

2017-11-16 Thread Christian Couder
On Mon, Oct 16, 2017 at 7:25 AM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> It is error prone and tiring to use many long environment
>> variables to give parameters to the 'run' script.
>
> This topic has been sitting in the list archive without getting much
> reaction from list participants.  Is anybody happy with these
> patches?

(Sorry for not responding earlier to this.)

Hopefully AEvar CC'ed will take a look at this soon.


Re: [PATCH v4 00/15] Parial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests

2017-11-16 Thread Junio C Hamano
Jeff Hostetler  writes:

> From: Jeff Hostetler 
>
> This part 3 of a 3 part sequence partial clone.  It assumes
> that part 1 and part 2 are in place.

I couldn't figure out why 'pu' fails with this topic at t5500 (and
others) so I dropped a merge of this before pushing the result out.

Thanks.


What's cooking in git.git (Nov 2017, #05; Fri, 17)

2017-11-16 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ab/mediawiki-name-truncation (2017-11-01) 1 commit
  (merged to 'next' on 2017-11-07 at b30233d585)
 + remote-mediawiki: limit filenames to legal

 The remote-helper for talking to MediaWiki has been updated to
 truncate an overlong pagename so that ".mw" suffix can still be
 added.


* ab/mediawiki-namespace (2017-11-08) 7 commits
  (merged to 'next' on 2017-11-10 at 1cf9cfdfbc)
 + remote-mediawiki: show progress while fetching namespaces
 + remote-mediawiki: process namespaces in order
 + remote-mediawiki: support fetching from (Main) namespace
 + remote-mediawiki: skip virtual namespaces
 + remote-mediawiki: show known namespace choices on failure
 + remote-mediawiki: allow fetching namespaces with spaces
 + remote-mediawiki: add namespace support

 The remote-helper for talking to MediaWiki has been updated to
 work with mediawiki namespaces.


* ab/pcre-v2 (2017-11-13) 1 commit
  (merged to 'next' on 2017-11-13 at 66bf57f071)
 + grep: fix NO_LIBPCRE1_JIT to fully disable JIT

 Building with NO_LIBPCRE1_JIT did not disable it, which has been fixed.


* ad/rebase-i-serie-typofix (2017-11-09) 1 commit
  (merged to 'next' on 2017-11-13 at 199e79b29e)
 + rebase -i: fix comment typo

 Typofix.


* ao/merge-verbosity-getenv-just-once (2017-11-01) 1 commit
  (merged to 'next' on 2017-11-09 at e7cfb8dcec)
 + merge-recursive: check GIT_MERGE_VERBOSITY only once

 Code cleanup.


* bc/submitting-patches-in-asciidoc (2017-11-13) 2 commits
  (merged to 'next' on 2017-11-13 at 70f65b981a)
 + Documentation: convert SubmittingPatches to AsciiDoc
 + Documentation: enable compat-mode for Asciidoctor

 The SubmittingPatches document has been converted to produce an
 HTML version via AsciiDoc/Asciidoctor.


* bp/read-index-from-skip-verification (2017-11-08) 1 commit
  (merged to 'next' on 2017-11-10 at 3c3e32f1ed)
 + read_index_from(): speed index loading by skipping verification of the entry 
order

 Drop (perhaps overly cautious) sanity check before using the index
 read from the filesystem at runtime.


* bw/rebase-i-ignored-submodule-fix (2017-11-07) 1 commit
  (merged to 'next' on 2017-11-10 at a0a54103ed)
 + wt-status: actually ignore submodules when requested

 "git rebase -i" recently started misbehaving when a submodule that
 is configured with 'submodule..ignore' is dirty; this has
 been corrected.


* cb/t4201-robustify (2017-11-13) 1 commit
  (merged to 'next' on 2017-11-13 at b83957b8f2)
 + t4201: make use of abbreviation in the test more robust

 A test update.


* cc/git-packet-pm (2017-11-07) 8 commits
  (merged to 'next' on 2017-11-10 at b40bc2c0bb)
 + Git/Packet.pm: extract parts of t0021/rot13-filter.pl for reuse
 + t0021/rot13-filter: add capability functions
 + t0021/rot13-filter: refactor checking final lf
 + t0021/rot13-filter: add packet_initialize()
 + t0021/rot13-filter: improve error message
 + t0021/rot13-filter: improve 'if .. elsif .. else' style
 + t0021/rot13-filter: refactor packet reading functions
 + t0021/rot13-filter: fix list comparison

 Parts of a test to drive the long-running content filter interface
 has been split into its own module, hopefully to eventually become
 reusable.


* jk/info-alternates-fix (2017-11-13) 1 commit
  (merged to 'next' on 2017-11-13 at ac84a7580a)
 + link_alt_odb_entries: make empty input a noop

 We used to add an empty alternate object database to the system
 that does not help anything; it has been corrected.


* js/for-each-ref-remote-name-and-ref (2017-11-08) 3 commits
  (merged to 'next' on 2017-11-10 at 254af5d602)
 + for-each-ref: test :remotename and :remoteref
 + for-each-ref: let upstream/push report the remote ref name
 + for-each-ref: let upstream/push optionally report the remote name

 The "--format=..." option "git for-each-ref" takes learned to show
 the name of the 'remote' repository and the ref at the remote side
 that is affected for 'upstream' and 'push' via "%(push:remotename)"
 and friends.


* jt/submodule-tests-cleanup (2017-11-08) 1 commit
  (merged to 'next' on 2017-11-10 at c6cbcdeaa4)
 + Tests: clean up and document submodule helpers

 Test clean-up.


* ma/bisect-leakfix (2017-11-06) 4 commits
  (merged to 'next' on 2017-11-09 at c280d786f4)
 + bisect: fix memory leak when returning best element
 + bisect: fix off-by-one error in `best_bisection_sorted()`
 + bisect: fix memory leak in `find_bisection()`
 + bisect: change calling-convention of `find_bisection()`

 Leak fixes.


* ma/reduce-heads-leakfix (2017-11-08) 2 

Re: [PATCH 1/1] exec_cmd: RUNTIME_PREFIX on some POSIX systems

2017-11-16 Thread Junio C Hamano
Dan Jacques  writes:

> Enable Git to resolve its own binary location using a variety of
> OS-specific and generic methods, including:
>
> - procfs via "/proc/self/exe" (Linux)
> - _NSGetExecutablePath (Darwin)
> - argv0, if absolute (all, including Windows).
>
> This is used to enable RUNTIME_PREFIX support for non-Windows systems,
> notably Linux and Darwin. When configured with RUNTIME_PREFIX, Git will
> do a best-effort resolution of its executable path and automatically use
> this as its "exec_path" for relative helper and data lookups, unless
> explicitly overridden.

Yay.

> Git will also always export and consume its resolved "exec_path" using
> the EXEC_PATH_ENVIRONMENT regardless of whether the user has overridden
> it, simplifying future lookups and ensuring consistency in Git tooling
> execution.

The "regardless of whether the user has overridden it" part sounded
alarming and made me wince twice.  I think you meant

 - If the user already has GIT_EXEC_PATH in the environment pointing
   somewhere, when we end up calling git_set_exec_path() to export
   the variable, the value we have at hand to be exported is what
   originally came from the user, bypassing the auto-detection logic
   this patch adds.

 - If the user did not have GIT_EXEC_PATH, the auto-detection logic
   is exercised, and the result is exported when git_set_exec_path()
   is called.  Any program that is spawned by us as a subprocess
   will inherit the same GIT_EXEC_PATH we detected.

As I have multiple installations of various versions of Git, I find
the latter somewhat disturbing.  

Suppose that I designate one stable version of Git and install it at
$HOME/git-stable/{bin,libexec,...}/, and want that version to always
be used in my hooks and other helper scripts that are spawned by
Git, even when I am trying out a newer version of Git that is under
testing.

My hooks would be running $HOME/git-stable/bin/git subcommand" (or
more realistically, it would do "PATH=$HOME/git-stable/bin:$PATH"
upfront), but with this patch (and with runtime-prefix layout) they
would use GIT_EXEC_PATH that was discovered and exported by the
version of Git under testing that invoked the hook, leading to an
inconsistent and hard to debug behaviour, no?

On the other hand, as long as existing GIT_EXEC_PATH is passed-thru
(i.e. the first point above), I think that is a sensible thing to
do.

> When building with a runtime prefix, Git's PERL libraries are now
> installed to a consistently-named directory. This path is resolved and
> exported to Git's delegate PERL invocations using the GITPERLLIB
> environment variable. This enables Git's delegate PERL scripts to import
> Git's own PERL libraries from a path relative to the executable.

Sounds good.

> Small incidental formatting cleanup of "exec_cmd.c".

We usually frown upon these because they often gets distracting, but
I didn't find the ones in this patch too bad.

> diff --git a/Makefile b/Makefile
> index ee9d5eb11..80db01706 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -296,7 +296,8 @@ all::
>  # Define PERL_PATH to the path of your Perl binary (usually /usr/bin/perl).
>  #
>  # Define NO_PERL_MAKEMAKER if you cannot use Makefiles generated by perl's
> -# MakeMaker (e.g. using ActiveState under Cygwin).
> +# MakeMaker (e.g. using ActiveState under Cygwin, or building with a fixed
> +# runtime prefix).

Windows folks may want to comment (either positively or negatively)
as they are the only ones that are currently using runtime-prefix
layout, especially on this part:

> @@ -1547,6 +1547,14 @@ else
>  endif
>  ifdef RUNTIME_PREFIX
>   COMPAT_CFLAGS += -DRUNTIME_PREFIX
> +
> + # Control PERL library location so it can be referenced by relocatable
> + # code.
> + NO_PERL_MAKEMAKER = YesPlease
> +endif


> diff --git a/cache.h b/cache.h
> index cb7fb7c00..9ef59f1cc 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -445,6 +445,8 @@ static inline enum object_type object_type(unsigned int 
> mode)
>  #define GIT_ICASE_PATHSPECS_ENVIRONMENT "GIT_ICASE_PATHSPECS"
>  #define GIT_QUARANTINE_ENVIRONMENT "GIT_QUARANTINE_PATH"
>  #define GIT_OPTIONAL_LOCKS_ENVIRONMENT "GIT_OPTIONAL_LOCKS"
> +#define GIT_PERL_LIB_ENVIRONMENT "GITPERLLIB"
> +#define GIT_TEXT_DOMAIN_DIR_ENVIRONMENT "GIT_TEXTDOMAINDIR"

This is a nice touch.  The other day I noticed that we stopped
defining these when we start using a new enviornment variable, which
should be rectified (this is not within the scope of this patch--I
am just welcoming this change that fixes the existing issue a bit
without getting distracing).

> diff --git a/common-main.c b/common-main.c
> index 6a689007e..6516a1f89 100644
> --- a/common-main.c
> +++ b/common-main.c
> @@ -32,12 +32,12 @@ int main(int argc, const char **argv)
>*/
>   sanitize_stdfds();
>  
> + git_resolve_executable_dir(argv[0]);
> +
>   git_setup_gettext();
>  
>   attr_start();
>  
> - git_extract_argv0_path(argv[0]);
> -

I presume that 

Re: Documentation of post-receive hook

2017-11-16 Thread Junio C Hamano
Junio C Hamano  writes:

>> This hook is invoked by 'git-receive-pack' on the remote repository,
>> which happens when a 'git push' is done on a local repository and
>> successfully transfers at least 1 commit.
>
> I am not sure "at least 1 commit" is a good phrase to use here.
> There are transfer that sends objects but no commit object, and the
> above makes it sound as if such a transfer will not trigger the
> hook.  Would
>
>   This hook is run by 'git receive-pack' on the remote
>   repository, after it receives objects sent by 'git push'.
>
> be clear enough to teach readers that a no-op push that recieve-pack
> does not receive any object does not trigger the hook?

Actually I take this back.  Your original observation "only when at
least one commit is transferred" is not even correct.

You can try what I just tried to make sure:

$ git clone --no-local . ../victim
$ cat >../victim/.git/hooks/post-receive <<\EOF
#!/bin/sh
(
echo "post receive was here at $(date)"
cat 
) >>../STAMP
EOF
$ chmod +x ../victim/.git/hooks/post-receive
$ git push ../victim master:foo

The last "push" does not transfer any object (and obviously does not
satisfy your "at least 1 commit" requirement), but it does update
the STAMP file.  This is because it updates a ref and that is what
post-receive wants to react to, even if there is no new objects
placed in the receiving repository.

So an updated suggestion for the text would be:

 This hook is invoked by 'git-receive-pack' on the remote repository,
 which happens when a 'git push' is done on a local repository.

Oh, wait.  That is what we already have ;-).

Having said all that, there is one case that running 'git push' does
*NOT* cause 'receive-pack' to be invoked at the other end, and in
that scenario, obviously the hook cannot be run, simply because the
command that would run the hook is not run in the first place.

After the above sequence against the "victim" test repository, you
could try:

$ git push ../victim master:foo
Everything up-to-date

and observe that the STAMP file is not updated.

What is happening is that "git push" notices that there is nothing
gained by invoking receive-pack on the other side, because the
branch 'foo' already points at the commit at the tip of our
'master'.

So it might technically be an improvement to update the text to
mention that 'git push' does not necessarily always lead to
invocation of receive-pack, something like:

 This hook is invoked by 'git-receive-pack' on the remote
 repository, which may happen when a 'git push' is done on a
 local repository.

but then that introduces the need to make the reader understand what
"may happen" is trying to say, iow, when does a user 'push' and it
does not trigger receive-pack?

But I do not think teaching that (i.e. when does receive-pack run?)
is the job of this paragraph, whose primary objective is to teach
about this hook that is run when receive-pack is run.  So...


Re: [PATCH] branch doc: remove --set-upstream from synopsis

2017-11-16 Thread Junio C Hamano
Todd Zullinger  writes:

> Seeing that the error output when using it tells the user to "use
> '--track' or '--set-upstream-to' instead," should we perhaps also
> remove the --set-upstream entry entirely?  That's reads:
>
>--set-upstream::
>As this option had confusing syntax, it is no longer supported.
>Please use `--track` or `--set-upstream-to` instead.
>
> I don't have a strong opinion either way, but perhaps the error
> message is all that's needed now?  Only users who have a long memory
> or are reading old documentation will call --set-upstream.  I can
> imagine someone coming along in a few months suggesting to remove the
> remaining reference to --set-upstream from the git branch
> documentation for consistency.

Perhaps.  But that must happen after we can safely remove the hidden
option that is there only to issue an error message.  I suspect that
we may not quite ready yet (the entry is there to ensure that an
"branch --set-upstream $rest" coming from an existing script and
trained fingers does not silently use --set-upstream-to thanks to
the helpful parse-options UI).


Re: [PATCH v4 5/6] rev-list: add list-objects filtering support

2017-11-16 Thread Junio C Hamano
Jonathan Tan  writes:

> If it were up to me, I would remove all existing mentions of "partial
> clone" and explain the presence of the "--missing" argument as follows:
>
> In the future, we will introduce a "partial clone" mechanism wherein
> an object in a repo, obtained from a remote, may reference a missing
> object that can be dynamically fetched from that remote once needed.
> This "partial clone" mechanism will have a way, sometimes slow, of
> determining if a missing link is one of the links expected to be
> produced by this mechanism.
>
> This patch introduces handling of missing objects to help debugging
> and development of the "partial clone" mechanism, and once the
> mechanism is implemented, for a power user to perform operations
> that are missing-object-aware without incurring the cost of checking
> if a missing link is expected.

That sounds quite sensible.


Re: [PATCH v4 4/6] list-objects: filter objects in traverse_commit_list

2017-11-16 Thread Junio C Hamano
Jeff King  writes:

> Those encodings don't necessarily need to be the same, because they're
> about transport. Inside each process we'd have the raw bytes, and encode
> them as appropriate to whatever sub-program we're going to pass to (or
> not at all if we skip the shell for sub-processes, which is usually a
> good idea).

Yes, I share the same feeling.  It does not help that the series
defines its own notion of arg_needs_armor() and uses it to set a
field called requires_armor that is not yet used, the definition of
"armor"ing being each byte getting encoded as two hexadecimal digits
without any sign (which makes me wonder what a receiver of
"deadbeef" would do---did it receive an armored string or a plain
one???).  I do not understand why these strings are not passed as
opaque sequences of bytes and instead converted at this low a layer.







Re: [RFC 2/3] am: semi working --cover-at-tip

2017-11-16 Thread Junio C Hamano
Nicolas Morey-Chaisemartin  writes:

>> I thought about that.

>> Is there a use case for cover after the last patch works and
>> removes the need to touch am_next (can be done out of the loop in
>> am_run).
>
> Do you have an opinion on that ? It has quite a big impact on how things are 
> done !
> Single series only would mean a simple flush at the end.
> Multiple series makes things a whole lot complex.

I am not sure what you even mean.  Are you wondering what "am"
should do to a mbox with, say, these messages?

1: [PATCH 0/2] Cover for series A
2: [PATCH 1/2] patch 1 of series A
3: [PATCH 2/2] patch 2 of series A
4: [PATCH 0/3] Cover for series B
5: [PATCH 1/3] patch 1 of series B
6: [PATCH 2/3] patch 2 of series B
7: [PATCH 2/3] patch 3 of series B

Running "am" on the whole thing and expecting covers to become the
capping empty commit at the tip is crazy, I would think, for such a
mbox, as there is no way to tell the command (after it processes 1,
2 and 3, to create commits out of 1, 2 and then an empty one out of
0 to finish one topic off) that it must create a new branch to store
the next series, and building the second series on top of the
capping empty commit at the tip of first series would not make any
sense---the "tip empty commit" for the first series will no longer
be at the "tip".

What I was alluding to is a different case, in which additional
patches are sent as a follow-up later, ending up in a mbox like
this:

1: [PATCH 0/2] Cover for series A
2: [PATCH 1/2] patch 1 of series A
3: [PATCH 2/2] patch 2 of series A
4: [PATCH 3/2] patch 3 of series A

Naturally, the cover letter may not list 3/2 in its short-log
section, but the description for the overall goal and approach
of the series in it should still be valid even with patch 3/2.
The total number of messages mailsplit gives us would be 4, your
subject parser would read "2" as the number of patches, which would
make the number of messages for the series to be expected "3"
(i.e. "2" plus cover), but "am" would want to create commits for
patches 1, 2, and 3, and then cap it with the cover material.


Re: Documentation of post-receive hook

2017-11-16 Thread Junio C Hamano
Christoph Michelbach  writes:

> I think the documentation of the post-receive hook is misleading. When reading
> it, it appears as though the post-receive hook is executed even when no 
> commits
> are transferred by a git push because it isn't mentioned anywhere that this is
> necessary for its execution.

In other words, post-receive hook triggers only after it receives
objects.  A mere action of running receive-pack command does not.

> This can easily be fixed by changing
>
> This hook is invoked by 'git-receive-pack' on the remote repository,
> which happens when a 'git push' is done on a local repository.

So the existing description is technically correct (i.e. it does
correctly identify who invokes it) but lacks a more interesting and
relevant information (i.e. receive-pack invokes only after receiving
data).

> This hook is invoked by 'git-receive-pack' on the remote repository,
> which happens when a 'git push' is done on a local repository and
> successfully transfers at least 1 commit.

I am not sure "at least 1 commit" is a good phrase to use here.
There are transfer that sends objects but no commit object, and the
above makes it sound as if such a transfer will not trigger the
hook.  Would

This hook is run by 'git receive-pack' on the remote
repository, after it receives objects sent by 'git push'.

be clear enough to teach readers that a no-op push that recieve-pack
does not receive any object does not trigger the hook?



Re: [PATCH] branch doc: remove --set-upstream from synopsis

2017-11-16 Thread Todd Zullinger

Junio C Hamano wrote:

Todd Zullinger  writes:

Support for the --set-upstream option was removed in 52668846ea
(builtin/branch: stop supporting the "--set-upstream" option,
2017-08-17), after a long deprecation period.

Remove the option from the command synopsis for consistency.  Replace
another reference to it in the description of `--delete` with
`--set-upstream-to`.

Signed-off-by: Todd Zullinger 
---


Makes sense.  Even though we internally still carry (and have to
carry) code to notice and explicitly reject "--set-upstream", I do
not think that we need to suggest its presence to the end user.

The option parsing code marks it with the PARSE_OPT_HIDDEN bit
correctly and it would make sense to make the synopsis section
follow suit.


Seeing that the error output when using it tells the user to "use
'--track' or '--set-upstream-to' instead," should we perhaps also
remove the --set-upstream entry entirely?  That's reads:

   --set-upstream::
   As this option had confusing syntax, it is no longer supported.
   Please use `--track` or `--set-upstream-to` instead.

I don't have a strong opinion either way, but perhaps the error
message is all that's needed now?  Only users who have a long memory
or are reading old documentation will call --set-upstream.  I can
imagine someone coming along in a few months suggesting to remove the
remaining reference to --set-upstream from the git branch
documentation for consistency.

--
Todd
~~
...more people are driven insane through religious hysteria than by
drinking alcohol.
   -- W.C. Fields



Re: [PATCH v3 1/1] Introduce git add --renormalize .

2017-11-16 Thread Junio C Hamano
tbo...@web.de writes:

I'll retitle this to

Subject: add: introduce "--renormalize"

and will queue with s/$old/$new/ that you'll see below.

> From: Torsten Bögershausen 
>
> Make it safer to normalize the line endings in a repository:

s/:$/./;

> Files that had been commited with CRLF will be commited with LF.
>
> The old way to normalize a repo was like this:
>  # Make sure that there are not untracked files
>  $ echo "* text=auto" >.gitattributes
>  $ git read-tree --empty
>  $ git add .
>  $ git commit -m "Introduce end-of-line normalization"
>
> The user must make sure that there are no untracked files,
> otherwise they would have been added and tracked from now on.
>
> The new "add ..renormalize" does not add untracked files:

s/\.\./--/;

>  $ echo "* text=auto" >.gitattributes
>  $ git add --renormalize .
>  $ git commit -m "Introduce end-of-line normalization"
>
> Note that "git add --renormalize " is the short form for
> "git add -u --renormalize ".
>
> While add it, document that the same renormalization may be needed,

s/add it/at it/;

> whenever a clean filter is added or changed.
>
> Helped-By: Junio C Hamano 
> Signed-off-by: Torsten Bögershausen 
> ---

Thanks.

>
> Changes since V2:
>   Add line endings in t0025
>   Use the <<-\EOF pattern
>   Improve the documentation for "git add --renormalize"
>   
>
> Documentation/git-add.txt   |  9 -
>  Documentation/gitattributes.txt |  6 --
>  builtin/add.c   | 28 ++--
>  cache.h |  1 +
>  read-cache.c| 30 +++---
>  sha1_file.c | 16 ++--
>  t/t0025-crlf-renormalize.sh | 30 ++
>  7 files changed, 102 insertions(+), 18 deletions(-)
>  create mode 100755 t/t0025-crlf-renormalize.sh
>
> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
> index b700beaff5..d50fa339dc 100644
> --- a/Documentation/git-add.txt
> +++ b/Documentation/git-add.txt
> @@ -10,7 +10,7 @@ SYNOPSIS
>  [verse]
>  'git add' [--verbose | -v] [--dry-run | -n] [--force | -f] [--interactive | 
> -i] [--patch | -p]
> [--edit | -e] [--[no-]all | --[no-]ignore-removal | [--update | -u]]
> -   [--intent-to-add | -N] [--refresh] [--ignore-errors] 
> [--ignore-missing]
> +   [--intent-to-add | -N] [--refresh] [--ignore-errors] 
> [--ignore-missing] [--renormalize]
> [--chmod=(+|-)x] [--] [...]
>  
>  DESCRIPTION
> @@ -175,6 +175,13 @@ for "git add --no-all ...", i.e. ignored 
> removed files.
>   warning (e.g., if you are manually performing operations on
>   submodules).
>  
> +--renormalize::
> + Apply the "clean" process freshly to all tracked files to
> + forcibly add them again to the index.  This is useful after
> + changing `core.autocrlf` configuration or the `text` attribute
> + in order to correct files added with wrong CRLF/LF line endings.
> + This option implies `-u`.
> +
>  --chmod=(+|-)x::
>   Override the executable bit of the added files.  The executable
>   bit is only changed in the index, the files on disk are left
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 4c68bc19d5..30687de81a 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -232,8 +232,7 @@ From a clean working directory:
>  
>  -
>  $ echo "* text=auto" >.gitattributes
> -$ git read-tree --empty   # Clean index, force re-scan of working directory
> -$ git add .
> +$ git add --renormalize .
>  $ git status# Show files that will be normalized
>  $ git commit -m "Introduce end-of-line normalization"
>  -
> @@ -328,6 +327,9 @@ You can declare that a filter turns a content that by 
> itself is unusable
>  into a usable content by setting the filter..required configuration
>  variable to `true`.
>  
> +Note: Whenever the clean filter is changed, the repo should be renormalized:
> +$ git add --renormalize .
> +
>  For example, in .gitattributes, you would assign the `filter`
>  attribute for paths.
>  
> diff --git a/builtin/add.c b/builtin/add.c
> index a648cf4c56..c42b50f857 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -26,6 +26,7 @@ static const char * const builtin_add_usage[] = {
>  };
>  static int patch_interactive, add_interactive, edit_interactive;
>  static int take_worktree_changes;
> +static int add_renormalize;
>  
>  struct update_callback_data {
>   int flags;
> @@ -123,6 +124,25 @@ int add_files_to_cache(const char *prefix,
>   return !!data.add_errors;
>  }
>  
> +static int renormalize_tracked_files(const struct pathspec *pathspec, int 
> flags)
> +{
> + int i, retval = 0;
> +
> + for (i = 0; i < active_nr; i++) {
> + struct cache_entry *ce = active_cache[i];
> 

Re: [PATCH] branch doc: remove --set-upstream from synopsis

2017-11-16 Thread Junio C Hamano
Todd Zullinger  writes:

> Support for the --set-upstream option was removed in 52668846ea
> (builtin/branch: stop supporting the "--set-upstream" option,
> 2017-08-17), after a long deprecation period.
>
> Remove the option from the command synopsis for consistency.  Replace
> another reference to it in the description of `--delete` with
> `--set-upstream-to`.
>
> Signed-off-by: Todd Zullinger 
> ---

Makes sense.  Even though we internally still carry (and have to
carry) code to notice and explicitly reject "--set-upstream", I do
not think that we need to suggest its presence to the end user.

The option parsing code marks it with the PARSE_OPT_HIDDEN bit
correctly and it would make sense to make the synopsis section
follow suit.

Thanks.


Re: [PATCH 1/5] connect: split git:// setup into a separate function

2017-11-16 Thread Junio C Hamano
Jonathan Nieder  writes:

>> Which means the defaulting of git_connect::conn to _fork is now
>> unneeded.  One of the things that made the original cascade a bit
>> harder to follow than necessary, aside from the physical length of
>> the PROTO_GIT part, was that the case where conn remains to point at
>> no_fork looked very special and it was buried in that long PROTO_GIT
>> part.
>
> Good idea.  Here's what I'll include in the reroll.

Sounds good.


Re: [PATCH] launch_editor(): indicate that Git waits for user input

2017-11-16 Thread Junio C Hamano
Lars Schneider  writes:

>> On 16 Nov 2017, at 15:58, Junio C Hamano  wrote:
>> 
>> Lars Schneider  writes:
>> 
 On 16 Nov 2017, at 07:04, Junio C Hamano  wrote:
>>> 
>>> Wow. Thanks for the quick patch :-)
>> 
>> Heh, this is not exactly my itch, so if you are inclined to, can you
>> take it over from here on?
>
> Absolutely! What is the proper way to proceed here? Should I send a v2
> with the changes I suggested? How do I attribute you correctly? I
> assume I need to remove your 'Signed-off-by:'?

If you are making major changes over the original patch, you'd
probably want to take the authorship and mention me on Helped-by:
and/or add "This is based on a previous work by...", if you want to
credit me.

If you are not taking the authorship over, then doing something like

https://public-inbox.org/git/xmqqshdex0ff@gitster.mtv.corp.google.com/

would be sufficient, I would think.

Thanks.


Re: [PATCHv4 7/7] builtin/describe.c: describe a blob

2017-11-16 Thread Junio C Hamano
Junio C Hamano  writes:

> Stefan Beller  writes:
>
>> grep "fatal: test-blob-1 is neither a commit nor blob" actual
>
> OK, that might be somewhat unsatisfying from end-user's point of
> view (logically "test-blob-1" is already a name based on the 'graph
> relations' that is satisfactory).

... but that is correct and I think we need any further updates to
address it.  If a user starts with a tag object, the user already
has one usable human-readable name.  By "unsatisfying", I didn't
mean that it is something we need to spend more cycles.



Documentation of post-receive hook

2017-11-16 Thread Christoph Michelbach
Hi,

I think the documentation of the post-receive hook is misleading. When reading
it, it appears as though the post-receive hook is executed even when no commits
are transferred by a git push because it isn't mentioned anywhere that this is
necessary for its execution.

This can easily be fixed by changing

This hook is invoked by 'git-receive-pack' on the remote repository,
which happens when a 'git push' is done on a local repository.

to:

This hook is invoked by 'git-receive-pack' on the remote repository,
which happens when a 'git push' is done on a local repository and
successfully transfers at least 1 commit.

Alternatively,

This hook executes once for the receive operation.

can be changed to

This hook executes once for the receive operation, but only if at least
1 commit was successfully transferred.

Imho, the first option should be chosen as it informs the reader about this
behavior at the first convenient opportunity.

-- 
Christoph Michelbach 


Re: [PATCH v4 4/6] list-objects: filter objects in traverse_commit_list

2017-11-16 Thread Jeff King
On Thu, Nov 16, 2017 at 04:49:08PM -0500, Jeff Hostetler wrote:

> > First of all, about the injection problem, replying to your previous e-mail
> > [1]:
> > 
> > https://public-inbox.org/git/61855872-221b-0e97-abaa-24a011ad8...@jeffhostetler.com/
> > 
> > > I couldn't use quote.[ch] because it is more concerned with
> > > quoting pathnames because of LF and CR characters within
> > > them -- rather than semicolons and quotes and the like which
> > > I was concerned about.
> > 
> > sq_quote_buf() (or one of the other similarly-named functions) should
> > solve this problem, right? The single quotes around the argument takes
> > care of LF, CR, and semicolons, and things like backslashes and quotes
> > are taken care of as documented.
> > 
> > I don't think we need to invent another encoding to solve this.
> 
> I'll take another look, sq_quote_buf() looks like it might work.
> I was looking at quote_c_style() and that didn't seem right for
> my needs.  Thanks.

I admit I haven't been following this thread closely, but I couldn't
seem to find any indication of exactly which interfaces need quoting, or
who is expected to unquote (here or in the previous iterations).

It sounds like you're worried about shell injection, but shouldn't we
worry about that the actual shell boundary? Likewise, if these values
are being passed over the git protocol, shouldn't that part of the
protocol be designed to encode arbitrary bytes?

Those encodings don't necessarily need to be the same, because they're
about transport. Inside each process we'd have the raw bytes, and encode
them as appropriate to whatever sub-program we're going to pass to (or
not at all if we skip the shell for sub-processes, which is usually a
good idea).

I have the feeling I'm missing something.

-Peff


Re: [PATCH v4 00/10] Partial clone part 2: fsck and promisors

2017-11-16 Thread Jonathan Tan
I patched both this series and the first 9 patches of mine [1] on part 1
of the entire partial clone implementation [2], and then
diffed them. I'll review just the differences between the two.

You can see the entire diff below (minus means in my patch set but not
in Jeff's, plus means the contrary) - I did not trim it.

[1] https://public-inbox.org/git/cover.1506714999.git.jonathanta...@google.com/
[2] 
https://public-inbox.org/git/20171116180743.61353-1-...@jeffhostetler.com/T/#t

> diff --git a/Documentation/git-pack-objects.txt 
> b/Documentation/git-pack-objects.txt
> index 5fad696c5..33a824eed 100644
> --- a/Documentation/git-pack-objects.txt
> +++ b/Documentation/git-pack-objects.txt
> @@ -242,9 +242,19 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it 
> creates a bundle.
>   the resulting packfile.  See linkgit:git-rev-list[1] for valid
>   `` forms.
>  
> ---missing=(error|allow-any):
> +--missing=(error|allow-any|allow-promisor):
>   Specifies how missing objects are handled.  This is useful, for
>   example, when there are missing objects from a prior partial clone.
> + This is stronger than `--missing=allow-promisor` because it limits
> + the traversal, rather than just silencing errors about missing
> + objects.

What is stronger than `--missing=allow-promisor`?

> +
> +--exclude-promisor-objects::
> + Omit objects that are known to be in the promisor remote". (This

Stray quote.

> + option has the purpose of operating only on locally created objects,
> + so that when we repack, we still maintain a distinction between
> + locally created objects [without .promisor] and objects from the
> + promisor remote [with .promisor].)  This is used with partial clone.
>  
>  SEE ALSO
>  
> diff --git a/Documentation/gitremote-helpers.txt 
> b/Documentation/gitremote-helpers.txt
> index 4a584f3c5..1ceab8944 100644
> --- a/Documentation/gitremote-helpers.txt
> +++ b/Documentation/gitremote-helpers.txt
> @@ -466,6 +466,12 @@ set by Git if the remote helper has the 'option' 
> capability.
>   Transmit  as a push option. As the push option
>   must not contain LF or NUL characters, the string is not encoded.
>  
> +'option from-promisor' {'true'|'false'}::
> + Indicate that these objects are being fetch by a promisor.

Should be: ...are being fetched from a promisor.

> +
> +'option no-haves' {'true'|'false'}::
> + Do not send "have" lines.
> +
>  SEE ALSO
>  
>  linkgit:git-remote[1]
> diff --git a/Documentation/rev-list-options.txt 
> b/Documentation/rev-list-options.txt
> index c84e46522..2beffe320 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -730,7 +730,7 @@ specification contained in .
>   Only useful with `--filter=`; prints a list of the omitted objects.
>   Object IDs are prefixed with a ``~'' character.
>  
> ---missing=(error|allow-any|print)::
> +--missing=(error|allow-any|allow-promisor|print)::
>   Specifies how missing objects are handled.  The repository may
>   have missing objects after a partial clone, for example.
>  +
> @@ -741,10 +741,20 @@ The value 'allow-any' will allow object traversal to 
> continue if a
>  missing object is encountered.  Missing objects will silently be omitted
>  from the results.
>  +
> +The value 'allow-promisor' is like 'allow-any' in that it will allow
> +object traversal to continue, but only for EXPECTED missing objects.
> ++
>  The value 'print' is like 'allow-any', but will also print a list of the
>  missing objects.  Object IDs are prefixed with a ``?'' character.
>  endif::git-rev-list[]
>  
> +--exclude-promisor-objects::
> + (For internal use only.)  Prefilter object traversal at
> + promisor boundary.  This is used with partial clone.  This is
> + stronger than `--missing=allow-promisor` because it limits the
> + traversal, rather than just silencing errors about missing
> + objects.
> +
>  --no-walk[=(sorted|unsorted)]::
>   Only show the given commits, but do not traverse their ancestors.
>   This has no effect if a range is specified. If the argument
> diff --git a/Documentation/technical/repository-version.txt 
> b/Documentation/technical/repository-version.txt
> index 074ccb9e0..e03eacceb 100644
> --- a/Documentation/technical/repository-version.txt
> +++ b/Documentation/technical/repository-version.txt
> @@ -87,14 +87,14 @@ When the config key `extensions.preciousObjects` is set 
> to `true`,
>  objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
>  `git repack -d`).
>  
> -`partialClone`
> +`partialclone`
>  ~~
>  
> -When the config key `extensions.partialClone` is set, a remote is
> -designated as a "promisor remote". Objects referenced by packed objects
> -obtained from that promisor remote do not need to be in the local repo.
> -Instead, the promisor remote promises that all such objects can be
> -fetched from it in the 

Re: [PATCH v4 4/6] list-objects: filter objects in traverse_commit_list

2017-11-16 Thread Jeff Hostetler



On 11/16/2017 3:21 PM, Jonathan Tan wrote:

On Thu, 16 Nov 2017 18:07:41 +
Jeff Hostetler  wrote:


+/*
+ * Return 1 if the given string needs armoring because of "special"
+ * characters that may cause injection problems when a command passes
+ * the argument to a subordinate command (such as when upload-pack
+ * launches pack-objects).
+ *
+ * The usual alphanumeric and key punctuation do not trigger it.
+ */
+static int arg_needs_armor(const char *arg)


First of all, about the injection problem, replying to your previous e-mail
[1]:

https://public-inbox.org/git/61855872-221b-0e97-abaa-24a011ad8...@jeffhostetler.com/


I couldn't use quote.[ch] because it is more concerned with
quoting pathnames because of LF and CR characters within
them -- rather than semicolons and quotes and the like which
I was concerned about.


sq_quote_buf() (or one of the other similarly-named functions) should
solve this problem, right? The single quotes around the argument takes
care of LF, CR, and semicolons, and things like backslashes and quotes
are taken care of as documented.

I don't think we need to invent another encoding to solve this.


I'll take another look, sq_quote_buf() looks like it might work.
I was looking at quote_c_style() and that didn't seem right for
my needs.  Thanks.





+{
+   const unsigned char *p;
+
+   for (p = (const unsigned char *)arg; *p; p++) {
+   if (*p >= 'a' && *p <= 'z')
+   continue;
+   if (*p >= 'A' && *p <= 'Z')
+   continue;
+   if (*p >= '0' && *p <= '9')
+   continue;
+   if (*p == '-' || *p == '_' || *p == '.' || *p == '/')
+   continue;


If we do take this approach, can ':' also be included?


Sure.  I just picked the common ones.




+   if (skip_prefix(arg, "sparse:oid=", )) {
+   struct object_context oc;
+   struct object_id sparse_oid;
+
+   /*
+* Try to parse  into an OID for the current
+* command, but DO NOT complain if we don't have the blob or
+* ref locally.
+*/
+   if (!get_oid_with_context(v0, GET_OID_BLOB,
+ _oid, ))
+   filter_options->sparse_oid_value = oiddup(_oid);
+   filter_options->choice = LOFC_SPARSE_OID;
+   if (arg_needs_armor(v0))
+   filter_options->requires_armor = v0 - arg;
+   return 0;
+   }


In your previous e-mail, you mentioned:


yes.  I always pass filter_options.raw_value over the wire.
The code above tries to parse it and put it in an OID for
private use by the current process -- just like the size limit
value in the blob:limit filter.


So I think this function should complain if you don't have the blob or
ref locally. (I envision that if a filter string is to be directly sent
to a server, it should be stored as a string, not processed by this
function first.)


The whole point was for clone to be able to ask the server to use
a known sparse-checkout spec, based on a branch name and a pathname
within the tree.  That's more usable.  My expectation was that users
could keep one or more sparse-checkout specs in the tree based upon
the area/subset of the tree they want to work on.  Then do a partial
clone based upon the desired subset.  Granted, we make them publish
OIDs (outside of git) and then make the user request the partial
clone by OID, but that's awkward.

There are 2 uses of the "--filter=..." parser -- one to actually use
the data within the current process (rev-list and pack-objects) and
one to capture the value and pass it on to another process (fetch-pack
and upload-pack).  The former want the fields fully decoded into
actual variables and the latter want the raw string to pass on.
The parser routine provides both values for its callers.  (In an
earlier draft, the parser would replace the raw_value of a sparse
spec with the actual OID when it had it locally, but I took that
out based on your earlier comments.  Now, the raw_value is always
passed by the second type of usages.)

Jeff



Re: [PATCH 12/30] directory rename detection: miscellaneous testcases to complete coverage

2017-11-16 Thread Elijah Newren
On Wed, Nov 15, 2017 at 12:03 PM, Stefan Beller  wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren  wrote:
>
>> +# Testcase 9d, N-fold transitive rename?
>> +#   (Related to testcase 9c...and 1c and 7e)
>> +#   Commit A: z/a, y/b, x/c, w/d, v/e, u/f
>> +#   Commit B:  y/{a,b},  w/{c,d},  u/{e,f}
>> +#   Commit C: z/{a,t}, x/{b,c}, v/{d,e}, u/f
>> +#   Expected: 
>> +#
>> +#   NOTE: z/ -> y/ (in commit B)
>> +# y/ -> x/ (in commit C)
>> +# x/ -> w/ (in commit B)
>> +# w/ -> v/ (in commit C)
>> +# v/ -> u/ (in commit B)
>> +# So, if we add a file to z, say z/t, where should it end up?  In u?
>> +# What if there's another file or directory named 't' in one of the
>> +# intervening directories and/or in u itself?  Also, shouldn't the
>> +# same logic that places 't' in u/ also move ALL other files to u/?
>> +# What if there are file or directory conflicts in any of them?  If
>> +# we attempted to do N-way (N-fold? N-ary? N-uple?) transitive 
>> renames
>> +# like this, would the user have any hope of understanding any
>> +# conflicts or how their working tree ended up?  I think not, so I'm
>> +# ruling out N-ary transitive renames for N>1.
>> +#
>> +#   Therefore our expected result is:
>> +# z/t, y/a, x/b, w/c, u/d, u/e, u/f
>> +#   The reason that v/d DOES get transitively renamed to u/d is that u/ 
>> isn't
>> +#   renamed somewhere.  A slightly sub-optimal result, but it uses fairly
>> +#   simple rules that are consistent with what we need for all the other
>> +#   testcases and simplifies things for the user.
>
> Does the merge order matter here?

No.

> If B and C were swapped, applying the same logic presented in the NOTE,
> one could argue that we expect:
>
> z/t y/a x/b w/c v/d v/e u/f
>
> I can make a strong point for y/a here, but the v/{d,e} also seem to deviate.

I don't understand; I thought my argument as presented was agnostic of
direction.  Perhaps I have an unstated assumption I'm not realizing or
something; could you explain how my logic above could lead to this
conclusion?

Also, let me try a different tack to see if it's clearer than the
above argument I made.  Looking at each path:

* z/t from commit C does not get renamed to y/t despite B's rename of
z/ -> y/ because C renamed y/ elsewhere.
* z/a from commit A was renamed to y/a in commit B.  We do not
transitively rename further from y/a to x/a (despite C's rename of y/
to x/) because B renamed x/ elsewhere.
* y/b from commit A was renamed to x/b in commit C.  We do not
transitively rename further from x/b to w/b (despite B's rename of x/
to w/) because C renamed w/ elsewhere.
* x/c from commit A was renamed to w/c in commit B.  We do not
transitively rename further from w/c to v/c (despite C's rename from
w/ to v/) because B renamed v/ elsewhere.
* w/d from commit A was renamed to v/d in commit C.  We DO
transitively rename from v/d to u/d because of B's rename of v/ to u/
and because C did not rename u/ to somewhere else.

(And, to complete the list, e and f are simple: v/e is renamed to u/e
in commit B, and there's no directory name on u on either side, so
there's no special logic needed at all.  u/f is even simpler; there's
no renames or directory renames or anything affecting it.)


>> +# Testcase 9e, N-to-1 whammo
>> +#   (Related to testcase 9c...and 1c and 7e)
>> +#   Commit A: dir1/{a,b}, dir2/{d,e}, dir3/{g,h}, dirN/{j,k}
>> +#   Commit B: dir1/{a,b,c,yo}, dir2/{d,e,f,yo}, dir3/{g,h,i,yo}, 
>> dirN/{j,k,l,yo}
>> +#   Commit C: combined/{a,b,d,e,g,h,j,k}
>> +#   Expected: combined/{a,b,c,d,e,f,g,h,i,j,k,l}, CONFLICT(Nto1) warnings,
>> +# dir1/yo, dir2/yo, dir3/yo, dirN/yo
>
> Very neat!

:-)

>> +# Testcase 9f, Renamed directory that only contained immediate subdirs
>> +#   (Related to testcases 1e & 9g)
>> +#   Commit A: goal/{a,b}/$more_files
>> +#   Commit B: priority/{a,b}/$more_files
>> +#   Commit C: goal/{a,b}/$more_files, goal/c
>> +#   Expected: priority/{a,b}/$more_files, priority/c
>
>> +# Testcase 9g, Renamed directory that only contained immediate subdirs, 
>> immediate subdirs renamed
>> +#   (Related to testcases 1e & 9f)
>> +#   Commit A: goal/{a,b}/$more_files
>> +#   Commit B: priority/{alpha,bravo}/$more_files
>> +#   Commit C: goal/{a,b}/$more_files, goal/c
>> +#   Expected: priority/{alpha,bravo}/$more_files, priority/c
>
> and if C also added goal/a/another_file, we'd expect it to
> become priority/alpha/another_file.

Yep!  I thought that was covered enough by other tests, but do you
feel I should add that to this testcase?

> What happens in moving dir hierarchies?
>
> A: root/node1/{leaf1, leaf2}, root/node2/{leaf3, leaf4}
> B: "Move node2 one layer down into node1"
> root/node1/{leaf1, leaf2, node2/{leaf3, leaf4}}
> C: "Add more leaves"
> root/node1/{leaf1, leaf2, leaf5}, root/node2/{leaf3, leaf4, leaf6}

Works just fine; 

Re: Git on Mac - Segmentation fault:11

2017-11-16 Thread Kevin
cc: mailinglist

On Thu, Nov 16, 2017 at 9:40 PM, Frank Burkitt  wrote:
> Kevin -
>
> Thank you for getting back to me.
>
> The version of Git is 2.15.0
> I used Brew to install it
> I am not getting any segfaults from other apps
> When I do a ‘git init’ I get a Segmentation fault: 11 whether I do it in or
> outside of a virtualenv
>
> Is there any additional info that would help?
>
> Frank
>
>
> Frank Burkitt
> Mobile: 307-699-1321
> Email: fburk...@burkitt.com
>
>
> "If you can't explain it simply, you don't understand it well enough" -
> Albert Einstein
>
>
>
>
>
> On Nov 16, 2017, at 1:34 PM, Kevin Daudt  wrote:
>
> On Wed, Nov 15, 2017 at 05:30:23PM -0700, Frank Burkitt wrote:
>
> I am using Git on a Macbook pro with MacOS High Sierra version 10.13.1
> (17B48).  I have been using it in a virtualenv with python 3.  I have
> begun to get "Segmentation fault: 11" with every git command.  I have
> been searching for a reason why this is occurring but have not been
> able to find a solution.  I have reinstalled the operating system,
> uninstalled and reinstalled Git, and a variety of other attempts at
> finding a solution.  Is this a know issue? Any suggestions would be
> appreciated.
>
>
> Hello Frank,
>
> Could you provide a bit more information?
>
> - What version of git are you using?
> - how did you install git,
> - Do you also get segfaults outside of the virtualenv?
>
> This sounds perhaps like it's a copattibility issue with a library.
>
> Kind regards, Kevin.
>
>


Re: [PATCH v4 5/6] rev-list: add list-objects filtering support

2017-11-16 Thread Jonathan Tan
On Thu, 16 Nov 2017 18:07:42 +
Jeff Hostetler  wrote:

> From: Jeff Hostetler 
> 
> Teach rev-list to use the filtering provided by the
> traverse_commit_list_filtered() interface to omit
> unwanted objects from the result.  This feature is
> intended to help with partial clone.
> 
> Object filtering is only allowed when one of the "--objects*"
> options are used.
> 
> When the "--filter-print-omitted" option is used, the omitted
> objects are printed at the end.  These are marked with a "~".
> This option can be combined with "--quiet" to get a list of
> just the omitted objects.
> 
> Added "--missing=(error|print|omit)" argument to specify how
> rev-list should behave when it encounters a missing object
> (presumably from a prior partial clone).
> 
> When "--missing=print" is used, rev-list will print a list of
> any missing objects that should have been included in the output.
> These are marked with a "?".
> 
> Add t6112 test.

The patch itself looks good, except that I have a nagging feeling about
the usage of the term "partial clone" in the commit message,
documentation, and test description. I feel that the usage here leads
one to believe that partial clones haphazardly leave repositories
without random objects (and at the point that this patch is merged,
there will not be any patch in the main repo contradicting this
viewpoint), contrary to the fact that we will have a tracking mechanism
to track which missing objects are expected to be missing. (If I'm the
only one feeling this way, though, then I'll just let it slide.)

If it were up to me, I would remove all existing mentions of "partial
clone" and explain the presence of the "--missing" argument as follows:

In the future, we will introduce a "partial clone" mechanism wherein
an object in a repo, obtained from a remote, may reference a missing
object that can be dynamically fetched from that remote once needed.
This "partial clone" mechanism will have a way, sometimes slow, of
determining if a missing link is one of the links expected to be
produced by this mechanism.

This patch introduces handling of missing objects to help debugging
and development of the "partial clone" mechanism, and once the
mechanism is implemented, for a power user to perform operations
that are missing-object-aware without incurring the cost of checking
if a missing link is expected.


Re: Git on Mac - Segmentation fault:11

2017-11-16 Thread Kevin Daudt
On Wed, Nov 15, 2017 at 05:30:23PM -0700, Frank Burkitt wrote:
> I am using Git on a Macbook pro with MacOS High Sierra version 10.13.1
> (17B48).  I have been using it in a virtualenv with python 3.  I have
> begun to get "Segmentation fault: 11" with every git command.  I have
> been searching for a reason why this is occurring but have not been
> able to find a solution.  I have reinstalled the operating system,
> uninstalled and reinstalled Git, and a variety of other attempts at
> finding a solution.  Is this a know issue? Any suggestions would be
> appreciated.

Hello Frank,

Could you provide a bit more information?

- What version of git are you using?
- how did you install git,
- Do you also get segfaults outside of the virtualenv?

This sounds perhaps like it's a copattibility issue with a library.

Kind regards, Kevin.



Re: [PATCH v4 4/6] list-objects: filter objects in traverse_commit_list

2017-11-16 Thread Jonathan Tan
On Thu, 16 Nov 2017 18:07:41 +
Jeff Hostetler  wrote:

> +/*
> + * Return 1 if the given string needs armoring because of "special"
> + * characters that may cause injection problems when a command passes
> + * the argument to a subordinate command (such as when upload-pack
> + * launches pack-objects).
> + *
> + * The usual alphanumeric and key punctuation do not trigger it.
> + */ 
> +static int arg_needs_armor(const char *arg)

First of all, about the injection problem, replying to your previous e-mail
[1]:

https://public-inbox.org/git/61855872-221b-0e97-abaa-24a011ad8...@jeffhostetler.com/

> I couldn't use quote.[ch] because it is more concerned with
> quoting pathnames because of LF and CR characters within
> them -- rather than semicolons and quotes and the like which
> I was concerned about.

sq_quote_buf() (or one of the other similarly-named functions) should
solve this problem, right? The single quotes around the argument takes
care of LF, CR, and semicolons, and things like backslashes and quotes
are taken care of as documented.

I don't think we need to invent another encoding to solve this.

> +{
> + const unsigned char *p;
> +
> + for (p = (const unsigned char *)arg; *p; p++) {
> + if (*p >= 'a' && *p <= 'z')
> + continue;
> + if (*p >= 'A' && *p <= 'Z')
> + continue;
> + if (*p >= '0' && *p <= '9')
> + continue;
> + if (*p == '-' || *p == '_' || *p == '.' || *p == '/')
> + continue;

If we do take this approach, can ':' also be included?

> + if (skip_prefix(arg, "sparse:oid=", )) {
> + struct object_context oc;
> + struct object_id sparse_oid;
> +
> + /*
> +  * Try to parse  into an OID for the current
> +  * command, but DO NOT complain if we don't have the blob or
> +  * ref locally.
> +  */
> + if (!get_oid_with_context(v0, GET_OID_BLOB,
> +   _oid, ))
> + filter_options->sparse_oid_value = oiddup(_oid);
> + filter_options->choice = LOFC_SPARSE_OID;
> + if (arg_needs_armor(v0))
> + filter_options->requires_armor = v0 - arg;
> + return 0;
> + }

In your previous e-mail, you mentioned:

> yes.  I always pass filter_options.raw_value over the wire.
> The code above tries to parse it and put it in an OID for
> private use by the current process -- just like the size limit
> value in the blob:limit filter.

So I think this function should complain if you don't have the blob or
ref locally. (I envision that if a filter string is to be directly sent
to a server, it should be stored as a string, not processed by this
function first.)


Re: [PATCH v4 07/10] introduce fetch-object: fetch one promisor object

2017-11-16 Thread Ramsay Jones


On 16/11/17 18:12, Jeff Hostetler wrote:
> From: Jonathan Tan 
> 
> Introduce fetch-object, providing the ability to fetch one object from a
> promisor remote.
> 
> This uses fetch-pack. To do this, the transport mechanism has been
> updated with 2 flags, "from-promisor" to indicate that the resulting
> pack comes from a promisor remote (and thus should be annotated as such
> by index-pack), and "no-haves" to suppress the sending of "have" lines.
> 
> This will be tested in a subsequent commit.
> 
> NEEDSWORK: update this when we have more information about protocol v2,
> which should allow a way to suppress the ref advertisement and
> officially allow any object type to be "want"-ed.
> 
> Signed-off-by: Jonathan Tan 
> ---
>  Documentation/gitremote-helpers.txt |  6 ++
>  Makefile|  1 +
>  builtin/fetch-pack.c|  8 
>  builtin/index-pack.c| 16 +---
>  fetch-object.c  | 23 +++
>  fetch-object.h  |  6 ++
>  fetch-pack.c|  8 ++--
>  fetch-pack.h|  2 ++
>  remote-curl.c   | 14 +-
>  transport.c |  8 
>  transport.h |  8 
>  11 files changed, 94 insertions(+), 6 deletions(-)
>  create mode 100644 fetch-object.c
>  create mode 100644 fetch-object.h
> 
[snip]
> diff --git a/fetch-object.c b/fetch-object.c
> new file mode 100644
> index 000..f89dbba
> --- /dev/null
> +++ b/fetch-object.c
> @@ -0,0 +1,23 @@
> +#include "cache.h"
> +#include "packfile.h"
> +#include "pkt-line.h"
> +#include "strbuf.h"
> +#include "transport.h"

I note that this still does not #include "fetch_object.h".
[If you recall, this suppresses a sparse warning].

> +
> +void fetch_object(const char *remote_name, const unsigned char *sha1)
> +{
> + struct remote *remote;
> + struct transport *transport;
> + struct ref *ref;
> +
> + remote = remote_get(remote_name);
> + if (!remote->url[0])
> + die(_("Remote with no URL"));
> + transport = transport_get(remote, remote->url[0]);
> +
> + ref = alloc_ref(sha1_to_hex(sha1));
> + hashcpy(ref->old_oid.hash, sha1);
> + transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
> + transport_set_option(transport, TRANS_OPT_NO_HAVES, "1");
> + transport_fetch_refs(transport, ref);
> +}
> diff --git a/fetch-object.h b/fetch-object.h
> new file mode 100644
> index 000..f371300
> --- /dev/null
> +++ b/fetch-object.h
> @@ -0,0 +1,6 @@
> +#ifndef FETCH_OBJECT_H
> +#define FETCH_OBJECT_H
> +
> +extern void fetch_object(const char *remote_name, const unsigned char *sha1);
> +
> +#endif

ATB,
Ramsay Jones




КЛИЕНТСКИЕ БАЗЫ!!! Подробнее: prodawez...@gmail.com Узнайте подробности!

2017-11-16 Thread uqbkcaqr...@vger.kernel.org
KLIENTSKIE BAZI!!! Podrobnee: prodawez...@gmail.com Uznaite podrobnosti!


Re: [PATCHv5 7/7] builtin/describe.c: describe a blob

2017-11-16 Thread Stefan Beller
On Wed, Nov 15, 2017 at 7:24 PM, Junio C Hamano  wrote:

> I am not sure if "And if there is ..." is adding much value here (I
> do not think it is even technically correct for that matter).  If
> there are more than one tag that point at the commit the user is
> interested in, we use one of the tags, as tags conceptually sit at a
> higher level.  And we use a heuristic to use one or the other tag to
> make up a name for the commit, so the same commit can have two valid
> names. ---So what?  Neither of these two valid names is "ambigous";
> the commit object the user wanted to name _is_ correctly identified
> (I would assume that we are not discussing a hash collision).
>
> Lucikly, if we remove "And if...precisely", the logic still flows
> nicely, if not more, to the next paragraph.

fixed.

>> When describing a blob, we want to describe the blob from a higher layer
>> as well, which is a tuple of (commit, deep/path) as the tree objects
>> involved are rather uninteresting.  The same blob can be referenced by
>> multiple commits, so how we decide which commit to use?  This patch
>> implements a rather naive approach on this: As there are no back pointers
>> from blobs to commits in which the blob occurs, we'll start walking from
>> any tips available, listing the blobs in-order of the commit and once we
>
> Is "any tips" still the case?  I was wondering why you start your
> traversal at HEAD and nothing else in this iteration.  There seems
> to be no mention of this design decision in the documentation and no
> justification in the log.

fixed the text. The design decision to reverse walk HEAD is tied to
your observation below:

> The reversing may improve the chance of an older commit to be chosen
> rather than the newer one, but it does not even guarantee to show the
> "introduction".

This is what I realized when we started walking all refs including reflog.
The introduction can only be found when we take the commit-parents
into account and look at the diffs. I noticed you started
origin/jc/diff-blobfind
which may be helpful to find the introduction correctly, which I'll play around
with that and see if I can get that working.

> What this guarantees is that a long history will be traversed fully
> before we start considering which commit has the blob of interest, I
> am afraid.  Is this a sensible trade-off?

I am not fully convinced all descriptions are in recent history, but I
tend to agree that most are, so probably the trade off is a wash.


[PATCH] apply: update line lengths for --inaccurate-eof

2017-11-16 Thread René Scharfe
Some diff implementations don't report missing newlines at the end of
files.  Applying such a patch can cause a newline character to be
added inadvertently.  The option --inaccurate-eof of git apply can be
used to remove trailing newlines if needed.

apply_one_fragment() cuts it off from the buffers for preimage and
postimage.  Before it does, it builds an array with the lengths of each
line for both.  Make sure to update the length of the last line in
these line info structures as well to keep them consistent with their
respective buffer.

Without this fix the added test fails; git apply dies and reports:

   fatal: BUG: caller miscounted postlen: asked 1, orig = 1, used = 2

That sanity check is only called if whitespace changes are ignored.

Reported-by: Mahmoud Al-Qudsi 
Signed-off-by: Rene Scharfe 
---
 apply.c|  2 ++
 t/t4107-apply-ignore-whitespace.sh | 14 ++
 2 files changed, 16 insertions(+)

diff --git a/apply.c b/apply.c
index b8087bd29c..321a9fa68d 100644
--- a/apply.c
+++ b/apply.c
@@ -2953,6 +2953,8 @@ static int apply_one_fragment(struct apply_state *state,
newlines.len > 0 && newlines.buf[newlines.len - 1] == '\n') {
old--;
strbuf_setlen(, newlines.len - 1);
+   preimage.line_allocated[preimage.nr - 1].len--;
+   postimage.line_allocated[postimage.nr - 1].len--;
}
 
leading = frag->leading;
diff --git a/t/t4107-apply-ignore-whitespace.sh 
b/t/t4107-apply-ignore-whitespace.sh
index 9e29b5262d..ac72eeaf27 100755
--- a/t/t4107-apply-ignore-whitespace.sh
+++ b/t/t4107-apply-ignore-whitespace.sh
@@ -178,4 +178,18 @@ test_expect_success 'patch5 fails 
(--no-ignore-whitespace)' '
test_must_fail git apply --no-ignore-whitespace patch5.patch
 '
 
+test_expect_success 'apply --ignore-space-change --inaccurate-eof' '
+   echo 1 >file &&
+   git apply --ignore-space-change --inaccurate-eof <<-\EOF &&
+   diff --git a/file b/file
+   --- a/file
+   +++ b/file
+   @@ -1 +1 @@
+   -1
+   +2
+   EOF
+   printf 2 >expect &&
+   test_cmp expect file
+'
+
 test_done
-- 
2.15.0


[PATCH v4 06/15] pack-objects: test support for blob filtering

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

As part of an effort to improve Git support for very large repositories
in which clients typically have only a subset of all version-controlled
blobs, test pack-objects support for --filter=blob:limit=, packing only
blobs not exceeding that size unless the blob corresponds to a file
whose name starts with ".git". upload-pack will eventually be taught to
use this new parameter if needed to exclude certain blobs during a fetch
or clone, potentially drastically reducing network consumption when
serving these very large repositories.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5300-pack-object.sh  | 26 ++
 t/test-lib-functions.sh | 12 
 2 files changed, 38 insertions(+)

diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 9c68b99..8e3db12 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -457,6 +457,32 @@ test_expect_success !PTHREADS,C_LOCALE_OUTPUT 
'pack-objects --threads=N or pack.
grep -F "no threads support, ignoring pack.threads" err
 '
 
+lcut () {
+   perl -e '$/ = undef; $_ = <>; s/^.{'$1'}//s; print $_'
+}
+
+test_expect_success 'filtering by size works with multiple excluded' '
+   rm -rf server &&
+   git init server &&
+   printf a > server/a &&
+   printf b > server/b &&
+   printf c-very-long-file > server/c &&
+   printf d-very-long-file > server/d &&
+   git -C server add a b c d &&
+   git -C server commit -m x &&
+
+   git -C server rev-parse HEAD >objects &&
+   git -C server pack-objects --revs --stdout --filter=blob:limit=10 
my.pack &&
+
+   # Ensure that only the small blobs are in the packfile
+   git index-pack my.pack &&
+   git verify-pack -v my.idx >objectlist &&
+   grep $(git hash-object server/a) objectlist &&
+   grep $(git hash-object server/b) objectlist &&
+   ! grep $(git hash-object server/c) objectlist &&
+   ! grep $(git hash-object server/d) objectlist
+'
+
 #
 # WARNING!
 #
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 1701fe2..07b79c7 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1020,3 +1020,15 @@ nongit () {
"$@"
)
 }
+
+# Converts big-endian pairs of hexadecimal digits into bytes. For example,
+# "printf 61620d0a | hex_pack" results in "ab\r\n".
+hex_pack () {
+   perl -e '$/ = undef; $input = <>; print pack("H*", $input)'
+}
+
+# Converts bytes into big-endian pairs of hexadecimal digits. For example,
+# "printf 'ab\r\n' | hex_unpack" results in "61620d0a".
+hex_unpack () {
+   perl -e '$/ = undef; $input = <>; print unpack("H2" x length($input), 
$input)'
+}
-- 
2.9.3



[PATCH v4 14/15] unpack-trees: batch fetching of missing blobs

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

When running checkout, first prefetch all blobs that are to be updated
but are missing. This means that only one pack is downloaded during such
operations, instead of one per missing blob.

This operates only on the blob level - if a repository has a missing
tree, they are still fetched one at a time.

This does not use the delayed checkout mechanism introduced in commit
2841e8f ("convert: add "status=delayed" to filter process protocol",
2017-06-30) due to significant conceptual differences - in particular,
for partial clones, we already know what needs to be fetched based on
the contents of the local repo alone, whereas for status=delayed, it is
the filter process that tells us what needs to be checked in the end.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 fetch-object.c   | 27 +++
 fetch-object.h   |  5 +
 t/t5601-clone.sh | 52 
 unpack-trees.c   | 22 ++
 4 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/fetch-object.c b/fetch-object.c
index 369b61c..21b4dfa 100644
--- a/fetch-object.c
+++ b/fetch-object.c
@@ -3,12 +3,12 @@
 #include "pkt-line.h"
 #include "strbuf.h"
 #include "transport.h"
+#include "fetch-object.h"
 
-void fetch_object(const char *remote_name, const unsigned char *sha1)
+static void fetch_refs(const char *remote_name, struct ref *ref)
 {
struct remote *remote;
struct transport *transport;
-   struct ref *ref;
int original_fetch_if_missing = fetch_if_missing;
 
fetch_if_missing = 0;
@@ -17,10 +17,29 @@ void fetch_object(const char *remote_name, const unsigned 
char *sha1)
die(_("Remote with no URL"));
transport = transport_get(remote, remote->url[0]);
 
-   ref = alloc_ref(sha1_to_hex(sha1));
-   hashcpy(ref->old_oid.hash, sha1);
transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
transport_set_option(transport, TRANS_OPT_NO_HAVES, "1");
transport_fetch_refs(transport, ref);
fetch_if_missing = original_fetch_if_missing;
 }
+
+void fetch_object(const char *remote_name, const unsigned char *sha1)
+{
+   struct ref *ref = alloc_ref(sha1_to_hex(sha1));
+   hashcpy(ref->old_oid.hash, sha1);
+   fetch_refs(remote_name, ref);
+}
+
+void fetch_objects(const char *remote_name, const struct oid_array *to_fetch)
+{
+   struct ref *ref = NULL;
+   int i;
+
+   for (i = 0; i < to_fetch->nr; i++) {
+   struct ref *new_ref = alloc_ref(oid_to_hex(_fetch->oid[i]));
+   oidcpy(_ref->old_oid, _fetch->oid[i]);
+   new_ref->next = ref;
+   ref = new_ref;
+   }
+   fetch_refs(remote_name, ref);
+}
diff --git a/fetch-object.h b/fetch-object.h
index f371300..4b269d0 100644
--- a/fetch-object.h
+++ b/fetch-object.h
@@ -1,6 +1,11 @@
 #ifndef FETCH_OBJECT_H
 #define FETCH_OBJECT_H
 
+#include "sha1-array.h"
+
 extern void fetch_object(const char *remote_name, const unsigned char *sha1);
 
+extern void fetch_objects(const char *remote_name,
+ const struct oid_array *to_fetch);
+
 #endif
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 6d37c6d..13610b7 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -611,6 +611,58 @@ test_expect_success 'partial clone: warn if server does 
not support object filte
test_i18ngrep "filtering not recognized by server" err
 '
 
+test_expect_success 'batch missing blob request during checkout' '
+   rm -rf server client &&
+
+   test_create_repo server &&
+   echo a >server/a &&
+   echo b >server/b &&
+   git -C server add a b &&
+
+   git -C server commit -m x &&
+   echo aa >server/a &&
+   echo bb >server/b &&
+   git -C server add a b &&
+   git -C server commit -m x &&
+
+   test_config -C server uploadpack.allowfilter 1 &&
+   test_config -C server uploadpack.allowanysha1inwant 1 &&
+
+   git clone --filter=blob:limit=0 "file://$(pwd)/server" client &&
+
+   # Ensure that there is only one negotiation by checking that there is
+   # only "done" line sent. ("done" marks the end of negotiation.)
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client checkout HEAD^ &&
+   grep "git> done" trace >done_lines &&
+   test_line_count = 1 done_lines
+'
+
+test_expect_success 'batch missing blob request does not inadvertently try to 
fetch gitlinks' '
+   rm -rf server client &&
+
+   test_create_repo repo_for_submodule &&
+   test_commit -C repo_for_submodule x &&
+
+   test_create_repo server &&
+   echo a >server/a &&
+   echo b >server/b &&
+   git -C server add a b &&
+   git -C server commit -m x &&
+
+   echo aa >server/a &&
+   echo bb >server/b &&
+   # Also add a gitlink pointing to an arbitrary repository

[PATCH v4 12/15] t5601: test for partial clone

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/clone.c  | 15 ---
 t/t5601-clone.sh | 49 +
 2 files changed, 61 insertions(+), 3 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index fceb9e7..5719690 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -889,6 +889,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
struct refspec *refspec;
const char *fetch_pattern;
 
+   fetch_if_missing = 0;
+
packet_trace_identity("clone");
argc = parse_options(argc, argv, prefix, builtin_clone_options,
 builtin_clone_usage, 0);
@@ -1109,11 +,13 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_set_option(transport, TRANS_OPT_UPLOADPACK,
 option_upload_pack);
 
-   if (filter_options.choice)
+   if (filter_options.choice) {
transport_set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
 filter_options.raw_value);
+   transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+   }
 
-   if (transport->smart_options && !deepen)
+   if (transport->smart_options && !deepen && !filter_options.choice)
transport->smart_options->check_self_contained_and_connected = 
1;
 
refs = transport_get_remote_refs(transport);
@@ -1173,13 +1177,17 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
write_refspec_config(src_ref_prefix, our_head_points_at,
remote_head_points_at, _top);
 
+   if (filter_options.choice)
+   partial_clone_register("origin", _options);
+
if (is_local)
clone_local(path, git_dir);
else if (refs && complete_refs_before_fetch)
transport_fetch_refs(transport, mapped_refs);
 
update_remote_refs(refs, mapped_refs, remote_head_points_at,
-  branch_top.buf, reflog_msg.buf, transport, 
!is_local);
+  branch_top.buf, reflog_msg.buf, transport,
+  !is_local && !filter_options.choice);
 
update_head(our_head_points_at, remote_head, reflog_msg.buf);
 
@@ -1200,6 +1208,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
}
 
junk_mode = JUNK_LEAVE_REPO;
+   fetch_if_missing = 1;
err = checkout(submodule_progress);
 
strbuf_release(_msg);
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 9c56f77..6d37c6d 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -571,4 +571,53 @@ test_expect_success 'GIT_TRACE_PACKFILE produces a usable 
pack' '
git -C replay.git index-pack -v --stdin  err &&
+
+   test_i18ngrep "filtering not recognized by server" err
+'
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'partial clone using HTTP' '
+   partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+'
+
+stop_httpd
+
 test_done
-- 
2.9.3



[PATCH v4 15/15] fetch-pack: restore save_commit_buffer after use

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

In fetch-pack, the global variable save_commit_buffer is set to 0, but
not restored to its original value after use.

In particular, if show_log() (in log-tree.c) is invoked after
fetch_pack() in the same process, show_log() will return before printing
out the commit message (because the invocation to
get_cached_commit_buffer() returns NULL, because the commit buffer was
not saved). I discovered this when attempting to run "git log -S" in a
partial clone, triggering the case where revision walking lazily loads
missing objects.

Therefore, restore save_commit_buffer to its original value after use.

An alternative to solve the problem I had is to replace
get_cached_commit_buffer() with get_commit_buffer(). That invocation was
introduced in commit a97934d ("use get_cached_commit_buffer where
appropriate", 2014-06-13) to replace "commit->buffer" introduced in
commit 3131b71 ("Add "--show-all" revision walker flag for debugging",
2008-02-13). In the latter commit, the commit author seems to be
deciding between not showing an unparsed commit at all and showing an
unparsed commit without the message (which is what the commit does), and
did not mention parsing the unparsed commit, so I prefer to preserve the
existing behavior.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 fetch-pack.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/fetch-pack.c b/fetch-pack.c
index 895e8f9..121f03e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -717,6 +717,7 @@ static int everything_local(struct fetch_pack_args *args,
 {
struct ref *ref;
int retval;
+   int old_save_commit_buffer = save_commit_buffer;
timestamp_t cutoff = 0;
 
save_commit_buffer = 0;
@@ -784,6 +785,9 @@ static int everything_local(struct fetch_pack_args *args,
print_verbose(args, _("already have %s (%s)"), 
oid_to_hex(remote),
  ref->name);
}
+
+   save_commit_buffer = old_save_commit_buffer;
+
return retval;
 }
 
-- 
2.9.3



[PATCH v4 09/15] fetch-pack: test support excluding large blobs

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Created tests to verify fetch-pack and upload-pack support
for excluding large blobs using --filter=blob:limit=
parameter.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 80a1a32..c57916b 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -755,4 +755,31 @@ test_expect_success 'fetching deepen' '
)
 '
 
+test_expect_success 'filtering by size' '
+   rm -rf server client &&
+   test_create_repo server &&
+   test_commit -C server one &&
+   test_config -C server uploadpack.allowfilter 1 &&
+
+   test_create_repo client &&
+   git -C client fetch-pack --filter=blob:limit=0 ../server HEAD &&
+
+   # Ensure that object is not inadvertently fetched
+   test_must_fail git -C client cat-file -e $(git hash-object server/one.t)
+'
+
+test_expect_success 'filtering by size has no effect if support for it is not 
advertised' '
+   rm -rf server client &&
+   test_create_repo server &&
+   test_commit -C server one &&
+
+   test_create_repo client &&
+   git -C client fetch-pack --filter=blob:limit=0 ../server HEAD 2> err &&
+
+   # Ensure that object is fetched
+   git -C client cat-file -e $(git hash-object server/one.t) &&
+
+   test_i18ngrep "filtering not recognized by server" err
+'
+
 test_done
-- 
2.9.3



[PATCH v4 08/15] partial-clone: define partial clone settings in config

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Create get and set routines for partial clone settings in
the config.  These will be used by partial clone and fetch
to remember the promisor remote and the default filter-spec.

Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 27 ---
 1 file changed, 27 deletions(-)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index c57916b..80a1a32 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -755,31 +755,4 @@ test_expect_success 'fetching deepen' '
)
 '
 
-test_expect_success 'filtering by size' '
-   rm -rf server client &&
-   test_create_repo server &&
-   test_commit -C server one &&
-   test_config -C server uploadpack.allowfilter 1 &&
-
-   test_create_repo client &&
-   git -C client fetch-pack --filter=blob:limit=0 ../server HEAD &&
-
-   # Ensure that object is not inadvertently fetched
-   test_must_fail git -C client cat-file -e $(git hash-object server/one.t)
-'
-
-test_expect_success 'filtering by size has no effect if support for it is not 
advertised' '
-   rm -rf server client &&
-   test_create_repo server &&
-   test_commit -C server one &&
-
-   test_create_repo client &&
-   git -C client fetch-pack --filter=blob:limit=0 ../server HEAD 2> err &&
-
-   # Ensure that object is fetched
-   git -C client cat-file -e $(git hash-object server/one.t) &&
-
-   test_i18ngrep "filtering not recognized by server" err
-'
-
 test_done
-- 
2.9.3



[PATCH v4 13/15] t5500: more tests for partial clone and fetch

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 60 +++
 1 file changed, 56 insertions(+), 4 deletions(-)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 23702b5..c95bb7b 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -782,7 +782,7 @@ test_expect_success 'filtering by size has no effect if 
support for it is not ad
test_i18ngrep "filtering not recognized by server" err
 '
 
-fetch_blob_max_bytes () {
+setup_blob_max_bytes () {
  SERVER="$1"
  URL="$2"
 
@@ -794,7 +794,11 @@ fetch_blob_max_bytes () {
git clone "$URL" client &&
test_config -C client extensions.partialclone origin &&
 
-   test_commit -C "$SERVER" two &&
+   test_commit -C "$SERVER" two
+}
+
+do_blob_max_bytes() {
+   SERVER="$1" &&
 
git -C client fetch --filter=blob:limit=0 origin HEAD:somewhere &&
 
@@ -805,14 +809,62 @@ fetch_blob_max_bytes () {
 }
 
 test_expect_success 'fetch with filtering' '
-fetch_blob_max_bytes server server
+   setup_blob_max_bytes server server &&
+   do_blob_max_bytes server
+'
+
+test_expect_success 'fetch respects configured filtering' '
+   setup_blob_max_bytes server server &&
+
+   test_config -C client core.partialclonefilter blob:limit=0 &&
+
+   git -C client fetch origin HEAD:somewhere &&
+
+   # Ensure that commit is fetched, but blob is not
+   test_config -C client extensions.partialclone "arbitrary string" &&
+   git -C client cat-file -e $(git -C server rev-parse two) &&
+   test_must_fail git -C client cat-file -e $(git hash-object server/two.t)
+'
+
+test_expect_success 'pull respects configured filtering' '
+   setup_blob_max_bytes server server &&
+
+   # Hide two.t from tip so that client does not load it upon the
+   # automatic checkout that pull performs
+   git -C server rm two.t &&
+   test_commit -C server three &&
+
+   test_config -C server uploadpack.allowanysha1inwant 1 &&
+   test_config -C client core.partialclonefilter blob:limit=0 &&
+
+   git -C client pull origin &&
+
+   # Ensure that commit is fetched, but blob is not
+   test_config -C client extensions.partialclone "arbitrary string" &&
+   git -C client cat-file -e $(git -C server rev-parse two) &&
+   test_must_fail git -C client cat-file -e $(git hash-object server/two.t)
+'
+
+test_expect_success 'clone configures filtering' '
+   rm -rf server client &&
+   test_create_repo server &&
+   test_commit -C server one &&
+   test_commit -C server two &&
+   test_config -C server uploadpack.allowanysha1inwant 1 &&
+
+   git clone --filter=blob:limit=12345 server client &&
+
+   # Ensure that we can, for example, checkout HEAD^
+   rm -rf client/.git/objects/* &&
+   git -C client checkout HEAD^
 '
 
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
 test_expect_success 'fetch with filtering and HTTP' '
-fetch_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+   setup_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server" &&
+   do_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server"
 '
 
 stop_httpd
-- 
2.9.3



[PATCH v4 11/15] t5500: add fetch-pack tests for partial clone

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 t/t5500-fetch-pack.sh | 36 
 1 file changed, 36 insertions(+)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index c57916b..23702b5 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -782,4 +782,40 @@ test_expect_success 'filtering by size has no effect if 
support for it is not ad
test_i18ngrep "filtering not recognized by server" err
 '
 
+fetch_blob_max_bytes () {
+ SERVER="$1"
+ URL="$2"
+
+   rm -rf "$SERVER" client &&
+   test_create_repo "$SERVER" &&
+   test_commit -C "$SERVER" one &&
+   test_config -C "$SERVER" uploadpack.allowfilter 1 &&
+
+   git clone "$URL" client &&
+   test_config -C client extensions.partialclone origin &&
+
+   test_commit -C "$SERVER" two &&
+
+   git -C client fetch --filter=blob:limit=0 origin HEAD:somewhere &&
+
+   # Ensure that commit is fetched, but blob is not
+   test_config -C client extensions.partialclone "arbitrary string" &&
+   git -C client cat-file -e $(git -C "$SERVER" rev-parse two) &&
+   test_must_fail git -C client cat-file -e $(git hash-object 
"$SERVER/two.t")
+}
+
+test_expect_success 'fetch with filtering' '
+fetch_blob_max_bytes server server
+'
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'fetch with filtering and HTTP' '
+fetch_blob_max_bytes "$HTTPD_DOCUMENT_ROOT_PATH/server" 
"$HTTPD_URL/smart/server"
+'
+
+stop_httpd
+
+
 test_done
-- 
2.9.3



[PATCH v4 10/15] fetch: add from_promisor and exclude-promisor-objects parameters

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Teach fetch to use from-promisor and exclude-promisor-objects
parameters with sub-commands.  Initialize fetch_if_missing
global variable.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 builtin/fetch.c | 61 -
 connected.c |  2 ++
 2 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index fb9af7c..d3cf423 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1047,9 +1047,11 @@ static struct transport *prepare_transport(struct remote 
*remote, int deepen)
set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, "yes");
if (update_shallow)
set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
-   if (filter_options.choice)
+   if (filter_options.choice) {
set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
   filter_options.raw_value);
+   set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+   }
return transport;
 }
 
@@ -1287,6 +1289,59 @@ static int fetch_multiple(struct string_list *list)
return result;
 }
 
+static inline void fetch_one__setup_partial(struct remote *remote)
+{
+   int ppc, neq;
+
+   /* Do we have a prior partial clone/fetch? */
+   ppc = (repository_format_partial_clone &&
+  *repository_format_partial_clone);
+
+   /*
+* If no prior partial clone/fetch and partial fetch was NOT
+* requested now, do a normal fetch.
+*/
+   if (!ppc && !filter_options.choice)
+   return;
+
+   /*
+* If this is the FIRST partial fetch request, we enable partial
+* on this repo and remember the given filter-spec as the default
+* for subsequent fetches to this remote.
+*/
+   if (!ppc && filter_options.choice) {
+   partial_clone_register(remote->name, _options);
+   return;
+   }
+
+   /*
+* We are currently limited to only ONE promisor remote.  That is,
+* we only allow partial fetches back to the original partial clone
+* remote (or the first partial fetch remote).  Disallow explicit
+* partial fetches to a different remote.
+*
+* Normal (non-partial) fetch commands should still be allowed to
+* other remotes.
+*/
+   neq = (strcmp(remote->name, repository_format_partial_clone));
+   if (neq && filter_options.choice)
+   die(_("unsupported partial-fetch to a different remote"));
+
+   if (neq && !filter_options.choice)
+   return;
+
+   /*
+* When fetching from the promisor remote, we either use the
+* explicitly given filter-spec or inherit the filter-spec from
+* the clone.
+*/
+   if (filter_options.choice)
+   return;
+
+   partial_clone_get_default_filter_spec(_options);
+   return;
+}
+
 static int fetch_one(struct remote *remote, int argc, const char **argv)
 {
static const char **refs = NULL;
@@ -1298,6 +1353,8 @@ static int fetch_one(struct remote *remote, int argc, 
const char **argv)
die(_("No remote repository specified.  Please, specify either 
a URL or a\n"
"remote name from which new revisions should be fetched."));
 
+   fetch_one__setup_partial(remote);
+
gtransport = prepare_transport(remote, 1);
 
if (prune < 0) {
@@ -1348,6 +1405,8 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
 
packet_trace_identity("fetch");
 
+   fetch_if_missing = 0;
+
/* Record the command line for the reflog */
strbuf_addstr(_rla, "fetch");
for (i = 1; i < argc; i++)
diff --git a/connected.c b/connected.c
index f416b05..3a5bd67 100644
--- a/connected.c
+++ b/connected.c
@@ -56,6 +56,8 @@ int check_connected(sha1_iterate_fn fn, void *cb_data,
argv_array_push(_list.args,"rev-list");
argv_array_push(_list.args, "--objects");
argv_array_push(_list.args, "--stdin");
+   if (repository_format_partial_clone)
+   argv_array_push(_list.args, "--exclude-promisor-objects");
argv_array_push(_list.args, "--not");
argv_array_push(_list.args, "--all");
argv_array_push(_list.args, "--quiet");
-- 
2.9.3



[PATCH v4 03/15] fetch: refactor calculation of remote list

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Separate out the calculation of remotes to be fetched from and the
actual fetching. This will allow us to include an additional step before
the actual fetching in a subsequent commit.

Signed-off-by: Jonathan Tan 
---
 builtin/fetch.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 225c734..1b1f039 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1322,7 +1322,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
 {
int i;
struct string_list list = STRING_LIST_INIT_DUP;
-   struct remote *remote;
+   struct remote *remote = NULL;
int result = 0;
struct argv_array argv_gc_auto = ARGV_ARRAY_INIT;
 
@@ -1367,17 +1367,14 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
else if (argc > 1)
die(_("fetch --all does not make sense with refspecs"));
(void) for_each_remote(get_one_remote_for_fetch, );
-   result = fetch_multiple();
} else if (argc == 0) {
/* No arguments -- use default remote */
remote = remote_get(NULL);
-   result = fetch_one(remote, argc, argv);
} else if (multiple) {
/* All arguments are assumed to be remotes or groups */
for (i = 0; i < argc; i++)
if (!add_remote_or_group(argv[i], ))
die(_("No such remote or remote group: %s"), 
argv[i]);
-   result = fetch_multiple();
} else {
/* Single remote or group */
(void) add_remote_or_group(argv[0], );
@@ -1385,14 +1382,19 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
/* More than one remote */
if (argc > 1)
die(_("Fetching a group and specifying refspecs 
does not make sense"));
-   result = fetch_multiple();
} else {
/* Zero or one remotes */
remote = remote_get(argv[0]);
-   result = fetch_one(remote, argc-1, argv+1);
+   argc--;
+   argv++;
}
}
 
+   if (remote)
+   result = fetch_one(remote, argc, argv);
+   else
+   result = fetch_multiple();
+
if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
struct argv_array options = ARGV_ARRAY_INIT;
 
-- 
2.9.3



[PATCH v4 00/15] Parial clone part 3: clone, fetch, fetch-pack, upload-pack, and tests

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

This part 3 of a 3 part sequence partial clone.  It assumes
that part 1 and part 2 are in place.

This patch series is labeled as V4 to keep it in sync with
the corresponding V4 versions of parts 1 and 2.  There was
not a V3 version of this patch series.

Jonathan and I independently started on this task.  This is another
pass at merging those efforts.  So there are several places that may
need refactoring and cleanup, but fewer than in the previous submission.
In particular, the test cases should be squashed and new tests added.

And I think we need more end-to-end tests.  I'll work on those next.

Jeff Hostetler (5):
  upload-pack: add object filtering for partial clone
  clone, fetch-pack, index-pack, transport: partial clone
  fetch: add object filtering for partial fetch
  remote-curl: add object filtering for partial clone
  partial-clone: define partial clone settings in config

Jonathan Tan (10):
  fetch: refactor calculation of remote list
  pack-objects: test support for blob filtering
  fetch-pack: test support excluding large blobs
  fetch-pack: test support excluding large blobs
  fetch: add from_promisor and exclude-promisor-objects parameters
  t5500: add fetch-pack tests for partial clone
  t5601: test for partial clone
  t5500: more tests for partial clone and fetch
  unpack-trees: batch fetching of missing blobs
  fetch-pack: restore save_commit_buffer after use

 Documentation/config.txt  |   4 +
 Documentation/gitremote-helpers.txt   |   4 +
 Documentation/technical/pack-protocol.txt |   8 ++
 Documentation/technical/protocol-capabilities.txt |   8 ++
 builtin/clone.c   |  22 -
 builtin/fetch-pack.c  |   4 +
 builtin/fetch.c   |  93 +++--
 cache.h   |   1 +
 config.c  |   5 +
 connected.c   |   2 +
 environment.c |   1 +
 fetch-object.c|  27 -
 fetch-object.h|   5 +
 fetch-pack.c  |  17 
 fetch-pack.h  |   2 +
 list-objects-filter-options.c | 110 +++--
 list-objects-filter-options.h |  12 +++
 remote-curl.c |  11 +++
 t/t5300-pack-object.sh|  26 +
 t/t5500-fetch-pack.sh | 115 ++
 t/t5601-clone.sh  | 101 +++
 t/test-lib-functions.sh   |  12 +++
 transport-helper.c|   5 +
 transport.c   |   4 +
 transport.h   |   5 +
 unpack-trees.c|  22 +
 upload-pack.c |  22 -
 27 files changed, 628 insertions(+), 20 deletions(-)

-- 
2.9.3



[PATCH v4 04/15] fetch: add object filtering for partial fetch

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Teach fetch to use the list-objects filtering parameters
to allow a "partial fetch" following a "partial clone".

Signed-off-by: Jeff Hostetler 
---
 builtin/fetch.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 1b1f039..fb9af7c 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -18,6 +18,7 @@
 #include "argv-array.h"
 #include "utf8.h"
 #include "packfile.h"
+#include "list-objects-filter-options.h"
 
 static const char * const builtin_fetch_usage[] = {
N_("git fetch [] [ [...]]"),
@@ -55,6 +56,7 @@ static int recurse_submodules_default = 
RECURSE_SUBMODULES_ON_DEMAND;
 static int shown_url = 0;
 static int refmap_alloc, refmap_nr;
 static const char **refmap_array;
+static struct list_objects_filter_options filter_options;
 
 static int git_fetch_config(const char *k, const char *v, void *cb)
 {
@@ -160,6 +162,7 @@ static struct option builtin_fetch_options[] = {
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
TRANSPORT_FAMILY_IPV6),
+   OPT_PARSE_LIST_OBJECTS_FILTER(_options),
OPT_END()
 };
 
@@ -1044,6 +1047,9 @@ static struct transport *prepare_transport(struct remote 
*remote, int deepen)
set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, "yes");
if (update_shallow)
set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
+   if (filter_options.choice)
+   set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
+  filter_options.raw_value);
return transport;
 }
 
@@ -1242,6 +1248,20 @@ static int fetch_multiple(struct string_list *list)
int i, result = 0;
struct argv_array argv = ARGV_ARRAY_INIT;
 
+   if (filter_options.choice) {
+   /*
+* We currently only support partial-fetches to the remote
+* used for the partial-clone because we only support 1
+* promisor remote, so we DO NOT allow explicit command
+* line filter arguments for multi-fetches.
+*
+* Note that the loop below will spawn background fetches
+* for each remote and one of them MAY INHERIT the proper
+* partial-fetch settings, so everything is consistent.
+*/
+   die(_("partial-fetch is not supported on multiple remotes"));
+   }
+
if (!append && !dry_run) {
int errcode = truncate_fetch_head();
if (errcode)
-- 
2.9.3



[PATCH v4 05/15] remote-curl: add object filtering for partial clone

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 Documentation/gitremote-helpers.txt |  4 
 remote-curl.c   | 11 +++
 2 files changed, 15 insertions(+)

diff --git a/Documentation/gitremote-helpers.txt 
b/Documentation/gitremote-helpers.txt
index 1ceab89..4bb11bf 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -472,6 +472,10 @@ set by Git if the remote helper has the 'option' 
capability.
 'option no-haves' {'true'|'false'}::
Do not send "have" lines.
 
+'option filter '::
+   An object filter specification for partial clone or fetch
+   as described in rev-list.
+
 SEE ALSO
 
 linkgit:git-remote[1]
diff --git a/remote-curl.c b/remote-curl.c
index 34a81b8..c2f28ab 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -13,6 +13,7 @@
 #include "credential.h"
 #include "sha1-array.h"
 #include "send-pack.h"
+#include "list-objects-filter-options.h"
 
 static struct remote *remote;
 /* always ends with a trailing slash */
@@ -22,6 +23,7 @@ struct options {
int verbosity;
unsigned long depth;
char *deepen_since;
+   char *partial_clone_filter;
struct string_list deepen_not;
struct string_list push_options;
unsigned progress : 1,
@@ -165,6 +167,9 @@ static int set_option(const char *name, const char *value)
} else if (!strcmp(name, "no-haves")) {
options.no_haves = 1;
return 0;
+   } else if (!strcmp(name, "filter")) {
+   options.partial_clone_filter = xstrdup(value);
+   return 0;
} else {
return 1 /* unsupported */;
}
@@ -834,6 +839,12 @@ static int fetch_git(struct discovery *heads,
argv_array_push(, "--from-promisor");
if (options.no_haves)
argv_array_push(, "--no-haves");
+   if (options.partial_clone_filter) {
+   struct list_objects_filter_options filter_options;
+   parse_list_objects_filter(_options,
+ options.partial_clone_filter);
+   list_objects_filter_push_arg(, _options);
+   }
argv_array_push(, url.buf);
 
for (i = 0; i < nr_heads; i++) {
-- 
2.9.3



[PATCH v4 07/15] fetch-pack: test support excluding large blobs

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Created tests to verify fetch-pack and upload-pack support
for excluding large blobs using --filter=blob:limit=
parameter.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 cache.h   |  1 +
 config.c  |  5 +++
 environment.c |  1 +
 list-objects-filter-options.c | 84 +++
 list-objects-filter-options.h |  6 
 t/t5500-fetch-pack.sh | 27 ++
 6 files changed, 117 insertions(+), 7 deletions(-)

diff --git a/cache.h b/cache.h
index 6980072..bccc510 100644
--- a/cache.h
+++ b/cache.h
@@ -861,6 +861,7 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_precious_objects;
 extern char *repository_format_partial_clone;
+extern const char *core_partial_clone_filter_default;
 
 struct repository_format {
int version;
diff --git a/config.c b/config.c
index adb7d7a..adeee04 100644
--- a/config.c
+++ b/config.c
@@ -1241,6 +1241,11 @@ static int git_default_core_config(const char *var, 
const char *value)
return 0;
}
 
+   if (!strcmp(var, "core.partialclonefilter")) {
+   return git_config_string(_partial_clone_filter_default,
+var, value);
+   }
+
/* Add other config variables here and to Documentation/config.txt. */
return 0;
 }
diff --git a/environment.c b/environment.c
index e52aab3..7537565 100644
--- a/environment.c
+++ b/environment.c
@@ -28,6 +28,7 @@ int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_precious_objects;
 char *repository_format_partial_clone;
+const char *core_partial_clone_filter_default;
 const char *git_commit_encoding;
 const char *git_log_output_encoding;
 const char *apply_default_whitespace;
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index f1fb57b..76a6579 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -75,13 +75,22 @@ int armor_decode_arg(struct strbuf *buf, const char *arg)
  * subordinate commands when necessary.  We also "intern" the arg for
  * the convenience of the current command.
  */
-int parse_list_objects_filter(struct list_objects_filter_options 
*filter_options,
- const char *arg)
+static int gently_parse_list_objects_filter(
+   struct list_objects_filter_options *filter_options,
+   const char *arg,
+   struct strbuf *errbuf)
 {
const char *v0;
 
-   if (filter_options->choice)
-   die(_("multiple object filter types cannot be combined"));
+   if (filter_options->choice) {
+   if (errbuf) {
+   strbuf_init(errbuf, 0);
+   strbuf_addstr(
+   errbuf,
+   _("multiple filter-specs cannot be combined"));
+   }
+   return 1;
+   }
 
filter_options->raw_value = strdup(arg);
 
@@ -92,7 +101,7 @@ int parse_list_objects_filter(struct 
list_objects_filter_options *filter_options
 
if (skip_prefix(arg, "blob:limit=", )) {
if (!git_parse_ulong(v0, _options->blob_limit_value))
-   die(_("invalid filter-spec expression '%s'"), arg);
+   goto invalid_expression;
filter_options->choice = LOFC_BLOB_LIMIT;
return 0;
}
@@ -127,13 +136,27 @@ int parse_list_objects_filter(struct 
list_objects_filter_options *filter_options
int r;
struct strbuf buf = STRBUF_INIT;
if (armor_decode_arg(, v0) < 0)
-   die(_("invalid filter-spec expression '%s'"), arg);
+   goto invalid_expression;
r = parse_list_objects_filter(filter_options, buf.buf);
strbuf_release();
return r;
}
 
-   die(_("invalid filter-spec expression '%s'"), arg);
+invalid_expression:
+   if (errbuf) {
+   strbuf_init(errbuf, 0);
+   strbuf_addf(errbuf, "invalid filter-spec '%s'", arg);
+   }
+   memset(filter_options, 0, sizeof(*filter_options));
+   return 1;
+}
+
+int parse_list_objects_filter(struct list_objects_filter_options 
*filter_options,
+ const char *arg)
+{
+   struct strbuf buf = STRBUF_INIT;
+   if (gently_parse_list_objects_filter(filter_options, arg, ))
+   die("%s", buf.buf);
return 0;
 }
 
@@ -173,3 +196,50 @@ void list_objects_filter_push_arg(
}
 }
 
+void partial_clone_register(
+   const char *remote,
+   const struct list_objects_filter_options *filter_options)
+{
+   /*
+* Record the name of the partial clone remote in the
+* config and in the 

[PATCH v4 02/15] clone, fetch-pack, index-pack, transport: partial clone

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 builtin/clone.c  |  9 +
 builtin/fetch-pack.c |  4 
 fetch-pack.c | 13 +
 fetch-pack.h |  2 ++
 transport-helper.c   |  5 +
 transport.c  |  4 
 transport.h  |  5 +
 7 files changed, 42 insertions(+)

diff --git a/builtin/clone.c b/builtin/clone.c
index dbddd98..fceb9e7 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -26,6 +26,7 @@
 #include "run-command.h"
 #include "connected.h"
 #include "packfile.h"
+#include "list-objects-filter-options.h"
 
 /*
  * Overall FIXMEs:
@@ -60,6 +61,7 @@ static struct string_list option_optional_reference = 
STRING_LIST_INIT_NODUP;
 static int option_dissociate;
 static int max_jobs = -1;
 static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
+static struct list_objects_filter_options filter_options;
 
 static int recurse_submodules_cb(const struct option *opt,
 const char *arg, int unset)
@@ -135,6 +137,7 @@ static struct option builtin_clone_options[] = {
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
TRANSPORT_FAMILY_IPV6),
+   OPT_PARSE_LIST_OBJECTS_FILTER(_options),
OPT_END()
 };
 
@@ -1073,6 +1076,8 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
warning(_("--shallow-since is ignored in local clones; 
use file:// instead."));
if (option_not.nr)
warning(_("--shallow-exclude is ignored in local 
clones; use file:// instead."));
+   if (filter_options.choice)
+   warning(_("--filter is ignored in local clones; use 
file:// instead."));
if (!access(mkpath("%s/shallow", path), F_OK)) {
if (option_local > 0)
warning(_("source repository is shallow, 
ignoring --local"));
@@ -1104,6 +1109,10 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_set_option(transport, TRANS_OPT_UPLOADPACK,
 option_upload_pack);
 
+   if (filter_options.choice)
+   transport_set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
+filter_options.raw_value);
+
if (transport->smart_options && !deepen)
transport->smart_options->check_self_contained_and_connected = 
1;
 
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 9a7ebf6..d0fdaa8 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -153,6 +153,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
args.no_haves = 1;
continue;
}
+   if (skip_prefix(arg, ("--" CL_ARG__FILTER "="), )) {
+   parse_list_objects_filter(_options, arg);
+   continue;
+   }
usage(fetch_pack_usage);
}
if (deepen_not.nr)
diff --git a/fetch-pack.c b/fetch-pack.c
index 4640b4e..895e8f9 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -29,6 +29,7 @@ static int deepen_not_ok;
 static int fetch_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
 static int agent_supported;
+static int server_supports_filtering;
 static struct lock_file shallow_lock;
 static const char *alternate_shallow_file;
 
@@ -379,6 +380,8 @@ static int find_common(struct fetch_pack_args *args,
if (deepen_not_ok)  strbuf_addstr(, " 
deepen-not");
if (agent_supported)strbuf_addf(, " agent=%s",

git_user_agent_sanitized());
+   if (args->filter_options.choice)
+   strbuf_addstr(, " filter");
packet_buf_write(_buf, "want %s%s\n", remote_hex, 
c.buf);
strbuf_release();
} else
@@ -407,6 +410,9 @@ static int find_common(struct fetch_pack_args *args,
packet_buf_write(_buf, "deepen-not %s", s->string);
}
}
+   if (server_supports_filtering && args->filter_options.choice)
+   packet_buf_write(_buf, "filter %s",
+args->filter_options.raw_value);
packet_buf_flush(_buf);
state_len = req_buf.len;
 
@@ -967,6 +973,13 @@ static struct ref *do_fetch_pack(struct fetch_pack_args 
*args,
else
prefer_ofs_delta = 0;
 
+   if (server_supports("filter")) {
+   server_supports_filtering = 1;
+   print_verbose(args, _("Server supports filter"));
+   } else if (args->filter_options.choice) {
+   warning("filtering not recognized by server, ignoring");
+  

[PATCH v4 01/15] upload-pack: add object filtering for partial clone

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Teach upload-pack to negotiate object filtering over the protocol and
to send filter parameters to pack-objects.  This is intended for partial
clone and fetch.

The idea to make upload-pack configurable using uploadpack.allowFilter
comes from Jonathan Tan's work in [1].

[1] 
https://public-inbox.org/git/f211093280b422c32cc1b7034130072f35c5ed51.1506714999.git.jonathanta...@google.com/

Signed-off-by: Jeff Hostetler 
---
 Documentation/config.txt  |  4 
 Documentation/technical/pack-protocol.txt |  8 +++
 Documentation/technical/protocol-capabilities.txt |  8 +++
 list-objects-filter-options.c | 26 +++
 list-objects-filter-options.h |  6 ++
 upload-pack.c | 22 ++-
 6 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1ac0ae6..e528210 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3268,6 +3268,10 @@ uploadpack.packObjectsHook::
was run. I.e., `upload-pack` will feed input intended for
`pack-objects` to the hook, and expects a completed packfile on
stdout.
+
+uploadpack.allowFilter::
+   If this option is set, `upload-pack` will advertise partial
+   clone and partial fetch object filtering.
 +
 Note that this configuration variable is ignored if it is seen in the
 repository-level config (this is a safety measure against fetching from
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index ed1eae8..a43a113 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -212,6 +212,7 @@ out of what the server said it could do with the first 
'want' line.
   upload-request=  want-list
   *shallow-line
   *1depth-request
+  [filter-request]
   flush-pkt
 
   want-list =  first-want
@@ -227,6 +228,8 @@ out of what the server said it could do with the first 
'want' line.
   additional-want   =  PKT-LINE("want" SP obj-id)
 
   depth =  1*DIGIT
+
+  filter-request=  PKT-LINE("filter" SP filter-spec)
 
 
 Clients MUST send all the obj-ids it wants from the reference
@@ -249,6 +252,11 @@ complete those commits. Commits whose parents are not 
received as a
 result are defined as shallow and marked as such in the server. This
 information is sent back to the client in the next step.
 
+The client can optionally request that pack-objects omit various
+objects from the packfile using one of several filtering techniques.
+These are intended for use with partial clone and partial fetch
+operations.  See `rev-list` for possible "filter-spec" values.
+
 Once all the 'want's and 'shallow's (and optional 'deepen') are
 transferred, clients MUST send a flush-pkt, to tell the server side
 that it is done sending the list.
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index 26dcc6f..332d209 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -309,3 +309,11 @@ to accept a signed push certificate, and asks the  
to be
 included in the push certificate.  A send-pack client MUST NOT
 send a push-cert packet unless the receive-pack server advertises
 this capability.
+
+filter
+--
+
+If the upload-pack server advertises the 'filter' capability,
+fetch-pack may send "filter" commands to request a partial clone
+or partial fetch and request that the server omit various objects
+from the packfile.
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index a9298fd..f1fb57b 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -147,3 +147,29 @@ int opt_parse_list_objects_filter(const struct option *opt,
 
return parse_list_objects_filter(filter_options, arg);
 }
+
+/*
+ * The caller wants to pass the value of filter_options->raw_value
+ * to a subordinate program.  Encode the value if necessary to guard
+ * against injection attacks.
+ */
+void list_objects_filter_push_arg(
+   struct argv_array *args,
+   const struct list_objects_filter_options *filter_options)
+{
+   if (!filter_options->choice)
+   return;
+   if (!filter_options->raw_value || !*filter_options->raw_value)
+   return;
+
+   if (filter_options->requires_armor) {
+   struct strbuf buf = STRBUF_INIT;
+   armor_encode_arg(, filter_options->raw_value);
+   argv_array_pushf(args, "--%s=%s", CL_ARG__FILTER, buf.buf);
+   strbuf_release();
+   } else {
+   argv_array_pushf(args, "--%s=%s", CL_ARG__FILTER,
+ 

[PATCH v4 01/10] extension.partialclone: introduce partial clone extension

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Introduce new repository extension option:
`extensions.partialclone`

See the update to Documentation/technical/repository-version.txt
in this patch for more information.

Signed-off-by: Jonathan Tan 
---
 Documentation/technical/repository-version.txt | 12 
 cache.h|  2 ++
 environment.c  |  1 +
 setup.c|  7 ++-
 4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/Documentation/technical/repository-version.txt 
b/Documentation/technical/repository-version.txt
index 00ad379..e03eacc 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -86,3 +86,15 @@ for testing format-1 compatibility.
 When the config key `extensions.preciousObjects` is set to `true`,
 objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
 `git repack -d`).
+
+`partialclone`
+~~
+
+When the config key `extensions.partialclone` is set, it indicates
+that the repo was created with a partial clone (or later performed
+a partial fetch) and that the remote may have omitted sending
+certain unwanted objects.  Such a remote is called a "promisor remote"
+and it promises that all such omitted objects can be fetched from it
+in the future.
+
+The value of this key is the name of the promisor remote.
diff --git a/cache.h b/cache.h
index 6440e2b..35e3f5e 100644
--- a/cache.h
+++ b/cache.h
@@ -860,10 +860,12 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION 0
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_precious_objects;
+extern char *repository_format_partial_clone;
 
 struct repository_format {
int version;
int precious_objects;
+   char *partial_clone; /* value of extensions.partialclone */
int is_bare;
char *work_tree;
struct string_list unknown_extensions;
diff --git a/environment.c b/environment.c
index 8289c25..e52aab3 100644
--- a/environment.c
+++ b/environment.c
@@ -27,6 +27,7 @@ int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_precious_objects;
+char *repository_format_partial_clone;
 const char *git_commit_encoding;
 const char *git_log_output_encoding;
 const char *apply_default_whitespace;
diff --git a/setup.c b/setup.c
index 03f51e0..58536bd 100644
--- a/setup.c
+++ b/setup.c
@@ -420,7 +420,11 @@ static int check_repo_format(const char *var, const char 
*value, void *vdata)
;
else if (!strcmp(ext, "preciousobjects"))
data->precious_objects = git_config_bool(var, value);
-   else
+   else if (!strcmp(ext, "partialclone")) {
+   if (!value)
+   return config_error_nonbool(var);
+   data->partial_clone = xstrdup(value);
+   } else
string_list_append(>unknown_extensions, ext);
} else if (strcmp(var, "core.bare") == 0) {
data->is_bare = git_config_bool(var, value);
@@ -463,6 +467,7 @@ static int check_repository_format_gently(const char 
*gitdir, int *nongit_ok)
}
 
repository_format_precious_objects = candidate.precious_objects;
+   repository_format_partial_clone = candidate.partial_clone;
string_list_clear(_extensions, 0);
if (!has_common) {
if (candidate.is_bare != -1) {
-- 
2.9.3



[PATCH v4 00/10] Partial clone part 2: fsck and promisors

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

This is part 2 of a 3 part sequence for partial clone.
Part 2 assumes part 1 is in place.

This patch series is labeled V4 to keep it in sync with the
V4 version of part 1.  (There was no V3 of this part.)

Part 2 is concerned with fsck, gc, initial support for dynamic
object fetching, and tracking promisor objects.  Jonathan Tan
originally developed this code.  I have moved it on top of
part 1 and updated it slightly.

Jonathan Tan (10):
  extension.partialclone: introduce partial clone extension
  fsck: introduce partialclone extension
  fsck: support refs pointing to promisor objects
  fsck: support referenced promisor objects
  fsck: support promisor objects as CLI argument
  index-pack: refactor writing of .keep files
  introduce fetch-object: fetch one promisor object
  sha1_file: support lazily fetching missing objects
  rev-list: support termination at promisor objects
  gc: do not repack promisor packfiles

 Documentation/git-pack-objects.txt |  12 +-
 Documentation/gitremote-helpers.txt|   6 +
 Documentation/rev-list-options.txt |  12 +-
 Documentation/technical/repository-version.txt |  12 +
 Makefile   |   1 +
 builtin/cat-file.c |   2 +
 builtin/fetch-pack.c   |  10 +
 builtin/fsck.c |  26 +-
 builtin/gc.c   |   3 +
 builtin/index-pack.c   | 113 
 builtin/pack-objects.c |  36 +++
 builtin/prune.c|   7 +
 builtin/repack.c   |   8 +-
 builtin/rev-list.c |  74 +-
 cache.h|  13 +-
 environment.c  |   1 +
 fetch-object.c |  26 ++
 fetch-object.h |   6 +
 fetch-pack.c   |   8 +-
 fetch-pack.h   |   2 +
 list-objects.c |  29 ++-
 object.c   |   2 +-
 packfile.c |  77 +-
 packfile.h |  13 +
 remote-curl.c  |  14 +-
 revision.c |  33 ++-
 revision.h |   5 +-
 setup.c|   7 +-
 sha1_file.c|  38 ++-
 t/t0410-partial-clone.sh   | 343 +
 transport.c|   8 +
 transport.h|   8 +
 32 files changed, 872 insertions(+), 83 deletions(-)
 create mode 100644 fetch-object.c
 create mode 100644 fetch-object.h
 create mode 100755 t/t0410-partial-clone.sh

-- 
2.9.3



[PATCH v4 09/10] rev-list: support termination at promisor objects

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Teach rev-list to support termination of an object traversal at any
object from a promisor remote (whether one that the local repo also has,
or one that the local repo knows about because it has another promisor
object that references it).

This will be used subsequently in gc and in the connectivity check used
by fetch.

For efficiency, if an object is referenced by a promisor object, and is
in the local repo only as a non-promisor object, object traversal will
not stop there. This is to avoid building the list of promisor object
references.

(In list-objects.c, the case where obj is NULL in process_blob() and
process_tree() do not need to be changed because those happen only when
there is a conflict between the expected type and the existing object.
If the object doesn't exist, an object will be synthesized, which is
fine.)

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 Documentation/rev-list-options.txt |  12 -
 builtin/rev-list.c |  74 ---
 list-objects.c |  29 ++-
 object.c   |   2 +-
 revision.c |  33 +++-
 revision.h |   5 +-
 t/t0410-partial-clone.sh   | 101 +
 7 files changed, 243 insertions(+), 13 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index c84e465..2beffe3 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -730,7 +730,7 @@ specification contained in .
Only useful with `--filter=`; prints a list of the omitted objects.
Object IDs are prefixed with a ``~'' character.
 
---missing=(error|allow-any|print)::
+--missing=(error|allow-any|allow-promisor|print)::
Specifies how missing objects are handled.  The repository may
have missing objects after a partial clone, for example.
 +
@@ -741,10 +741,20 @@ The value 'allow-any' will allow object traversal to 
continue if a
 missing object is encountered.  Missing objects will silently be omitted
 from the results.
 +
+The value 'allow-promisor' is like 'allow-any' in that it will allow
+object traversal to continue, but only for EXPECTED missing objects.
++
 The value 'print' is like 'allow-any', but will also print a list of the
 missing objects.  Object IDs are prefixed with a ``?'' character.
 endif::git-rev-list[]
 
+--exclude-promisor-objects::
+   (For internal use only.)  Prefilter object traversal at
+   promisor boundary.  This is used with partial clone.  This is
+   stronger than `--missing=allow-promisor` because it limits the
+   traversal, rather than just silencing errors about missing
+   objects.
+
 --no-walk[=(sorted|unsorted)]::
Only show the given commits, but do not traverse their ancestors.
This has no effect if a range is specified. If the argument
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index da4a39b..d144d66 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -15,6 +15,7 @@
 #include "progress.h"
 #include "reflog-walk.h"
 #include "oidset.h"
+#include "packfile.h"
 
 static const char rev_list_usage[] =
 "git rev-list [OPTION] ... [ -- paths... ]\n"
@@ -67,6 +68,7 @@ enum missing_action {
MA_ERROR = 0,/* fail if any missing objects are encountered */
MA_ALLOW_ANY,/* silently allow ALL missing objects */
MA_PRINT,/* print ALL missing objects in special section */
+   MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */
 };
 static enum missing_action arg_missing_action;
 
@@ -197,6 +199,12 @@ static void finish_commit(struct commit *commit, void 
*data)
 
 static inline void finish_object__ma(struct object *obj)
 {
+   /*
+* Whether or not we try to dynamically fetch missing objects
+* from the server, we currently DO NOT have the object.  We
+* can either print, allow (ignore), or conditionally allow
+* (ignore) them.
+*/
switch (arg_missing_action) {
case MA_ERROR:
die("missing blob object '%s'", oid_to_hex(>oid));
@@ -209,25 +217,36 @@ static inline void finish_object__ma(struct object *obj)
oidset_insert(_objects, >oid);
return;
 
+   case MA_ALLOW_PROMISOR:
+   if (is_promisor_object(>oid))
+   return;
+   die("unexpected missing blob object '%s'",
+   oid_to_hex(>oid));
+   return;
+
default:
BUG("unhandled missing_action");
return;
}
 }
 
-static void finish_object(struct object *obj, const char *name, void *cb_data)
+static int finish_object(struct object *obj, const char *name, void *cb_data)
 {
struct rev_list_info *info = cb_data;

[PATCH v4 08/10] sha1_file: support lazily fetching missing objects

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Teach sha1_file to fetch objects from the remote configured in
extensions.partialclone whenever an object is requested but missing.

The fetching of objects can be suppressed through a global variable.
This is used by fsck and index-pack.

However, by default, such fetching is not suppressed. This is meant as a
temporary measure to ensure that all Git commands work in such a
situation. Future patches will update some commands to either tolerate
missing objects (without fetching them) or be more efficient in fetching
them.

In order to determine the code changes in sha1_file.c necessary, I
investigated the following:
 (1) functions in sha1_file.c that take in a hash, without the user
 regarding how the object is stored (loose or packed)
 (2) functions in packfile.c (because I need to check callers that know
 about the loose/packed distinction and operate on both differently,
 and ensure that they can handle the concept of objects that are
 neither loose nor packed)

(1) is handled by the modification to sha1_object_info_extended().

For (2), I looked at for_each_packed_object and others.  For
for_each_packed_object, the callers either already work or are fixed in
this patch:
 - reachable - only to find recent objects
 - builtin/fsck - already knows about missing objects
 - builtin/cat-file - warning message added in this commit

Callers of the other functions do not need to be changed:
 - parse_pack_index
   - http - indirectly from http_get_info_packs
   - find_pack_entry_one
 - this searches a single pack that is provided as an argument; the
   caller already knows (through other means) that the sought object
   is in a specific pack
 - find_sha1_pack
   - fast-import - appears to be an optimization to not store a file if
 it is already in a pack
   - http-walker - to search through a struct alt_base
   - http-push - to search through remote packs
 - has_sha1_pack
   - builtin/fsck - already knows about promisor objects
   - builtin/count-objects - informational purposes only (check if loose
 object is also packed)
   - builtin/prune-packed - check if object to be pruned is packed (if
 not, don't prune it)
   - revision - used to exclude packed objects if requested by user
   - diff - just for optimization

Signed-off-by: Jonathan Tan 
---
 builtin/cat-file.c   |  2 ++
 builtin/fetch-pack.c |  2 ++
 builtin/fsck.c   |  3 +++
 builtin/index-pack.c |  6 ++
 cache.h  |  8 
 fetch-object.c   |  3 +++
 sha1_file.c  | 38 
 t/t0410-partial-clone.sh | 51 
 8 files changed, 100 insertions(+), 13 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index f5fa4fd..cf9ea5c 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -475,6 +475,8 @@ static int batch_objects(struct batch_options *opt)
 
for_each_loose_object(batch_loose_object, , 0);
for_each_packed_object(batch_packed_object, , 0);
+   if (repository_format_partial_clone)
+   warning("This repository has extensions.partialClone 
set. Some objects may not be loaded.");
 
cb.opt = opt;
cb.expand = 
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 9f303cf..9a7ebf6 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -53,6 +53,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
struct oid_array shallow = OID_ARRAY_INIT;
struct string_list deepen_not = STRING_LIST_INIT_DUP;
 
+   fetch_if_missing = 0;
+
packet_trace_identity("fetch-pack");
 
memset(, 0, sizeof(args));
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 578a7c8..3b76c0e 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -678,6 +678,9 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
int i;
struct alternate_object_database *alt;
 
+   /* fsck knows how to handle missing promisor objects */
+   fetch_if_missing = 0;
+
errors_found = 0;
check_replace_refs = 0;
 
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 24c2f05..a0a35e6 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1657,6 +1657,12 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
unsigned foreign_nr = 1;/* zero is a "good" value, assume bad */
int report_end_of_input = 0;
 
+   /*
+* index-pack never needs to fetch missing objects, since it only
+* accesses the repo to do hash collision checks
+*/
+   fetch_if_missing = 0;
+
if (argc == 2 && !strcmp(argv[1], "-h"))
usage(index_pack_usage);
 
diff --git a/cache.h b/cache.h
index c76f2e9..6980072 100644
--- a/cache.h
+++ b/cache.h
@@ -1727,6 +1727,14 

[PATCH v4 04/10] fsck: support referenced promisor objects

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Teach fsck to not treat missing promisor objects indirectly pointed to
by refs as an error when extensions.partialclone is set.

Signed-off-by: Jonathan Tan 
---
 builtin/fsck.c   | 11 +++
 t/t0410-partial-clone.sh | 23 +++
 2 files changed, 34 insertions(+)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index ee937bb..4c2a56d 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -149,6 +149,15 @@ static int mark_object(struct object *obj, int type, void 
*data, struct fsck_opt
if (obj->flags & REACHABLE)
return 0;
obj->flags |= REACHABLE;
+
+   if (is_promisor_object(>oid))
+   /*
+* Further recursion does not need to be performed on this
+* object since it is a promisor object (so it does not need to
+* be added to "pending").
+*/
+   return 0;
+
if (!(obj->flags & HAS_OBJ)) {
if (parent && !has_object_file(>oid)) {
printf("broken link from %7s %s\n",
@@ -208,6 +217,8 @@ static void check_reachable_object(struct object *obj)
 * do a full fsck
 */
if (!(obj->flags & HAS_OBJ)) {
+   if (is_promisor_object(>oid))
+   return;
if (has_sha1_pack(obj->oid.hash))
return; /* it is in pack - forget about it */
printf("missing %s %s\n", printable_type(obj),
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index bf75162..4f9931f 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -102,4 +102,27 @@ test_expect_success 'missing ref object, but promised, 
passes fsck' '
git -C repo fsck
 '
 
+test_expect_success 'missing object, but promised, passes fsck' '
+   rm -rf repo &&
+   test_create_repo repo &&
+   test_commit -C repo 1 &&
+   test_commit -C repo 2 &&
+   test_commit -C repo 3 &&
+   git -C repo tag -a annotated_tag -m "annotated tag" &&
+
+   C=$(git -C repo rev-parse 1) &&
+   T=$(git -C repo rev-parse 2^{tree}) &&
+   B=$(git hash-object repo/3.t) &&
+   AT=$(git -C repo rev-parse annotated_tag) &&
+
+   promise_and_delete "$C" &&
+   promise_and_delete "$T" &&
+   promise_and_delete "$B" &&
+   promise_and_delete "$AT" &&
+
+   git -C repo config core.repositoryformatversion 1 &&
+   git -C repo config extensions.partialclone "arbitrary string" &&
+   git -C repo fsck
+'
+
 test_done
-- 
2.9.3



[PATCH v4 10/10] gc: do not repack promisor packfiles

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Teach gc to stop traversal at promisor objects, and to leave promisor
packfiles alone. This has the effect of only repacking non-promisor
packfiles, and preserves the distinction between promisor packfiles and
non-promisor packfiles.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 Documentation/git-pack-objects.txt | 12 -
 builtin/gc.c   |  3 +++
 builtin/pack-objects.c | 36 ++
 builtin/prune.c|  7 +
 builtin/repack.c   |  8 --
 t/t0410-partial-clone.sh   | 52 +-
 6 files changed, 114 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 5fad696..33a824e 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -242,9 +242,19 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it 
creates a bundle.
the resulting packfile.  See linkgit:git-rev-list[1] for valid
`` forms.
 
---missing=(error|allow-any):
+--missing=(error|allow-any|allow-promisor):
Specifies how missing objects are handled.  This is useful, for
example, when there are missing objects from a prior partial clone.
+   This is stronger than `--missing=allow-promisor` because it limits
+   the traversal, rather than just silencing errors about missing
+   objects.
+
+--exclude-promisor-objects::
+   Omit objects that are known to be in the promisor remote". (This
+   option has the purpose of operating only on locally created objects,
+   so that when we repack, we still maintain a distinction between
+   locally created objects [without .promisor] and objects from the
+   promisor remote [with .promisor].)  This is used with partial clone.
 
 SEE ALSO
 
diff --git a/builtin/gc.c b/builtin/gc.c
index 3c5eae0..77fa720 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -458,6 +458,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
argv_array_push(, prune_expire);
if (quiet)
argv_array_push(, "--no-progress");
+   if (repository_format_partial_clone)
+   argv_array_push(,
+   "--exclude-promisor-objects");
if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
return error(FAILED_RUN, prune.argv[0]);
}
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 45ad35d..4534209 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -75,6 +75,8 @@ static int use_bitmap_index = -1;
 static int write_bitmap_index;
 static uint16_t write_bitmap_options;
 
+static int exclude_promisor_objects;
+
 static unsigned long delta_cache_size = 0;
 static unsigned long max_delta_cache_size = 256 * 1024 * 1024;
 static unsigned long cache_max_small_delta_size = 1000;
@@ -86,6 +88,7 @@ static struct list_objects_filter_options filter_options;
 enum missing_action {
MA_ERROR = 0,/* fail if any missing objects are encountered */
MA_ALLOW_ANY,/* silently allow ALL missing objects */
+   MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */
 };
 static enum missing_action arg_missing_action;
 static show_object_fn fn_show_object;
@@ -2577,6 +2580,20 @@ static void show_object__ma_allow_any(struct object 
*obj, const char *name, void
show_object(obj, name, data);
 }
 
+static void show_object__ma_allow_promisor(struct object *obj, const char 
*name, void *data)
+{
+   assert(arg_missing_action == MA_ALLOW_PROMISOR);
+
+   /*
+* Quietly ignore EXPECTED missing objects.  This avoids problems with
+* staging them now and getting an odd error later.
+*/
+   if (!has_object_file(>oid) && is_promisor_object(>oid))
+   return;
+
+   show_object(obj, name, data);
+}
+
 static int option_parse_missing_action(const struct option *opt,
   const char *arg, int unset)
 {
@@ -2591,10 +2608,18 @@ static int option_parse_missing_action(const struct 
option *opt,
 
if (!strcmp(arg, "allow-any")) {
arg_missing_action = MA_ALLOW_ANY;
+   fetch_if_missing = 0;
fn_show_object = show_object__ma_allow_any;
return 0;
}
 
+   if (!strcmp(arg, "allow-promisor")) {
+   arg_missing_action = MA_ALLOW_PROMISOR;
+   fetch_if_missing = 0;
+   fn_show_object = show_object__ma_allow_promisor;
+   return 0;
+   }
+
die(_("invalid value for --missing"));
return 0;
 }
@@ -3008,6 +3033,8 @@ int cmd_pack_objects(int argc, 

[PATCH v4 07/10] introduce fetch-object: fetch one promisor object

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Introduce fetch-object, providing the ability to fetch one object from a
promisor remote.

This uses fetch-pack. To do this, the transport mechanism has been
updated with 2 flags, "from-promisor" to indicate that the resulting
pack comes from a promisor remote (and thus should be annotated as such
by index-pack), and "no-haves" to suppress the sending of "have" lines.

This will be tested in a subsequent commit.

NEEDSWORK: update this when we have more information about protocol v2,
which should allow a way to suppress the ref advertisement and
officially allow any object type to be "want"-ed.

Signed-off-by: Jonathan Tan 
---
 Documentation/gitremote-helpers.txt |  6 ++
 Makefile|  1 +
 builtin/fetch-pack.c|  8 
 builtin/index-pack.c| 16 +---
 fetch-object.c  | 23 +++
 fetch-object.h  |  6 ++
 fetch-pack.c|  8 ++--
 fetch-pack.h|  2 ++
 remote-curl.c   | 14 +-
 transport.c |  8 
 transport.h |  8 
 11 files changed, 94 insertions(+), 6 deletions(-)
 create mode 100644 fetch-object.c
 create mode 100644 fetch-object.h

diff --git a/Documentation/gitremote-helpers.txt 
b/Documentation/gitremote-helpers.txt
index 4a584f3..1ceab89 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -466,6 +466,12 @@ set by Git if the remote helper has the 'option' 
capability.
Transmit  as a push option. As the push option
must not contain LF or NUL characters, the string is not encoded.
 
+'option from-promisor' {'true'|'false'}::
+   Indicate that these objects are being fetch by a promisor.
+
+'option no-haves' {'true'|'false'}::
+   Do not send "have" lines.
+
 SEE ALSO
 
 linkgit:git-remote[1]
diff --git a/Makefile b/Makefile
index ca378a4..795e0c7 100644
--- a/Makefile
+++ b/Makefile
@@ -792,6 +792,7 @@ LIB_OBJS += ewah/ewah_bitmap.o
 LIB_OBJS += ewah/ewah_io.o
 LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
+LIB_OBJS += fetch-object.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
 LIB_OBJS += gettext.o
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d1..9f303cf 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -143,6 +143,14 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
args.update_shallow = 1;
continue;
}
+   if (!strcmp("--from-promisor", arg)) {
+   args.from_promisor = 1;
+   continue;
+   }
+   if (!strcmp("--no-haves", arg)) {
+   args.no_haves = 1;
+   continue;
+   }
usage(fetch_pack_usage);
}
if (deepen_not.nr)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4f305a7..24c2f05 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1429,14 +1429,16 @@ static void write_special_file(const char *suffix, 
const char *msg,
if (close(fd) != 0)
die_errno(_("cannot close written %s file '%s'"),
  suffix, filename);
-   *report = suffix;
+   if (report)
+   *report = suffix;
}
strbuf_release(_buf);
 }
 
 static void final(const char *final_pack_name, const char *curr_pack_name,
  const char *final_index_name, const char *curr_index_name,
- const char *keep_msg, unsigned char *sha1)
+ const char *keep_msg, const char *promisor_msg,
+ unsigned char *sha1)
 {
const char *report = "pack";
struct strbuf pack_name = STRBUF_INIT;
@@ -1455,6 +1457,9 @@ static void final(const char *final_pack_name, const char 
*curr_pack_name,
if (keep_msg)
write_special_file("keep", keep_msg, final_pack_name, sha1,
   );
+   if (promisor_msg)
+   write_special_file("promisor", promisor_msg, final_pack_name,
+  sha1, NULL);
 
if (final_pack_name != curr_pack_name) {
if (!final_pack_name)
@@ -1644,6 +1649,7 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
const char *curr_index;
const char *index_name = NULL, *pack_name = NULL;
const char *keep_msg = NULL;
+   const char *promisor_msg = NULL;
struct strbuf index_name_buf = STRBUF_INIT;
struct pack_idx_entry **idx_objects;
struct pack_idx_option opts;
@@ -1693,6 +1699,10 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
 

[PATCH v4 02/10] fsck: introduce partialclone extension

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Currently, Git does not support repos with very large numbers of objects
or repos that wish to minimize manipulation of certain blobs (for
example, because they are very large) very well, even if the user
operates mostly on part of the repo, because Git is designed on the
assumption that every referenced object is available somewhere in the
repo storage. In such an arrangement, the full set of objects is usually
available in remote storage, ready to be lazily downloaded.

Teach fsck about the new state of affairs. In this commit, teach fsck
that missing promisor objects referenced from the reflog are not an
error case; in future commits, fsck will be taught about other cases.

Signed-off-by: Jonathan Tan 
---
 builtin/fsck.c   |  2 +-
 cache.h  |  3 +-
 packfile.c   | 77 +++--
 packfile.h   | 13 
 t/t0410-partial-clone.sh | 81 
 5 files changed, 171 insertions(+), 5 deletions(-)
 create mode 100755 t/t0410-partial-clone.sh

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 56afe40..2934299 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -398,7 +398,7 @@ static void fsck_handle_reflog_oid(const char *refname, 
struct object_id *oid,
xstrfmt("%s@{%"PRItime"}", refname, 
timestamp));
obj->flags |= USED;
mark_object_reachable(obj);
-   } else {
+   } else if (!is_promisor_object(oid)) {
error("%s: invalid reflog entry %s", refname, 
oid_to_hex(oid));
errors_found |= ERROR_REACHABLE;
}
diff --git a/cache.h b/cache.h
index 35e3f5e..c76f2e9 100644
--- a/cache.h
+++ b/cache.h
@@ -1587,7 +1587,8 @@ extern struct packed_git {
unsigned pack_local:1,
 pack_keep:1,
 freshened:1,
-do_not_close:1;
+do_not_close:1,
+pack_promisor:1;
unsigned char sha1[20];
struct revindex_entry *revindex;
/* something like ".git/objects/pack/x.pack" */
diff --git a/packfile.c b/packfile.c
index 4a5fe7a..234797c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -8,6 +8,11 @@
 #include "list.h"
 #include "streaming.h"
 #include "sha1-lookup.h"
+#include "commit.h"
+#include "object.h"
+#include "tag.h"
+#include "tree-walk.h"
+#include "tree.h"
 
 char *odb_pack_name(struct strbuf *buf,
const unsigned char *sha1,
@@ -643,10 +648,10 @@ struct packed_git *add_packed_git(const char *path, 
size_t path_len, int local)
return NULL;
 
/*
-* ".pack" is long enough to hold any suffix we're adding (and
+* ".promisor" is long enough to hold any suffix we're adding (and
 * the use xsnprintf double-checks that)
 */
-   alloc = st_add3(path_len, strlen(".pack"), 1);
+   alloc = st_add3(path_len, strlen(".promisor"), 1);
p = alloc_packed_git(alloc);
memcpy(p->pack_name, path, path_len);
 
@@ -654,6 +659,10 @@ struct packed_git *add_packed_git(const char *path, size_t 
path_len, int local)
if (!access(p->pack_name, F_OK))
p->pack_keep = 1;
 
+   xsnprintf(p->pack_name + path_len, alloc - path_len, ".promisor");
+   if (!access(p->pack_name, F_OK))
+   p->pack_promisor = 1;
+
xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
if (stat(p->pack_name, ) || !S_ISREG(st.st_mode)) {
free(p);
@@ -781,7 +790,8 @@ static void prepare_packed_git_one(char *objdir, int local)
if (ends_with(de->d_name, ".idx") ||
ends_with(de->d_name, ".pack") ||
ends_with(de->d_name, ".bitmap") ||
-   ends_with(de->d_name, ".keep"))
+   ends_with(de->d_name, ".keep") ||
+   ends_with(de->d_name, ".promisor"))
string_list_append(, path.buf);
else
report_garbage(PACKDIR_FILE_GARBAGE, path.buf);
@@ -1889,6 +1899,9 @@ int for_each_packed_object(each_packed_object_fn cb, void 
*data, unsigned flags)
for (p = packed_git; p; p = p->next) {
if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
continue;
+   if ((flags & FOR_EACH_OBJECT_PROMISOR_ONLY) &&
+   !p->pack_promisor)
+   continue;
if (open_pack_index(p)) {
pack_errors = 1;
continue;
@@ -1899,3 +1912,61 @@ int for_each_packed_object(each_packed_object_fn cb, 
void *data, unsigned flags)
}
return r ? r : pack_errors;
 }
+
+static int add_promisor_object(const struct object_id *oid,
+   

[PATCH v4 03/10] fsck: support refs pointing to promisor objects

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Teach fsck to not treat refs referring to missing promisor objects as an
error when extensions.partialclone is set.

For the purposes of warning about no default refs, such refs are still
treated as legitimate refs.

Signed-off-by: Jonathan Tan 
---
 builtin/fsck.c   |  8 
 t/t0410-partial-clone.sh | 24 
 2 files changed, 32 insertions(+)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2934299..ee937bb 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -434,6 +434,14 @@ static int fsck_handle_ref(const char *refname, const 
struct object_id *oid,
 
obj = parse_object(oid);
if (!obj) {
+   if (is_promisor_object(oid)) {
+   /*
+* Increment default_refs anyway, because this is a
+* valid ref.
+*/
+default_refs++;
+return 0;
+   }
error("%s: invalid sha1 pointer %s", refname, oid_to_hex(oid));
errors_found |= ERROR_REACHABLE;
/* We'll continue with the rest despite the error.. */
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 3ddb3b9..bf75162 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -13,6 +13,14 @@ pack_as_from_promisor () {
>repo/.git/objects/pack/pack-$HASH.promisor
 }
 
+promise_and_delete () {
+   HASH=$(git -C repo rev-parse "$1") &&
+   git -C repo tag -a -m message my_annotated_tag "$HASH" &&
+   git -C repo rev-parse my_annotated_tag | pack_as_from_promisor &&
+   git -C repo tag -d my_annotated_tag &&
+   delete_object repo "$HASH"
+}
+
 test_expect_success 'missing reflog object, but promised by a commit, passes 
fsck' '
test_create_repo repo &&
test_commit -C repo my_commit &&
@@ -78,4 +86,20 @@ test_expect_success 'missing reflog object alone fails fsck, 
even with extension
test_must_fail git -C repo fsck
 '
 
+test_expect_success 'missing ref object, but promised, passes fsck' '
+   rm -rf repo &&
+   test_create_repo repo &&
+   test_commit -C repo my_commit &&
+
+   A=$(git -C repo commit-tree -m a HEAD^{tree}) &&
+
+   # Reference $A only from ref
+   git -C repo branch my_branch "$A" &&
+   promise_and_delete "$A" &&
+
+   git -C repo config core.repositoryformatversion 1 &&
+   git -C repo config extensions.partialclone "arbitrary string" &&
+   git -C repo fsck
+'
+
 test_done
-- 
2.9.3



[PATCH v4 05/10] fsck: support promisor objects as CLI argument

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

Teach fsck to not treat missing promisor objects provided on the CLI as
an error when extensions.partialclone is set.

Signed-off-by: Jonathan Tan 
---
 builtin/fsck.c   |  2 ++
 t/t0410-partial-clone.sh | 13 +
 2 files changed, 15 insertions(+)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 4c2a56d..578a7c8 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -750,6 +750,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
struct object *obj = lookup_object(oid.hash);
 
if (!obj || !(obj->flags & HAS_OBJ)) {
+   if (is_promisor_object())
+   continue;
error("%s: object missing", oid_to_hex());
errors_found |= ERROR_OBJECT;
continue;
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 4f9931f..e96f436 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -125,4 +125,17 @@ test_expect_success 'missing object, but promised, passes 
fsck' '
git -C repo fsck
 '
 
+test_expect_success 'missing CLI object, but promised, passes fsck' '
+   rm -rf repo &&
+   test_create_repo repo &&
+   test_commit -C repo my_commit &&
+
+   A=$(git -C repo commit-tree -m a HEAD^{tree}) &&
+   promise_and_delete "$A" &&
+
+   git -C repo config core.repositoryformatversion 1 &&
+   git -C repo config extensions.partialclone "arbitrary string" &&
+   git -C repo fsck "$A"
+'
+
 test_done
-- 
2.9.3



[PATCH v4 06/10] index-pack: refactor writing of .keep files

2017-11-16 Thread Jeff Hostetler
From: Jonathan Tan 

In a subsequent commit, index-pack will be taught to write ".promisor"
files which are similar to the ".keep" files it knows how to write.
Refactor the writing of ".keep" files, so that the implementation of
writing ".promisor" files becomes easier.

Signed-off-by: Jonathan Tan 
---
 builtin/index-pack.c | 99 
 1 file changed, 53 insertions(+), 46 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 8ec459f..4f305a7 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1389,15 +1389,58 @@ static void fix_unresolved_deltas(struct sha1file *f)
free(sorted_by_pos);
 }
 
+static const char *derive_filename(const char *pack_name, const char *suffix,
+  struct strbuf *buf)
+{
+   size_t len;
+   if (!strip_suffix(pack_name, ".pack", ))
+   die(_("packfile name '%s' does not end with '.pack'"),
+   pack_name);
+   strbuf_add(buf, pack_name, len);
+   strbuf_addch(buf, '.');
+   strbuf_addstr(buf, suffix);
+   return buf->buf;
+}
+
+static void write_special_file(const char *suffix, const char *msg,
+  const char *pack_name, const unsigned char *sha1,
+  const char **report)
+{
+   struct strbuf name_buf = STRBUF_INIT;
+   const char *filename;
+   int fd;
+   int msg_len = strlen(msg);
+
+   if (pack_name)
+   filename = derive_filename(pack_name, suffix, _buf);
+   else
+   filename = odb_pack_name(_buf, sha1, suffix);
+
+   fd = odb_pack_keep(filename);
+   if (fd < 0) {
+   if (errno != EEXIST)
+   die_errno(_("cannot write %s file '%s'"),
+ suffix, filename);
+   } else {
+   if (msg_len > 0) {
+   write_or_die(fd, msg, msg_len);
+   write_or_die(fd, "\n", 1);
+   }
+   if (close(fd) != 0)
+   die_errno(_("cannot close written %s file '%s'"),
+ suffix, filename);
+   *report = suffix;
+   }
+   strbuf_release(_buf);
+}
+
 static void final(const char *final_pack_name, const char *curr_pack_name,
  const char *final_index_name, const char *curr_index_name,
- const char *keep_name, const char *keep_msg,
- unsigned char *sha1)
+ const char *keep_msg, unsigned char *sha1)
 {
const char *report = "pack";
struct strbuf pack_name = STRBUF_INIT;
struct strbuf index_name = STRBUF_INIT;
-   struct strbuf keep_name_buf = STRBUF_INIT;
int err;
 
if (!from_stdin) {
@@ -1409,28 +1452,9 @@ static void final(const char *final_pack_name, const 
char *curr_pack_name,
die_errno(_("error while closing pack file"));
}
 
-   if (keep_msg) {
-   int keep_fd, keep_msg_len = strlen(keep_msg);
-
-   if (!keep_name)
-   keep_name = odb_pack_name(_name_buf, sha1, "keep");
-
-   keep_fd = odb_pack_keep(keep_name);
-   if (keep_fd < 0) {
-   if (errno != EEXIST)
-   die_errno(_("cannot write keep file '%s'"),
- keep_name);
-   } else {
-   if (keep_msg_len > 0) {
-   write_or_die(keep_fd, keep_msg, keep_msg_len);
-   write_or_die(keep_fd, "\n", 1);
-   }
-   if (close(keep_fd) != 0)
-   die_errno(_("cannot close written keep file 
'%s'"),
- keep_name);
-   report = "keep";
-   }
-   }
+   if (keep_msg)
+   write_special_file("keep", keep_msg, final_pack_name, sha1,
+  );
 
if (final_pack_name != curr_pack_name) {
if (!final_pack_name)
@@ -1472,7 +1496,6 @@ static void final(const char *final_pack_name, const char 
*curr_pack_name,
 
strbuf_release(_name);
strbuf_release(_name);
-   strbuf_release(_name_buf);
 }
 
 static int git_index_pack_config(const char *k, const char *v, void *cb)
@@ -1615,26 +1638,13 @@ static void show_pack_info(int stat_only)
}
 }
 
-static const char *derive_filename(const char *pack_name, const char *suffix,
-  struct strbuf *buf)
-{
-   size_t len;
-   if (!strip_suffix(pack_name, ".pack", ))
-   die(_("packfile name '%s' does not end with '.pack'"),
-   pack_name);
-   strbuf_add(buf, pack_name, len);
-   strbuf_addstr(buf, suffix);
-   

[PATCH v4 4/6] list-objects: filter objects in traverse_commit_list

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Create traverse_commit_list_filtered() and add filtering
interface to allow certain objects to be omitted from the
traversal.

Update traverse_commit_list() to be a wrapper for the above
with a null filter to minimize the number of callers that
needed to be changed.

Object filtering will be used in a future commit by rev-list
and pack-objects for partial clone and fetch to omit unwanted
objects from the result.

traverse_bitmap_commit_list() does not work with filtering.
If a packfile bitmap is present, it will not be used.  It
should be possible to extend such support in the future (at
least to simple filters that do not require object pathnames),
but that is beyond the scope of this patch series.

Signed-off-by: Jeff Hostetler 
---
 Makefile  |   2 +
 list-objects-filter-options.c | 149 
 list-objects-filter-options.h |  57 ++
 list-objects-filter.c | 401 ++
 list-objects-filter.h |  77 
 list-objects.c|  95 --
 list-objects.h|  13 +-
 object.h  |   1 +
 8 files changed, 778 insertions(+), 17 deletions(-)
 create mode 100644 list-objects-filter-options.c
 create mode 100644 list-objects-filter-options.h
 create mode 100644 list-objects-filter.c
 create mode 100644 list-objects-filter.h

diff --git a/Makefile b/Makefile
index cd75985..ca378a4 100644
--- a/Makefile
+++ b/Makefile
@@ -807,6 +807,8 @@ LIB_OBJS += levenshtein.o
 LIB_OBJS += line-log.o
 LIB_OBJS += line-range.o
 LIB_OBJS += list-objects.o
+LIB_OBJS += list-objects-filter.o
+LIB_OBJS += list-objects-filter-options.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
 LIB_OBJS += log-tree.o
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
new file mode 100644
index 000..a9298fd
--- /dev/null
+++ b/list-objects-filter-options.c
@@ -0,0 +1,149 @@
+#include "cache.h"
+#include "commit.h"
+#include "config.h"
+#include "revision.h"
+#include "argv-array.h"
+#include "list-objects.h"
+#include "list-objects-filter.h"
+#include "list-objects-filter-options.h"
+
+/*
+ * Return 1 if the given string needs armoring because of "special"
+ * characters that may cause injection problems when a command passes
+ * the argument to a subordinate command (such as when upload-pack
+ * launches pack-objects).
+ *
+ * The usual alphanumeric and key punctuation do not trigger it.
+ */ 
+static int arg_needs_armor(const char *arg)
+{
+   const unsigned char *p;
+
+   for (p = (const unsigned char *)arg; *p; p++) {
+   if (*p >= 'a' && *p <= 'z')
+   continue;
+   if (*p >= 'A' && *p <= 'Z')
+   continue;
+   if (*p >= '0' && *p <= '9')
+   continue;
+   if (*p == '-' || *p == '_' || *p == '.' || *p == '/')
+   continue;
+
+   return 1;
+   }
+   return 0;
+}
+
+void armor_encode_arg(struct strbuf *buf, const char *arg)
+{
+   static const char hex[] = "0123456789abcdef";
+   const unsigned char *p;
+
+   for (p = (const unsigned char *)arg; *p; p++) {
+   unsigned int val = *p;
+   strbuf_addch(buf, hex[val >> 4]);
+   strbuf_addch(buf, hex[val & 0xf]);
+   }
+}
+
+int armor_decode_arg(struct strbuf *buf, const char *arg)
+{
+   const char *p;
+
+   for (p = arg; *p; p += 2) {
+   int val = hex2chr(p);
+   unsigned char ch;
+   if (val < 0)
+   return -1;
+   ch = val;
+   strbuf_addch(buf, ch);
+   }
+   return 0;
+}
+
+/*
+ * Parse value of the argument to the "filter" keword.
+ * On the command line this looks like:
+ *   --filter=
+ * and in the pack protocol as:
+ *   "filter" SP 
+ *
+ * The filter keyword will be used by many commands.
+ * See Documentation/rev-list-options.txt for allowed values for .
+ *
+ * Capture the given arg as the "raw_value".  This can be forwarded to
+ * subordinate commands when necessary.  We also "intern" the arg for
+ * the convenience of the current command.
+ */
+int parse_list_objects_filter(struct list_objects_filter_options 
*filter_options,
+ const char *arg)
+{
+   const char *v0;
+
+   if (filter_options->choice)
+   die(_("multiple object filter types cannot be combined"));
+
+   filter_options->raw_value = strdup(arg);
+
+   if (!strcmp(arg, "blob:none")) {
+   filter_options->choice = LOFC_BLOB_NONE;
+   return 0;
+   }
+
+   if (skip_prefix(arg, "blob:limit=", )) {
+   if (!git_parse_ulong(v0, _options->blob_limit_value))
+   die(_("invalid filter-spec expression '%s'"), arg);
+   filter_options->choice = LOFC_BLOB_LIMIT;
+ 

[PATCH v4 0/6] Partial clone part 1: object filtering

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Here is V4 of the list-object filtering, rev-list, and pack-objects.

This version addresses comments on the V3 version series.

This version replaces the code to scan and reject the filter-spec
for injection characters with a new hex-encoding technique.  The
purpose of this is only to guard against injection attacks containing
characters like semicolon, quotes, spaces, and etc. when a filter-spec
is handed to a subordinate command.  It does not eliminate the need
for the recipient to validate the contents.

This version also combines the various command line flags for
handling missing objects into a single --missing={error,print,allow-any}
flag.


Jeff Hostetler (6):
  dir: allow exclusions from blob in addition to file
  oidmap: add oidmap iterator methods
  oidset: add iterator methods to oidset
  list-objects: filter objects in traverse_commit_list
  rev-list: add list-objects filtering support
  pack-objects: add list-objects filtering

 Documentation/git-pack-objects.txt |  12 +-
 Documentation/git-rev-list.txt |   4 +-
 Documentation/rev-list-options.txt |  37 +++
 Makefile   |   2 +
 builtin/pack-objects.c |  64 +-
 builtin/rev-list.c | 108 -
 dir.c  | 132 ---
 dir.h  |   3 +
 list-objects-filter-options.c  | 149 
 list-objects-filter-options.h  |  57 +
 list-objects-filter.c  | 401 +
 list-objects-filter.h  |  77 +++
 list-objects.c |  95 ++--
 list-objects.h |  13 +-
 object.h   |   1 +
 oidmap.h   |  22 ++
 oidset.c   |  10 +
 oidset.h   |  36 +++
 t/t5317-pack-objects-filter-objects.sh | 375 ++
 t/t6112-rev-list-filters-objects.sh| 225 ++
 20 files changed, 1770 insertions(+), 53 deletions(-)
 create mode 100644 list-objects-filter-options.c
 create mode 100644 list-objects-filter-options.h
 create mode 100644 list-objects-filter.c
 create mode 100644 list-objects-filter.h
 create mode 100755 t/t5317-pack-objects-filter-objects.sh
 create mode 100755 t/t6112-rev-list-filters-objects.sh

-- 
2.9.3



[PATCH v4 6/6] pack-objects: add list-objects filtering

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Teach pack-objects to use the filtering provided by the
traverse_commit_list_filtered() interface to omit unwanted
objects from the resulting packfile.

This feature is intended for partial clone/fetch.

Filtering requires the use of the "--stdout" option.

Add t5317 test.

Signed-off-by: Jeff Hostetler 
---
 Documentation/git-pack-objects.txt |  12 +-
 builtin/pack-objects.c |  64 +-
 t/t5317-pack-objects-filter-objects.sh | 375 +
 3 files changed, 449 insertions(+), 2 deletions(-)
 create mode 100755 t/t5317-pack-objects-filter-objects.sh

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 473a161..5fad696 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -12,7 +12,8 @@ SYNOPSIS
 'git pack-objects' [-q | --progress | --all-progress] [--all-progress-implied]
[--no-reuse-delta] [--delta-base-offset] [--non-empty]
[--local] [--incremental] [--window=] [--depth=]
-   [--revs [--unpacked | --all]] [--stdout | base-name]
+   [--revs [--unpacked | --all]]
+   [--stdout [--filter=] | base-name]
[--shallow] [--keep-true-parents] < object-list
 
 
@@ -236,6 +237,15 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it 
creates a bundle.
With this option, parents that are hidden by grafts are packed
nevertheless.
 
+--filter=::
+   Requires `--stdout`.  Omits certain objects (usually blobs) from
+   the resulting packfile.  See linkgit:git-rev-list[1] for valid
+   `` forms.
+
+--missing=(error|allow-any):
+   Specifies how missing objects are handled.  This is useful, for
+   example, when there are missing objects from a prior partial clone.
+
 SEE ALSO
 
 linkgit:git-rev-list[1]
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 6e77dfd..45ad35d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -15,6 +15,8 @@
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "list-objects-filter.h"
+#include "list-objects-filter-options.h"
 #include "pack-objects.h"
 #include "progress.h"
 #include "refs.h"
@@ -79,6 +81,15 @@ static unsigned long cache_max_small_delta_size = 1000;
 
 static unsigned long window_memory_limit = 0;
 
+static struct list_objects_filter_options filter_options;
+
+enum missing_action {
+   MA_ERROR = 0,/* fail if any missing objects are encountered */
+   MA_ALLOW_ANY,/* silently allow ALL missing objects */
+};
+static enum missing_action arg_missing_action;
+static show_object_fn fn_show_object;
+
 /*
  * stats
  */
@@ -2552,6 +2563,42 @@ static void show_object(struct object *obj, const char 
*name, void *data)
obj->flags |= OBJECT_ADDED;
 }
 
+static void show_object__ma_allow_any(struct object *obj, const char *name, 
void *data)
+{
+   assert(arg_missing_action == MA_ALLOW_ANY);
+
+   /*
+* Quietly ignore ALL missing objects.  This avoids problems with
+* staging them now and getting an odd error later.
+*/
+   if (!has_object_file(>oid))
+   return;
+
+   show_object(obj, name, data);
+}
+
+static int option_parse_missing_action(const struct option *opt,
+  const char *arg, int unset)
+{
+   assert(arg);
+   assert(!unset);
+
+   if (!strcmp(arg, "error")) {
+   arg_missing_action = MA_ERROR;
+   fn_show_object = show_object;
+   return 0;
+   }
+
+   if (!strcmp(arg, "allow-any")) {
+   arg_missing_action = MA_ALLOW_ANY;
+   fn_show_object = show_object__ma_allow_any;
+   return 0;
+   }
+
+   die(_("invalid value for --missing"));
+   return 0;
+}
+
 static void show_edge(struct commit *commit)
 {
add_preferred_base(commit->object.oid.hash);
@@ -2816,7 +2863,12 @@ static void get_object_list(int ac, const char **av)
if (prepare_revision_walk())
die("revision walk setup failed");
mark_edges_uninteresting(, show_edge);
-   traverse_commit_list(, show_commit, show_object, NULL);
+
+   if (!fn_show_object)
+   fn_show_object = show_object;
+   traverse_commit_list_filtered(_options, ,
+ show_commit, fn_show_object, NULL,
+ NULL);
 
if (unpack_unreachable_expiration) {
revs.ignore_missing_links = 1;
@@ -2952,6 +3004,10 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
 N_("use a bitmap index if available to speed up 
counting objects")),
OPT_BOOL(0, "write-bitmap-index", _bitmap_index,
 N_("write a bitmap index together with the pack 
index")),
+   

[PATCH v4 3/6] oidset: add iterator methods to oidset

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Add the usual iterator methods to oidset.
Add oidset_remove().

Signed-off-by: Jeff Hostetler 
---
 oidset.c | 10 ++
 oidset.h | 36 
 2 files changed, 46 insertions(+)

diff --git a/oidset.c b/oidset.c
index f1f874a..454c54f 100644
--- a/oidset.c
+++ b/oidset.c
@@ -24,6 +24,16 @@ int oidset_insert(struct oidset *set, const struct object_id 
*oid)
return 0;
 }
 
+int oidset_remove(struct oidset *set, const struct object_id *oid)
+{
+   struct oidmap_entry *entry;
+
+   entry = oidmap_remove(>map, oid);
+   free(entry);
+
+   return (entry != NULL);
+}
+
 void oidset_clear(struct oidset *set)
 {
oidmap_free(>map, 1);
diff --git a/oidset.h b/oidset.h
index f4c9e0f..783abce 100644
--- a/oidset.h
+++ b/oidset.h
@@ -24,6 +24,12 @@ struct oidset {
 
 #define OIDSET_INIT { OIDMAP_INIT }
 
+
+static inline void oidset_init(struct oidset *set, size_t initial_size)
+{
+   return oidmap_init(>map, initial_size);
+}
+
 /**
  * Returns true iff `set` contains `oid`.
  */
@@ -39,9 +45,39 @@ int oidset_contains(const struct oidset *set, const struct 
object_id *oid);
 int oidset_insert(struct oidset *set, const struct object_id *oid);
 
 /**
+ * Remove the oid from the set.
+ *
+ * Returns 1 if the oid was present in the set, 0 otherwise.
+ */
+int oidset_remove(struct oidset *set, const struct object_id *oid);
+
+/**
  * Remove all entries from the oidset, freeing any resources associated with
  * it.
  */
 void oidset_clear(struct oidset *set);
 
+struct oidset_iter {
+   struct oidmap_iter m_iter;
+};
+
+static inline void oidset_iter_init(struct oidset *set,
+   struct oidset_iter *iter)
+{
+   oidmap_iter_init(>map, >m_iter);
+}
+
+static inline struct object_id *oidset_iter_next(struct oidset_iter *iter)
+{
+   struct oidmap_entry *e = oidmap_iter_next(>m_iter);
+   return e ? >oid : NULL;
+}
+
+static inline struct object_id *oidset_iter_first(struct oidset *set,
+ struct oidset_iter *iter)
+{
+   oidset_iter_init(set, iter);
+   return oidset_iter_next(iter);
+}
+
 #endif /* OIDSET_H */
-- 
2.9.3



[PATCH v4 1/6] dir: allow exclusions from blob in addition to file

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Refactor add_excludes() to separate the reading of the
exclude file into a buffer and the parsing of the buffer
into exclude_list items.

Add add_excludes_from_blob_to_list() to allow an exclude
file be specified with an OID without assuming a local
worktree or index exists.

Refactor read_skip_worktree_file_from_index() and add
do_read_blob() to eliminate duplication of preliminary
processing of blob contents.

Signed-off-by: Jeff Hostetler 
---
 dir.c | 132 ++
 dir.h |   3 ++
 2 files changed, 104 insertions(+), 31 deletions(-)

diff --git a/dir.c b/dir.c
index 1d17b80..1962374 100644
--- a/dir.c
+++ b/dir.c
@@ -220,6 +220,57 @@ int within_depth(const char *name, int namelen,
return 1;
 }
 
+/*
+ * Read the contents of the blob with the given OID into a buffer.
+ * Append a trailing LF to the end if the last line doesn't have one.
+ *
+ * Returns:
+ *-1 when the OID is invalid or unknown or does not refer to a blob.
+ * 0 when the blob is empty.
+ * 1 along with { data, size } of the (possibly augmented) buffer
+ *   when successful.
+ *
+ * Optionally updates the given sha1_stat with the given OID (when valid).
+ */
+static int do_read_blob(const struct object_id *oid,
+   struct sha1_stat *sha1_stat,
+   size_t *size_out,
+   char **data_out)
+{
+   enum object_type type;
+   unsigned long sz;
+   char *data;
+
+   *size_out = 0;
+   *data_out = NULL;
+
+   data = read_sha1_file(oid->hash, , );
+   if (!data || type != OBJ_BLOB) {
+   free(data);
+   return -1;
+   }
+
+   if (sha1_stat) {
+   memset(_stat->stat, 0, sizeof(sha1_stat->stat));
+   hashcpy(sha1_stat->sha1, oid->hash);
+   }
+
+   if (sz == 0) {
+   free(data);
+   return 0;
+   }
+
+   if (data[sz - 1] != '\n') {
+   data = xrealloc(data, st_add(sz, 1));
+   data[sz++] = '\n';
+   }
+
+   *size_out = xsize_t(sz);
+   *data_out = data;
+
+   return 1;
+}
+
 #define DO_MATCH_EXCLUDE   (1<<0)
 #define DO_MATCH_DIRECTORY (1<<1)
 #define DO_MATCH_SUBMODULE (1<<2)
@@ -600,32 +651,22 @@ void add_exclude(const char *string, const char *base,
x->el = el;
 }
 
-static void *read_skip_worktree_file_from_index(const struct index_state 
*istate,
-   const char *path, size_t *size,
-   struct sha1_stat *sha1_stat)
+static int read_skip_worktree_file_from_index(const struct index_state *istate,
+ const char *path,
+ size_t *size_out,
+ char **data_out,
+ struct sha1_stat *sha1_stat)
 {
int pos, len;
-   unsigned long sz;
-   enum object_type type;
-   void *data;
 
len = strlen(path);
pos = index_name_pos(istate, path, len);
if (pos < 0)
-   return NULL;
+   return -1;
if (!ce_skip_worktree(istate->cache[pos]))
-   return NULL;
-   data = read_sha1_file(istate->cache[pos]->oid.hash, , );
-   if (!data || type != OBJ_BLOB) {
-   free(data);
-   return NULL;
-   }
-   *size = xsize_t(sz);
-   if (sha1_stat) {
-   memset(_stat->stat, 0, sizeof(sha1_stat->stat));
-   hashcpy(sha1_stat->sha1, istate->cache[pos]->oid.hash);
-   }
-   return data;
+   return -1;
+
+   return do_read_blob(>cache[pos]->oid, sha1_stat, size_out, 
data_out);
 }
 
 /*
@@ -739,6 +780,10 @@ static void invalidate_directory(struct untracked_cache 
*uc,
dir->dirs[i]->recurse = 0;
 }
 
+static int add_excludes_from_buffer(char *buf, size_t size,
+   const char *base, int baselen,
+   struct exclude_list *el);
+
 /*
  * Given a file with name "fname", read it (either from disk, or from
  * an index if 'istate' is non-null), parse it and store the
@@ -754,9 +799,10 @@ static int add_excludes(const char *fname, const char 
*base, int baselen,
struct sha1_stat *sha1_stat)
 {
struct stat st;
-   int fd, i, lineno = 1;
+   int r;
+   int fd;
size_t size = 0;
-   char *buf, *entry;
+   char *buf;
 
fd = open(fname, O_RDONLY);
if (fd < 0 || fstat(fd, ) < 0) {
@@ -764,17 +810,13 @@ static int add_excludes(const char *fname, const char 
*base, int baselen,
warn_on_fopen_errors(fname);
else
close(fd);
-   if (!istate ||
-   (buf = 

[PATCH v4 2/6] oidmap: add oidmap iterator methods

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Add the usual map iterator functions to oidmap.

Signed-off-by: Jeff Hostetler 
---
 oidmap.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/oidmap.h b/oidmap.h
index 18f54cd..d3cd2bb 100644
--- a/oidmap.h
+++ b/oidmap.h
@@ -65,4 +65,26 @@ extern void *oidmap_put(struct oidmap *map, void *entry);
  */
 extern void *oidmap_remove(struct oidmap *map, const struct object_id *key);
 
+
+struct oidmap_iter {
+   struct hashmap_iter h_iter;
+};
+
+static inline void oidmap_iter_init(struct oidmap *map, struct oidmap_iter 
*iter)
+{
+   hashmap_iter_init(>map, >h_iter);
+}
+
+static inline void *oidmap_iter_next(struct oidmap_iter *iter)
+{
+   return hashmap_iter_next(>h_iter);
+}
+
+static inline void *oidmap_iter_first(struct oidmap *map,
+ struct oidmap_iter *iter)
+{
+   oidmap_iter_init(map, iter);
+   return oidmap_iter_next(iter);
+}
+
 #endif
-- 
2.9.3



[PATCH v4 5/6] rev-list: add list-objects filtering support

2017-11-16 Thread Jeff Hostetler
From: Jeff Hostetler 

Teach rev-list to use the filtering provided by the
traverse_commit_list_filtered() interface to omit
unwanted objects from the result.  This feature is
intended to help with partial clone.

Object filtering is only allowed when one of the "--objects*"
options are used.

When the "--filter-print-omitted" option is used, the omitted
objects are printed at the end.  These are marked with a "~".
This option can be combined with "--quiet" to get a list of
just the omitted objects.

Added "--missing=(error|print|omit)" argument to specify how
rev-list should behave when it encounters a missing object
(presumably from a prior partial clone).

When "--missing=print" is used, rev-list will print a list of
any missing objects that should have been included in the output.
These are marked with a "?".

Add t6112 test.

Signed-off-by: Jeff Hostetler 
---
 Documentation/git-rev-list.txt  |   4 +-
 Documentation/rev-list-options.txt  |  37 ++
 builtin/rev-list.c  | 108 -
 t/t6112-rev-list-filters-objects.sh | 225 
 4 files changed, 371 insertions(+), 3 deletions(-)
 create mode 100755 t/t6112-rev-list-filters-objects.sh

diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index ef22f17..397a0dd 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -47,7 +47,9 @@ SYNOPSIS
 [ --fixed-strings | -F ]
 [ --date=]
 [ [ --objects | --objects-edge | --objects-edge-aggressive ]
-  [ --unpacked ] ]
+  [ --unpacked ]
+  [ --filter= [ --filter-print-omitted ] ] ]
+[ --missing=(error|allow-any|print) ]
 [ --pretty | --header ]
 [ --bisect ]
 [ --bisect-vars ]
diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 13501e1..c84e465 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -706,6 +706,43 @@ ifdef::git-rev-list[]
 --unpacked::
Only useful with `--objects`; print the object IDs that are not
in packs.
+
+--filter=::
+   Only useful with one of the `--objects*`; omits objects (usually
+   blobs) from the list of printed objects.  The ''
+   may be one of the following:
++
+The form '--filter=blob:none' omits all blobs.
++
+The form '--filter=blob:limit=[kmg]' omits blobs larger than n bytes
+or units.  The value may be zero.  Special files matching '.git*' are
+alwayse included, regardless of size.
++
+The form '--filter=sparse:oid=' uses a sparse-checkout
+specification contained in the object (or the object that the expression
+evaluates to) to omit blobs not required by the corresponding sparse
+checkout.
++
+The form '--filter=sparse:path=' similarly uses a sparse-checkout
+specification contained in .
+
+--filter-print-omitted::
+   Only useful with `--filter=`; prints a list of the omitted objects.
+   Object IDs are prefixed with a ``~'' character.
+
+--missing=(error|allow-any|print)::
+   Specifies how missing objects are handled.  The repository may
+   have missing objects after a partial clone, for example.
++
+The value 'error' requests that rev-list stop with an error if a missing
+object is encountered.  This is the default action.
++
+The value 'allow-any' will allow object traversal to continue if a
+missing object is encountered.  Missing objects will silently be omitted
+from the results.
++
+The value 'print' is like 'allow-any', but will also print a list of the
+missing objects.  Object IDs are prefixed with a ``?'' character.
 endif::git-rev-list[]
 
 --no-walk[=(sorted|unsorted)]::
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index c1c74d4..da4a39b 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -4,6 +4,8 @@
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "list-objects-filter.h"
+#include "list-objects-filter-options.h"
 #include "pack.h"
 #include "pack-bitmap.h"
 #include "builtin.h"
@@ -12,6 +14,7 @@
 #include "bisect.h"
 #include "progress.h"
 #include "reflog-walk.h"
+#include "oidset.h"
 
 static const char rev_list_usage[] =
 "git rev-list [OPTION] ... [ -- paths... ]\n"
@@ -55,6 +58,20 @@ static const char rev_list_usage[] =
 static struct progress *progress;
 static unsigned progress_counter;
 
+static struct list_objects_filter_options filter_options;
+static struct oidset omitted_objects;
+static int arg_print_omitted; /* print objects omitted by filter */
+
+static struct oidset missing_objects;
+enum missing_action {
+   MA_ERROR = 0,/* fail if any missing objects are encountered */
+   MA_ALLOW_ANY,/* silently allow ALL missing objects */
+   MA_PRINT,/* print ALL missing objects in special section */
+};
+static enum missing_action arg_missing_action;
+
+#define 

Re: [PATCH 04/14] fetch: add object filtering for partial fetch

2017-11-16 Thread Jeff Hostetler



On 11/3/2017 4:38 PM, Jonathan Tan wrote:

@@ -1242,6 +1249,20 @@ static int fetch_multiple(struct string_list *list)
int i, result = 0;
struct argv_array argv = ARGV_ARRAY_INIT;
  
+	if (filter_options.choice) {

+   /*
+* We currently only support partial-fetches to the remote
+* used for the partial-clone because we only support 1
+* promisor remote, so we DO NOT allow explicit command
+* line filter arguments.
+*
+* Note that the loop below will spawn background fetches
+* for each remote and one of them MAY INHERIT the proper
+* partial-fetch settings, so everything is consistent.
+*/
+   die(_("partial-fetch is not supported on multiple remotes"));
+   }
+
if (!append && !dry_run) {
int errcode = truncate_fetch_head();
if (errcode)


My intention in doing the "fetch: refactor calculation of remote list"
patch is so that the interaction between the provided list of remotes
and the specification of the filter can be handled using the following
diff:

 -  if (remote)
 +  if (remote) {
 +  if (filter_options.choice &&
 +  strcmp(remote->name, 
repository_format_partial_clone_remote))
 +  die(_("--blob-max-bytes can only be used with the remote 
configured in core.partialClone"));
result = fetch_one(remote, argc, argv);
 -  else
 +  } else {
 +  if (filter_options.choice)
 +  die(_("--blob-max-bytes can only be used with the remote 
configured in core.partialClone"));
result = fetch_multiple();
 +  }

(Ignore the "blob-max-bytes" in the error message - that needs to be
updated.)

The GitHub link I provided above has this diff, and it seems to work.



I put the filter_options.choice tests inside the fetch_{one,multiple}
routines because the former needs to be able to register partial clone
with the config and/or inherit the default filter-spec for the
promisor remote and that took more code that what can neatly fit inline
here.  This will be more apparent in my next patch series.

Jeff


Re: [PATCH 02/14] clone, fetch-pack, index-pack, transport: partial clone

2017-11-16 Thread Jeff Hostetler



On 11/8/2017 1:01 PM, Adam Dinwoodie wrote:

On Friday 03 November 2017 at 01:32 pm -0700, Jonathan Tan wrote:

On Thu,  2 Nov 2017 20:31:17 +
Jeff Hostetler  wrote:

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index a0a35e6..31cd5ba 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -222,6 +222,16 @@ static unsigned check_object(struct object *obj)
if (!(obj->flags & FLAG_CHECKED)) {
unsigned long size;
int type = sha1_object_info(obj->oid.hash, );
+
+   if (type <= 0) {
+   /*
+* TODO Use the promisor code to conditionally
+* try to fetch this object -or- assume it is ok.
+*/
+   obj->flags |= FLAG_CHECKED;
+   return 0;
+   }
+
if (type <= 0)
die(_("did not receive expected object %s"),
  oid_to_hex(>oid));


This causes some repo corruption tests to fail.


Confirmed: I see this patch, or at least f7e0dbc38 ("clone, fetch-pack,
index-pack, transport: partial clone", 2017-11-02), causing t5300.26 to
fail on 64-bit Cygwin.

For the sake of anyone trying to reproduce this, I needed to cherry pick
66d4c7a58 ("fixup! upload-pack: add object filtering for partial clone",
2017-11-08) onto that commit before I was able to get it to compile.

Adam



Thanks.  I've removed this from my next version.  I think it was
left over from a pre-promisor version.

Jeff


Re: [PATCH v3 4/6] list-objects: filter objects in traverse_commit_list

2017-11-16 Thread Jeff Hostetler



On 11/8/2017 12:01 AM, Junio C Hamano wrote:

Jonathan Tan  writes:


Having said that, though, it might be safer to still introduce one, and
relax it later if necessary - it is much easier to relax a constraint
than to increase one.


It would also be more error prone to have such a long switch ()
statement, each of whose case arm needs to be carefully looked at.

While protection against attacks over the wire against the process
that receives the request is necessary and doing the quoting right
at this layer is one valuable component of it, we would need to be
careful about what features we allow the other side to request.

For example, an innocent-looking use of get_oid_with_context() can
trigger an expensive operation, e.g. "master^{/sekritCodeName}", may
not just waste resources but also may reveal the presence of an
object that we might not want to leak to a stranger.  Limiting such
an abuse must sit at a lot higher layer than a byte-by-byte check
over the request like the code does.



Right.  I could see adding another server-side variable in the
spirit of the existing "uploadpack.allow*" variables.

My main concern at this point has been avoiding injections.

Jeff



Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-16 Thread Jeff Hostetler



On 11/8/2017 4:51 PM, Jonathan Tan wrote:

On Wed, 8 Nov 2017 15:32:21 -0500
Jeff Hostetler  wrote:


Thanks Jonathan.

I moved my version of part 2 on top of yesterday's part 1.
There are a few changes between my version and yours. Could
you take a quick look at them and see if they make sense?
(I'll spare the mailing list another patch series until after
I attend to the feed back on part 1.)

https://github.com/jeffhostetler/git/commits/core/pc3_p2


Thanks - the differences are quite minor, and they generally make sense.
The main one is that finish_object() in builtin/rev-list.c now returns
int instead of void, but that makes sense.

Other than that:

  - I think you accidentally squashed the rev-list commit into
"sha1_file: support lazily fetching missing objects".


fixed. thanks.


  - The documentation for --exclude-promisor-objects in
git-pack-objects.txt should be "Omit objects that are known to be in
the promisor remote". (This option has the purpose of operating only
on locally created objects, so that when we repack, we still maintain
a distinction between locally created objects [without .promisor] and
objects from the promisor remote [with .promisor].)



  - The transport options in gitremote-helpers.txt could have the same
documentation as in transport.h.


fixed. thanks.
 


Re: [PATCH v3 4/6] list-objects: filter objects in traverse_commit_list

2017-11-16 Thread Jeff Hostetler



On 11/7/2017 6:20 PM, Jonathan Tan wrote:

On Tue,  7 Nov 2017 19:35:44 +
Jeff Hostetler  wrote:


+/*
+ * Reject the arg if it contains any characters that might
+ * require quoting or escaping when handing to a sub-command.
+ */
+static int reject_injection_chars(const char *arg)
+{

[snip]

+}


Someone pointed me to quote.{c,h}, which is probably sufficient to
ensure shell safety if we do invoke subcommands through the shell. If
that is so, we probably don't need a blacklist.

Having said that, though, it might be safer to still introduce one, and
relax it later if necessary - it is much easier to relax a constraint
than to increase one.


I couldn't use quote.[ch] because it is more concerned with
quoting pathnames because of LF and CR characters within
them -- rather than semicolons and quotes and the like which
I was concerned about.

Anyway, in my next patch series I've replaced all of the
injection code from my last series with something a little
stronger and not restricting.




+   } else if (skip_prefix(arg, "sparse:", )) {
+
+   if (skip_prefix(v0, "oid=", )) {
+   struct object_context oc;
+   struct object_id sparse_oid;
+   filter_options->choice = LOFC_SPARSE_OID;
+   if (!get_oid_with_context(v1, GET_OID_BLOB,
+ _oid, ))
+   filter_options->sparse_oid_value =
+   oiddup(_oid);
+   return 0;
+   }


In your recent e-mail [1], you said that you will change it to always pass
the original expression - is that still the plan?

[1] 
https://public-inbox.org/git/f698d5a8-bf31-cea1-a8da-88b755b0b...@jeffhostetler.com/


yes.  I always pass filter_options.raw_value over the wire.
The code above tries to parse it and put it in an OID for
private use by the current process -- just like the size limit
value in the blob:limit filter.


+/* Remember to update object flag allocation in object.h */


You probably can delete this line.


Every other place that defined flag bits included this comment,
so I did too.  (It really made it easier to find the other
random places that define bits, actually.)




+/*
+ * FILTER_SHOWN_BUT_REVISIT -- we set this bit on tree objects
+ * that have been shown, but should be revisited if they appear
+ * in the traversal (until we mark it SEEN).  This is a way to
+ * let us silently de-dup calls to show() in the caller.


This is unclear to me at first reading. Maybe something like:

   FILTER_SHOWN_BUT_REVISIT -- we set this bit on tree objects that have
   been shown, but should not be skipped over if they reappear in the
   traversal. This ensures that the tree's descendants are re-processed
   if the tree reappears subsequently, and that the tree is not shown
   twice.


+ * This
+ * is subtly different from the "revision.h:SHOWN" and the
+ * "sha1_name.c:ONELINE_SEEN" bits.  And also different from
+ * the non-de-dup usage in pack-bitmap.c
+ */


Optional: I'm not sure if this comparison is useful. (Maybe it is useful
to others, though.)


I was thinking the first comment about my FILTER_SHOWN field
would be to ask why I wasn't just using the existing SHOWN bit.
There are subtle differences between the bits and I wanted to
point out that I was not just duplicating the usage of an existing
bit.
 



+/*
+ * A filter driven by a sparse-checkout specification to only
+ * include blobs that a sparse checkout would populate.
+ *
+ * The sparse-checkout spec can be loaded from a blob with the
+ * given OID or from a local pathname.  We allow an OID because
+ * the repo may be bare or we may be doing the filtering on the
+ * server.
+ */
+struct frame {
+   /*
+* defval is the usual default include/exclude value that
+* should be inherited as we recurse into directories based
+* upon pattern matching of the directory itself or of a
+* containing directory.
+*/
+   int defval;


Can this be an "unsigned defval : 1" as well? In the function below, I
see that you assign to an "int val" first (which can take -1, 0, and 1)
before assigning to this, so that is fine.

Also, maybe a better name would be "exclude", with the documentation:

   1 if the directory is excluded, 0 otherwise. Excluded directories will
   still be recursed through, because an "include" rule for an object
   might override an "exclude" rule for one of its ancestors.



The name "defval" is used unpack-trees.c during the clear_ce_flags()
recursion while looking at the exclusion list.  I was just trying to
match that behavior.

Thanks
Jeff


[PATCH] completion: add '--copy' option to 'git branch'

2017-11-16 Thread Todd Zullinger
In 52d59cc645 (branch: add a --copy (-c) option to go with --move (-m),
2017-06-18), `git branch` learned a `--copy` option.  Include it when
providing command completions.

Signed-off-by: Todd Zullinger 
---
 contrib/completion/git-completion.bash | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index f07f16b28f..89eb65d280 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1204,7 +1204,7 @@ _git_branch ()
--color --no-color --verbose --abbrev= --no-abbrev
--track --no-track --contains --no-contains --merged 
--no-merged
--set-upstream-to= --edit-description --list
-   --unset-upstream --delete --move --remotes
+   --unset-upstream --delete --move --copy --remotes
--column --no-column --sort= --points-at
"
;;
-- 
2.15.0



[PATCH 1/1] exec_cmd: RUNTIME_PREFIX on some POSIX systems

2017-11-16 Thread Dan Jacques
Enable Git to resolve its own binary location using a variety of
OS-specific and generic methods, including:

- procfs via "/proc/self/exe" (Linux)
- _NSGetExecutablePath (Darwin)
- argv0, if absolute (all, including Windows).

This is used to enable RUNTIME_PREFIX support for non-Windows systems,
notably Linux and Darwin. When configured with RUNTIME_PREFIX, Git will
do a best-effort resolution of its executable path and automatically use
this as its "exec_path" for relative helper and data lookups, unless
explicitly overridden.

Git will also always export and consume its resolved "exec_path" using
the EXEC_PATH_ENVIRONMENT regardless of whether the user has overridden
it, simplifying future lookups and ensuring consistency in Git tooling
execution.

When building with a runtime prefix, Git's PERL libraries are now
installed to a consistently-named directory. This path is resolved and
exported to Git's delegate PERL invocations using the GITPERLLIB
environment variable. This enables Git's delegate PERL scripts to import
Git's own PERL libraries from a path relative to the executable.

Small incidental formatting cleanup of "exec_cmd.c".

Signed-off-by: Dan Jacques 
---
 Makefile   |  29 ++--
 builtin/receive-pack.c |   2 +-
 cache.h|   2 +
 common-main.c  |   4 +-
 config.mak.uname   |   4 ++
 exec_cmd.c | 189 +++--
 exec_cmd.h |   6 +-
 gettext.c  |   7 +-
 git.c  |   7 +-
 http-backend.c |   2 +-
 shell.c|   2 +-
 upload-pack.c  |   2 +-
 12 files changed, 217 insertions(+), 39 deletions(-)

diff --git a/Makefile b/Makefile
index ee9d5eb11..80db01706 100644
--- a/Makefile
+++ b/Makefile
@@ -296,7 +296,8 @@ all::
 # Define PERL_PATH to the path of your Perl binary (usually /usr/bin/perl).
 #
 # Define NO_PERL_MAKEMAKER if you cannot use Makefiles generated by perl's
-# MakeMaker (e.g. using ActiveState under Cygwin).
+# MakeMaker (e.g. using ActiveState under Cygwin, or building with a fixed
+# runtime prefix).
 #
 # Define NO_PERL if you do not want Perl scripts or libraries at all.
 #
@@ -462,6 +463,7 @@ ARFLAGS = rcs
 #   mandir
 #   infodir
 #   htmldir
+#   localedir
 # This can help installing the suite in a relocatable way.
 
 prefix = $(HOME)
@@ -485,6 +487,7 @@ pathsep = :
 mandir_relative = $(patsubst $(prefix)/%,%,$(mandir))
 infodir_relative = $(patsubst $(prefix)/%,%,$(infodir))
 htmldir_relative = $(patsubst $(prefix)/%,%,$(htmldir))
+localedir_relative = $(patsubst $(prefix)/%,%,$(localedir))
 
 export prefix bindir sharedir sysconfdir gitwebdir localedir
 
@@ -1522,9 +1525,6 @@ ifdef SHA1_MAX_BLOCK_SIZE
LIB_OBJS += compat/sha1-chunked.o
BASIC_CFLAGS += -DSHA1_MAX_BLOCK_SIZE="$(SHA1_MAX_BLOCK_SIZE)"
 endif
-ifdef NO_PERL_MAKEMAKER
-   export NO_PERL_MAKEMAKER
-endif
 ifdef NO_HSTRERROR
COMPAT_CFLAGS += -DNO_HSTRERROR
COMPAT_OBJS += compat/hstrerror.o
@@ -1547,6 +1547,14 @@ else
 endif
 ifdef RUNTIME_PREFIX
COMPAT_CFLAGS += -DRUNTIME_PREFIX
+
+   # Control PERL library location so it can be referenced by relocatable
+   # code.
+   NO_PERL_MAKEMAKER = YesPlease
+endif
+
+ifdef NO_PERL_MAKEMAKER
+   export NO_PERL_MAKEMAKER
 endif
 
 ifdef NO_PTHREADS
@@ -1632,6 +1640,15 @@ ifdef HAVE_GETDELIM
BASIC_CFLAGS += -DHAVE_GETDELIM
 endif
 
+ifneq ($(PROCFS_EXECUTABLE_PATH),)
+   procfs_executable_path_SQ = $(subst ','\'',$(PROCFS_EXECUTABLE_PATH))
+   BASIC_CFLAGS += 
'-DPROCFS_EXECUTABLE_PATH="$(procfs_executable_path_SQ)"'
+endif
+
+ifdef HAVE_NS_GET_EXECUTABLE_PATH
+   BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -1714,6 +1731,7 @@ bindir_relative_SQ = $(subst ','\'',$(bindir_relative))
 mandir_relative_SQ = $(subst ','\'',$(mandir_relative))
 infodir_relative_SQ = $(subst ','\'',$(infodir_relative))
 localedir_SQ = $(subst ','\'',$(localedir))
+localedir_relative_SQ = $(subst ','\'',$(localedir_relative))
 gitexecdir_SQ = $(subst ','\'',$(gitexecdir))
 template_dir_SQ = $(subst ','\'',$(template_dir))
 htmldir_relative_SQ = $(subst ','\'',$(htmldir_relative))
@@ -2130,6 +2148,7 @@ endif
 exec_cmd.sp exec_cmd.s exec_cmd.o: GIT-PREFIX
 exec_cmd.sp exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
'-DGIT_EXEC_PATH="$(gitexecdir_SQ)"' \
+   '-DGIT_LOCALE_PATH="$(localedir_relative_SQ)"' \
'-DBINDIR="$(bindir_relative_SQ)"' \
'-DPREFIX="$(prefix_SQ)"'
 
@@ -2147,7 +2166,7 @@ attr.sp attr.s attr.o: EXTRA_CPPFLAGS = \
 
 gettext.sp gettext.s gettext.o: GIT-PREFIX
 gettext.sp gettext.s gettext.o: EXTRA_CPPFLAGS = \
-   -DGIT_LOCALE_PATH='"$(localedir_SQ)"'
+   -DGIT_LOCALE_PATH='"$(localedir_relative_SQ)"'
 
 http-push.sp http.sp http-walker.sp remote-curl.sp imap-send.sp: SPARSE_FLAGS 
+= \
-DCURL_DISABLE_TYPECHECK
diff --git 

[PATCH 0/1] RUNTIME_PREFIX on POSIX systems.

2017-11-16 Thread Dan Jacques
Hello! This would be my first contribution to the Git project, so please
accept my apology in advance for any mistakes and let me know what I can
do better.

This patch expands support for the RUNTIME_PREFIX configuration flag,
currently only used on Windows builds, to include Linux, Darwin, and
FreeBSD. When Git is built with RUNTIME_PREFIX enabled, it resolves its
ancillary paths relative to the runtime location of its executable
rather than hard-coding them at compile-time, allowing a Git
installation to be deployed to a path other than the one in which it
was installed.

It was useful to create Git distribution bundles that could unpack
fully-functional Git deployments to arbitrary locations in support of the
Chromium project. Chromium has been using Git bundles built with a patch
similar to this one on its Linux and Mac continuous integration fleet (plus
some developer systems) for almost a year now.

RUNTIME_PREFIX remains an optional configuration flag, so standard Git
builds will not see any changes. However, with this patch applied,
Linux, Darwin, and FreeBSD users can now optionally use "config.mak" to
enable RUNTIME_PREFIX and build relocatable Git distributions. An
example "config.mak" that builds relocatable Git binaries for Linux/Mac
is:

# BEGIN: config.mak
RUNTIME_PREFIX = YesPlease
gitexecdir = libexec/git-core
template_dir = share/git-core/templates
sysconfdir = etc
# END: config.mak

Implementation notes:

It is unfortunately not straightforward to resolve the full absolute path
of the currently-running binary. On some operating systems, notably
Windows, this path is executively supplied as argv[0]. On other
operating systems, however, argv[0] is supplied by the invoker (shell,
script, kernel, etc.), and is not a reliable source of information about
the running Git binary.

The specific method that this patch employs for binary directory resolution
varies depending on the operating system. On Linux and FreeBSD,
Git resolves "/proc/self/exe" and "/proc/curproc/file" respectively. On
Darwin, Git uses the "_NSGetExecutablePath" function. On all operating
systems, notably Windows, Git continues to fall back to resolution
against argv[0] when it is an absolute path.

When RUNTIME_PREFIX is enabled, the resolved runtime path needs to be
passed to ancillary Git tools for their own resolution requirements:

- C-source Git programs will use the EXEC_PATH_ENVIRONMENT environment
  variable that Git already exports, ensuring that any launched tools use
  the same runtime prefix as the entry point.

- PERL tooling needs to know how to locate Git's support libraries. When
  RUNTIME_PREFIX is configured, Git now exports the GITPERLLIB environment
  variable, a mechanism that Git's PERL tooling supports that appears to be
  built for testing. PERL scripts installed using MakeMaker incorporate the
  builder system's PERL version into their installation path, making
  it inconsistent to hard-code; consequently, this patch opts to disable
  MakeMaker for RUNTIME_PREFIX builds in order to deterministically control
  the destination of Git's support libraries.

- Git also exports the GIT_TEXTDOMAINDIR environment variable when
  RUNTIME_PREFIX is set so that its locale configuration can be leveraged
  by Git tooling gettext().

Please note that this patch affects Windows Git builds, since the Windows
Git project uses RUNTIME_PREFIX to support arbitrary installation paths.
Notably, PERL scripts are now always installed without MakeMaker (if they
weren't before), and EXEC_PATH_ENVIRONMENT is preferred by tools instead of
re-resolving argv[0]. Chromium uses the stock redistributable Windows Git
package, so I haven't had an opportunity to test this patch on that
platform.

Please take a look and let me know what you think. Thanks!
-Dan

Dan Jacques (1):
  exec_cmd: RUNTIME_PREFIX on some POSIX systems

 Makefile   |  29 ++--
 builtin/receive-pack.c |   2 +-
 cache.h|   2 +
 common-main.c  |   4 +-
 config.mak.uname   |   4 ++
 exec_cmd.c | 189 +++--
 exec_cmd.h |   6 +-
 gettext.c  |   7 +-
 git.c  |   7 +-
 http-backend.c |   2 +-
 shell.c|   2 +-
 upload-pack.c  |   2 +-
 12 files changed, 217 insertions(+), 39 deletions(-)

-- 
2.15.0.448.gf294e3d99a-goog



Re: [PATCH 3/4] branch: correctly reject refs/heads/{-dash,HEAD}

2017-11-16 Thread Kaartic Sivaraam

On Thursday 16 November 2017 08:27 PM, Junio C Hamano wrote:

Kaartic Sivaraam  writes:


I guess this series is not yet ready for 'next'. When I tried to apply
this patch it doesn't seem to be applying cleanly. I get some
conflicts in 'sha1_name.c' possibly as a consequence of the changes to
the file that aren't accounted by the patch.


Oh, it is totally expected that this patch (and others) may not
apply on 'next' without conflict resolution, as this topic, as all
the other topics, are designed to apply cleanly to either 'master'
or 'maint' or one of the older 'maint-*' series, if it is a bugfix
topic.  A patch series that only applies cleanly to 'next' would be
useless---it would mean all the topics that are already in 'next'
that interacts with it must graduate first before it can go in.


Thanks for the explanation. Seems I still have to become accustomed to 
the workflow.




Make it a habit to build on 'master' or older and then merge to
higher integration branches to make sure it fits with others.



Though I don't clearly understand how to do that, I'll let experience 
teach me the same (if possible). :-)




What you could do is to inspect origin/pu branch after you fetch,
and look at the commit that merges this topic to learn how the
conflicts are resolved (the contrib/rerere-train.sh script may help
this process).



Sure thing.



+   if (*name == '-' ||
+   !strcmp(sb->buf, "refs/heads/HEAD"))


I guess this check should be made more consistent. Possibly either of,


Among these two, this one


if (starts_with(sb->buf, "refs/heads/-") ||
!strcmp(sb->buf, "refs/heads/HEAD"))


has more chance to be correct.  Also, if we were to check the result
of expansion in sb->buf[], I would think that we should keep a
separate variable that points at >buf[11] and compare the
remainder against fixed strings, as we already know sb->buf[] starts
with "refs/heads/" due to our splicing in the fixed string.

Because the point of using strbuf_branchname() is to allow us expand
funny notations like @{-1} to refs/heads/foo, and the result of
expansion is what eventually matters, checking name[] is wrong, I
would think, even though I haven't thought things through.

In any case, I would say thinking this part through should be left
as a theme for a follow-on patch, and not within the scope of this
one.  After all, checking *name against '-' was part of the original
code, so it is safer to keep the thing we are not touching bug-to-bug
compatible and fixing things one step at a time (the one fix we made
with this patch is to make sure we store refs/heads/-dash in sb when
we reject name=="-dash").



Good point. This is better for a follow-up patch as there's a 
possibility that we might be introducing new bugs which weren't present 
previously as a consequence of changing that conditional (bug-to-bug 
compatibility, good term that (possibly) summarizes my long-winded 
explanation ;-)


Re: [PATCH] branch doc: remove --set-upstream from synopsis

2017-11-16 Thread Todd Zullinger

Martin Ågren wrote:

On 16 November 2017 at 08:46, Todd Zullinger  wrote:
I noticed that --set-upstream was still in the synopsis for git branch.  I 
don't think it was left there intentionally.  I looked through the thread where 
support for the option was removed and didn't notice any comments suggesting 
otherwise[1].  With luck, I didn't miss the obvious while reading the thread.


[1] 
https://public-inbox.org/git/20170807143938.5127-1-kaarticsivaraam91...@gmail.com/


Actually, the first version of the series did remove it from the 
synopsis [2]. That hunk was later dropped. Kaartic mentioned it [3] and 
I thought out loud about it [4].


Oh my.  I'll have to hope no prospective employer finds this thread if 
I ever apply for a job as a proofreader.  ;)


Thanks for pointing out the relevant parts of the discussion, of 
course.


I get the same initial thought now as then: It's a bit odd that we 
pique the interest of the reader, but that when they try it out or 
read up on it, we say "nope, this is not what you are looking for".


Indeed.  If we do want to keep the option in the synopsis, it should 
at least be moved to the same entry as --set-upstream-to, since that's 
it's effect now.


I don't think we should do that, but it would at least be accurate 
there.  Using it like it's described in the synopsis now is an error.


# git branch [--set-upstream | --track | --no-track] [-l] [-f]  
[]

$ git branch --set-upstream mybranch master
fatal: the '--set-upstream' option is no longer supported. Please use 
'--track' or '--set-upstream-to' instead.


$ git branch --track mybranch master
Branch 'mybranch' set up to track local branch 'master'.

Kaartic Sivaraam wrote:
I didn't remove it as there wasn't a "strong" consensus that this 
should go off the "Synopsis" at that time. If removing it from the 
synopsis seems to be better than leaving it, then lets do it.


Indeed, I'm sorry I missed that you'd removed it earlier in the patch 
history.  I'm obviously in favor of removing it now. :)


Further, I think we should make this some kind of "guideline" in 
this project to remain consistent. Something like,


   * If you deprecate an option of a command to an extent that it's not 
 usable at all, remove that option from the "Synopsis" of the 
 concerned "Documentation".


possibly to "Documentation/SubmittingPatches" or at least keep this as 
some form of undocumented guideline if this might make 
"Documentation/SubmittingPatches" unnecessarily clumsy. I dunno, was 
just thinking out loud.


Yeah, it's probably tough to cover the many varied situations like 
this to SubmittingPatches.  In general, I would agree with the above 
guideline.  It's best to not steer people toward options which we 
would prefer them not to use.


Where that line falls with regard to documentation can certainly vary 
a lot.  It's often hard for long-time git users to step into the shoes 
of new users who may be reading the documentation for the first time. 
The right balance between enough information and not too much is 
always tricky.


[Incidentally, I ran into this because I had made some changes on a 
master branch of a repository.  Later, I wanted to move them to a 
topic branch so I could do some other comparisons between master and 
upstream.  I wondered if I could setup the remote tracking using git 
branch on the new branch in one step.  The recent branch --copy option 
does this perfectly (and easily).]


Thanks to both of you.

--
Todd
~~
Whenever I feel blue, I start breathing again.



Re: [PATCH] git-send-email: fix get_maintainer.pl regression

2017-11-16 Thread Alex Bennée

Alex Bennée  writes:

> Getting rid of Mail::Address regressed behaviour with common
> get_maintainer scripts such as the Linux kernel. Fix the missed corner
> case and add a test for it.
>

> diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
> index 4d261c2a9..0bcd7ab96 100755
> --- a/t/t9001-send-email.sh
> +++ b/t/t9001-send-email.sh
> @@ -172,6 +172,27 @@ test_expect_success $PREREQ 'cc trailer with various 
> syntax' '
>   test_cmp expected-cc commandline1
>  '
>
> +test_expect_success $PREREQ 'setup get_mainter script for cc trailer' "
> +cat >expected-cc-script.sh <<-EOF && chmod +x expected-cc-script.sh
> +#!/bin/sh
> +echo 'One Person  (supporter:THIS (FOO/bar))'
> +echo 'Two Person  (maintainer:THIS THING)'
> +echo 'Third List  (moderated list:THIS THING (FOO/bar))'
> +echo ' (moderated list:FOR THING)'
> +echo 'f...@example.com (open list:FOR THING (FOO/bar))'
> +echo 's...@example.com (open list)'
> +EOF
> +"
> +
> +test_expect_success $PREREQ 'cc trailer with get_maintainer output' '
> + test_commit cc-trailer &&
> + clean_fake_sendmail &&
> + git send-email -1 --to=recipi...@example.com \
> + --cc-cmd="$(pwd)/expected-cc-script.sh" \
> + --smtp-server="$(pwd)/fake.sendmail" &&
> + test_cmp expected-cc commandline1
> +'
> +

OK I'm afraid I don't fully understand the test harness as this breaks a
bunch of other tests. If anyone can offer some pointers on how to fix
I'd be grateful.

In the meantime I know the core change works because I tested with:

#+name: send-patches-dry-run
#+begin_src sh :results output
# temp workaround
export PERL5LIB=/home/alex/src/git.git/perl/
git send-email --confirm=never --dry-run --quiet ${mailto} ${series}.patches/*
#+end_src

When I sent my last set of kernel patches to the list (the workflow that
was broken before by the cc9075067776ebd34cc08f31bf78bb05f12fd879 change
landing via my git stable PPA).

--
Alex Bennée


[PATCH v3 1/1] Introduce git add --renormalize .

2017-11-16 Thread tboegi
From: Torsten Bögershausen 

Make it safer to normalize the line endings in a repository:
Files that had been commited with CRLF will be commited with LF.

The old way to normalize a repo was like this:
 # Make sure that there are not untracked files
 $ echo "* text=auto" >.gitattributes
 $ git read-tree --empty
 $ git add .
 $ git commit -m "Introduce end-of-line normalization"

The user must make sure that there are no untracked files,
otherwise they would have been added and tracked from now on.

The new "add ..renormalize" does not add untracked files:
 $ echo "* text=auto" >.gitattributes
 $ git add --renormalize .
 $ git commit -m "Introduce end-of-line normalization"

Note that "git add --renormalize " is the short form for
"git add -u --renormalize ".

While add it, document that the same renormalization may be needed,
whenever a clean filter is added or changed.

Helped-By: Junio C Hamano 
Signed-off-by: Torsten Bögershausen 
---

Changes since V2:
  Add line endings in t0025
  Use the <<-\EOF pattern
  Improve the documentation for "git add --renormalize"
  

Documentation/git-add.txt   |  9 -
 Documentation/gitattributes.txt |  6 --
 builtin/add.c   | 28 ++--
 cache.h |  1 +
 read-cache.c| 30 +++---
 sha1_file.c | 16 ++--
 t/t0025-crlf-renormalize.sh | 30 ++
 7 files changed, 102 insertions(+), 18 deletions(-)
 create mode 100755 t/t0025-crlf-renormalize.sh

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index b700beaff5..d50fa339dc 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 [verse]
 'git add' [--verbose | -v] [--dry-run | -n] [--force | -f] [--interactive | 
-i] [--patch | -p]
  [--edit | -e] [--[no-]all | --[no-]ignore-removal | [--update | -u]]
- [--intent-to-add | -N] [--refresh] [--ignore-errors] 
[--ignore-missing]
+ [--intent-to-add | -N] [--refresh] [--ignore-errors] 
[--ignore-missing] [--renormalize]
  [--chmod=(+|-)x] [--] [...]
 
 DESCRIPTION
@@ -175,6 +175,13 @@ for "git add --no-all ...", i.e. ignored removed 
files.
warning (e.g., if you are manually performing operations on
submodules).
 
+--renormalize::
+   Apply the "clean" process freshly to all tracked files to
+   forcibly add them again to the index.  This is useful after
+   changing `core.autocrlf` configuration or the `text` attribute
+   in order to correct files added with wrong CRLF/LF line endings.
+   This option implies `-u`.
+
 --chmod=(+|-)x::
Override the executable bit of the added files.  The executable
bit is only changed in the index, the files on disk are left
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 4c68bc19d5..30687de81a 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -232,8 +232,7 @@ From a clean working directory:
 
 -
 $ echo "* text=auto" >.gitattributes
-$ git read-tree --empty   # Clean index, force re-scan of working directory
-$ git add .
+$ git add --renormalize .
 $ git status# Show files that will be normalized
 $ git commit -m "Introduce end-of-line normalization"
 -
@@ -328,6 +327,9 @@ You can declare that a filter turns a content that by 
itself is unusable
 into a usable content by setting the filter..required configuration
 variable to `true`.
 
+Note: Whenever the clean filter is changed, the repo should be renormalized:
+$ git add --renormalize .
+
 For example, in .gitattributes, you would assign the `filter`
 attribute for paths.
 
diff --git a/builtin/add.c b/builtin/add.c
index a648cf4c56..c42b50f857 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -26,6 +26,7 @@ static const char * const builtin_add_usage[] = {
 };
 static int patch_interactive, add_interactive, edit_interactive;
 static int take_worktree_changes;
+static int add_renormalize;
 
 struct update_callback_data {
int flags;
@@ -123,6 +124,25 @@ int add_files_to_cache(const char *prefix,
return !!data.add_errors;
 }
 
+static int renormalize_tracked_files(const struct pathspec *pathspec, int 
flags)
+{
+   int i, retval = 0;
+
+   for (i = 0; i < active_nr; i++) {
+   struct cache_entry *ce = active_cache[i];
+
+   if (ce_stage(ce))
+   continue; /* do not touch unmerged paths */
+   if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
+   continue; /* do not touch non blobs */
+   if (pathspec && !ce_path_match(ce, pathspec, NULL))
+   continue;
+   retval |= add_file_to_cache(ce->name, flags | 

Re: [RFC 2/3] am: semi working --cover-at-tip

2017-11-16 Thread Nicolas Morey-Chaisemartin


Le 14/11/2017 à 10:17, Nicolas Morey-Chaisemartin a écrit :
>
> Le 14/11/2017 à 07:00, Junio C Hamano a écrit :
>> Nicolas Morey-Chaisemartin  writes:
>>
>> By the way, don't we want to sanity check state->last (which we
>> learn by running "git mailsplit" that splits the incoming mbox into
>> pieces and counts the number of messages) against state->series_len?
>> Sometimes people send [PATCH 0-6/6], a 6-patch series with a cover
>> letter, and then follow-up with [PATCH 7/6].  For somebody like me,
>> it would be more convenient if the above code (more-or-less) ignored
>> series_len and called do_apply_cover() after applying the last patch
>> (which would be [PATCH 7/6]) based on what state->last says.
> I thought about that.
> Is there a use case for cover after the last patch works and removes the need 
> to touch am_next (can be done out of the loop in am_run).

Do you have an opinion on that ? It has quite a big impact on how things are 
done !
Single series only would mean a simple flush at the end.
Multiple series makes things a whole lot complex.
We do not know the series_id of the next patch until it's parsed by parse_mail.
Which would mean interrupting parse_mail when detecting a new series to call 
parse_mail on the cover_id plus an extra detection at the end of the loop.

Nicolas


Re: Changing encoding of a file : What should happen to CRLF in file ?

2017-11-16 Thread Torsten Bögershausen
On Thu, Nov 16, 2017 at 12:35:33AM +0530, Ashish Negi wrote:
> On windows :
> > git --version
> git version 2.14.2.windows.2
> 
> On linux :
> > git --version
> git version 2.7.4
> 
> I would like to understand the solution :
> If i understood it correctly : it removes file_name.txt from index, so
> git forgets about it.
> we then add the file again after changing encoding. This time, git
> takes it as utf-8 file and converts crlf to lf when storing it
> internally.
> Right ?

Yes, exactly.
(In a coming release of Git there will be a "git add --renormalize " )

> 
> Thank you for the support.
> 

Thanks for a clean bug report.
Actually it is a bug, I put it on my to do list




[PATCH] git-send-email: fix get_maintainer.pl regression

2017-11-16 Thread Alex Bennée
Getting rid of Mail::Address regressed behaviour with common
get_maintainer scripts such as the Linux kernel. Fix the missed corner
case and add a test for it.

Fixes: cc9075067776ebd34cc08f31bf78bb05f12fd879

Signed-off-by: Alex Bennée 
---
 perl/Git.pm   |  3 +++
 t/t9000/test.pl   |  4 +++-
 t/t9001-send-email.sh | 21 +
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index ffa09ace9..9b17de1cc 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -936,6 +936,9 @@ sub parse_mailboxes {
$end_of_addr_seen = 0;
} elsif ($token =~ /^\(/) {
push @comment, $token;
+   } elsif ($token =~ /^\)/) {
+   my $nested_comment = pop @comment;
+   push @comment, "$nested_comment$token";
} elsif ($token eq "<") {
push @phrase, (splice @address), (splice @buffer);
} elsif ($token eq ">") {
diff --git a/t/t9000/test.pl b/t/t9000/test.pl
index dfeaa9c65..f10be50cd 100755
--- a/t/t9000/test.pl
+++ b/t/t9000/test.pl
@@ -35,7 +35,9 @@ my @success_list = (q[Jane],
q['Jane 'Doe' ],
q[Jane@:;\.,()<>Doe ],
q[Jane  Doe],
-   q[ Jane Doe]);
+   q[ Jane Doe],
+   q[j...@example.com (open list:for thing (foo/bar))],
+);
 
 my @known_failure_list = (q[Jane\ Doe ],
q["Doe, Ja"ne ],
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 4d261c2a9..0bcd7ab96 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -172,6 +172,27 @@ test_expect_success $PREREQ 'cc trailer with various 
syntax' '
test_cmp expected-cc commandline1
 '
 
+test_expect_success $PREREQ 'setup get_mainter script for cc trailer' "
+cat >expected-cc-script.sh <<-EOF && chmod +x expected-cc-script.sh
+#!/bin/sh
+echo 'One Person  (supporter:THIS (FOO/bar))'
+echo 'Two Person  (maintainer:THIS THING)'
+echo 'Third List  (moderated list:THIS THING (FOO/bar))'
+echo ' (moderated list:FOR THING)'
+echo 'f...@example.com (open list:FOR THING (FOO/bar))'
+echo 's...@example.com (open list)'
+EOF
+"
+
+test_expect_success $PREREQ 'cc trailer with get_maintainer output' '
+   test_commit cc-trailer &&
+   clean_fake_sendmail &&
+   git send-email -1 --to=recipi...@example.com \
+   --cc-cmd="$(pwd)/expected-cc-script.sh" \
+   --smtp-server="$(pwd)/fake.sendmail" &&
+   test_cmp expected-cc commandline1
+'
+
 test_expect_success $PREREQ 'setup expect' "
 cat >expected-show-all-headers <<\EOF
 0001-Second.patch
-- 
2.15.0



Re: [PATCH] launch_editor(): indicate that Git waits for user input

2017-11-16 Thread Lars Schneider

> On 16 Nov 2017, at 15:58, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>>> On 16 Nov 2017, at 07:04, Junio C Hamano  wrote:
>> 
>> Wow. Thanks for the quick patch :-)
> 
> Heh, this is not exactly my itch, so if you are inclined to, can you
> take it over from here on?

Absolutely! What is the proper way to proceed here? Should I send a v2
with the changes I suggested? How do I attribute you correctly? I
assume I need to remove your 'Signed-off-by:'?

Thanks,
Lars


Re: [PATCH] launch_editor(): indicate that Git waits for user input

2017-11-16 Thread Junio C Hamano
Lars Schneider  writes:

>> On 16 Nov 2017, at 07:04, Junio C Hamano  wrote:
>
> Wow. Thanks for the quick patch :-)

Heh, this is not exactly my itch, so if you are inclined to, can you
take it over from here on?

Thanks.


Re: [PATCH 3/4] branch: correctly reject refs/heads/{-dash,HEAD}

2017-11-16 Thread Junio C Hamano
Kaartic Sivaraam  writes:

> I guess this series is not yet ready for 'next'. When I tried to apply
> this patch it doesn't seem to be applying cleanly. I get some
> conflicts in 'sha1_name.c' possibly as a consequence of the changes to
> the file that aren't accounted by the patch.

Oh, it is totally expected that this patch (and others) may not
apply on 'next' without conflict resolution, as this topic, as all
the other topics, are designed to apply cleanly to either 'master'
or 'maint' or one of the older 'maint-*' series, if it is a bugfix
topic.  A patch series that only applies cleanly to 'next' would be
useless---it would mean all the topics that are already in 'next'
that interacts with it must graduate first before it can go in.
Make it a habit to build on 'master' or older and then merge to
higher integration branches to make sure it fits with others.

What you could do is to inspect origin/pu branch after you fetch,
and look at the commit that merges this topic to learn how the
conflicts are resolved (the contrib/rerere-train.sh script may help
this process).

Your inability to resolve merge conflicts does not have much to do
with the readiness of a topic, as long as somebody else can resolve
them ;-)

>> +if (*name == '-' ||
>> +!strcmp(sb->buf, "refs/heads/HEAD"))
>
> I guess this check should be made more consistent. Possibly either of,

Among these two, this one

>   if (starts_with(sb->buf, "refs/heads/-") ||
>   !strcmp(sb->buf, "refs/heads/HEAD"))

has more chance to be correct.  Also, if we were to check the result
of expansion in sb->buf[], I would think that we should keep a
separate variable that points at >buf[11] and compare the
remainder against fixed strings, as we already know sb->buf[] starts
with "refs/heads/" due to our splicing in the fixed string.

Because the point of using strbuf_branchname() is to allow us expand
funny notations like @{-1} to refs/heads/foo, and the result of
expansion is what eventually matters, checking name[] is wrong, I
would think, even though I haven't thought things through.  

In any case, I would say thinking this part through should be left
as a theme for a follow-on patch, and not within the scope of this
one.  After all, checking *name against '-' was part of the original
code, so it is safer to keep the thing we are not touching bug-to-bug
compatible and fixing things one step at a time (the one fix we made
with this patch is to make sure we store refs/heads/-dash in sb when
we reject name=="-dash").



КЛИЕНТСКИЕ БАЗЫ!!! Подробнее: prodawez...@gmail.com Узнайте подробности!

2017-11-16 Thread lqerjanc...@vger.kernel.org
KLIENTSKIE BAZI!!! Podrobnee: prodawez...@gmail.com Uznaite podrobnosti!


Re: [PATCH v2] sequencer: reschedule pick if index can't be locked

2017-11-16 Thread Phillip Wood
On 16/11/17 05:22, Junio C Hamano wrote:
> From: Phillip Wood 
> Date: Wed, 15 Nov 2017 10:41:25 +
> 
> If the index cannot be locked in do_recursive_merge(), issue an
> error message and go on to the error recovery codepath, instead of
> dying.  When the commit cannot be picked, it needs to be rescheduled
> when performing an interactive rebase, but just dying there won't
> allow that to happen, and when the user runs 'git rebase --continue'
> rather than 'git rebase --abort', the commit gets silently dropped.
> 
> Signed-off-by: Phillip Wood 
> ---
> 
>  * I've queue this, taking input from responses by Dscho and Martin.

That's great thanks, sorry for the bug in the original

Best Wishes

Phillip

>  sequencer.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 332a383b03..10924ffd49 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -438,7 +438,8 @@ static int do_recursive_merge(struct commit *base, struct 
> commit *next,
>   char **xopt;
>   static struct lock_file index_lock;
>  
> - hold_locked_index(_lock, LOCK_DIE_ON_ERROR);
> + if (hold_locked_index(_lock, LOCK_REPORT_ON_ERROR) < 0)
> + return -1;
>  
>   read_cache();
>  
> 



Re: [PATCH 3/4] branch: correctly reject refs/heads/{-dash,HEAD}

2017-11-16 Thread Kaartic Sivaraam

On Thursday 16 November 2017 03:44 AM, Junio C Hamano wrote:

 Kaartic Sivaraam  writes:

 >> Are these two patches follow-up fixes (replacement of 3/3 plus an
 >> extra patch) to jc/branch-name-sanity topic?
 >
 > Yes, that's right.
 >
 >> Thanks for working on these.
 >
 > You're welcome. Please do be sure I haven't broken anything in
 > v2. These patches should cleanly apply on 'next', if they don't let me
 > know.

 OK, so here is a replacement for your replacement, based on an
 additional analysis I did while I was reviewing your changes.
 The final 4/4 is what you sent as [v2 2/2] (which was meant to
 be [v2 4/3]).  I think with these updates, the resulting 4-patch
 series is good for 'next'.



I guess this series is not yet ready for 'next'. When I tried to apply 
this patch it doesn't seem to be applying cleanly. I get some conflicts 
in 'sha1_name.c' possibly as a consequence of the changes to the file 
that aren't accounted by the patch. As to which change,


$ git whatchanged  jch/jc/branch-name-sanity..origin/next sha1_name.c

lists at least 5 of them, so there's possibly a lot of change that 
hasn't been taken into account by this patch. Particularly, the function 
'strbuf_check_branch_ref' itself is found at line 1435 in the version 
found in 'next' though this patch expects it to be near line 1332, I guess.


Further comment inline.


  sha1_name.c | 14 --
  t/t1430-bad-ref-name.sh | 43 +++
  2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index c7c5ab376c..67961d6e47 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -1332,9 +1332,19 @@ void strbuf_branchname(struct strbuf *sb, const char 
*name, unsigned allowed)
  int strbuf_check_branch_ref(struct strbuf *sb, const char *name)
  {
strbuf_branchname(sb, name, INTERPRET_BRANCH_LOCAL);
-   if (name[0] == '-')
-   return -1;
+
+   /*
+* This splice must be done even if we end up rejecting the
+* name; builtin/branch.c::copy_or_rename_branch() still wants
+* to see what the name expanded to so that "branch -m" can be
+* used as a tool to correct earlier mistakes.
+*/
strbuf_splice(sb, 0, 0, "refs/heads/", 11);
+
+   if (*name == '-' ||
+   !strcmp(sb->buf, "refs/heads/HEAD"))


I guess this check should be made more consistent. Possibly either of,

if (starts_with(sb->buf, "refs/heads/-") ||
!strcmp(sb->buf, "refs/heads/HEAD"))

or,

if (*name == '-' ||
!strcmp(name, "HEAD"))


might make them consistent (at least from my perspective).


I tried to reproduce this patch manually and other than the above this 
one LGTM. Though I can't be very sure as I couldn't apply it (I did it 
"manually" to some extent, you see ;-)




+   return -1;
+
return check_refname_format(sb->buf, 0);
  }
  
diff --git a/t/t1430-bad-ref-name.sh b/t/t1430-bad-ref-name.sh

index e88349c8a0..c7878a60ed 100755
--- a/t/t1430-bad-ref-name.sh
+++ b/t/t1430-bad-ref-name.sh
@@ -331,4 +331,47 @@ test_expect_success 'update-ref --stdin -z fails delete 
with bad ref name' '
grep "fatal: invalid ref format: ~a" err
  '
  
+test_expect_success 'branch rejects HEAD as a branch name' '

+   test_must_fail git branch HEAD HEAD^ &&
+   test_must_fail git show-ref refs/heads/HEAD
+'
+
+test_expect_success 'checkout -b rejects HEAD as a branch name' '
+   test_must_fail git checkout -B HEAD HEAD^ &&
+   test_must_fail git show-ref refs/heads/HEAD
+'
+
+test_expect_success 'update-ref can operate on refs/heads/HEAD' '
+   git update-ref refs/heads/HEAD HEAD^ &&
+   git show-ref refs/heads/HEAD &&
+   git update-ref -d refs/heads/HEAD &&
+   test_must_fail git show-ref refs/heads/HEAD
+'
+
+test_expect_success 'branch -d can remove refs/heads/HEAD' '
+   git update-ref refs/heads/HEAD HEAD^ &&
+   git branch -d HEAD &&
+   test_must_fail git show-ref refs/heads/HEAD
+'
+
+test_expect_success 'branch -m can rename refs/heads/HEAD' '
+   git update-ref refs/heads/HEAD HEAD^ &&
+   git branch -m HEAD tail &&
+   test_must_fail git show-ref refs/heads/HEAD &&
+   git show-ref refs/heads/tail
+'
+
+test_expect_success 'branch -d can remove refs/heads/-dash' '
+   git update-ref refs/heads/-dash HEAD^ &&
+   git branch -d -- -dash &&
+   test_must_fail git show-ref refs/heads/-dash
+'
+
+test_expect_success 'branch -m can rename refs/heads/-dash' '
+   git update-ref refs/heads/-dash HEAD^ &&
+   git branch -m -- -dash dash &&
+   test_must_fail git show-ref refs/heads/-dash &&
+   git show-ref refs/heads/dash
+'
+
  test_done





RE: [Feature- Request] Option to commit after checking out branch command is made

2017-11-16 Thread Ninivaggi Mattia
I like the --autostash flag. 
But actually you just could write an alias co="git stash && git checkout" and 
use co dev for the same purpose, rendering the change irrelevant.

-Original Message-
From: Junio C Hamano [mailto:gits...@pobox.com] 
Sent: Thursday, November 16, 2017 1:20 AM
To: Ninivaggi Mattia
Cc: git@vger.kernel.org
Subject: Re: [Feature- Request] Option to commit after checking out branch 
command is made

Ninivaggi Mattia  writes:

> 1. I checkout a branch, without having commited first
> > git checkout dev
> 2. I get this error message:
> > error: Your local changes to the following files would be overwritten 
> by checkout:
> > // List of files
> > // ..
> > //
> > Please commit your changes or stash them before you switch branches.
>
> But I would rather prefer a scenario like this:
> ...
> 1. I checkout a branch, without having commited first
> > git checkout dev
> 2. I get a message like this:
> > Your local changes to the following files would be overwritten by 
> checkout:
> > // List of files
> > // ..
> > //
> > Would you want to commit first? (y/n))
>
> IF y --> prompt for commit message and commit automatically

I do not think you want to do this for a few reasons.

 * The "please commit or stash" is merely a suggestion whose primary
   purpose is to silence clueless newbies who would have complained
   "Git said 'error: ... overwritten by checkout' and I do not know
   what to do next; the error message is so unhelpful" otherwise.
   Majority of the time when I see this message, it is because I
   forgot that I was in the middle of doing something (meaning: I am
   far from finished with the changes I was working on), and I would
   not be ready to commit.  I'd rather keep working to get the work
   into a more reasonable shape before committing, or stash the
   changes first if the task I wanted to do on that "dev" branch is
   more urgent and what I was in the middle of doing is lower
   priority.  

   Because of this, I would expect many users (including the ones
   who are right now newbies but will have gained experience to
   become experts in the future) to appreciate "stash before switch"
   a lot more than "commit first before switch".

 * People write scripts that use "git checkout" to switch branches,
   and they rely on the command to fail in this situation, instead
   of going interactive and gets stuck waiting for an input (which
   may never come).  Because of this, the updated behaviour you
   propose must never be the default, and at least must be protected
   behind a flag, something like "git checkout --autostash dev" (or
   "--autocommit", if you insist).  With that, you could do

[alias]
co = checkout --autostash

   and train your fingers to say "git co dev".

Of course, you can have a "git-co" script on your $PATH, which runs "git 
checkout $1", lets it fail just like it does now, and then does "git commit", 
if you really want the behaviour.  Again, you can then use "git co dev" and you 
do not have to worry about breaking people's scripts that depends on "git 
checkout" to fail without going interactive.


I have a PRPSL for you, kindly contact me on j11...@outlook.kr for details

2017-11-16 Thread Nunez, Daniela




Daniela  Nunez
Adjunct Faculty
daniela.nun...@tamiu.edu
Tel. (956) 326-2647
Fax (956) 326-
Department Of Humanities
Texas A International University
5201 University Blvd
Laredo, Texas 78041

As pursuant to Texas A International University rule 33.04.99.L2 concerning 
the Use and Disposition of Electronic Communications, this email is a mechanism 
for official communication of the University. Electronic mail (e-mail) should 
be used only for legitimate academic or state business. Official email 
communications are intended only to meet the academic and administrative needs 
of the campus community.


Re: [PATCH] branch doc: remove --set-upstream from synopsis

2017-11-16 Thread Kaartic Sivaraam

On Thursday 16 November 2017 04:19 PM, Martin Ågren wrote:

On 16 November 2017 at 08:46, Todd Zullinger  wrote:

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index d6587c5e96..159ca388f1 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -14,7 +14,7 @@ SYNOPSIS
 [(--merged | --no-merged) []]
 [--contains [

Re: [PATCH] sequencer: reschedule pick if index can't be locked

2017-11-16 Thread Martin Ågren
On 16 November 2017 at 11:43, Phillip Wood  wrote:
> On 15/11/17 18:44, Martin Ågren wrote:
>>
>> On 15 November 2017 at 11:41, Phillip Wood 
>> wrote:
>>
>>  From the commit message, I would have expected the flags to be zero. This
>> patch
>> does not only turn off the die-ing, it also tells the lockfile-API to
>> print an
>> error message before returning. I don't have an opinion on whether that
>> extra
>> verboseness is good or bad, but if it's wanted, I think the commit message
>> should mention this change.
>
>
> Hi Martin, thanks for your comments. LOCK_DIE_ON_ERROR also prints the same
> warning so that behavior is unchanged by this patch, though mentioning it in
> the commit message would be no bad thing.

Argh, you're right of course. Sorry for this.

Martin


Re: [PATCH] branch doc: remove --set-upstream from synopsis

2017-11-16 Thread Martin Ågren
On 16 November 2017 at 08:46, Todd Zullinger  wrote:
> Support for the --set-upstream option was removed in 52668846ea
> (builtin/branch: stop supporting the "--set-upstream" option,
> 2017-08-17), after a long deprecation period.
>
> Remove the option from the command synopsis for consistency.  Replace
> another reference to it in the description of `--delete` with
> `--set-upstream-to`.
>
> Signed-off-by: Todd Zullinger 
> ---
>
> I noticed that --set-upstream was still in the synopsis for git branch.  I
> don't think it was left there intentionally.  I looked through the thread 
> where
> support for the option was removed and didn't notice any comments suggesting
> otherwise[1].  With luck, I didn't miss the obvious while reading the thread.
>
> [1] 
> https://public-inbox.org/git/20170807143938.5127-1-kaarticsivaraam91...@gmail.com/

Actually, the first version of the series did remove it from the
synopsis [2]. That hunk was later dropped. Kaartic mentioned it [3] and
I thought out loud about it [4].

I get the same initial thought now as then: It's a bit odd that we pique
the interest of the reader, but that when they try it out or read up on
it, we say "nope, this is not what you are looking for".

> diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
> index d6587c5e96..159ca388f1 100644
> --- a/Documentation/git-branch.txt
> +++ b/Documentation/git-branch.txt
> @@ -14,7 +14,7 @@ SYNOPSIS
> [(--merged | --no-merged) []]
> [--contains [ [--points-at ] [--format=] [...]
> -'git branch' [--set-upstream | --track | --no-track] [-l] [-f]  
> []
> +'git branch' [--track | --no-track] [-l] [-f]  []

Personally, I think this is an improvement.

>  'git branch' (--set-upstream-to= | -u ) []
>  'git branch' --unset-upstream []
>  'git branch' (-m | -M) [] 
> @@ -86,7 +86,7 @@ OPTIONS
>  --delete::
> Delete a branch. The branch must be fully merged in its
> upstream branch, or in `HEAD` if no upstream was set with
> -   `--track` or `--set-upstream`.
> +   `--track` or `--set-upstream-to`.

Good catch.

Martin

[2] 
https://public-inbox.org/git/20170807143938.5127-2-kaarticsivaraam91...@gmail.com/

[3] 
https://public-inbox.org/git/20170817025425.6647-2-kaarticsivaraam91...@gmail.com/

[4] 
https://public-inbox.org/git/CAN0heSquaXk421sR6Ry59C+er8n26nC93=3kg1wd0xnxzku...@mail.gmail.com/


Re: [PATCH] sequencer: reschedule pick if index can't be locked

2017-11-16 Thread Phillip Wood

On 15/11/17 18:44, Martin Ågren wrote:

On 15 November 2017 at 11:41, Phillip Wood  wrote:

From: Phillip Wood 

Return an error instead of dying if the index cannot be locked in
do_recursive_merge() as if the commit cannot be picked it needs to be
rescheduled when performing an interactive rebase. If the pick is not
rescheduled and the user runs 'git rebase --continue' rather than 'git
rebase --abort' then the commit gets silently dropped.


Makes sense. (Your analysis, not the current behavior. ;-) )


Signed-off-by: Phillip Wood 
---
  sequencer.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/sequencer.c b/sequencer.c
index 
6d027b06c8d8dc69b14d05752637a65aa121ab24..8c10442b84068d3fb7ec809ef1faa0203cb83e60
 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -438,7 +438,8 @@ static int do_recursive_merge(struct commit *base, struct 
commit *next,
 char **xopt;
 static struct lock_file index_lock;

-   hold_locked_index(_lock, LOCK_DIE_ON_ERROR);
+   if (hold_locked_index(_lock, LOCK_REPORT_ON_ERROR))
+   return -1;

 read_cache();


 From the commit message, I would have expected the flags to be zero. This patch
does not only turn off the die-ing, it also tells the lockfile-API to print an
error message before returning. I don't have an opinion on whether that extra
verboseness is good or bad, but if it's wanted, I think the commit message
should mention this change.


Hi Martin, thanks for your comments. LOCK_DIE_ON_ERROR also prints the 
same warning so that behavior is unchanged by this patch, though 
mentioning it in the commit message would be no bad thing.




Also, don't you want to check "< 0" rather than "!= 0"? If all goes
well, the return value will be a file descriptor. I think that it will
always be non-zero, so I think you'll always return -1 here.


Yes you're right, thanks. I thought I had tested this but I now realise 
my so called test just fast-forwarded so didn't touch this code path


Best Wishes

Phillip


Martin





Re: Improved error handling (Was: [PATCH 1/2] sequencer: factor out rewrite_file())

2017-11-16 Thread Simon Ruderich
On Mon, Nov 06, 2017 at 05:13:15PM +0100, Simon Ruderich wrote:
> On Sat, Nov 04, 2017 at 10:07:00PM -0400, Jeff King wrote:
>> Yes, I think what you've written here (and below) is quite close to the
>> error_context patches I linked elsewhere in the thread. In other
>> words, I think it's a sane approach.
>
> In contrast to error_context I'd like to keep all exiting
> behavior (die, ignore, etc.) in the hand of the caller and not
> use any callbacks as that makes the control flow much harder to
> follow.
>
>> I agree it might be nice for the error context to have a positive "there
>> was an error" flag. It's probably worth making it redundant with the
>> return code, though, so callers can use whichever style is most
>> convenient for them.
>
> Agreed.
>
> Regarding the API, should it be allowed to pass NULL as error
> pointer to request no additional error handling or should the
> error functions panic on NULL? Allowing NULL makes partial
> conversions possible (e.g. for write_in_full) where old callers
> just pass NULL and check the return values and converted callers
> can use the error struct.
>
> How should translations get handled? Appending ": %s" for
> strerror(errno) might be problematic. Same goes for "outer
> message: inner message" where the helper function just inserts ":
> " between the messages. Is _("%s: %s") (with appropriate
> translator comments) enough to handle these cases?
>
> Suggestions how to name the struct and the corresponding
> functions? My initial idea was struct error and to use error_ as
> prefix, but I'm not sure if struct error is too broad and may
> introduce conflicts with system headers. Also error_ is a little
> long and could be shorted to just err_ but I don't know if that's
> clear enough. The error_ prefix doesn't conflict with many git
> functions, but there are some in usage.c (error_errno, error,
> error_routine).
>
> And as general question, is this approach to error handling
> something we should pursue or are there objections? If there's
> consensus that this might be a good idea I'll look into
> converting some parts of the git code (maybe refs.c) to see how
> it pans out.

Any comments?

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9


Re: [PATCH] launch_editor(): indicate that Git waits for user input

2017-11-16 Thread Lars Schneider

> On 16 Nov 2017, at 07:04, Junio C Hamano  wrote:

Wow. Thanks for the quick patch :-)


> When a graphical GIT_EDITOR is spawned by a Git command that opens
> and waits for it for the user input (e.g. "git rebase -i") pops its
> window elsewhere that is obscure, the user may be left staring the
> original terminal window s/he invoked the Git command from, without
> even realizing that now s/he needs to interact with another window
> the editor is waiting in, before Git can proceed.

Maybe:

When a graphical GIT_EDITOR is spawned by a Git command that opens
and waits for user input (e.g. "git rebase -i"), then the editor window
might be obscured by other windows. The user may be left staring at the 
original Git terminal window without even realizing that s/he needs to 
interact with another window before Git can proceed. To this user Git 
appears hanging.


> Show a message to the original terminal, and get rid of it when the
s/to/in/ ?

> editor returns.



> Signed-off-by: Junio C Hamano 
> ---
> 
> * I tried this with "emacsclient" but it turns out that Emacs folks
>   have already thought about this issue, and the program says
>   "Waiting for Emacs..." while it does its thing, so from that
>   point of view, perhaps as Stefan said originally, the editor Lars
>   had trouble with is what is at fault and needs fixing?  I dunno.

Based on my (not too extensive) testing, Emacs is probably the only 
editor with this clever behavior.


>   Also, emacsclient seems to conclude its message, once the editing
>   is done, by emitting LF _before_ we have a chance to do the "go
>   back to the beginning and clear" dance, so it probably is not a
>   very good example to emulate the situation Lars had trouble with.
>   You cannot observe the nuking of the "launched, waiting..." with
>   it.
> 
> editor.c | 25 +
> 1 file changed, 25 insertions(+)
> 
> diff --git a/editor.c b/editor.c
> index 7519edecdc..1321944716 100644
> --- a/editor.c
> +++ b/editor.c
> @@ -40,6 +40,28 @@ int launch_editor(const char *path, struct strbuf *buffer, 
> const char *const *en
>   const char *args[] = { editor, real_path(path), NULL };
>   struct child_process p = CHILD_PROCESS_INIT;
>   int ret, sig;
> + static const char *close_notice = NULL;
> +
> + if (isatty(2) && !close_notice) {
> + char *term = getenv("TERM");
> +
> + if (term && strcmp(term, "dumb"))
> + /*
> +  * go back to the beginning and erase the
> +  * entire line if the terminal is capable
> +  * to do so, to avoid wasting the vertical
> +  * space.
> +  */
> + close_notice = "\r\033[K";
> + else
> + /* otherwise, complete and waste the line */
> + close_notice = "done.\n";
> + }
> +
> + if (close_notice) {
> + fprintf(stderr, "Launched your editor, waiting...");

I also noticed that some people don't get that they need to save and close the
file to continue. Plus, we should tell them that Git is waiting for *them* and
not anything else. Maybe we should also tell them the editor name, but that 
might
be too verbose. I dunno. How about this?

fprintf(stderr, "Launched your editor ('%s'). Adjust, save, and close 
the file to continue. Waiting for you input... ", editor);


- Lars

Hello Friend

2017-11-16 Thread Wang Jianlin



--
I intend to give you a portion of my wealth as a free-will financial 
donation to you, Respond to partake.

Wang Jianlin
Wanda Group