[PATCH 1/2] name-rev: allow passing multiple patterns using --refs

2016-12-06 Thread Jacob Keller
From: Jacob Keller 

Allow git name-rev to take a string list of patterns instead of only
a single pattern. All patterns are matched inclusively, meaning that
a ref only needs to match one pattern to be included.

Add tests and documentation for this change.

Signed-off-by: Jacob Keller 
---
 Documentation/git-name-rev.txt   |  4 +++-
 builtin/name-rev.c   | 41 +---
 t/t6007-rev-list-cherry-pick-file.sh | 30 ++
 3 files changed, 62 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-name-rev.txt b/Documentation/git-name-rev.txt
index ca28fb8e2a07..7433627db12d 100644
--- a/Documentation/git-name-rev.txt
+++ b/Documentation/git-name-rev.txt
@@ -26,7 +26,9 @@ OPTIONS
 
 --refs=::
Only use refs whose names match a given shell pattern.  The pattern
-   can be one of branch name, tag name or fully qualified ref name.
+   can be one of branch name, tag name or fully qualified ref name. If
+   given multiple times, use refs whose names match any of the given shell
+   patterns. Use `--no-refs` to clear any previous ref patterns given.
 
 --all::
List all commits reachable from all refs
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index cd89d48b65e8..000a2a700ed3 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -108,7 +108,7 @@ static const char *name_ref_abbrev(const char *refname, int 
shorten_unambiguous)
 struct name_ref_data {
int tags_only;
int name_only;
-   const char *ref_filter;
+   struct string_list ref_filters;
 };
 
 static struct tip_table {
@@ -150,16 +150,33 @@ static int name_ref(const char *path, const struct 
object_id *oid, int flags, vo
if (data->tags_only && !starts_with(path, "refs/tags/"))
return 0;
 
-   if (data->ref_filter) {
-   switch (subpath_matches(path, data->ref_filter)) {
-   case -1: /* did not match */
-   return 0;
-   case 0:  /* matched fully */
-   break;
-   default: /* matched subpath */
-   can_abbreviate_output = 1;
-   break;
+   if (data->ref_filters.nr) {
+   struct string_list_item *item;
+   int matched = 0;
+
+   /* See if any of the patterns match. */
+   for_each_string_list_item(item, >ref_filters) {
+   /*
+* We want to check every pattern even if we already
+* found a match, just in case one of the later
+* patterns could abbreviate the output.
+*/
+   switch (subpath_matches(path, item->string)) {
+   case -1: /* did not match */
+   break;
+   case 0: /* matched fully */
+   matched = 1;
+   break;
+   default: /* matched subpath */
+   matched = 1;
+   can_abbreviate_output = 1;
+   break;
+   }
}
+
+   /* If none of the patterns matched, stop now */
+   if (!matched)
+   return 0;
}
 
add_to_tip_table(oid->hash, path, can_abbreviate_output);
@@ -306,11 +323,11 @@ int cmd_name_rev(int argc, const char **argv, const char 
*prefix)
 {
struct object_array revs = OBJECT_ARRAY_INIT;
int all = 0, transform_stdin = 0, allow_undefined = 1, always = 0, 
peel_tag = 0;
-   struct name_ref_data data = { 0, 0, NULL };
+   struct name_ref_data data = { 0, 0, STRING_LIST_INIT_NODUP };
struct option opts[] = {
OPT_BOOL(0, "name-only", _only, N_("print only names 
(no SHA-1)")),
OPT_BOOL(0, "tags", _only, N_("only use tags to name 
the commits")),
-   OPT_STRING(0, "refs", _filter, N_("pattern"),
+   OPT_STRING_LIST(0, "refs", _filters, N_("pattern"),
   N_("only use refs matching ")),
OPT_GROUP(""),
OPT_BOOL(0, "all", , N_("list all commits reachable from 
all refs")),
diff --git a/t/t6007-rev-list-cherry-pick-file.sh 
b/t/t6007-rev-list-cherry-pick-file.sh
index 1408b608eb03..d072ec43b016 100755
--- a/t/t6007-rev-list-cherry-pick-file.sh
+++ b/t/t6007-rev-list-cherry-pick-file.sh
@@ -99,6 +99,36 @@ test_expect_success '--cherry-pick bar does not come up 
empty (II)' '
test_cmp actual.named expect
 '
 
+test_expect_success 'name-rev multiple --refs combine inclusive' '
+   git rev-list --left-right --cherry-pick F...E -- bar > actual &&
+   git name-rev --stdin --name-only --refs="*tags/F" --refs="*tags/E" \
+   < actual > 

[PATCH 2/2] describe: add support for multiple match patterns

2016-12-06 Thread Jacob Keller
From: Jacob Keller 

Convert "--match" into a string list that accumulates patterns. If any
patterns are given, then include all tags that match at least one
pattern. This allows the user to construct multiple small patterns and
compose them. If desired, a future patch could expand functionality by
adding a new option to determine the style of combining multiple
patterns.

The primary use of this feature can be seen when trying to find which
release tag a given commit made it into. Suppose that you version all
your official releases such as "v1.2", "v1.3", "v1.4", "v2.1" and so on.
Now, you also have other tags which represent -rc releases and other
such tags. If you want to find the first major release that contains
a given commit you might try

git describe --contains --match="v?.?" 

This will work as long as you have only single digits. But if you start
adding multiple digits, the pattern becomes not enough to match all the
tags you wanted while excluding the ones you didn't.

By implementing multiple --match invocations, this can be avoided by
simply passing multiple match patterns to the command:

git describe --contains --match="v[0-9].[0-9]" --match="v[0-9].[0-9][0-9]" ...

The end result is the ability to easily search tag space for which
tag included a given commit, without including -rc and other tags which
aren't interesting to you.

Signed-off-by: Jacob Keller 
---
 Documentation/git-describe.txt |  6 +-
 builtin/describe.c | 31 ---
 t/t6120-describe.sh| 19 +++
 3 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index e4ac448ff565..c85f2811ce28 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -83,7 +83,11 @@ OPTIONS
 --match ::
Only consider tags matching the given `glob(7)` pattern,
excluding the "refs/tags/" prefix.  This can be used to avoid
-   leaking private tags from the repository.
+   leaking private tags from the repository, or to shrink the scope of
+   searched tags to avoid -rc tags or similar. If given multiple a list
+   of patterns will be accumulated, and tags matching any of the patterns
+   will be considered. Use `--no-match` to clear and reset the list of
+   patterns.
 
 --always::
Show uniquely abbreviated commit object as fallback.
diff --git a/builtin/describe.c b/builtin/describe.c
index 01490a157efc..e3ceab65e273 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -28,7 +28,7 @@ static int abbrev = -1; /* unspecified */
 static int max_candidates = 10;
 static struct hashmap names;
 static int have_util;
-static const char *pattern;
+static struct string_list patterns = STRING_LIST_INIT_NODUP;
 static int always;
 static const char *dirty;
 
@@ -129,9 +129,25 @@ static int get_name(const char *path, const struct 
object_id *oid, int flag, voi
if (!all && !is_tag)
return 0;
 
-   /* Accept only tags that match the pattern, if given */
-   if (pattern && (!is_tag || wildmatch(pattern, path + 10, 0, NULL)))
-   return 0;
+   /*
+* If we're given patterns, accept only tags which match at least one
+* pattern.
+*/
+   if (patterns.nr) {
+   struct string_list_item *item;
+
+   if (!is_tag)
+   return 0;
+
+   for_each_string_list_item(item, ) {
+   if (!wildmatch(item->string, path + 10, 0, NULL)) {
+   break;
+   }
+
+   /* If we get here, no pattern matched. */
+   return 0;
+   }
+   }
 
/* Is it annotated? */
if (!peel_ref(path, peeled.hash)) {
@@ -404,7 +420,7 @@ int cmd_describe(int argc, const char **argv, const char 
*prefix)
N_("only output exact matches"), 0),
OPT_INTEGER(0, "candidates", _candidates,
N_("consider  most recent tags (default: 10)")),
-   OPT_STRING(0, "match",   , N_("pattern"),
+   OPT_STRING_LIST(0, "match", , N_("pattern"),
   N_("only consider tags matching ")),
OPT_BOOL(0, "always",,
N_("show abbreviated commit object as fallback")),
@@ -430,6 +446,7 @@ int cmd_describe(int argc, const char **argv, const char 
*prefix)
die(_("--long is incompatible with --abbrev=0"));
 
if (contains) {
+   struct string_list_item *item;
struct argv_array args;
 
argv_array_init();
@@ -440,8 +457,8 @@ int cmd_describe(int argc, const char **argv, const char 
*prefix)
argv_array_push(, "--always");
if (!all) {

git add -p with new file

2016-12-06 Thread Ariel


If you do git add -p new_file it says:

No changes.

Which is a rather confusing message. I would expect it to show me the 
content of the file in patch form, in the normal way that -p works, let me 
edit it, etc.


(Note: I am aware I can do -N first, but when I specifically enter the 
name of a new file I feel it should figure out what I mean.)


-Ariel


Re: [PATCH] real_path: make real_path thread-safe

2016-12-06 Thread Ramsay Jones


On 07/12/16 00:10, Brandon Williams wrote:
> On 12/06, Junio C Hamano wrote:
>> POSIX cares about treating "//" at the very beginning of the path
>> specially.  Is that supposed to be handled here, or by a lot higher
>> level up in the callchain?
> 
> What exactly does "//" mean in this context? (I'm just naive in this
> area)

This refers to a UNC path (ie Universal Naming Convention) which
is used to refer to servers, printers and other 'network resources'.
Although this started (many moons ago) in unix, it isn't used too
much outside of windows networks! (where it is usually denoted by
\\servername\path).

You can see the relics of unix UNC paths if you look at the wording
for basename() in the POSIX standard. Note the special treatment of
the path which 'is exactly "//"', see 
http://pubs.opengroup.org/onlinepubs/009695399/functions/basename.html

ATB,
Ramsay Jones


[no subject]

2016-12-06 Thread jbh
unsubscribe


Re: [PATCH v4 1/3] update-unicode.sh: automatically download newer definition files

2016-12-06 Thread Beat Bolli
On 05.12.16 21:31, Junio C Hamano wrote:
> Torsten Bögershausen  writes:
> 
>> On Sat, Dec 03, 2016 at 10:00:47PM +0100, Beat Bolli wrote:
>>> Checking just for the unicode data files' existence is not sufficient;
>>> we should also download them if a newer version exists on the Unicode
>>> consortium's servers. Option -N of wget does this nicely for us.
>>>
>>> Reviewed-by: Torsten Boegershausen 
>>
>> Minor remark (Not sure if this motivates v5, may be Junio can fix it 
>> locally?)
>> s/oe/ö/
>>
>> Beside this: Thanks again (and I learned about the -N option of wget)
> 
> Will fix up while queuing (only 1/3 needs it, 2/3 has it right).
> 
> Also, I'll do s/update-unicode.sh/update_unicode.sh/ on the title
> and the message to match the reality.  At some point we might want
> to fix the reality to match people's expectations, though.

Thanks, Junio. This was a bit sloppy of me.

I really appreciate your regard for the small things!

Cheers, Beat


Re: [PATCH] real_path: make real_path thread-safe

2016-12-06 Thread Brandon Williams
On 12/06, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > +/* removes the last path component from 'path' except if 'path' is root */
> > +static void strip_last_component(struct strbuf *path)
> > +{
> > +   if (path->len > 1) {
> > +   char *last_slash = find_last_dir_sep(path->buf);
> > +   strbuf_setlen(path, last_slash - path->buf);
> > +   }
> > +}
> 
> You use find_last_dir_sep() which takes care of "Windows uses
> backslash" issue.  Is this function expected to be fed something
> like "C:\My Files\foo.txt" and more importantly "C:\My Files"?  Or
> is that handled by a lot higher level up in the callchain?  I am
> reacting the comparison of path->len and 1 here.

This function should accept both absolute and relative paths, which
means it should probably accept "C:\My Files".  I wasn't thinking about
windows 100% of the time while writing this so I'm hoping that a windows
expert will point things like this out to me :).  So in reality this
check shouldn't be (path->len > 1) but rather some function is_only_root
which would check if the strbuf only contains a string which looks like
root.

> 
> Also is the input expected to be normalized?  Are we expected to be
> fed something like "/a//b/./c/../d/e" and react sensibly, or is that
> handled by a lot higher level up in the callchain?

Yes it should handle paths like that sensibly.

> > +/* gets the next component in 'remaining' and places it in 'next' */
> > +static void get_next_component(struct strbuf *next, struct strbuf 
> > *remaining)
> > +{
> > +   char *start = NULL;
> > +   char *end = NULL;
> > +
> > +   strbuf_reset(next);
> > +
> > +   /* look for the next component */
> > +   /* Skip sequences of multiple path-separators */
> > +   for (start = remaining->buf; is_dir_sep(*start); start++)
> > +   /* nothing */;
> 
> Style:
>   ; /* nothing */

k will fix.

> 
> > +   /* Find end of the path component */
> > +   for (end = start; *end && !is_dir_sep(*end); end++)
> > +   /* nothing */;
> > +
> > +   strbuf_add(next, start, end - start);
> 
> OK, so this was given "///foo/bar" in "remaining" and appended
> 'foo/' to "next".  I.e. deduping of slashes is handled here.
> 
> POSIX cares about treating "//" at the very beginning of the path
> specially.  Is that supposed to be handled here, or by a lot higher
> level up in the callchain?

What exactly does "//" mean in this context? (I'm just naive in this
area)

> 
> > +   /* remove the component from 'remaining' */
> > +   strbuf_remove(remaining, 0, end - remaining->buf);
> > +}
> > +
> >  /* We allow "recursive" symbolic links. Only within reason, though. */
> > -#define MAXDEPTH 5
> > +#define MAXSYMLINKS 5
> >  
> >  /*
> >   * Return the real path (i.e., absolute path, with symlinks resolved
> > @@ -21,7 +51,6 @@ int is_directory(const char *path)
> >   * absolute_path().)  The return value is a pointer to a static
> >   * buffer.
> >   *
> >   * The directory part of path (i.e., everything up to the last
> >   * dir_sep) must denote a valid, existing directory, but the last
> >   * component need not exist.  If die_on_error is set, then die with an
> > @@ -33,22 +62,16 @@ int is_directory(const char *path)
> >   */
> >  static const char *real_path_internal(const char *path, int die_on_error)
> >  {
> > +   static struct strbuf resolved = STRBUF_INIT;
> 
> This being 'static' would probably mean that this is not reentrant,
> which goes against the title of the patch.

Yes I mentioned in the cover letter that this was the one last thing
that needed to be changed to make it reentrant.  I wanted to get this
out for review sooner since this is the biggest change (getting rid of
the chdir() calls) and dropping the static here would just require
making the callers own the memory which should hopefully be an easy
refactor.  It just would have taken a bit more time to actually do that
refactor now.  I'm slowly working on it while this is being reviewed and
could be ready to send out when rerolling this patch.  Sorry for having
a slightly misleading patch title :)

-- 
Brandon Williams


Re: Merging .gitmodules files

2016-12-06 Thread Junio C Hamano
David Turner  writes:

> I could set my .gitattributes for the .gitmodules file to use a
> custom merge driver. But: (a) I don't see an off-the-shelf one
> that does what I want ("union" happens to work in the add/add
> case, but not in the add/remove case or other cases) and (b) I
> would have to rewrite my whole history in order to have the
> .gitmodules file exist at every commit (or find a way to get
> .git/info/attributes into each of my users' clones) and (c) this
> should work correctly without customization; Git already treats
> the .gitmodules file as special for commands like "status";
> there's no reason it shouldn't do so for merge and rebase.
> ... if I did have time, do others agree that it would be
> reasonable to special-case this file?  (Naturally, before doing
> the merge, we would check that the file was in fact parseable as a
> git config file; merging two changed gitmodules files of which
> either is unparseable would fall back to merging as text).

I do not see a fundamental reason why Git shouldn't know what
.gitmodules file is and how it should merge.

Our philosophy has always been "give users enough flexibility so
that they can experiment and come up with a solution that is general
enough (i.e. you can do it with custom merge driver), and then fold
it back into the core if it is an issue that is general enough and
it makes sense for the core to care about (i.e. my "why not?"
above).  If you already have a custom merge driver, then you have
already cleared the first step ;-)



Re: [PATCH v15 19/27] bisect--helper: `bisect_state` & `bisect_head` shell function in C

2016-12-06 Thread Stephan Beyer
Hi Pranit,

On 12/06/2016 11:40 PM, Pranit Bauva wrote:
> On Tue, Nov 22, 2016 at 5:42 AM, Stephan Beyer  wrote:
>> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>>> +static int bisect_state(struct bisect_terms *terms, const char **argv,
>>> + int argc)
>>> +{
>>> + const char *state = argv[0];
>>> +
>>> + get_terms(terms);
>>> + if (check_and_set_terms(terms, state))
>>> + return -1;
>>> +
>>> + if (!argc)
>>> + die(_("Please call `--bisect-state` with at least one 
>>> argument"));
>>
>> I think this check should move to cmd_bisect__helper. There are also the
>> other argument number checks.
> 
> Not really. After the whole conversion, the cmdmode will cease to
> exists while bisect_state will be called directly, thus it is
> important to check it here.

Okay, that's a point.
In that case, you should probably use "return error()" again and the
message (mentioning "--bisect-state") does not make sense when
--bisect-state ceases to exist.

>>> +
>>> + if (argc == 1 && one_of(state, terms->term_good,
>>> + terms->term_bad, "skip", NULL)) {
>>> + const char *bisected_head = xstrdup(bisect_head());
>>> + const char *hex[1];
>>
>> Maybe:
>> const char *hex;
>>
>>> + unsigned char sha1[20];
>>> +
>>> + if (get_sha1(bisected_head, sha1))
>>> + die(_("Bad rev input: %s"), bisected_head);
>>
>> And instead of...
>>
>>> + if (bisect_write(state, sha1_to_hex(sha1), terms, 0))
>>> + return -1;
>>> +
>>> + *hex = xstrdup(sha1_to_hex(sha1));
>>> + if (check_expected_revs(hex, 1))
>>> + return -1;
>>
>> ... simply:
>>
>> hex = xstrdup(sha1_to_hex(sha1));
>> if (set_state(terms, state, hex)) {
>> free(hex);
>> return -1;
>> }
>> free(hex);
>>
>> where:
> 
> Yes I am planning to convert all places with hex rather than the sha1
> but not yet, maybe in an another patch series because currently a lot
> of things revolve around sha1 and changing its behaviour wouldn't
> really be a part of a porting patch series.

I was not suggesting a change of behavior, I was suggesting a simple
helper function set_state() to get rid of code duplication above and
some lines below:

>> ... And replace this:
>>
>>> + const char **hex_string = (const char **) 
>>> [i].string;
>>> + if(bisect_write(state, *hex_string, terms, 0)) {
>>> + string_list_clear(, 0);
>>> + return -1;
>>> + }
>>> + if (check_expected_revs(hex_string, 1)) {
>>> + string_list_clear(, 0);
>>> + return -1;
>>> + }
>>
>> by:
>>
>> const char *hex_str = hex.items[i].string;
>> if (set_state(terms, state, hex_string)) {
>> string_list_clear(, 0);
>> return -1;
>> }

See?

>>> @@ -184,8 +137,8 @@ exit code \$res from '\$command' is < 0 or >= 128" >&2
>>>   state="$TERM_GOOD"
>>>   fi
>>>
>>> - # We have to use a subshell because "bisect_state" can exit.
>>> - ( bisect_state $state >"$GIT_DIR/BISECT_RUN" )
>>> + # We have to use a subshell because "--bisect-state" can exit.
>>> + ( git bisect--helper --bisect-state $state 
>>> >"$GIT_DIR/BISECT_RUN" )
>>
>> The new comment is funny, but you don't need a subshell here any
>> longer.
> 
> True, but right now I didn't want to modify that part of the source
> code to remove the comment. I will remove the comment all together
> when I port bisect_run()
For most of the patches, I was commenting on the current state, not on
the big picture.

Still I think that it is better to remove the comment and the subshell
instead of making the comment weird ("yes the builtin can exit, but why
do we need a subshell?" vs "yes the shell function calls exit, so we
need a subshell because we do not want to exit this shell script")

~Stephan


What's cooking in git.git (Dec 2016, #01; Tue, 6)

2016-12-06 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

The 'next' branch has been rewound, 'maint' now is for maintenance
fixes post v2.11 release.

The description of many new topics in this issue of the report is
probably sketchier than it should be.  Refinements are very much
appreciated.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* jk/common-main (2016-11-29) 1 commit
  (merged to 'next' on 2016-11-29 at 2985e7efba)
 + common-main: stop munging argv[0] path

 Fix for a small regression in a topic already in 'master'.

--
[New Topics]

* ak/lazy-prereq-mktemp (2016-11-29) 1 commit
 - t7610: clean up foo.XX tmpdir

 Test code clean-up.

 Will merge to 'next'.


* da/mergetool-trust-exit-code (2016-11-29) 2 commits
 - mergetools/vimdiff: trust Vim's exit code
 - mergetool: honor mergetool.$tool.trustExitCode for built-in tools

 mergetool..trustExitCode configuration variable did not apply
 to built-in tools, but now it does.

 Will merge to 'next'.


* jb/diff-no-index-no-abbrev (2016-11-29) 1 commit
 - diff: handle --no-abbrev outside of repository

 "git diff --no-index" did not take "--no-abbrev" option.

 Will merge to 'next'.


* vk/p4-submit-shelve (2016-11-29) 1 commit
 - git-p4: allow submit to create shelved changelists.
 (this branch is used by ld/p4-update-shelve.)

 Will merge to 'next'.


* jk/http-walker-limit-redirect-2.9 (2016-12-06) 5 commits
 - http: treat http-alternates like redirects
 - http: make redirects more obvious
 - remote-curl: rename shadowed options variable
 - http: always update the base URL for redirects
 - http: simplify update_url_from_redirect
 (this branch is used by jk/http-walker-limit-redirect.)

 Transport with dumb http can be fooled into following foreign URLs
 that the end user does not intend to, especially with the server
 side redirects and http-alternates mechanism, which can lead to
 security issues.  Tighten the redirection and make it more obvious
 to the end user when it happens.

 Will merge to 'next'.


* jk/http-walker-limit-redirect (2016-12-06) 2 commits
 - http-walker: complain about non-404 loose object errors
 - Merge branch 'ew/http-walker' into jk/http-walker-limit-redirect
 (this branch uses jk/http-walker-limit-redirect-2.9.)

 Update the error messages from the dumb-http client when it fails
 to obtain loose objects; we used to give sensible error message
 only upon 404 but we now forbid unexpected redirects that needs to
 be reported with something sensible.

 Will merge to 'next'.


* ah/grammos (2016-12-05) 3 commits
 - clone,fetch: explain the shallow-clone option a little more clearly
 - receive-pack: improve English grammar of denyCurrentBranch message
 - bisect: improve English grammar of not-ancestors message

 A few messages have been fixed for their grammatical errors.

 Will merge to 'next'.


* ak/commit-only-allow-empty (2016-12-05) 1 commit
 - commit: make --only --allow-empty work without paths

 "git commit --allow-empty --only" (no pathspec) with dirty index
 ought to be an acceptable way to create a new commit that does not
 change any paths, but it was forbidden (perhaps because nobody
 needed it).

 Will merge to 'next'.


* bb/unicode-9.0 (2016-12-05) 3 commits
 - unicode_width.h: update the tables to Unicode 9.0
 - update_unicode.sh: strip the plane offsets from the double_width[] table
 - update_unicode.sh: automatically download newer definition files

 The character width table has been updated to match Unicode 9.0

 Will merge to 'next'.


* ld/p4-update-shelve (2016-12-05) 1 commit
 - git-p4: support updating an existing shelved changelist
 (this branch uses vk/p4-submit-shelve.)

 Will merge to 'next'.


* ld/p4-worktree (2016-12-05) 1 commit
 - git-p4: support secondary working trees managed by "git worktree"

 Iffy.
 cf. <20161202224319.5385-2-l...@diamand.org>


* ls/p4-empty-file-on-lfs (2016-12-05) 1 commit
 - git-p4: fix empty file processing for large file system backend GitLFS

 "git p4" LFS support was broken when LFS stores an empty blob.

 Will merge to 'next'.


* ls/p4-retry-thrice (2016-12-05) 1 commit
 - git-p4: add config to retry p4 commands; retry 3 times by default

 Will merge to 'next'.


* ls/t0021-fixup (2016-12-05) 1 commit
 - t0021: minor filter process test cleanup

 Will merge to 'next'.


* ls/travis-update-p4-and-lfs (2016-12-05) 1 commit
 - travis-ci: update P4 to 16.2 and GitLFS to 1.5.2 in Linux build

 The default Travis-CI configuration specifies newer P4 and GitLFS.

 Will merge to 'next'.


* sb/t3600-cleanup (2016-12-05) 1 

Merging .gitmodules files

2016-12-06 Thread David Turner
Consider two commits: one adds file A, and the other adds file B.  These 
commits don't conflict; you can merge them with no problem.

But if the two commits instead add submodules A and B, and you try to merge, 
you'll likely get a conflict in .gitmodules.  This seems wrong; .gitmodules 
happens to be a plain text file as an implementation detail, but in terms of 
interpretation, it is more like a map of maps (name1 -> {path -> "...", url -> 
"..."}, name2 -> ...).  

We (Two Sigma) keep our .gitmodules file in alphabetical order (so we don't use 
git submodule add -- our .gitmodules file is instead generated by some more 
complicated offline process).  But even for ordinary .gitmodules files, order 
is not important so long as it's consistent.

I could set my .gitattributes for the .gitmodules file to use a custom merge 
driver. But: (a) I don't see an off-the-shelf one that does what I want 
("union" happens to work in the add/add case, but not in the add/remove case or 
other cases) and (b) I would have to rewrite my whole history in order to have 
the .gitmodules file exist at every commit (or find a way to get 
.git/info/attributes into each of my users' clones) and (c) this should work 
correctly without customization; Git already treats the .gitmodules file as 
special for commands like "status"; there's no reason it shouldn't do so for 
merge and rebase.

I'm not sure I'll necessarily have time to implement this -- for our use case ( 
http://github.com/twosigma/git-meta ), we might be able to get away with doing 
it in JS, and using something like 
https://github.com/mirkokiefer/diff-merge-patch#sets .  But if I did have time, 
do others agree that it would be reasonable to special-case this file?  
(Naturally, before doing the merge, we would check that the file was in fact 
parseable as a git config file; merging two changed gitmodules files of which 
either is unparseable would fall back to merging as text). 


Re: [PATCH] real_path: make real_path thread-safe

2016-12-06 Thread Junio C Hamano
Brandon Williams  writes:

> +/* removes the last path component from 'path' except if 'path' is root */
> +static void strip_last_component(struct strbuf *path)
> +{
> + if (path->len > 1) {
> + char *last_slash = find_last_dir_sep(path->buf);
> + strbuf_setlen(path, last_slash - path->buf);
> + }
> +}

You use find_last_dir_sep() which takes care of "Windows uses
backslash" issue.  Is this function expected to be fed something
like "C:\My Files\foo.txt" and more importantly "C:\My Files"?  Or
is that handled by a lot higher level up in the callchain?  I am
reacting the comparison of path->len and 1 here.

Also is the input expected to be normalized?  Are we expected to be
fed something like "/a//b/./c/../d/e" and react sensibly, or is that
handled by a lot higher level up in the callchain?

> +/* gets the next component in 'remaining' and places it in 'next' */
> +static void get_next_component(struct strbuf *next, struct strbuf *remaining)
> +{
> + char *start = NULL;
> + char *end = NULL;
> +
> + strbuf_reset(next);
> +
> + /* look for the next component */
> + /* Skip sequences of multiple path-separators */
> + for (start = remaining->buf; is_dir_sep(*start); start++)
> + /* nothing */;

Style:
; /* nothing */

> + /* Find end of the path component */
> + for (end = start; *end && !is_dir_sep(*end); end++)
> + /* nothing */;
> +
> + strbuf_add(next, start, end - start);

OK, so this was given "///foo/bar" in "remaining" and appended
'foo/' to "next".  I.e. deduping of slashes is handled here.

POSIX cares about treating "//" at the very beginning of the path
specially.  Is that supposed to be handled here, or by a lot higher
level up in the callchain?

> + /* remove the component from 'remaining' */
> + strbuf_remove(remaining, 0, end - remaining->buf);
> +}
> +
>  /* We allow "recursive" symbolic links. Only within reason, though. */
> -#define MAXDEPTH 5
> +#define MAXSYMLINKS 5
>  
>  /*
>   * Return the real path (i.e., absolute path, with symlinks resolved
> @@ -21,7 +51,6 @@ int is_directory(const char *path)
>   * absolute_path().)  The return value is a pointer to a static
>   * buffer.
>   *
>   * The directory part of path (i.e., everything up to the last
>   * dir_sep) must denote a valid, existing directory, but the last
>   * component need not exist.  If die_on_error is set, then die with an
> @@ -33,22 +62,16 @@ int is_directory(const char *path)
>   */
>  static const char *real_path_internal(const char *path, int die_on_error)
>  {
> + static struct strbuf resolved = STRBUF_INIT;

This being 'static' would probably mean that this is not reentrant,
which goes against the title of the patch.


Re: [PATCH v15 23/27] bisect--helper: `bisect_replay` shell function in C

2016-12-06 Thread Stephan Beyer
Hi Pranit and Christian and Lars ;)

On 12/07/2016 12:02 AM, Pranit Bauva wrote:
> On Tue, Nov 22, 2016 at 6:19 AM, Stephan Beyer  wrote:
>> Okay Pranit,
>>
>> this is the last patch for me to review in this series.
>>
>> Now that I have a coarse overview of what you did, I have the general
>> remark that imho the "terms" variable should simply be global instead of
>> being passed around all the time.
> 
> In a personal conversation with my mentors, we thought it is the best
> fit to keep it in a struct and pass it around so that it is easier in
> libification.

I guess you had that conversation at the beginning of the project and I
think that at that stage I would have recommended it that way, too.

However, looking at the v15 bisect--helper.c, it looks like it is rather
a pain to do it that way. I mean, you use the terms like they were
global variables: you initialize them in cmd_bisect__helper() and then
you always pass them around; you update them "globally", never using
restricted scope, never ever use a second instantiation besides the one
of cmd_bisect__helper(). These are all signs that *in the current code*
using (static) global variables is the better way (making the code simpler).

The libification argument may be worth considering, though. I am not
sure if there is really a use-case where you need two different
instantiations of the terms, *except* if the lib allows bisecting with
different terms at the same time (in different repos, of course). Is the
lib "aware" of such use-cases like doing stuff in different repos at the
same time? If that's the case, not using global variables is the right
thing to do. In any other case I can imagine, I'd suggest to use global
variables.

~Stephan


Re: [PATCH v15 23/27] bisect--helper: `bisect_replay` shell function in C

2016-12-06 Thread Stephan Beyer
Hey Pranit,

On 12/07/2016 12:02 AM, Pranit Bauva wrote:
>>> +static int bisect_replay(struct bisect_terms *terms, const char *filename)
>>> +{
>>> + struct strbuf line = STRBUF_INIT;
>>> + struct strbuf word = STRBUF_INIT;
>>> + FILE *fp = NULL;
>>
>> (The initialization is not necessary here.)
> 
> Um. I think it is. Otherwise if it goes to the finish block before you
> try to operate on fp, it will cause a seg fault.

You are right, thanks!

>>> + while (strbuf_getline(, fp) != EOF) {
>>> + int pos = 0;
>>> + while (pos < line.len) {
>>> + pos = get_next_word(line.buf, pos, );
>>> +
>>> + if (!strcmp(word.buf, "git")) {
>>> + continue;
>>> + } else if (!strcmp(word.buf, "git-bisect")) {
>>> + continue;
>>> + } else if (!strcmp(word.buf, "bisect")) {
>>> + continue;
>>> + } else if (!strcmp(word.buf, "#")) {
>>> + break;
>>
>> Maybe it is more robust to check whether word.buf begins with #
> 
> Assuming that you meant "# ", yes.

No, if I get it right "# " can never occur because the word.buf never
contains a space.
What I meant was that you are currently ignoring everything after a
"# ", so comments like

# foo

are ignored.
However, imagine a user changes the file by hand (he probably should not
do it but, hey, it's git: unixy, hacky ... and he thinks he knows what
he does) and then we have in the file something like

#foo

which makes perfectly sense when you are used to programming languages
with # as comment-till-eol marker. The problem is that your current code
does expect "#" as a single word and would hence not recognize #foo as a
comment.

I hope I made it clear why I suggested to test if the word *begins* with
"#" (not "# ").

~Stephan


Re: [PATCH v15 12/27] bisect--helper: `get_terms` & `bisect_terms` shell function in C

2016-12-06 Thread Stephan Beyer
Hey Pranit,

On 12/06/2016 10:14 PM, Pranit Bauva wrote:
>>> +
>>> + if (argc == 0) {
>>> + printf(_("Your current terms are %s for the old state\nand "
>>> +"%s for the new state.\n"), terms->term_good,
>>> +terms->term_bad);
>>
>> Very minor: It improves the readability if you'd split the string after
>> the \n and put the "and "in the next line.
> 
> Ah. This is because of the message. If I do the other way, then it
> won't match the output in one of the tests in t/t6030 thus, I am
> keeping it that way in order to avoid modifying the file t/t6030.

I think I was unclear here. I was referring to the coding/layouting
style, not to the string. I mean like writing:

printf(_("Your current terms are %s for the old state\n"
 "and "%s for the new state.\n"),
   terms->term_good, terms->term_bad);

The string fed to _() is the same, but it is split in a different (imho
more readable) way: after the "\n", not after the "and ".


>>> + die(_("invalid argument %s for 'git bisect "
>>> +   "terms'.\nSupported options are: "
>>> +   "--term-good|--term-old and "
>>> +   "--term-bad|--term-new."), argv[i]);
>>
>> Hm, "return error(...)" and "die(...)" seems to be quasi-equivalent in
>> this case. Because I am always looking from a library perspective, I'd
>> prefer "return error(...)".
> 
> I should use return error()

When you reroll your patches, please also check if you always put _()
around your error()s ;) (Hmmm... On the other hand, it might be arguable
if translations are useful for errors that only occur when people hack
git-bisect or use the bisect--helper directly... This makes me feel like
all those errors should be prefixed by some "BUG: " marker since the
ordinary user only sees them when there is a bug. But I don't feel in
the position to decide or recommend such a thing, so it's just a thought.)

~Stephan


Re: [PATCH v15 23/27] bisect--helper: `bisect_replay` shell function in C

2016-12-06 Thread Pranit Bauva
Hey Stephan,

On Tue, Nov 22, 2016 at 6:19 AM, Stephan Beyer  wrote:
> Okay Pranit,
>
> this is the last patch for me to review in this series.
>
> Now that I have a coarse overview of what you did, I have the general
> remark that imho the "terms" variable should simply be global instead of
> being passed around all the time.

In a personal conversation with my mentors, we thought it is the best
fit to keep it in a struct and pass it around so that it is easier in
libification.

> I also had some other remarks but I forgot them... maybe they come to my
> mind again when I see patch series v16.
>
> I also want to remark again that I am not a Git developer and only
> reviewed this because of my interest in git-bisect. So some of my
> suggestions might conflict with other beliefs here. For example, I
> consider it very bad style to leak memory... but Git is rather written
> as a scripting tool than a genuine library, so perhaps many people here
> do not care about it as long as it works...

Thanks for taking out your time to review my series extremely
carefully. I will try to post a v16 next week probably.

> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>> index c18ca07..b367d8d 100644
>> --- a/builtin/bisect--helper.c
>> +++ b/builtin/bisect--helper.c
>> @@ -601,7 +602,6 @@ static int bisect_start(struct bisect_terms *terms, int 
>> no_checkout,
>>   terms->term_good = arg;
>>   } else if (!strcmp(arg, "--term-bad") ||
>>!strcmp(arg, "--term-new")) {
>> - const char *next_arg;
>
> This should already have been removed in patch 15/27, not here.
>
>> @@ -875,6 +875,117 @@ static int bisect_log(void)
>>   return status;
>>  }
>>
>> +static int get_next_word(const char *line, int pos, struct strbuf *word)
>> +{
>> + int i, len = strlen(line), begin = 0;
>> + strbuf_reset(word);
>> + for (i = pos; i < len; i++) {
>> + if (line[i] == ' ' && begin)
>> + return i + 1;
>> +
>> + if (!begin)
>> + begin = 1;
>> + strbuf_addch(word, line[i]);
>> + }
>> +
>> + return i;
>> +}
>> +
>> +static int bisect_replay(struct bisect_terms *terms, const char *filename)
>> +{
>> + struct strbuf line = STRBUF_INIT;
>> + struct strbuf word = STRBUF_INIT;
>> + FILE *fp = NULL;
>
> (The initialization is not necessary here.)

Um. I think it is. Otherwise if it goes to the finish block before you
try to operate on fp, it will cause a seg fault.

>> + int res = 0;
>> +
>> + if (is_empty_or_missing_file(filename)) {
>> + error(_("no such file with name '%s' exists"), filename);
>
> The error message is misleading if the file exists but is empty.
> Maybe something like "cannot read file '%s' for replaying"?

Okay will change.

>> + res = -1;
>> + goto finish;
>
> goto fail;
> :D
>
>> + }
>> +
>> + if (bisect_reset(NULL)) {
>> + res = -1;
>> + goto finish;
>
> goto fail;
>
>> + }
>> +
>> + fp = fopen(filename, "r");
>> + if (!fp) {
>> + res = -1;
>> + goto finish;
>
> goto fail;
>
>> + }
>> +
>> + while (strbuf_getline(, fp) != EOF) {
>> + int pos = 0;
>> + while (pos < line.len) {
>> + pos = get_next_word(line.buf, pos, );
>> +
>> + if (!strcmp(word.buf, "git")) {
>> + continue;
>> + } else if (!strcmp(word.buf, "git-bisect")) {
>> + continue;
>> + } else if (!strcmp(word.buf, "bisect")) {
>> + continue;
>> + } else if (!strcmp(word.buf, "#")) {
>> + break;
>
> Maybe it is more robust to check whether word.buf begins with #

Assuming that you meant "# ", yes.

>> + }
>> +
>> + get_terms(terms);
>> + if (check_and_set_terms(terms, word.buf)) {
>> + res = -1;
>> + goto finish;
>
> goto fail;
>
>> + }
>> +
>> + if (!strcmp(word.buf, "start")) {
>> + struct argv_array argv = ARGV_ARRAY_INIT;
>> + sq_dequote_to_argv_array(line.buf+pos, );
>> + if (bisect_start(terms, 0, argv.argv, 
>> argv.argc)) {
>> + argv_array_clear();
>> + res = -1;
>> + goto finish;
>
> goto fail;
>
>> + }
>> + argv_array_clear();
>> +   

Re: [PATCH v15 10/27] bisect--helper: `check_and_set_terms` shell function in C

2016-12-06 Thread Pranit Bauva
Hey Stephan,

On Fri, Nov 18, 2016 at 1:55 AM, Stephan Beyer  wrote:
> Hi Pranit,
>
> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>> index 3f19b68..c6c11e3 100644
>> --- a/builtin/bisect--helper.c
>> +++ b/builtin/bisect--helper.c
>> @@ -20,6 +20,7 @@ static const char * const git_bisect_helper_usage[] = {
>>   N_("git bisect--helper --bisect-clean-state"),
>>   N_("git bisect--helper --bisect-reset []"),
>>   N_("git bisect--helper --bisect-write
>>  []"),
>> + N_("git bisect--helper --bisect-check-and-set-terms  
>>  "),
>
> Here's the same as in the previous patch... I'd not use
> TERM_GOOD/TERM_BAD in capitals.

Sure.

>>   NULL
>>  };
>>
>> @@ -212,6 +213,38 @@ static int bisect_write(const char *state, const char 
>> *rev,
>>   return retval;
>>  }
>>
>> +static int set_terms(struct bisect_terms *terms, const char *bad,
>> +  const char *good)
>> +{
>> + terms->term_good = xstrdup(good);
>> + terms->term_bad = xstrdup(bad);
>> + return write_terms(terms->term_bad, terms->term_good);
>
> At this stage of the patch series I am wondering why you are setting
> "terms" here, but I guess you'll need it later.
>
> However, you are leaking memory here. Something like
>
> free(terms->term_good);
> free(terms->term_bad);
> terms->term_good = xstrdup(good);
> terms->term_bad = xstrdup(bad);
>
> should be safe (because you've always used xstrdup() for the terms
> members before). Or am I overseeing something?

Yeah it is safe.

>> @@ -278,6 +314,13 @@ int cmd_bisect__helper(int argc, const char **argv, 
>> const char *prefix)
>>   terms.term_bad = xstrdup(argv[3]);
>>   res = bisect_write(argv[0], argv[1], , nolog);
>>   break;
>> + case CHECK_AND_SET_TERMS:
>> + if (argc != 3)
>> + die(_("--check-and-set-terms requires 3 arguments"));
>> + terms.term_good = xstrdup(argv[1]);
>> + terms.term_bad = xstrdup(argv[2]);
>> + res = check_and_set_terms(, argv[0]);
>> + break;
>
> Ha! When I reviewed the last patch, I asked you why you changed the code
> from returning directly from each subcommand to setting res; break; and
> then return res at the bottom of the function.
>
> Now I see why this was useful. The two members of "terms" are again
> leaking memory: you are allocating memory by using xstrdup() but you are
> not freeing it.

I will take care about freeing the memory.

Regards,
Pranit Bauva


Re: [PATCH v15 22/27] bisect--helper: `bisect_log` shell function in C

2016-12-06 Thread Pranit Bauva
Hey Stephan,

On Fri, Nov 18, 2016 at 3:17 AM, Stephan Beyer  wrote:
> Hi,
>
> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>> index 493034c..c18ca07 100644
>> --- a/builtin/bisect--helper.c
>> +++ b/builtin/bisect--helper.c
>> @@ -858,6 +858,23 @@ static int bisect_state(struct bisect_terms *terms, 
>> const char **argv,
>>   return -1;
>>  }
>>
>> +static int bisect_log(void)
>> +{
>> + int fd, status;
>> + fd = open(git_path_bisect_log(), O_RDONLY);
>> + if (fd < 0)
>> + return -1;
>> +
>> + status = copy_fd(fd, 1);
>
> Perhaps
>
> status = copy_fd(fd, STDOUT_FILENO);

Sure!

>> + if (status) {
>> + close(fd);
>> + return -1;
>> + }
>> +
>> + close(fd);
>> + return status;
>> +}
>
> That's weird.
> Either get rid of the if() and actually use status:
>
> status = copy_fd(fd, STDOUT_FILENO);
>
> close(fd);
> return status ? -1 : 0;

This one seems better!


Re: [PATCH v15 19/27] bisect--helper: `bisect_state` & `bisect_head` shell function in C

2016-12-06 Thread Pranit Bauva
Hey Stephan,

On Tue, Nov 22, 2016 at 5:42 AM, Stephan Beyer  wrote:
> Hi,
>
> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>> Reimplement the `bisect_state` shell function in C and also add a
>> subcommand `--bisect-state` to `git-bisect--helper` to call it from
>> git-bisect.sh .
>>
>> Using `--bisect-state` subcommand is a temporary measure to port shell
>> function to C so as to use the existing test suite. As more functions
>> are ported, this subcommand will be retired and will be called by some
>> other methods.
>>
>> `bisect_head` is called from `bisect_state` thus its not required to
>> introduce another subcommand.
>
> Missing comma before "thus", and "it is" (or "it's") instead of "its" :)

Sure, will fix.

>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>> index 1767916..1481c6d 100644
>> --- a/builtin/bisect--helper.c
>> +++ b/builtin/bisect--helper.c
>> @@ -784,6 +786,79 @@ static int bisect_autostart(struct bisect_terms *terms)
>>   return 0;
>>  }
>>
>> +static char *bisect_head(void)
>> +{
>> + if (is_empty_or_missing_file(git_path_bisect_head()))
>> + return "HEAD";
>> + else
>> + return "BISECT_HEAD";
>> +}
>
> This is very shellish.
> In C I'd expect something like
>
> static int bisect_head_sha1(unsigned char *sha)
> {
> int res;
> if (is_empty_or_missing_file(git_path_bisect_head()))
> res = get_sha1("HEAD", sha);
> else
> res = get_sha1("BISECT_HEAD", sha);
>
> if (res)
> return error(_("Could not find BISECT_HEAD or HEAD."));
>
> return 0;
> }
>
>> +
>> +static int bisect_state(struct bisect_terms *terms, const char **argv,
>> + int argc)
>> +{
>> + const char *state = argv[0];
>> +
>> + get_terms(terms);
>> + if (check_and_set_terms(terms, state))
>> + return -1;
>> +
>> + if (!argc)
>> + die(_("Please call `--bisect-state` with at least one 
>> argument"));
>
> I think this check should move to cmd_bisect__helper. There are also the
> other argument number checks.

Not really. After the whole conversion, the cmdmode will cease to
exists while bisect_state will be called directly, thus it is
important to check it here.

>> +
>> + if (argc == 1 && one_of(state, terms->term_good,
>> + terms->term_bad, "skip", NULL)) {
>> + const char *bisected_head = xstrdup(bisect_head());
>> + const char *hex[1];
>
> Maybe:
> const char *hex;
>
>> + unsigned char sha1[20];
>> +
>> + if (get_sha1(bisected_head, sha1))
>> + die(_("Bad rev input: %s"), bisected_head);
>
> And instead of...
>
>> + if (bisect_write(state, sha1_to_hex(sha1), terms, 0))
>> + return -1;
>> +
>> + *hex = xstrdup(sha1_to_hex(sha1));
>> + if (check_expected_revs(hex, 1))
>> + return -1;
>
> ... simply:
>
> hex = xstrdup(sha1_to_hex(sha1));
> if (set_state(terms, state, hex)) {
> free(hex);
> return -1;
> }
> free(hex);
>
> where:

Yes I am planning to convert all places with hex rather than the sha1
but not yet, maybe in an another patch series because currently a lot
of things revolve around sha1 and changing its behaviour wouldn't
really be a part of a porting patch series.

> static int set_state(struct bisect_terms *terms, const char *state,
>  const char *hex)
> {
> if (bisect_write(state, hex, terms, 0))
> return -1;
> if (check_expected_revs(, 1))
> return -1;
> return 0;
> }
>
>> + return bisect_auto_next(terms, NULL);
>> + }
>> +
>> + if ((argc == 2 && !strcmp(state, terms->term_bad)) ||
>> + one_of(state, terms->term_good, "skip", NULL)) {
>> + int i;
>> + struct string_list hex = STRING_LIST_INIT_DUP;
>> +
>> + for (i = 1; i < argc; i++) {
>> + unsigned char sha1[20];
>> +
>> + if (get_sha1(argv[i], sha1)) {
>> + string_list_clear(, 0);
>> + die(_("Bad rev input: %s"), argv[i]);
>> + }
>> + string_list_append(, sha1_to_hex(sha1));
>> + }
>> + for (i = 0; i < hex.nr; i++) {
>
> ... And replace this:
>
>> + const char **hex_string = (const char **) 
>> [i].string;
>> + if(bisect_write(state, *hex_string, terms, 0)) {
>> + string_list_clear(, 0);
>> + return -1;
>> + }
>> + if (check_expected_revs(hex_string, 1)) {
>> + 

Re: [PATCH 10/17] pathspec: simpler logic to prefix original pathspec elements

2016-12-06 Thread Brandon Williams
On 12/06, Stefan Beller wrote:
> On Tue, Dec 6, 2016 at 1:51 PM, Brandon Williams  wrote:
> 
> > struct strbuf sb = STRBUF_INIT;
> > -   if (prefixlen && !literal_global) {
> > -   /* Preserve the actual prefix length of each 
> > pattern */
> > -   if (short_magic)
> > -   prefix_short_magic(, prefixlen, 
> > short_magic);
> > -   else if (long_magic_end) {
> > -   strbuf_add(, elt, long_magic_end - elt);
> > -   strbuf_addf(, ",prefix:%d)", prefixlen);
> > -   } else
> > -   strbuf_addf(, ":(prefix:%d)", prefixlen);
> 
> This fixes the issue with add -p . mentioned somewhere else on the mailing 
> list.
>
> > -   }
> > +
> > +   /* Preserve the actual prefix length of each pattern */
> > +   prefix_magic(, prefixlen, element_magic);
> > +
> 
> Did you find a reason why we passed magic literally, i.e. short magic
> was passed as short magic and long magic as long magic before?
> 
> I cannot think of any reason why that would have been the case,
> but I assume there had to be a reason for that.

nope, perhaps it was because we technically already have the long magic
string and the short magic needs to be converted to long magic (as you
can't mix short and long magic).

> Another note: This collides with the attr system refactoring, which I
> postpone redoing until the submodule checkout is done, so maybe
> you want to pickup this patch:
> https://public-inbox.org/git/20161110203428.30512-31-sbel...@google.com/
> which only relies on one patch prior
> https://public-inbox.org/git/20161110203428.30512-30-sbel...@google.com/

After looking at those patches I think I do something extremely similar
in a future patch in this series, the parse_long_magic patch.

-- 
Brandon Williams


Re: [PATCH 16/17] pathspec: small readability changes

2016-12-06 Thread Stefan Beller
On Tue, Dec 6, 2016 at 1:51 PM, Brandon Williams  wrote:
> A few small changes to improve readability.  This is done by grouping related
> assignments, adding blank lines, ensuring lines are <80 characters, etc.

The 'etc' sounds a bit sloppy in the commit message.
Maybe s/etc/and adding proper comments/ ?

Code looks good.


Re: [PATCH 10/17] pathspec: simpler logic to prefix original pathspec elements

2016-12-06 Thread Stefan Beller
On Tue, Dec 6, 2016 at 1:51 PM, Brandon Williams  wrote:

> struct strbuf sb = STRBUF_INIT;
> -   if (prefixlen && !literal_global) {
> -   /* Preserve the actual prefix length of each pattern 
> */
> -   if (short_magic)
> -   prefix_short_magic(, prefixlen, 
> short_magic);
> -   else if (long_magic_end) {
> -   strbuf_add(, elt, long_magic_end - elt);
> -   strbuf_addf(, ",prefix:%d)", prefixlen);
> -   } else
> -   strbuf_addf(, ":(prefix:%d)", prefixlen);

This fixes the issue with add -p . mentioned somewhere else on the mailing list.

> -   }
> +
> +   /* Preserve the actual prefix length of each pattern */
> +   prefix_magic(, prefixlen, element_magic);
> +

Did you find a reason why we passed magic literally, i.e. short magic
was passed as short magic and long magic as long magic before?

I cannot think of any reason why that would have been the case,
but I assume there had to be a reason for that.


Another note: This collides with the attr system refactoring, which I
postpone redoing until the submodule checkout is done, so maybe
you want to pickup this patch:
https://public-inbox.org/git/20161110203428.30512-31-sbel...@google.com/
which only relies on one patch prior
https://public-inbox.org/git/20161110203428.30512-30-sbel...@google.com/


Re: [PATCH v7 4/4] transport: add from_user parameter to is_transport_allowed

2016-12-06 Thread Brandon Williams
On 12/06, Jeff King wrote:
> On Mon, Dec 05, 2016 at 12:04:52PM -0800, Junio C Hamano wrote:
> 
> > > I'm sending out another reroll of this series so that in Jeff's he can
> > > just call 'get_curl_allowed_protocols(-1)' for the non-redirection curl
> > > option, which should make this test stop barfing.
> > 
> > I was hoping to eventually merge Peff's series to older maintenance
> > tracks.  How bad would it be if we rebased the v8 of this series
> > together with Peff's series to say v2.9 (or even older if it does
> > not look too bad)?
> 
> My series actually fixes existing security problems, so I'd consider it
> a bug-fix. I _think_ Brandon's series is purely about allowing more
> expressiveness in the whitelist policy, and so could be considered more
> of a feature.

Yes this was really the main intent on my series.

> So one option is to apply my series for older 'maint', and then just
> rebase Brandon's on top of that for 'master'.
> 
> I don't know if that makes things any easier. I feel funny saying "no,
> no, mine preempts yours because it is more maint-worthy", but I think
> that order does make sense.
> 
> I think it would be OK to put Brandon's on maint, too, though. It is a
> refactor of an existing security feature to make it more featureful, but
> the way it is implemented could not cause security regressions unless
> you use the new feature (IOW, we still respect the whitelist environment
> exactly as before).

Either way let me know if there is something I need to do.

-- 
Brandon Williams


Re* [BUG] Index.lock error message regression in git 2.11.0

2016-12-06 Thread Junio C Hamano
Junio C Hamano  writes:

> Perhaps the attached would fix it (not even compile tested, though)?
>
> I would prefer to make 0 to mean "show error but return -1", 1 to
> mean "die on error", and 2 to mean "be silent and return -1 on
> error", though.  Asking to be silent should be the exception for
> this error from usability and safety's point of view, and requiring
> such exceptional callers to pass LOCK_SILENT_ON_ERROR would be
> easier to "git grep" for them.

So here is the "You have to ask explicitly, if you know that it is
safe to be silent" version with a proper log message.

-- >8 --
Subject: [PATCH] lockfile: LOCK_SILENT_ON_ERROR

Recent "libify merge machinery" stopped from passing die_on_error
bit to hold_locked_index(), and lost an error message when there are
competing update in progress with "git merge --ff-only $commit", for
example.  The command still exits with a non-zero status, but that
is not of much help for an interactive user.  The last thing the
command says is "Updating $from..$to".  We used to follow it with a
big error message that makes it clear that "merge --ff-only" did not
succeed.

Introduce a new bit "LOCK_SILENT_ON_ERROR" that can be passed by
callers that do want to silence the message (because they either
make it a non-error by doing something else, or they show their own
error message to explain the situation), and show the error message
we used to give for everybody else, including the caller that was
touched by the libification in question.

I would not be surprised if some existing calls to hold_lock*()
functions that pass die_on_error=0 need to be updated to pass
LOCK_SILENT_ON_ERROR, and when this fix is taken alone, it may look
like a regression, but we are better off starting louder and squelch
the ones that we find safe to make silent than the other way around.

Reported-by: Robbie Iannucci 
Signed-off-by: Junio C Hamano 
---
 lockfile.c | 11 +--
 lockfile.h |  8 +++-
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/lockfile.c b/lockfile.c
index 9268cdf325..f7e8104449 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -174,8 +174,15 @@ int hold_lock_file_for_update_timeout(struct lock_file 
*lk, const char *path,
  int flags, long timeout_ms)
 {
int fd = lock_file_timeout(lk, path, flags, timeout_ms);
-   if (fd < 0 && (flags & LOCK_DIE_ON_ERROR))
-   unable_to_lock_die(path, errno);
+   if (fd < 0) {
+   if (flags & LOCK_DIE_ON_ERROR)
+   unable_to_lock_die(path, errno);
+   else if (!(flags & LOCK_SILENT_ON_ERROR)) {
+   struct strbuf buf = STRBUF_INIT;
+   unable_to_lock_message(path, errno, );
+   error("%s", buf.buf);
+   }
+   }
return fd;
 }
 
diff --git a/lockfile.h b/lockfile.h
index d26ad27b2b..98b4862254 100644
--- a/lockfile.h
+++ b/lockfile.h
@@ -129,9 +129,15 @@ struct lock_file {
 /*
  * If a lock is already taken for the file, `die()` with an error
  * message. If this flag is not specified, trying to lock a file that
- * is already locked returns -1 to the caller.
+ * is already locked gives the same error message and returns -1 to
+ * the caller.
  */
 #define LOCK_DIE_ON_ERROR 1
+/*
+ * ... or the function can be told to be totally silent and return
+ * -1 to the caller upon error with this flag
+ */
+#define LOCK_SILENT_ON_ERROR 2
 
 /*
  * Usually symbolic links in the destination path are resolved. This
-- 
2.11.0-270-g0b6beed61f



Re: [BUG] Index.lock error message regression in git 2.11.0

2016-12-06 Thread Junio C Hamano
Robbie Iannucci  writes:

> I just upgraded to 2.11.0 from 2.10.2, and I noticed that some
> commands no longer print an error message when the `index.lock` file
> exists (such as `git merge --ff-only`).
>
> It appears this bug was introduced in
> 55f5704da69d3e6836620f01bee0093ad5e331e8 (sequencer: lib'ify
> checkout_fast_forward()). I determined this by running the attached
> bisect script (on OS X, but I don't think that matters; I've also seen
> the error message missing on Linux):

Yes, I can see that this is not limited to OSX.  The command does
exit with non-zero status, but that is not of much help for an
interactive user.

$ git checkout v2.11.0^0
$ >.git/index.lock
$ git merge --ff-only master; echo $?
Updating 454cb6bd52..8d7a455ed5
1
$ exit

We can see that it attempted to update, but because there is no
indication that it failed, the user can easily be misled to think
that we are now at the tip of the master branch.

We used to give "fatal: Unable to create ..." followed by "Another
git process seems to be running..." advice, that would have helped
the user from the confusion.

I do not think it is the right solution to call hold_locked_index()
with die_on_error=1 from the codepath changed by your 55f5704da6
("sequencer: lib'ify checkout_fast_forward()", 2016-09-09).  Perhaps
the caller of checkout_fast_forward() needs to learn how/why the
function returned error and diagnose this case and a message?  Or
perhaps turn that die_on_error parameter from boolean to tricolor
(i.e. the caller does not want you to die, but the caller knows that
you have more information to give better error message than it does,
so please show an error message instead of silently returning -1)?

Perhaps the attached would fix it (not even compile tested, though)?

I would prefer to make 0 to mean "show error but return -1", 1 to
mean "die on error", and 2 to mean "be silent and return -1 on
error", though.  Asking to be silent should be the exception for
this error from usability and safety's point of view, and requiring
such exceptional callers to pass LOCK_SILENT_ON_ERROR would be
easier to "git grep" for them.

Dscho, what do you think?  


 lockfile.c | 11 +--
 lockfile.h |  5 +
 merge.c|  2 +-
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/lockfile.c b/lockfile.c
index 9268cdf325..f190caddb0 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -174,8 +174,15 @@ int hold_lock_file_for_update_timeout(struct lock_file 
*lk, const char *path,
  int flags, long timeout_ms)
 {
int fd = lock_file_timeout(lk, path, flags, timeout_ms);
-   if (fd < 0 && (flags & LOCK_DIE_ON_ERROR))
-   unable_to_lock_die(path, errno);
+   if (fd < 0) {
+   if (flags & LOCK_DIE_ON_ERROR)
+   unable_to_lock_die(path, errno);
+   else if (flags & LOCK_ERROR_ON_ERROR) {
+   struct strbuf buf = STRBUF_INIT;
+   unable_to_lock_message(path, errno, );
+   error("%s", buf.buf);
+   }
+   }
return fd;
 }
 
diff --git a/lockfile.h b/lockfile.h
index d26ad27b2b..6cba9c8142 100644
--- a/lockfile.h
+++ b/lockfile.h
@@ -132,6 +132,11 @@ struct lock_file {
  * is already locked returns -1 to the caller.
  */
 #define LOCK_DIE_ON_ERROR 1
+/*
+ * ... or the function can be told to show the usual error message
+ * and still return -1 to the caller with this flag
+ */
+#define LOCK_ERROR_ON_ERROR 2
 
 /*
  * Usually symbolic links in the destination path are resolved. This
diff --git a/merge.c b/merge.c
index 23866c9165..9e2e4f1a22 100644
--- a/merge.c
+++ b/merge.c
@@ -57,7 +57,7 @@ int checkout_fast_forward(const unsigned char *head,
 
refresh_cache(REFRESH_QUIET);
 
-   if (hold_locked_index(lock_file, 0) < 0)
+   if (hold_locked_index(lock_file, LOCK_ERROR_ON_ERROR) < 0)
return -1;
 
memset(, 0, sizeof(trees));


[PATCH 17/17] pathspec: remove outdated comment

2016-12-06 Thread Brandon Williams
Remove part of the function header comment to prefix_pathspec as it is
no longer relevant.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 8a07b02..66db257 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -298,15 +298,6 @@ static void strip_submodule_slash_expensive(struct 
pathspec_item *item)
 /*
  * Take an element of a pathspec and check for magic signatures.
  * Append the result to the prefix. Return the magic bitmap.
- *
- * For now, we only parse the syntax and throw out anything other than
- * "top" magic.
- *
- * NEEDSWORK: This needs to be rewritten when we start migrating
- * get_pathspec() users to use the "struct pathspec" interface.  For
- * example, a pathspec element may be marked as case-insensitive, but
- * the prefix part must always match literally, and a single stupid
- * string cannot express such a case.
  */
 static unsigned prefix_pathspec(struct pathspec_item *item,
unsigned flags,
-- 
2.8.0.rc3.226.g39d4020



[PATCH 15/17] pathspec: create strip submodule slash helpers

2016-12-06 Thread Brandon Williams
Factor out the logic responsible for stripping the trailing slash on
pathspecs referencing submodules into its own function.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 68 ++
 1 file changed, 42 insertions(+), 26 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 793caf1..41aa213 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -257,6 +257,44 @@ static const char *parse_element_magic(unsigned *magic, 
int *prefix_len,
return parse_short_magic(magic, elem);
 }
 
+static void strip_submodule_slash_cheap(struct pathspec_item *item)
+{
+   int i;
+
+   if ((item->len >= 1 && item->match[item->len - 1] == '/') &&
+   (i = cache_name_pos(item->match, item->len - 1)) >= 0 &&
+   S_ISGITLINK(active_cache[i]->ce_mode)) {
+   item->len--;
+   item->match[item->len] = '\0';
+   }
+}
+
+static void strip_submodule_slash_expensive(struct pathspec_item *item)
+{
+   int i;
+
+   for (i = 0; i < active_nr; i++) {
+   struct cache_entry *ce = active_cache[i];
+   int ce_len = ce_namelen(ce);
+
+   if (!S_ISGITLINK(ce->ce_mode))
+   continue;
+
+   if (item->len <= ce_len || item->match[ce_len] != '/' ||
+   memcmp(ce->name, item->match, ce_len))
+   continue;
+
+   if (item->len == ce_len + 1) {
+   /* strip trailing slash */
+   item->len--;
+   item->match[item->len] = '\0';
+   } else {
+   die (_("Pathspec '%s' is in submodule '%.*s'"),
+item->original, ce_len, ce->name);
+   }
+   }
+}
+
 /*
  * Take an element of a pathspec and check for magic signatures.
  * Append the result to the prefix. Return the magic bitmap.
@@ -278,7 +316,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
unsigned magic = 0, element_magic = 0;
const char *copyfrom = elt;
char *match;
-   int i, pathspec_prefix = -1;
+   int pathspec_prefix = -1;
 
/* PATHSPEC_LITERAL_PATH ignores magic */
if (!(flags & PATHSPEC_LITERAL_PATH)) {
@@ -327,33 +365,11 @@ static unsigned prefix_pathspec(struct pathspec_item 
*item,
item->len = strlen(item->match);
item->prefix = prefixlen;
 
-   if ((flags & PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP) &&
-   (item->len >= 1 && item->match[item->len - 1] == '/') &&
-   (i = cache_name_pos(item->match, item->len - 1)) >= 0 &&
-   S_ISGITLINK(active_cache[i]->ce_mode)) {
-   item->len--;
-   match[item->len] = '\0';
-   }
+   if (flags & PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP)
+   strip_submodule_slash_cheap(item);
 
if (flags & PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE)
-   for (i = 0; i < active_nr; i++) {
-   struct cache_entry *ce = active_cache[i];
-   int ce_len = ce_namelen(ce);
-
-   if (!S_ISGITLINK(ce->ce_mode))
-   continue;
-
-   if (item->len <= ce_len || match[ce_len] != '/' ||
-   memcmp(ce->name, match, ce_len))
-   continue;
-   if (item->len == ce_len + 1) {
-   /* strip trailing slash */
-   item->len--;
-   match[item->len] = '\0';
-   } else
-   die (_("Pathspec '%s' is in submodule '%.*s'"),
-elt, ce_len, ce->name);
-   }
+   strip_submodule_slash_expensive(item);
 
if (magic & PATHSPEC_LITERAL)
item->nowildcard_len = item->len;
-- 
2.8.0.rc3.226.g39d4020



[PATCH 13/17] pathspec: create parse_long_magic function

2016-12-06 Thread Brandon Williams
Factor out the logic responsible for parsing long magic into its own
function.  As well as hoist the prefix check logic outside of the inner
loop as there isn't anything that needs to be done after matching
"prefix:".

Signed-off-by: Brandon Williams 
---
 pathspec.c | 92 ++
 1 file changed, 57 insertions(+), 35 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index d4d4839..1d28679 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -156,6 +156,60 @@ static int get_global_magic(int element_magic)
 }
 
 /*
+ * Parse the pathspec element looking for long magic
+ *
+ * saves all magic in 'magic'
+ * if prefix magic is used, save the prefix length in 'prefix_len'
+ * returns the position in 'elem' after all magic has been parsed
+ */
+static const char *parse_long_magic(unsigned *magic, int *prefix_len,
+   const char *elem)
+{
+   const char *pos;
+   const char *nextat;
+
+   for (pos = elem + 2; *pos && *pos != ')'; pos = nextat) {
+   size_t len = strcspn(pos, ",)");
+   int i;
+
+   if (pos[len] == ',')
+   nextat = pos + len + 1; /* handle ',' */
+   else
+   nextat = pos + len; /* handle ')' and '\0' */
+
+   if (!len)
+   continue;
+
+   if (starts_with(pos, "prefix:")) {
+   char *endptr;
+   *prefix_len = strtol(pos + 7, , 10);
+   if (endptr - pos != len)
+   die(_("invalid parameter for pathspec magic 
'prefix'"));
+   continue;
+   }
+
+   for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+   if (strlen(pathspec_magic[i].name) == len &&
+   !strncmp(pathspec_magic[i].name, pos, len)) {
+   *magic |= pathspec_magic[i].bit;
+   break;
+   }
+   }
+
+   if (ARRAY_SIZE(pathspec_magic) <= i)
+   die(_("Invalid pathspec magic '%.*s' in '%s'"),
+   (int) len, pos, elem);
+   }
+
+   if (*pos != ')')
+   die(_("Missing ')' at the end of pathspec magic in '%s'"),
+   elem);
+   pos++;
+
+   return pos;
+}
+
+/*
  * Parse the pathspec element looking for short magic
  *
  * saves all magic in 'magic'
@@ -218,41 +272,9 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
; /* nothing to do */
} else if (elt[1] == '(') {
/* longhand */
-   const char *nextat;
-   for (copyfrom = elt + 2;
-*copyfrom && *copyfrom != ')';
-copyfrom = nextat) {
-   size_t len = strcspn(copyfrom, ",)");
-   if (copyfrom[len] == ',')
-   nextat = copyfrom + len + 1;
-   else
-   /* handle ')' and '\0' */
-   nextat = copyfrom + len;
-   if (!len)
-   continue;
-   for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
-   if (strlen(pathspec_magic[i].name) == len &&
-   !strncmp(pathspec_magic[i].name, copyfrom, 
len)) {
-   element_magic |= pathspec_magic[i].bit;
-   break;
-   }
-   if (starts_with(copyfrom, "prefix:")) {
-   char *endptr;
-   pathspec_prefix = strtol(copyfrom + 7,
-, 10);
-   if (endptr - copyfrom != len)
-   die(_("invalid parameter for 
pathspec magic 'prefix'"));
-   /* "i" would be wrong, but it does not 
matter */
-   break;
-   }
-   }
-   if (ARRAY_SIZE(pathspec_magic) <= i)
-   die(_("Invalid pathspec magic '%.*s' in '%s'"),
-   (int) len, copyfrom, elt);
-   }
-   if (*copyfrom != ')')
-   die(_("Missing ')' at the end of pathspec magic in 
'%s'"), elt);
-   copyfrom++;
+   copyfrom = parse_long_magic(_magic,
+   _prefix,
+   elt);
} else {
/* shorthand */
copyfrom = parse_short_magic(_magic, elt);
-- 

[PATCH 05/17] pathspec: remove the deprecated get_pathspec function

2016-12-06 Thread Brandon Williams
Now that all callers of the old 'get_pathspec' interface have been
migrated to use the new pathspec struct interface it can be removed
from the codebase.

Since there are no more users of the '_raw' field in the pathspec struct
it can also be removed.  This patch also removes the old functionality
of modifying the const char **argv array that was passed into
parse_pathspec.  Instead the constructed 'match' string (which is a
pathspec element with the prefix prepended) is only stored in its
corresponding pathspec_item entry.

Signed-off-by: Brandon Williams 
---
 Documentation/technical/api-setup.txt |  2 --
 cache.h   |  1 -
 pathspec.c| 42 +++
 pathspec.h|  1 -
 4 files changed, 3 insertions(+), 43 deletions(-)

diff --git a/Documentation/technical/api-setup.txt 
b/Documentation/technical/api-setup.txt
index 540e455..eb1fa98 100644
--- a/Documentation/technical/api-setup.txt
+++ b/Documentation/technical/api-setup.txt
@@ -27,8 +27,6 @@ parse_pathspec(). This function takes several arguments:
 
 - prefix and args come from cmd_* functions
 
-get_pathspec() is obsolete and should never be used in new code.
-
 parse_pathspec() helps catch unsupported features and reject them
 politely. At a lower level, different pathspec-related functions may
 not support the same set of features. Such pathspec-sensitive
diff --git a/cache.h b/cache.h
index a50a61a..0f80e01 100644
--- a/cache.h
+++ b/cache.h
@@ -514,7 +514,6 @@ extern void set_git_work_tree(const char *tree);
 
 #define ALTERNATE_DB_ENVIRONMENT "GIT_ALTERNATE_OBJECT_DIRECTORIES"
 
-extern const char **get_pathspec(const char *prefix, const char **pathspec);
 extern void setup_work_tree(void);
 extern const char *setup_git_directory_gently(int *);
 extern const char *setup_git_directory(void);
diff --git a/pathspec.c b/pathspec.c
index 22ca74a..1f918cb 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -103,7 +103,7 @@ static void prefix_short_magic(struct strbuf *sb, int 
prefixlen,
  */
 static unsigned prefix_pathspec(struct pathspec_item *item,
unsigned *p_short_magic,
-   const char **raw, unsigned flags,
+   unsigned flags,
const char *prefix, int prefixlen,
const char *elt)
 {
@@ -240,7 +240,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
if (!match)
die(_("%s: '%s' is outside repository"), elt, copyfrom);
}
-   *raw = item->match = match;
+   item->match = match;
/*
 * Prefix the pathspec (keep all magic) and assign to
 * original. Useful for passing to another command.
@@ -381,8 +381,6 @@ void parse_pathspec(struct pathspec *pathspec,
 
/* No arguments with prefix -> prefix pathspec */
if (!entry) {
-   static const char *raw[2];
-
if (flags & PATHSPEC_PREFER_FULL)
return;
 
@@ -394,10 +392,7 @@ void parse_pathspec(struct pathspec *pathspec,
item->original = prefix;
item->nowildcard_len = item->len = strlen(prefix);
item->prefix = item->len;
-   raw[0] = prefix;
-   raw[1] = NULL;
pathspec->nr = 1;
-   pathspec->_raw = raw;
return;
}
 
@@ -415,7 +410,6 @@ void parse_pathspec(struct pathspec *pathspec,
pathspec->nr = n;
ALLOC_ARRAY(pathspec->items, n);
item = pathspec->items;
-   pathspec->_raw = argv;
prefixlen = prefix ? strlen(prefix) : 0;
 
for (i = 0; i < n; i++) {
@@ -423,7 +417,7 @@ void parse_pathspec(struct pathspec *pathspec,
entry = argv[i];
 
item[i].magic = prefix_pathspec(item + i, _magic,
-   argv + i, flags,
+   flags,
prefix, prefixlen, entry);
if ((flags & PATHSPEC_LITERAL_PATH) &&
!(magic_mask & PATHSPEC_LITERAL))
@@ -457,36 +451,6 @@ void parse_pathspec(struct pathspec *pathspec,
}
 }
 
-/*
- * N.B. get_pathspec() is deprecated in favor of the "struct pathspec"
- * based interface - see pathspec.c:parse_pathspec().
- *
- * Arguments:
- *  - prefix - a path relative to the root of the working tree
- *  - pathspec - a list of paths underneath the prefix path
- *
- * Iterates over pathspec, prepending each path with prefix,
- * and return the resulting list.
- *
- * If pathspec is empty, return a singleton list containing prefix.
- *
- * If pathspec and prefix are both empty, return an empty list.
- *
- * This is typically used by built-in commands such as add.c, in order
- * to normalize argv arguments provided to the 

[PATCH 12/17] pathspec: create parse_short_magic function

2016-12-06 Thread Brandon Williams
Factor out the logic responsible for parsing short magic into its own
function.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 54 --
 1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 08e76f6..d4d4839 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -156,6 +156,41 @@ static int get_global_magic(int element_magic)
 }
 
 /*
+ * Parse the pathspec element looking for short magic
+ *
+ * saves all magic in 'magic'
+ * returns the position in 'elem' after all magic has been parsed
+ */
+static const char *parse_short_magic(unsigned *magic, const char *elem)
+{
+   const char *pos;
+
+   for (pos = elem + 1; *pos && *pos != ':'; pos++) {
+   char ch = *pos;
+   int i;
+
+   if (!is_pathspec_magic(ch))
+   break;
+
+   for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+   if (pathspec_magic[i].mnemonic == ch) {
+   *magic |= pathspec_magic[i].bit;
+   break;
+   }
+   }
+
+   if (ARRAY_SIZE(pathspec_magic) <= i)
+   die(_("Unimplemented pathspec magic '%c' in '%s'"),
+   ch, elem);
+   }
+
+   if (*pos == ':')
+   pos++;
+
+   return pos;
+}
+
+/*
  * Take an element of a pathspec and check for magic signatures.
  * Append the result to the prefix. Return the magic bitmap.
  *
@@ -220,24 +255,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
copyfrom++;
} else {
/* shorthand */
-   for (copyfrom = elt + 1;
-*copyfrom && *copyfrom != ':';
-copyfrom++) {
-   char ch = *copyfrom;
-
-   if (!is_pathspec_magic(ch))
-   break;
-   for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++)
-   if (pathspec_magic[i].mnemonic == ch) {
-   element_magic |= pathspec_magic[i].bit;
-   break;
-   }
-   if (ARRAY_SIZE(pathspec_magic) <= i)
-   die(_("Unimplemented pathspec magic '%c' in 
'%s'"),
-   ch, elt);
-   }
-   if (*copyfrom == ':')
-   copyfrom++;
+   copyfrom = parse_short_magic(_magic, elt);
}
 
magic |= element_magic;
-- 
2.8.0.rc3.226.g39d4020



[PATCH 11/17] pathspec: factor global magic into its own function

2016-12-06 Thread Brandon Williams
Create helper functions to read the global magic environment variables
in additon to factoring out the global magic gathering logic into its
own function.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 120 +
 1 file changed, 74 insertions(+), 46 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 5afebd3..08e76f6 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -87,6 +87,74 @@ static void prefix_magic(struct strbuf *sb, int prefixlen, 
unsigned magic)
strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
+static inline int get_literal_global(void)
+{
+   static int literal_global = -1;
+
+   if (literal_global < 0)
+   literal_global = git_env_bool(GIT_LITERAL_PATHSPECS_ENVIRONMENT,
+ 0);
+   return literal_global;
+}
+
+static inline int get_glob_global(void)
+{
+   static int glob_global = -1;
+
+   if (glob_global < 0)
+   glob_global = git_env_bool(GIT_GLOB_PATHSPECS_ENVIRONMENT, 0);
+   return glob_global;
+}
+
+static inline int get_noglob_global(void)
+{
+   static int noglob_global = -1;
+
+   if (noglob_global < 0)
+   noglob_global = git_env_bool(GIT_NOGLOB_PATHSPECS_ENVIRONMENT,
+0);
+   return noglob_global;
+}
+
+static inline int get_icase_global(void)
+{
+   static int icase_global = -1;
+
+   if (icase_global < 0)
+   icase_global = git_env_bool(GIT_ICASE_PATHSPECS_ENVIRONMENT, 0);
+
+   return icase_global;
+}
+
+static int get_global_magic(int element_magic)
+{
+   int global_magic = 0;
+
+   if (get_literal_global())
+   global_magic |= PATHSPEC_LITERAL;
+
+   /* --glob-pathspec is overridden by :(literal) */
+   if (get_glob_global() && !(element_magic & PATHSPEC_LITERAL))
+   global_magic |= PATHSPEC_GLOB;
+
+   if (get_glob_global() && get_noglob_global())
+   die(_("global 'glob' and 'noglob' pathspec settings are 
incompatible"));
+
+   if (get_icase_global())
+   global_magic |= PATHSPEC_ICASE;
+
+   if ((global_magic & PATHSPEC_LITERAL) &&
+   (global_magic & ~PATHSPEC_LITERAL))
+   die(_("global 'literal' pathspec setting is incompatible "
+ "with all other global pathspec settings"));
+
+   /* --noglob-pathspec adds :(literal) _unless_ :(glob) is specified */
+   if (get_noglob_global() && !(element_magic & PATHSPEC_GLOB))
+   global_magic |= PATHSPEC_LITERAL;
+
+   return global_magic;
+}
+
 /*
  * Take an element of a pathspec and check for magic signatures.
  * Append the result to the prefix. Return the magic bitmap.
@@ -105,46 +173,12 @@ static unsigned prefix_pathspec(struct pathspec_item 
*item,
const char *prefix, int prefixlen,
const char *elt)
 {
-   static int literal_global = -1;
-   static int glob_global = -1;
-   static int noglob_global = -1;
-   static int icase_global = -1;
-   unsigned magic = 0, element_magic = 0, global_magic = 0;
+   unsigned magic = 0, element_magic = 0;
const char *copyfrom = elt;
char *match;
int i, pathspec_prefix = -1;
 
-   if (literal_global < 0)
-   literal_global = 
git_env_bool(GIT_LITERAL_PATHSPECS_ENVIRONMENT, 0);
-   if (literal_global)
-   global_magic |= PATHSPEC_LITERAL;
-
-   if (glob_global < 0)
-   glob_global = git_env_bool(GIT_GLOB_PATHSPECS_ENVIRONMENT, 0);
-   if (glob_global)
-   global_magic |= PATHSPEC_GLOB;
-
-   if (noglob_global < 0)
-   noglob_global = git_env_bool(GIT_NOGLOB_PATHSPECS_ENVIRONMENT, 
0);
-
-   if (glob_global && noglob_global)
-   die(_("global 'glob' and 'noglob' pathspec settings are 
incompatible"));
-
-
-   if (icase_global < 0)
-   icase_global = git_env_bool(GIT_ICASE_PATHSPECS_ENVIRONMENT, 0);
-   if (icase_global)
-   global_magic |= PATHSPEC_ICASE;
-
-   if ((global_magic & PATHSPEC_LITERAL) &&
-   (global_magic & ~PATHSPEC_LITERAL))
-   die(_("global 'literal' pathspec setting is incompatible "
- "with all other global pathspec settings"));
-
-   if (flags & PATHSPEC_LITERAL_PATH)
-   global_magic = 0;
-
-   if (elt[0] != ':' || literal_global ||
+   if (elt[0] != ':' || get_literal_global() ||
(flags & PATHSPEC_LITERAL_PATH)) {
; /* nothing to do */
} else if (elt[1] == '(') {
@@ -208,15 +242,9 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
 
magic |= element_magic;
 
-   /* --noglob-pathspec adds :(literal) _unless_ :(glob) is specified */
-   if (noglob_global && !(magic & PATHSPEC_GLOB))
-   

[PATCH 14/17] pathspec: create parse_element_magic helper

2016-12-06 Thread Brandon Williams
Factor out the logic responsible for the magic in a pathspec element
into its own function.

Also avoid calling into the parsing functions when
`PATHSPEC_LITERAL_PATH` is specified since it causes magic to be
ignored and all paths to be treated as literals.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 35 +++
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 1d28679..793caf1 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -244,6 +244,19 @@ static const char *parse_short_magic(unsigned *magic, 
const char *elem)
return pos;
 }
 
+static const char *parse_element_magic(unsigned *magic, int *prefix_len,
+  const char *elem)
+{
+   if (elem[0] != ':' || get_literal_global())
+   return elem; /* nothing to do */
+   else if (elem[1] == '(')
+   /* longhand */
+   return parse_long_magic(magic, prefix_len, elem);
+   else
+   /* shorthand */
+   return parse_short_magic(magic, elem);
+}
+
 /*
  * Take an element of a pathspec and check for magic signatures.
  * Append the result to the prefix. Return the magic bitmap.
@@ -267,24 +280,14 @@ static unsigned prefix_pathspec(struct pathspec_item 
*item,
char *match;
int i, pathspec_prefix = -1;
 
-   if (elt[0] != ':' || get_literal_global() ||
-   (flags & PATHSPEC_LITERAL_PATH)) {
-   ; /* nothing to do */
-   } else if (elt[1] == '(') {
-   /* longhand */
-   copyfrom = parse_long_magic(_magic,
-   _prefix,
-   elt);
-   } else {
-   /* shorthand */
-   copyfrom = parse_short_magic(_magic, elt);
-   }
-
-   magic |= element_magic;
-
/* PATHSPEC_LITERAL_PATH ignores magic */
-   if (!(flags & PATHSPEC_LITERAL_PATH))
+   if (!(flags & PATHSPEC_LITERAL_PATH)) {
+   copyfrom = parse_element_magic(_magic,
+  _prefix,
+  elt);
+   magic |= element_magic;
magic |= get_global_magic(element_magic);
+   }
 
if (pathspec_prefix >= 0 &&
(prefixlen || (prefix && *prefix)))
-- 
2.8.0.rc3.226.g39d4020



[PATCH 08/17] pathspec: remove unused variable from unsupported_magic

2016-12-06 Thread Brandon Williams
Removed unused variable 'n' from the 'unsupported_magic()' function.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 8f367f0..ec0d590 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -333,8 +333,8 @@ static void NORETURN unsupported_magic(const char *pattern,
   unsigned short_magic)
 {
struct strbuf sb = STRBUF_INIT;
-   int i, n;
-   for (n = i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+   int i;
+   for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
const struct pathspec_magic *m = pathspec_magic + i;
if (!(magic & m->bit))
continue;
@@ -344,7 +344,6 @@ static void NORETURN unsupported_magic(const char *pattern,
strbuf_addf(, "'%c'", m->mnemonic);
else
strbuf_addf(, "'%s'", m->name);
-   n++;
}
/*
 * We may want to substitute "this command" with a command
-- 
2.8.0.rc3.226.g39d4020



[PATCH 16/17] pathspec: small readability changes

2016-12-06 Thread Brandon Williams
A few small changes to improve readability.  This is done by grouping related
assignments, adding blank lines, ensuring lines are <80 characters, etc.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 41aa213..8a07b02 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -334,6 +334,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
if ((magic & PATHSPEC_LITERAL) && (magic & PATHSPEC_GLOB))
die(_("%s: 'literal' and 'glob' are incompatible"), elt);
 
+   /* Create match string which will be used for pathspec matching */
if (pathspec_prefix >= 0) {
match = xstrdup(copyfrom);
prefixlen = pathspec_prefix;
@@ -341,11 +342,16 @@ static unsigned prefix_pathspec(struct pathspec_item 
*item,
match = xstrdup(copyfrom);
prefixlen = 0;
} else {
-   match = prefix_path_gently(prefix, prefixlen, , 
copyfrom);
+   match = prefix_path_gently(prefix, prefixlen,
+  , copyfrom);
if (!match)
die(_("%s: '%s' is outside repository"), elt, copyfrom);
}
+
item->match = match;
+   item->len = strlen(item->match);
+   item->prefix = prefixlen;
+
/*
 * Prefix the pathspec (keep all magic) and assign to
 * original. Useful for passing to another command.
@@ -362,8 +368,6 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
} else {
item->original = xstrdup(elt);
}
-   item->len = strlen(item->match);
-   item->prefix = prefixlen;
 
if (flags & PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP)
strip_submodule_slash_cheap(item);
@@ -371,13 +375,14 @@ static unsigned prefix_pathspec(struct pathspec_item 
*item,
if (flags & PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE)
strip_submodule_slash_expensive(item);
 
-   if (magic & PATHSPEC_LITERAL)
+   if (magic & PATHSPEC_LITERAL) {
item->nowildcard_len = item->len;
-   else {
+   } else {
item->nowildcard_len = simple_length(item->match);
if (item->nowildcard_len < prefixlen)
item->nowildcard_len = prefixlen;
}
+
item->flags = 0;
if (magic & PATHSPEC_GLOB) {
/*
-- 
2.8.0.rc3.226.g39d4020



[PATCH 10/17] pathspec: simpler logic to prefix original pathspec elements

2016-12-06 Thread Brandon Williams
The logic used to prefix an original pathspec element with 'prefix'
magic is more general purpose and can be used for more than just short
magic.  Remove the extra code paths and rename 'prefix_short_magic' to
'prefix_magic' to better indicate that it can be used in more general
situations.

Also, slightly change the logic which decides when to prefix the
original element in order to prevent a pathspec of "." from getting
converted to "" (empty string).

Signed-off-by: Brandon Williams 
---
 pathspec.c | 33 +
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 159f6db..5afebd3 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -74,13 +74,12 @@ static struct pathspec_magic {
{ PATHSPEC_EXCLUDE, '!', "exclude" },
 };
 
-static void prefix_short_magic(struct strbuf *sb, int prefixlen,
-  unsigned short_magic)
+static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
 {
int i;
strbuf_addstr(sb, ":(");
for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++)
-   if (short_magic & pathspec_magic[i].bit) {
+   if (magic & pathspec_magic[i].bit) {
if (sb->buf[sb->len - 1] != '(')
strbuf_addch(sb, ',');
strbuf_addstr(sb, pathspec_magic[i].name);
@@ -110,8 +109,8 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
static int glob_global = -1;
static int noglob_global = -1;
static int icase_global = -1;
-   unsigned magic = 0, short_magic = 0, global_magic = 0;
-   const char *copyfrom = elt, *long_magic_end = NULL;
+   unsigned magic = 0, element_magic = 0, global_magic = 0;
+   const char *copyfrom = elt;
char *match;
int i, pathspec_prefix = -1;
 
@@ -165,7 +164,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
if (strlen(pathspec_magic[i].name) == len &&
!strncmp(pathspec_magic[i].name, copyfrom, 
len)) {
-   magic |= pathspec_magic[i].bit;
+   element_magic |= pathspec_magic[i].bit;
break;
}
if (starts_with(copyfrom, "prefix:")) {
@@ -184,7 +183,6 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
}
if (*copyfrom != ')')
die(_("Missing ')' at the end of pathspec magic in 
'%s'"), elt);
-   long_magic_end = copyfrom;
copyfrom++;
} else {
/* shorthand */
@@ -197,7 +195,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
break;
for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++)
if (pathspec_magic[i].mnemonic == ch) {
-   short_magic |= pathspec_magic[i].bit;
+   element_magic |= pathspec_magic[i].bit;
break;
}
if (ARRAY_SIZE(pathspec_magic) <= i)
@@ -208,7 +206,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
copyfrom++;
}
 
-   magic |= short_magic;
+   magic |= element_magic;
 
/* --noglob-pathspec adds :(literal) _unless_ :(glob) is specified */
if (noglob_global && !(magic & PATHSPEC_GLOB))
@@ -243,18 +241,13 @@ static unsigned prefix_pathspec(struct pathspec_item 
*item,
 * Prefix the pathspec (keep all magic) and assign to
 * original. Useful for passing to another command.
 */
-   if (flags & PATHSPEC_PREFIX_ORIGIN) {
+   if ((flags & PATHSPEC_PREFIX_ORIGIN) &&
+   prefixlen && !literal_global) {
struct strbuf sb = STRBUF_INIT;
-   if (prefixlen && !literal_global) {
-   /* Preserve the actual prefix length of each pattern */
-   if (short_magic)
-   prefix_short_magic(, prefixlen, short_magic);
-   else if (long_magic_end) {
-   strbuf_add(, elt, long_magic_end - elt);
-   strbuf_addf(, ",prefix:%d)", prefixlen);
-   } else
-   strbuf_addf(, ":(prefix:%d)", prefixlen);
-   }
+
+   /* Preserve the actual prefix length of each pattern */
+   prefix_magic(, prefixlen, element_magic);
+
strbuf_addstr(, match);
item->original = strbuf_detach(, NULL);
} else {
-- 

[PATCH 09/17] pathspec: always show mnemonic and name in unsupported_magic

2016-12-06 Thread Brandon Williams
For better clarity, always show the mnemonic and name of the unsupported
magic being used.  This lets users have a more clear understanding of
what magic feature isn't supported.  And if they supplied a mnemonic,
the user will be told what its corresponding name is which will allow
them to more easily search the man pages for that magic type.

This also avoids passing an extra parameter around the pathspec
initialization code.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index ec0d590..159f6db 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -68,7 +68,7 @@ static struct pathspec_magic {
const char *name;
 } pathspec_magic[] = {
{ PATHSPEC_FROMTOP, '/', "top" },
-   { PATHSPEC_LITERAL,   0, "literal" },
+   { PATHSPEC_LITERAL,'\0', "literal" },
{ PATHSPEC_GLOB,   '\0', "glob" },
{ PATHSPEC_ICASE,  '\0', "icase" },
{ PATHSPEC_EXCLUDE, '!', "exclude" },
@@ -102,7 +102,6 @@ static void prefix_short_magic(struct strbuf *sb, int 
prefixlen,
  * string cannot express such a case.
  */
 static unsigned prefix_pathspec(struct pathspec_item *item,
-   unsigned *p_short_magic,
unsigned flags,
const char *prefix, int prefixlen,
const char *elt)
@@ -210,7 +209,6 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
}
 
magic |= short_magic;
-   *p_short_magic = short_magic;
 
/* --noglob-pathspec adds :(literal) _unless_ :(glob) is specified */
if (noglob_global && !(magic & PATHSPEC_GLOB))
@@ -329,8 +327,7 @@ static int pathspec_item_cmp(const void *a_, const void *b_)
 }
 
 static void NORETURN unsupported_magic(const char *pattern,
-  unsigned magic,
-  unsigned short_magic)
+  unsigned magic)
 {
struct strbuf sb = STRBUF_INIT;
int i;
@@ -340,8 +337,9 @@ static void NORETURN unsupported_magic(const char *pattern,
continue;
if (sb.len)
strbuf_addch(, ' ');
-   if (short_magic & m->bit)
-   strbuf_addf(, "'%c'", m->mnemonic);
+
+   if (m->mnemonic)
+   strbuf_addf(, "'(%c)%s'", m->mnemonic, m->name);
else
strbuf_addf(, "'%s'", m->name);
}
@@ -413,10 +411,9 @@ void parse_pathspec(struct pathspec *pathspec,
prefixlen = prefix ? strlen(prefix) : 0;
 
for (i = 0; i < n; i++) {
-   unsigned short_magic;
entry = argv[i];
 
-   item[i].magic = prefix_pathspec(item + i, _magic,
+   item[i].magic = prefix_pathspec(item + i,
flags,
prefix, prefixlen, entry);
if ((flags & PATHSPEC_LITERAL_PATH) &&
@@ -426,8 +423,7 @@ void parse_pathspec(struct pathspec *pathspec,
nr_exclude++;
if (item[i].magic & magic_mask)
unsupported_magic(entry,
- item[i].magic & magic_mask,
- short_magic);
+ item[i].magic & magic_mask);
 
if ((flags & PATHSPEC_SYMLINK_LEADING_PATH) &&
has_symlink_leading_path(item[i].match, item[i].len)) {
-- 
2.8.0.rc3.226.g39d4020



[PATCH 04/17] ls-tree: convert show_recursive to use the pathspec struct interface

2016-12-06 Thread Brandon Williams
Convert 'show_recursive()' to use the pathspec struct interface from
using the '_raw' entry in the pathspec struct.

Signed-off-by: Brandon Williams 
---
 builtin/ls-tree.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 0e30d86..e0f4307 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -31,21 +31,18 @@ static const  char * const ls_tree_usage[] = {
 
 static int show_recursive(const char *base, int baselen, const char *pathname)
 {
-   const char **s;
+   int i;
 
if (ls_options & LS_RECURSIVE)
return 1;
 
-   s = pathspec._raw;
-   if (!s)
+   if (!pathspec.nr)
return 0;
 
-   for (;;) {
-   const char *spec = *s++;
+   for (i = 0; i < pathspec.nr; i++) {
+   const char *spec = pathspec.items[i].match;
int len, speclen;
 
-   if (!spec)
-   return 0;
if (strncmp(base, spec, baselen))
continue;
len = strlen(pathname);
@@ -59,6 +56,7 @@ static int show_recursive(const char *base, int baselen, 
const char *pathname)
continue;
return 1;
}
+   return 0;
 }
 
 static int show_tree(const unsigned char *sha1, struct strbuf *base,
-- 
2.8.0.rc3.226.g39d4020



[PATCH 07/17] mv: small code cleanup

2016-12-06 Thread Brandon Williams
Now that the call to 'parse_pathspec()' doesn't modify the passed in
const char **array there isn't a need to duplicate the pathspec element
prior to freeing the intermediate strings.  This small cleanup just
makes the code a bit easier to read.

Signed-off-by: Brandon Williams 
---
 builtin/mv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 4df4a12..b7cceb6 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -56,9 +56,8 @@ static const char **internal_copy_pathspec(const char *prefix,
 
/* Copy the pathspec and free the old intermediate strings */
for (i = 0; i < count; i++) {
-   const char *match = xstrdup(ps.items[i].match);
free((char *) result[i]);
-   result[i] = match;
+   result[i] = xstrdup(ps.items[i].match);
}
 
clear_pathspec();
-- 
2.8.0.rc3.226.g39d4020



[PATCH 06/17] pathspec: copy and free owned memory

2016-12-06 Thread Brandon Williams
The 'original' string entry in a pathspec_item is only duplicated some
of the time, instead always make a copy of the original and take
ownership of the memory.

Since both 'match' and 'original' string entries in a pathspec_item are
owned by the pathspec struct, they need to be freed when clearing the
pathspec struct (in 'clear_pathspec()') and duplicated when copying the
pathspec struct (in 'copy_pathspec()').

Also change the type of 'match' and 'original' to 'char *' in order to
more explicitly show the ownership of the memory.

Signed-off-by: Brandon Williams 
---
 pathspec.c | 22 ++
 pathspec.h |  4 ++--
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 1f918cb..8f367f0 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -259,8 +259,9 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
}
strbuf_addstr(, match);
item->original = strbuf_detach(, NULL);
-   } else
-   item->original = elt;
+   } else {
+   item->original = xstrdup(elt);
+   }
item->len = strlen(item->match);
item->prefix = prefixlen;
 
@@ -388,8 +389,8 @@ void parse_pathspec(struct pathspec *pathspec,
die("BUG: PATHSPEC_PREFER_CWD requires arguments");
 
pathspec->items = item = xcalloc(1, sizeof(*item));
-   item->match = prefix;
-   item->original = prefix;
+   item->match = xstrdup(prefix);
+   item->original = xstrdup(prefix);
item->nowildcard_len = item->len = strlen(prefix);
item->prefix = item->len;
pathspec->nr = 1;
@@ -453,13 +454,26 @@ void parse_pathspec(struct pathspec *pathspec,
 
 void copy_pathspec(struct pathspec *dst, const struct pathspec *src)
 {
+   int i;
+
*dst = *src;
ALLOC_ARRAY(dst->items, dst->nr);
COPY_ARRAY(dst->items, src->items, dst->nr);
+
+   for (i = 0; i < dst->nr; i++) {
+   dst->items[i].match = xstrdup(src->items[i].match);
+   dst->items[i].original = xstrdup(src->items[i].original);
+   }
 }
 
 void clear_pathspec(struct pathspec *pathspec)
 {
+   int i;
+
+   for (i = 0; i < pathspec->nr; i++) {
+   free(pathspec->items[i].match);
+   free(pathspec->items[i].original);
+   }
free(pathspec->items);
pathspec->items = NULL;
 }
diff --git a/pathspec.h b/pathspec.h
index 70a592e..49fd823 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -25,8 +25,8 @@ struct pathspec {
unsigned magic;
int max_depth;
struct pathspec_item {
-   const char *match;
-   const char *original;
+   char *match;
+   char *original;
unsigned magic;
int len, prefix;
int nowildcard_len;
-- 
2.8.0.rc3.226.g39d4020



[PATCH 03/17] dir: convert fill_directory to use the pathspec struct interface

2016-12-06 Thread Brandon Williams
Convert 'fill_directory()' to use the pathspec struct interface from
using the '_raw' entry in the pathspec struct.

Signed-off-by: Brandon Williams 
---
 dir.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 7df292b..8730a4f 100644
--- a/dir.c
+++ b/dir.c
@@ -188,7 +188,8 @@ int fill_directory(struct dir_struct *dir, const struct 
pathspec *pathspec)
len = common_prefix_len(pathspec);
 
/* Read the directory and prune it */
-   read_directory(dir, pathspec->nr ? pathspec->_raw[0] : "", len, 
pathspec);
+   read_directory(dir, pathspec->nr ? pathspec->items[0].match : "",
+  len, pathspec);
return len;
 }
 
-- 
2.8.0.rc3.226.g39d4020



[PATCH 02/17] dir: convert create_simplify to use the pathspec struct interface

2016-12-06 Thread Brandon Williams
Convert 'create_simplify()' to use the pathspec struct interface from
using the '_raw' entry in the pathspec.

Signed-off-by: Brandon Williams 
---
 dir.c | 25 -
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/dir.c b/dir.c
index bfa8c8a..7df292b 100644
--- a/dir.c
+++ b/dir.c
@@ -1787,25 +1787,24 @@ static int cmp_name(const void *p1, const void *p2)
return name_compare(e1->name, e1->len, e2->name, e2->len);
 }
 
-static struct path_simplify *create_simplify(const char **pathspec)
+static struct path_simplify *create_simplify(const struct pathspec *pathspec)
 {
-   int nr, alloc = 0;
+   int i;
struct path_simplify *simplify = NULL;
 
-   if (!pathspec)
+   if (!pathspec || !pathspec->nr)
return NULL;
 
-   for (nr = 0 ; ; nr++) {
+   ALLOC_ARRAY(simplify, pathspec->nr + 1);
+   for (i = 0; i < pathspec->nr; i++) {
const char *match;
-   ALLOC_GROW(simplify, nr + 1, alloc);
-   match = *pathspec++;
-   if (!match)
-   break;
-   simplify[nr].path = match;
-   simplify[nr].len = simple_length(match);
+   match = pathspec->items[i].match;
+   simplify[i].path = match;
+   simplify[i].len = pathspec->items[i].nowildcard_len;
}
-   simplify[nr].path = NULL;
-   simplify[nr].len = 0;
+   simplify[i].path = NULL;
+   simplify[i].len = 0;
+
return simplify;
 }
 
@@ -2036,7 +2035,7 @@ int read_directory(struct dir_struct *dir, const char 
*path, int len, const stru
 * subset of positive ones, which has no impacts on
 * create_simplify().
 */
-   simplify = create_simplify(pathspec ? pathspec->_raw : NULL);
+   simplify = create_simplify(pathspec);
untracked = validate_untracked_cache(dir, len, pathspec);
if (!untracked)
/*
-- 
2.8.0.rc3.226.g39d4020



[PATCH 00/17] pathspec cleanup

2016-12-06 Thread Brandon Williams
The intent of this series is to cleanup some of the pathspec initialization
code as well as finally migrating the remaining users of the _raw field or
get_pathspec() to the pathspec struct interface.  This way both the _raw field
and get_pathspec() can be removed from the codebase.  This also removes the
functionality where parse_pathspec() modified the const char * argv array that
was passed in (which felt kind of odd to me as I wouldn't have expected the
passed in array to be modified).

I also noticed that there are memory leaks associated with the 'original' and
'match' strings.  To fix this the pathspec struct needed to take ownership of
the memory for these fields so that they can be cleaned up when clearing the
pathspec struct.

Most of the work went to simplifying the prefix_pathspec function.  This
consisted of factoring out long sections of code into their own helper
functions.  The overall result is a much more readable function.

Brandon Williams (17):
  mv: convert to using pathspec struct interface
  dir: convert create_simplify to use the pathspec struct interface
  dir: convert fill_directory to use the pathspec struct interface
  ls-tree: convert show_recursive to use the pathspec struct interface
  pathspec: remove the deprecated get_pathspec function
  pathspec: copy and free owned memory
  mv: small code cleanup
  pathspec: remove unused variable from unsupported_magic
  pathspec: always show mnemonic and name in unsupported_magic
  pathspec: simpler logic to prefix original pathspec elements
  pathspec: factor global magic into its own function
  pathspec: create parse_short_magic function
  pathspec: create parse_long_magic function
  pathspec: create parse_element_magic helper
  pathspec: create strip submodule slash helpers
  pathspec: small readability changes
  pathspec: remove outdated comment

 Documentation/technical/api-setup.txt |   2 -
 builtin/ls-tree.c |  12 +-
 builtin/mv.c  |  44 +++-
 cache.h   |   1 -
 dir.c |  28 +--
 pathspec.c| 449 +++---
 pathspec.h|   5 +-
 7 files changed, 301 insertions(+), 240 deletions(-)

-- 
2.8.0.rc3.226.g39d4020



[PATCH 01/17] mv: convert to using pathspec struct interface

2016-12-06 Thread Brandon Williams
Convert the 'internal_copy_pathspec()' function to use the pathspec
struct interface from using the deprecated 'get_pathspec()' interface.

In addition to this, fix a memory leak caused by only duplicating some
of the pathspec elements.  Instead always duplicate all of the the
pathspec elements as an intermediate step (with modificationed based on
the passed in flags).  This way the intermediate strings can then be
freed prior to duplicating the result of parse_pathspec (which contains
each of the elements with the prefix prepended).

Signed-off-by: Brandon Williams 
---
 builtin/mv.c | 45 -
 1 file changed, 32 insertions(+), 13 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 2f43877..4df4a12 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -4,6 +4,7 @@
  * Copyright (C) 2006 Johannes Schindelin
  */
 #include "builtin.h"
+#include "pathspec.h"
 #include "lockfile.h"
 #include "dir.h"
 #include "cache-tree.h"
@@ -25,25 +26,43 @@ static const char **internal_copy_pathspec(const char 
*prefix,
 {
int i;
const char **result;
+   struct pathspec ps;
ALLOC_ARRAY(result, count + 1);
-   COPY_ARRAY(result, pathspec, count);
-   result[count] = NULL;
+
+   /* Create an intermediate copy of the pathspec based on the flags */
for (i = 0; i < count; i++) {
-   int length = strlen(result[i]);
+   int length = strlen(pathspec[i]);
int to_copy = length;
+   char *it;
while (!(flags & KEEP_TRAILING_SLASH) &&
-  to_copy > 0 && is_dir_sep(result[i][to_copy - 1]))
+  to_copy > 0 && is_dir_sep(pathspec[i][to_copy - 1]))
to_copy--;
-   if (to_copy != length || flags & DUP_BASENAME) {
-   char *it = xmemdupz(result[i], to_copy);
-   if (flags & DUP_BASENAME) {
-   result[i] = xstrdup(basename(it));
-   free(it);
-   } else
-   result[i] = it;
-   }
+
+   it = xmemdupz(pathspec[i], to_copy);
+   if (flags & DUP_BASENAME) {
+   result[i] = xstrdup(basename(it));
+   free(it);
+   } else
+   result[i] = it;
+   }
+   result[count] = NULL;
+
+   parse_pathspec(,
+  PATHSPEC_ALL_MAGIC &
+  ~(PATHSPEC_FROMTOP | PATHSPEC_LITERAL),
+  PATHSPEC_KEEP_ORDER | PATHSPEC_PREFER_CWD,
+  prefix, result);
+   assert(count == ps.nr);
+
+   /* Copy the pathspec and free the old intermediate strings */
+   for (i = 0; i < count; i++) {
+   const char *match = xstrdup(ps.items[i].match);
+   free((char *) result[i]);
+   result[i] = match;
}
-   return get_pathspec(prefix, result);
+
+   clear_pathspec();
+   return result;
 }
 
 static const char *add_slash(const char *path)
-- 
2.8.0.rc3.226.g39d4020



Re: [PATCH v15 09/27] bisect--helper: `bisect_write` shell function in C

2016-12-06 Thread Pranit Bauva
Hey Stephan,

On Thu, Nov 17, 2016 at 3:10 PM, Stephan Beyer  wrote:
> Hi,
>
> I've only got some minors to mention here ;)
>
> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>> index c542e8b..3f19b68 100644
>> --- a/builtin/bisect--helper.c
>> +++ b/builtin/bisect--helper.c
>> @@ -19,9 +19,15 @@ static const char * const git_bisect_helper_usage[] = {
>>   N_("git bisect--helper --write-terms  "),
>>   N_("git bisect--helper --bisect-clean-state"),
>>   N_("git bisect--helper --bisect-reset []"),
>> + N_("git bisect--helper --bisect-write
>>  []"),
>>   NULL
>>  };
>
> I wouldn't write "" in capital letters. I'd use
> something like " " as you have used for
> --write-terms. Note that "git bisect --help" uses "
> " in that context.

Keeping it in small does give less strain to the eye ;)

>> @@ -149,6 +155,63 @@ static int check_expected_revs(const char **revs, int 
>> rev_nr)
>>   return 0;
>>  }
>>
>> +static int bisect_write(const char *state, const char *rev,
>> + const struct bisect_terms *terms, int nolog)
>> +{
>> + struct strbuf tag = STRBUF_INIT;
>> + struct strbuf commit_name = STRBUF_INIT;
>> + struct object_id oid;
>> + struct commit *commit;
>> + struct pretty_print_context pp = {0};
>> + FILE *fp = NULL;
>> + int retval = 0;
>> +
>> + if (!strcmp(state, terms->term_bad))
>> + strbuf_addf(, "refs/bisect/%s", state);
>> + else if (one_of(state, terms->term_good, "skip", NULL))
>> + strbuf_addf(, "refs/bisect/%s-%s", state, rev);
>> + else {
>> + error(_("Bad bisect_write argument: %s"), state);
>> + retval = -1;
>> + goto finish;
>> + }
>> +
>> + if (get_oid(rev, )) {
>> + error(_("couldn't get the oid of the rev '%s'"), rev);
>> + retval = -1;
>> + goto finish;
>> + }
>> +
>> + if (update_ref(NULL, tag.buf, oid.hash, NULL, 0,
>> +UPDATE_REFS_MSG_ON_ERR)) {
>> + retval = -1;
>> + goto finish;
>> + }
>
> I'd like to mention that the "goto fail;" trick could apply in this
> function, too.

Sure!

>> @@ -156,9 +219,10 @@ int cmd_bisect__helper(int argc, const char **argv, 
>> const char *prefix)
>>   WRITE_TERMS,
>>   BISECT_CLEAN_STATE,
>>   BISECT_RESET,
>> - CHECK_EXPECTED_REVS
>> + CHECK_EXPECTED_REVS,
>> + BISECT_WRITE
>>   } cmdmode = 0;
>> - int no_checkout = 0;
>> + int no_checkout = 0, res = 0;
>
> Why do you do this "direct return" -> "set res and return res" transition?
> You don't need it in this patch, you do not need it in the subsequent
> patches (you always set "res" exactly once after the initialization),
> and you don't need cleanup code in this function.

Initially I was using strbuf but then I switched to const char *
according to Junio's suggestion. It seems that in this version I have
forgot to free the terms.

>>   struct option options[] = {
>>   OPT_CMDMODE(0, "next-all", ,
>>N_("perform 'git bisect next'"), NEXT_ALL),
>> @@ -170,10 +234,13 @@ int cmd_bisect__helper(int argc, const char **argv, 
>> const char *prefix)
>>N_("reset the bisection state"), BISECT_RESET),
>>   OPT_CMDMODE(0, "check-expected-revs", ,
>>N_("check for expected revs"), CHECK_EXPECTED_REVS),
>> + OPT_CMDMODE(0, "bisect-write", ,
>> +  N_("write out the bisection state in BISECT_LOG"), 
>> BISECT_WRITE),
>
> That info text is confusing, especially considering that there is a
> "nolog" option. I think the action of bisect-write is two-fold: (1)
> update the refs, (2) log.

I agree that it is confusing. I couldn't find a better way to describe
it and since this would be gone after the whole conversion, I didn't
bother putting more efforts there.

Regards,
Pranit Bauva


Re: [PATCH v15 12/27] bisect--helper: `get_terms` & `bisect_terms` shell function in C

2016-12-06 Thread Pranit Bauva
Hey Stephan,

On Fri, Nov 18, 2016 at 3:02 AM, Stephan Beyer  wrote:
> Hi,
>
> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>> index 317d671..6a5878c 100644
>> --- a/builtin/bisect--helper.c
>> +++ b/builtin/bisect--helper.c
> [...]
>> +static int bisect_terms(struct bisect_terms *terms, const char **argv, int 
>> argc)
>> +{
>> + int i;
>> + const char bisect_term_usage[] =
>> +"git bisect--helper --bisect-terms [--term-good | --term-bad | ]"
>> +"--term-old | --term-new";
>
> Three things:
>
> (1) Is that indentation intentional?

Yes it was intentional but now I cannot recollect why. I think it was
because I found something similar. Nevertheless, I will fix this
indentation/

> (2) You have a "]" at the end of the first part of the string instead of
> the end of the second part.

This should be corrected.

> (3) After the correction, bisect_term_usage and
> git_bisect_helper_usage[7] are the same strings. I don't recommend to
> use git_bisect_helper_usage[7] instead because keeping the index
> up-to-date is a maintenance hell. (At the end of your patch series it is
> a 3 instead of a 7.) However, if - for whatever reason - the usage of
> bisect--helper --bisect-terms changes, you always have to sync the two
> strings which is also nasty
>
>> +
>> + if (get_terms(terms))
>> + return error(_("no terms defined"));
>> +
>> + if (argc > 1) {
>> + usage(bisect_term_usage);
>> + return -1;
>> + }
>
> ...and since you only use it once, why not simply do something like
>
> return error(_("--bisect-term requires exactly one argument"));
>
> and drop the definition of bisect_term_usage.

Sure that would be better.

>> +
>> + if (argc == 0) {
>> + printf(_("Your current terms are %s for the old state\nand "
>> +"%s for the new state.\n"), terms->term_good,
>> +terms->term_bad);
>
> Very minor: It improves the readability if you'd split the string after
> the \n and put the "and "in the next line.

Ah. This is because of the message. If I do the other way, then it
won't match the output in one of the tests in t/t6030 thus, I am
keeping it that way in order to avoid modifying the file t/t6030.

>> + return 0;
>> + }
>> +
>> + for (i = 0; i < argc; i++) {
>> + if (!strcmp(argv[i], "--term-good"))
>> + printf("%s\n", terms->term_good);
>> + else if (!strcmp(argv[i], "--term-bad"))
>> + printf("%s\n", terms->term_bad);
>> + else
>> + die(_("invalid argument %s for 'git bisect "
>> +   "terms'.\nSupported options are: "
>> +   "--term-good|--term-old and "
>> +   "--term-bad|--term-new."), argv[i]);
>
> Hm, "return error(...)" and "die(...)" seems to be quasi-equivalent in
> this case. Because I am always looking from a library perspective, I'd
> prefer "return error(...)".

I should use return error()

>> @@ -429,6 +492,11 @@ int cmd_bisect__helper(int argc, const char **argv, 
>> const char *prefix)
>>   terms.term_bad = xstrdup(argv[1]);
>>   res = bisect_next_check(, argc == 3 ? argv[2] : NULL);
>>   break;
>> + case BISECT_TERMS:
>> + if (argc > 1)
>> + die(_("--bisect-terms requires 0 or 1 argument"));
>> + res = bisect_terms(, argv, argc);
>> + break;
>
> Also here: "terms" is leaking...

Will have to free it.

> ~Stephan


Re: [PATCH] xdiff: Do not enable XDL_FAST_HASH by default

2016-12-06 Thread Junio C Hamano
Jeff King  writes:

> This is a nice incremental step in the sense that people can still
> enable it if they want to in order to time it, play with it, etc. But
> given what we know, I wonder if the help text here should warn people.
>
> Or I guess we could move straight to dropping it entirely.
>
> Here's what that patch might look like (I retimed it just be sure, and
> was sad to see that it really _is_ making some cases faster. But I still
> think slower-but-predictable is a better default).

I like this version that drops quite a lot of code ;-)

> Subject: [PATCH] xdiff: drop XDL_FAST_HASH
> ...
> The idea of XDL_FAST_HASH is to speed up the hash
> computation. But the generated hashes have worse collision
> behavior. This means that in some cases it speeds diffs up
> (running "git log -p" on git.git improves by ~8% with it),
> but in others it can slow things down. One pathological case
> saw over a 100x slowdown[1].
>
> There may be a better hash function that covers both
> properties, but in the meantime we are better off with the
> original hash. It's slightly slower in the common case, but
> it has fewer surprising pathological cases.
>
> [1] http://public-inbox.org/git/20141222041944.ga...@peff.net/


Re: "git add -p ." raises an unexpected "warning: empty strings as pathspecs will be made invalid in upcoming releases. please use . instead if you meant to match all paths"

2016-12-06 Thread Junio C Hamano
Brandon Williams  writes:

> On 11/30, Junio C Hamano wrote:
>> Junio C Hamano  forgot to Cc: the author of the
>> most relevant change to the issue, d426430e6e ("pathspec: warn on
>> empty strings as pathspec", 2016-06-22).
>> ...
>
> I've been doing a bit of work trying to clean up the pathspec
> initialization code and I believe this can be fixed without
> having to add in this work around.  The code which does the munging is
> always trying to prefix the pathspec regardless if there is a prefix or
> not.  If instead its changed to only try and prefix the original if
> there is indeed a prefix, then it should fix the munging.

Thanks; sounds like a plan.


Re: [PATCH v2] tag, branch, for-each-ref: add --ignore-case for sorting and filtering

2016-12-06 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> This options makes sorting ignore case, which is great when you have
> branches named bug-12-do-something, Bug-12-do-some-more and
> BUG-12-do-what and want to group them together. Sorting externally may
> not be an option because we lose coloring and column layout from
> git-branch and git-tag.
>
> The same could be said for filtering, but it's probably less important
> because you can always go with the ugly pattern [bB][uU][gG]-* if you're
> desperate.
>
> You can't have case-sensitive filtering and case-insensitive sorting (or
> the other way around) with this though. For branch and tag, that should
> be no problem. for-each-ref, as a plumbing, might want finer control.
> But we can always add --{filter,sort}-ignore-case when there is a need
> for it.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---

It took me a while to figure out the interactions with topics in
flight, but I think I resolved it correctly now.  There was a topic
that added "--format" to branch and tag.

Will be pushed out as part of today's integration cycle.

Thanks.


Re: [RFC PATCH] GIT-VERSION-GEN: set --abbrev=9 to match auto-scaling

2016-12-06 Thread Ramsay Jones


On 06/12/16 18:26, Jeff King wrote:
> On Tue, Dec 06, 2016 at 10:17:55AM -0800, Junio C Hamano wrote:
> 
>> Yup, that is what I meant to say with "that is already possible" and
>> we are on the same page.  As all three of us seem to be happy with
>> just dropping --abbrev and letting describe do its default thing (as
>> configured by whoever is doing the build), let's do so.
>>
>> -- >8 --
>> From: Ramsay Jones 
>> Date: Sun, 4 Dec 2016 20:45:59 +
>> Subject: [PATCH] GIT-VERSION-GEN: do not force abbreviation length used by 
>> 'describe'
> 
> Thanks, this is a nice summary of the discussion, and the patch is
> obviously correct.

Yep, looks very good to me! (I had started to write a commit
message for this patch, when your email arrived. Needless to
say, but your message is much better than mine!)

Thanks!

ATB,
Ramsay Jones



Re: [PATCH] stash: disable renames when calling git-diff

2016-12-06 Thread Junio C Hamano
Jeff King  writes:

> Here it is in patch form. Perhaps I am missing some subtle case that
> diff-index would not handle, but it does pass the test suite. And
> looking over the original thread, I don't see any particular reason to
> prefer git-diff.

Ah, our mails crossed ;-)  I am reasonably sure that "diff HEAD" and
"diff-index HEAD" would behave identically.

> @@ -115,7 +115,7 @@ create_stash () {
>   git read-tree --index-output="$TMPindex" -m $i_tree &&
>   GIT_INDEX_FILE="$TMPindex" &&
>   export GIT_INDEX_FILE &&
> - git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
> + git diff-index --name-only -z HEAD -- 
> >"$TMP-stagenames" &&
>   git update-index -z --add --remove --stdin 
> <"$TMP-stagenames" &&
>   git write-tree &&
>   rm -f "$TMPindex"

Will queue this one instead.  Thanks.



Re: [PATCH] stash: disable renames when calling git-diff

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 03:11:30PM -0500, Jeff King wrote:

> > Yeah, it looks like "add -u" to me, too.  Perhaps the script was old
> > enough that it didn't exist back then?  I dunno.
> 
> Hmm, nope. It _was_ "git add -u" at one point and switched. See
> 7aa5d43cc (stash: Don't overwrite files that have gone from the index,
> 2010-04-18).
> 
> I think you are right that diff-index could work, though. I always
> forget that without "--cached", diff-index looks at the working tree
> files.

Here it is in patch form. Perhaps I am missing some subtle case that
diff-index would not handle, but it does pass the test suite. And
looking over the original thread, I don't see any particular reason to
prefer git-diff.

+cc Charles just in case he remembers anything.

-- >8 --
Subject: [PATCH] stash: prefer plumbing over git-diff

When creating a stash, we need to look at the diff between
the working tree and HEAD, and do so using the git-diff
porcelain.  Because git-diff enables porcelain config like
renames by default, this causes at least one problem. The
--name-only format will not mention the source side of a
rename, meaning we will fail to stash a deletion that is
part of a rename.

We could fix that case by passing --no-renames, but this is
a symptom of a larger problem. We should be using the
diff-index plumbing here, which does not have renames
enabled by default, and also does not respect any
potentially confusing config options.

Reported-by: Matthew Patey 
Signed-off-by: Jeff King 
---
 git-stash.sh | 2 +-
 t/t3903-stash.sh | 9 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/git-stash.sh b/git-stash.sh
index 4546abaae..10c284d1a 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -115,7 +115,7 @@ create_stash () {
git read-tree --index-output="$TMPindex" -m $i_tree &&
GIT_INDEX_FILE="$TMPindex" &&
export GIT_INDEX_FILE &&
-   git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
+   git diff-index --name-only -z HEAD -- 
>"$TMP-stagenames" &&
git update-index -z --add --remove --stdin 
<"$TMP-stagenames" &&
git write-tree &&
rm -f "$TMPindex"
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index e1a6ccc00..2de3e18ce 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -766,4 +766,13 @@ test_expect_success 'stash list --cc shows combined diff' '
test_cmp expect actual
 '
 
+test_expect_success 'stash is not confused by partial renames' '
+   mv file renamed &&
+   git add renamed &&
+   git stash &&
+   git stash apply &&
+   test_path_is_file renamed &&
+   test_path_is_missing file
+'
+
 test_done
-- 
2.11.0.191.gdb26c57



Re: [PATCH] stash: disable renames when calling git-diff

2016-12-06 Thread Junio C Hamano
Junio C Hamano  writes:

> Jeff King  writes:
>
>> I think you are right that diff-index could work, though. I always
>> forget that without "--cached", diff-index looks at the working tree
>> files.
>
> I'll reword the log message while queuing.

The last paragraph became like so:

The correct solution is to move to an appropriate plumbing
command. But in the short term lets ask git-diff not to respect
renames.



Re: [PATCH] stash: disable renames when calling git-diff

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 12:22:25PM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I think you are right that diff-index could work, though. I always
> > forget that without "--cached", diff-index looks at the working tree
> > files.
> 
> I'll reword the log message while queuing.  It was super surprising
> to me to hear that there was something "git diff" did that three
> "git diff-*" plumbing brothers couldn't do at the basic "what to
> compare with what" level, as I wrote all four ;-)

Oops, our emails just crossed; I just sent a revised patch.

I know there are a few cases that git-diff can handle that the others
can't, but I think they are all direct blob-to-blob comparisons.

-Peff


Re: [PATCH] stash: disable renames when calling git-diff

2016-12-06 Thread Junio C Hamano
Jeff King  writes:

> I think you are right that diff-index could work, though. I always
> forget that without "--cached", diff-index looks at the working tree
> files.

I'll reword the log message while queuing.  It was super surprising
to me to hear that there was something "git diff" did that three
"git diff-*" plumbing brothers couldn't do at the basic "what to
compare with what" level, as I wrote all four ;-)


Re: [PATCH] stash: disable renames when calling git-diff

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 11:51:16AM -0800, Junio C Hamano wrote:

> > I don't think there's a plumbing command which works for diffing the
> > working tree directly to a git tree. In the long run, it might be a good
> > idea to remedy that.
> 
> I do not think that one is doing anything different from "git
> diff-index --name-only -z HEAD".
> [...]
> Yeah, it looks like "add -u" to me, too.  Perhaps the script was old
> enough that it didn't exist back then?  I dunno.

Hmm, nope. It _was_ "git add -u" at one point and switched. See
7aa5d43cc (stash: Don't overwrite files that have gone from the index,
2010-04-18).

I think you are right that diff-index could work, though. I always
forget that without "--cached", diff-index looks at the working tree
files.

-Peff


Re: [PATCH] stash: disable renames when calling git-diff

2016-12-06 Thread Junio C Hamano
Jeff King  writes:

> On Tue, Dec 06, 2016 at 11:20:18AM -0800, Junio C Hamano wrote:
>
>> Jeff King  writes:
>> 
>> >> If you run:
>> >> 
>> >>   git -c diff.renames=false stash
>> >> 
>> >> then it works.
>> >
>> > And here's a patch to fix it.
>> 
>> Yuck.  This obviously has easier to bite more people since we
>> enabled the renames by default.  Thanks for a quick fix.
>> 
>> I wonder why we are using "git diff" here, not the plumbing,
>> though.
>
> I don't think there's a plumbing command which works for diffing the
> working tree directly to a git tree. In the long run, it might be a good
> idea to remedy that.

I do not think that one is doing anything different from "git
diff-index --name-only -z HEAD".

> Though I'm not sure that "git add -u" would not accomplish the same
> thing as these several commands.

Yeah, it looks like "add -u" to me, too.  Perhaps the script was old
enough that it didn't exist back then?  I dunno.



Re: [PATCH v15 18/27] bisect--helper: `bisect_autostart` shell function in C

2016-12-06 Thread Pranit Bauva
Hey Stephan,

On Mon, Nov 21, 2016 at 1:45 AM, Stephan Beyer  wrote:
> Hi,
>
> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>> index 502bf18..1767916 100644
>> --- a/builtin/bisect--helper.c
>> +++ b/builtin/bisect--helper.c
>> @@ -422,6 +425,7 @@ static int bisect_next(...)
>>  {
>>   int res, no_checkout;
>>
>> + bisect_autostart(terms);
>
> You are not checking for return values here. (The shell code simply
> exited if there is no tty, but you don't.)

True. I didn't notice it carefully. Thanks for pointing it out.

>> @@ -754,6 +758,32 @@ static int bisect_start(struct bisect_terms *terms, int 
>> no_checkout,
>>   return retval || bisect_auto_next(terms, NULL);
>>  }
>>
>> +static int bisect_autostart(struct bisect_terms *terms)
>> +{
>> + if (is_empty_or_missing_file(git_path_bisect_start())) {
>> + const char *yesno;
>> + const char *argv[] = {NULL};
>> + fprintf(stderr, _("You need to start by \"git bisect "
>> +   "start\"\n"));
>> +
>> + if (!isatty(0))
>
> isatty(STDIN_FILENO)?

Seems better.

>> + return 1;
>> +
>> + /*
>> +  * TRANSLATORS: Make sure to include [Y] and [n] in your
>> +  * translation. THe program will only accept English input
>
> Typo "THe"

Sure.

>> +  * at this point.
>> +  */
>
> Taking "at this point" into consideration, I think the Y and n can be
> easily translated now that it is in C. I guess, by using...
>
>> + yesno = git_prompt(_("Do you want me to do it for you "
>> +  "[Y/n]? "), PROMPT_ECHO);
>> + if (starts_with(yesno, "n") || starts_with(yesno, "N"))
>
> ... starts_with(yesno, _("n")) || starts_with(yesno, _("N"))
> here (but not sure). However, this would be an extra patch on top of
> this series.

Can add it as an extra patch. Thanks for informing.

>> + exit(0);
>
> Shouldn't this also be "return 1;"? Saying "no" is the same outcome as
> not having a tty to ask for yes or no.

Yes.

>>  int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
>>  {
>>   enum {
>> @@ -790,6 +821,8 @@ int cmd_bisect__helper(int argc, const char **argv, 
>> const char *prefix)
>>N_("find the next bisection commit"), BISECT_NEXT),
>>   OPT_CMDMODE(0, "bisect-auto-next", ,
>>N_("verify the next bisection state then find the 
>> next bisection state"), BISECT_AUTO_NEXT),
>> + OPT_CMDMODE(0, "bisect-autostart", ,
>> +  N_("start the bisection if BISECT_START empty or 
>> missing"), BISECT_AUTOSTART),
>
> The word "is" is missing.

Sure. Thanks for going through these patches very carefully.

Regards,
Pranit Bauva


Re: [PATCH] stash: disable renames when calling git-diff

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 11:20:18AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >> If you run:
> >> 
> >>   git -c diff.renames=false stash
> >> 
> >> then it works.
> >
> > And here's a patch to fix it.
> 
> Yuck.  This obviously has easier to bite more people since we
> enabled the renames by default.  Thanks for a quick fix.
> 
> I wonder why we are using "git diff" here, not the plumbing,
> though.

I don't think there's a plumbing command which works for diffing the
working tree directly to a git tree. In the long run, it might be a good
idea to remedy that.

Though I'm not sure that "git add -u" would not accomplish the same
thing as these several commands.

-Peff


Re: [PATCH v15 08/27] bisect--helper: `is_expected_rev` & `check_expected_revs` shell function in C

2016-12-06 Thread Pranit Bauva
Hey Stephan,

On Thu, Nov 17, 2016 at 5:17 AM, Stephan Beyer  wrote:
> Hi,
>
> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>> index d84ba86..c542e8b 100644
>> --- a/builtin/bisect--helper.c
>> +++ b/builtin/bisect--helper.c
>> @@ -123,13 +123,40 @@ static int bisect_reset(const char *commit)
>>   return bisect_clean_state();
>>  }
>>
>> +static int is_expected_rev(const char *expected_hex)
>> +{
>> + struct strbuf actual_hex = STRBUF_INIT;
>> + int res = 0;
>> + if (strbuf_read_file(_hex, git_path_bisect_expected_rev(), 0) 
>> >= 40) {
>> + strbuf_trim(_hex);
>> + res = !strcmp(actual_hex.buf, expected_hex);
>> + }
>> + strbuf_release(_hex);
>> + return res;
>> +}
>
> I am not sure it does what it should.
>
> I would expect the following behavior from this function:
>  - file does not exist (or is "broken") => return 0
>  - actual_hex != expected_hex => return 0
>  - otherwise return 1
>
> If I am not wrong, the code does the following instead:
>  - file does not exist (or is "broken") => return 0
>  - actual_hex != expected_hex => return 1
>  - otherwise => return 0

Yeah, you are right. I should update this. Thanks for pointing it out.

>> +static int check_expected_revs(const char **revs, int rev_nr)
>> +{
>> + int i;
>> +
>> + for (i = 0; i < rev_nr; i++) {
>> + if (!is_expected_rev(revs[i])) {
>> + unlink_or_warn(git_path_bisect_ancestors_ok());
>> + unlink_or_warn(git_path_bisect_expected_rev());
>> + return 0;
>> + }
>> + }
>> + return 0;
>> +}
>
> Here I am not sure what the function *should* do. However, I see that it
> basically mimics the behavior of the shell function (assuming
> is_expected_rev() is implemented correctly).
>
> I don't understand why the return value is int and not void. To avoid a
> "return 0;" line when calling this function?

Initially I thought I would be using the return value but now I
realize that it is meaningless to do so. Using void seems better. :)

>> @@ -167,6 +196,8 @@ int cmd_bisect__helper(int argc, const char **argv, 
>> const char *prefix)
>>   if (argc > 1)
>>   die(_("--bisect-reset requires either zero or one 
>> arguments"));
>>   return bisect_reset(argc ? argv[0] : NULL);
>> + case CHECK_EXPECTED_REVS:
>> + return check_expected_revs(argv, argc);
>
> I note that you check the correct number of arguments for some
> subcommands and you do not check it for some other subcommands like this
> one. (I don't care, I just want to mention it.)

Here we should be able to accept any number of arguments. I think it
would be good to add a non-zero check though just to maintain the
uniformity. Though this is something programmer needs to be careful
about rather than the user.

>>   default:
>>   die("BUG: unknown subcommand '%d'", cmdmode);
>>   }
>
> ~Stephan

Regards,
Pranit Bauva


Re: [PATCH] stash: disable renames when calling git-diff

2016-12-06 Thread Junio C Hamano
Jeff King  writes:

>> If you run:
>> 
>>   git -c diff.renames=false stash
>> 
>> then it works.
>
> And here's a patch to fix it.

Yuck.  This obviously has easier to bite more people since we
enabled the renames by default.  Thanks for a quick fix.

I wonder why we are using "git diff" here, not the plumbing,
though.

>
> -- >8 --
> Subject: [PATCH] stash: disable renames when calling git-diff
>
> When creating a stash, we need to look at the diff between
> the working tree and HEAD. There's no plumbing command that
> covers this case, so we do so by calling the git-diff
> porcelain. This means we also inherit the usual list of
> porcelain options, which can impact the output in confusing
> ways.
>
> There's at least one known problem: --name-only will not
> mention the source side of a rename, meaning that we will
> fail to stash a deletion in such a case.
>
> The correct solution is to move to an appropriate plumbing
> command. But since there isn't one available, in the short
> term we can plug this particular problem by manually asking
> git-diff not to respect renames.
>
> Reported-by: Matthew Patey 
> Signed-off-by: Jeff King 
> ---
> There may be other similar problems lurking depending on the various
> config options you have set, but I don't think so. Most of the options
> only affect --patch output.
>
> I do find it interesting that --name-only does not mention the source
> side of a rename. The longer forms like --name-status mention both
> sides. Certainly --name-status does not have any way of saying "this is
> the source side of a rename", but I'd think it would simply mention both
> sides. Anyway, even if that were changed (which would fix this bug), I
> think adding --no-renames is still a good thing to be doing here.
>
>  git-stash.sh | 2 +-
>  t/t3903-stash.sh | 9 +
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/git-stash.sh b/git-stash.sh
> index 4546abaae..40ab2710e 100755
> --- a/git-stash.sh
> +++ b/git-stash.sh
> @@ -115,7 +115,7 @@ create_stash () {
>   git read-tree --index-output="$TMPindex" -m $i_tree &&
>   GIT_INDEX_FILE="$TMPindex" &&
>   export GIT_INDEX_FILE &&
> - git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
> + git diff --name-only -z --no-renames HEAD -- 
> >"$TMP-stagenames" &&
>   git update-index -z --add --remove --stdin 
> <"$TMP-stagenames" &&
>   git write-tree &&
>   rm -f "$TMPindex"
> diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> index e1a6ccc00..2de3e18ce 100755
> --- a/t/t3903-stash.sh
> +++ b/t/t3903-stash.sh
> @@ -766,4 +766,13 @@ test_expect_success 'stash list --cc shows combined 
> diff' '
>   test_cmp expect actual
>  '
>  
> +test_expect_success 'stash is not confused by partial renames' '
> + mv file renamed &&
> + git add renamed &&
> + git stash &&
> + git stash apply &&
> + test_path_is_file renamed &&
> + test_path_is_missing file
> +'
> +
>  test_done


BUG: "cherry-pick A..B || git reset --hard OTHER"

2016-12-06 Thread Junio C Hamano
I was burned a few times with this in the past few years, but it did
not irritate me often enough that I didn't write it down.  But I
think this is serious enough that deserves attention from those who
were involved.

A short reproduction recipe, using objects from git.git that are
publicly available and stable, shows how bad it is.

$ git checkout v2.9.3^0

$ git cherry-pick 0582a34f52..a94bb68397
... see conflict, decide to give up backporting to
... such an old fork point.
... the git-prompt gives "|CHERRY-PICKING" correctly.

$ git reset --hard v2.10.2^0
... the git-prompt no longer says "|CHERRY-PICKING"

$ edit && git commit -m "prelim work for backporting"
[detached HEAD cc5a6a9219] prelim work for backporting

$ git cherry-pick 0582a34f52..a94bb68397
error: a cherry-pick or revert is already in progress
hint: try "git cherry-pick (--continue | --quit | --abort)"
fatal: cherry-pick failed

$ git cherry-pick --abort
... we come back to v2.9.3^0, losing the new commit!

The above shows two bugs.

 (1) The third invocation of "cherry-pick" with "--abort" to get rid
 of the state from the unfinished cherry-pick we did previously
 is necessary, but the command does not notice that we resetted
 to a new branch AND we even did some other work there.  This
 loses end-user's work.  

 "git cherry-pick --abort" should learn from "git am --abort"
 that has an extra safety to deal with the above workflow.  The
 state from the unfinished "am" is removed, but the head is not
 rewound to avoid losing end-user's work.

 You can try by replacing two instances of

$ git cherry-pick 0582a34f52..a94bb68397

 with

$ git format-patch --stdout 0582a34f52..a94bb68397 | git am

 in the above sequence, and conclude with "git am--abort" to see
 how much more pleasant and safe "git am --abort" is.

 (2) The bug in "cherry-pick --abort" is made worse because the
 git-completion script seems to be unaware of $GIT_DIR/sequencer
 and stops saying "|CHERRY-PICKING" after the step to switch to
 a different state with "git reset --hard v2.10.2^0".  If the
 prompt showed it after "git reset", the end user would have
 thought twice before starting the "prelim work".

Thanks.


Re: [PATCH v15 11/27] bisect--helper: `bisect_next_check` & bisect_voc shell function in C

2016-12-06 Thread Pranit Bauva
Hey Stephan,

Sorry for the late replies. My end semester exams just got over.

On Fri, Nov 18, 2016 at 2:29 AM, Stephan Beyer  wrote:
>
> Hi Pranit,
>
> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
> > Also reimplement `bisect_voc` shell function in C and call it from
> > `bisect_next_check` implementation in C.
>
> Please don't! ;D
>
> > +static char *bisect_voc(char *revision_type)
> > +{
> > + if (!strcmp(revision_type, "bad"))
> > + return "bad|new";
> > + if (!strcmp(revision_type, "good"))
> > + return "good|old";
> > +
> > + return NULL;
> > +}
>
> Why not simply use something like this:
>
> static const char *voc[] = {
> "bad|new",
> "good|old",
> };
>
> Then...

This probably seems a good idea.

> > +static int bisect_next_check(const struct bisect_terms *terms,
> > +  const char *current_term)
> > +{
> > + int missing_good = 1, missing_bad = 1, retval = 0;
> > + char *bad_ref = xstrfmt("refs/bisect/%s", terms->term_bad);
> > + char *good_glob = xstrfmt("%s-*", terms->term_good);
> > + char *bad_syn, *good_syn;
>
> ...you don't need bad_syn and good_syn...
>
> > + bad_syn = xstrdup(bisect_voc("bad"));
> > + good_syn = xstrdup(bisect_voc("good"));
>
> ...and hence not these two lines...
>
> > + if (!is_empty_or_missing_file(git_path_bisect_start())) {
> > + error(_("You need to give me at least one %s and "
> > + "%s revision. You can use \"git bisect %s\" "
> > + "and \"git bisect %s\" for that. \n"),
> > + bad_syn, good_syn, bad_syn, good_syn);
>
> ...and write
> voc[0], voc[1], voc[0], voc[1]);
> instead...
>
> > + retval = -1;
> > + goto finish;
> > + }
> > + else {
> > + error(_("You need to start by \"git bisect start\". You "
> > + "then need to give me at least one %s and %s "
> > + "revision. You can use \"git bisect %s\" and "
> > + "\"git bisect %s\" for that.\n"),
> > + good_syn, bad_syn, bad_syn, good_syn);
>
> ...and here
> voc[1], voc[0], voc[0], voc[1]);
> ...
>
> > + retval = -1;
> > + goto finish;
> > + }
> > + goto finish;
> > +finish:
> > + if (!bad_ref)
> > + free(bad_ref);
> > + if (!good_glob)
> > + free(good_glob);
> > + if (!bad_syn)
> > + free(bad_syn);
> > + if (!good_syn)
> > + free(good_syn);
>
> ...and you can remove the 4 lines above.
>
> > + return retval;
> > +}
>
> Besides that, there are again some things that I've already mentioned
> and that can be applied here, too, for example, not capitalizing
> TERM_GOOD and TERM_BAD, the goto fail simplification, the terms memory leak.

Your suggestion simplifies things, I will use that.

Regards,
Pranit Bauva


Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 10:22:21AM -0800, Stefan Beller wrote:

> >> Maybe even go a step further and say that the config code needs a context
> >> "object".
> >
> > If I were writing git from scratch, I'd consider making a "struct
> > repository" object. I'm not sure how painful it would be to retro-fit it
> > at this point.
> 
> Would it be possible to introduce "the repo" struct similar to "the index"
> in cache.h?
> 
> From a submodule perspective I would very much welcome this
> object oriented approach to repositories.

I think it may be more complicated, because there's some implicit global
state in "the repo", like where files are relative to our cwd. All of
those low-level functions would have to start caring about which repo
we're talking about so they can prefix the appropriate working tree
path, etc.

For some operations that would be fine, but there are things that would
subtly fail for submodules. I'm thinking we'd end up with some code
state like:

  /* finding a repo does not modify global state; good */
  struct repository *repo = repo_discover(".");

  /* obvious repo-level operations like looking up refs can be done with
   * a repository object; good */
  repo_for_each_ref(repo, callback, NULL);

  /*
   * "enter" the repo so that we are at the top-level of the working
   * tree, etc. After this you can actually look at the index without
   * things breaking.
   */
  repo_enter(repo);

That would be enough to implement a lot of submodule-level stuff, but it
would break pretty subtly as soon as you asked the submodule about its
working tree. The solution is to make everything that accesses the
working tree aware of the idea of a working tree root besides the cwd.
But that's a pretty invasive change.

-Peff


Re: [PATCH] clone,fetch: explain the shallow-clone option a little more clearly

2016-12-06 Thread Junio C Hamano
Duy Nguyen  writes:

> On Tue, Dec 6, 2016 at 1:28 AM, Junio C Hamano  wrote:
>> I however offhand do not think the feature can be used to make the
>> repository shallower
>
> I'm pretty sure it can,...

I wrote my message after a short local testing, but it is very
possible I botched it and reached a wrong conclusion above.

If we can use the command to make it shallower, then the phrase
"deepen" would probably be what we need to be fixing in this patch:

>   OPT_STRING_LIST(0, "shallow-exclude", _not, N_("revision"),
> - N_("deepen history of shallow clone by excluding rev")),
> + N_("deepen history of shallow clone, excluding rev")),

Perhaps a shorter version of:

Adjust the depth of shallow clone so that commits that are
decendants of the named rev are made available, while commits
that are ancestors of the named rev are made beyond reach.

or something like that.  Here is my (somewhat botched) attempt:

Adjust shallow clone's history to be cut at the rev




[PATCH v2] jk/http-walker-limit-redirect rebased to maint-2.9

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 01:10:08PM -0500, Jeff King wrote:

> > I think I merged yours and then Brandon's on jch/pu branches in that
> > order, and the conflict resolution should look OK.
> > 
> > I however forked yours on v2.11.0-rc1, which would need to be
> > rebased to one of the earlier maintenance tracks, before we can
> > merge it to 'next'.
> 
> Yeah, I built it on top of master.
> 
> It does depend on some of the http-walker changes Eric made a few months
> ago. In particular, 17966c0a6 (http: avoid disconnecting on 404s for
> loose objects, 2016-07-11) added some checks against the HTTP status
> code, and my series modifies the checks (mostly so that ">= 400" becomes
> ">= 300").
> 
> Rebasing on maint-2.9 means omitting those changes. That preserves the
> security properties, but means that the error handling is worse when we
> see an illegal redirect. That may be OK, though.
> 
> Since the resolution is to omit the changes entirely from my series,
> merging up to v2.11 wouldn't produce any conflicts. We'd need to have a
> separate set of patches adding those changes back in.

This actually turned out to be pretty easy. The final patch from my
original series is the one that tweaks the error-handling from
17966c0a6. The rest of them are _almost_ untouched, except that one of
the error-handling tweaks gets bumped to the final patch.

So here's a resend of the patches, rebased on your maint-2.9 branch.
Patches 1-5 should be applied directly there.

On maint-2.10, you can merge up maint-2.9, and then apply patch 6.

Hopefully that makes sense, but I've pushed branches
jk/maint-X-http-redirect to https://github.com/peff/git that show the
final configuration (and a diff of jk/maint-2.10-http-redirect shows the
outcome is identical to applying the original series on top of 2.10).

Merging up to 2.11 should be trivial.

  [1/6]: http: simplify update_url_from_redirect
  [2/6]: http: always update the base URL for redirects
  [3/6]: remote-curl: rename shadowed options variable
  [4/6]: http: make redirects more obvious
  [5/6]: http: treat http-alternates like redirects
  [6/6]: http-walker: complain about non-404 loose object errors

-Peff


Re: [RFC PATCH] GIT-VERSION-GEN: set --abbrev=9 to match auto-scaling

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 10:17:55AM -0800, Junio C Hamano wrote:

> Yup, that is what I meant to say with "that is already possible" and
> we are on the same page.  As all three of us seem to be happy with
> just dropping --abbrev and letting describe do its default thing (as
> configured by whoever is doing the build), let's do so.
> 
> -- >8 --
> From: Ramsay Jones 
> Date: Sun, 4 Dec 2016 20:45:59 +
> Subject: [PATCH] GIT-VERSION-GEN: do not force abbreviation length used by 
> 'describe'

Thanks, this is a nice summary of the discussion, and the patch is
obviously correct.

-Peff


[PATCH v2 6/6] http-walker: complain about non-404 loose object errors

2016-12-06 Thread Jeff King
Since commit 17966c0a6 (http: avoid disconnecting on 404s
for loose objects, 2016-07-11), we turn off curl's
FAILONERROR option and instead manually deal with failing
HTTP codes.

However, the logic to do so only recognizes HTTP 404 as a
failure. This is probably the most common result, but if we
were to get another code, the curl result remains CURLE_OK,
and we treat it as success. We still end up detecting the
failure when we try to zlib-inflate the object (which will
fail), but instead of reporting the HTTP error, we just
claim that the object is corrupt.

Instead, let's catch anything in the 300's or above as an
error (300's are redirects which are not an error at the
HTTP level, but are an indication that we've explicitly
disabled redirects, so we should treat them as such; we
certainly don't have the resulting object content).

Note that we also fill in req->errorstr, which we didn't do
before. Without FAILONERROR, curl will not have filled this
in, and it will remain a blank string. This never mattered
for the 404 case, because in the logic below we hit the
"missing_target()" branch and print nothing. But for other
errors, we'd want to say _something_, if only to fill in the
blank slot in the error message.

Signed-off-by: Jeff King 
---
The second hunk here is new in v2; earlier it appeared in patch 3. But
arguably it goes better here anyway; I didn't even need to modify the
commit message to explain it.

 http-walker.c | 7 +--
 http.c| 2 +-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/http-walker.c b/http-walker.c
index 25a8b1ad4..c2f81cd6a 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -482,10 +482,13 @@ static int fetch_object(struct walker *walker, unsigned 
char *sha1)
 * we turned off CURLOPT_FAILONERROR to avoid losing a
 * persistent connection and got CURLE_OK.
 */
-   if (req->http_code == 404 && req->curl_result == CURLE_OK &&
+   if (req->http_code >= 300 && req->curl_result == CURLE_OK &&
(starts_with(req->url, "http://;) ||
-starts_with(req->url, "https://;)))
+starts_with(req->url, "https://;))) {
req->curl_result = CURLE_HTTP_RETURNED_ERROR;
+   xsnprintf(req->errorstr, sizeof(req->errorstr),
+ "HTTP request failed");
+   }
 
if (obj_req->state == ABORTED) {
ret = error("Request for %s aborted", hex);
diff --git a/http.c b/http.c
index ff4d88231..c25d1d540 100644
--- a/http.c
+++ b/http.c
@@ -2021,7 +2021,7 @@ static size_t fwrite_sha1_file(char *ptr, size_t eltsize, 
size_t nmemb,
if (c != CURLE_OK)
die("BUG: curl_easy_getinfo for HTTP code failed: %s",
curl_easy_strerror(c));
-   if (slot->http_code >= 400)
+   if (slot->http_code >= 300)
return size;
}
 
-- 
2.11.0.191.gdb26c57


[PATCH v2 5/6] http: treat http-alternates like redirects

2016-12-06 Thread Jeff King
The previous commit made HTTP redirects more obvious and
tightened up the default behavior. However, there's another
way for a server to ask a git client to fetch arbitrary
content: by having an http-alternates file (or a regular
alternates file, which is used as a backup).

Similar to the HTTP redirect case, a malicious server can
claim to have refs pointing at object X, return a 404 when
the client asks for X, but point to some other URL via
http-alternates, which the client will transparently fetch.
The end result is that it looks from the user's perspective
like the objects came from the malicious server, as the
other URL is not mentioned at all.

Worse, because we feed the new URL to curl ourselves, the
usual protocol restrictions do not kick in (neither curl's
default of disallowing file://, nor the protocol
whitelisting in f4113cac0 (http: limit redirection to
protocol-whitelist, 2015-09-22).

Let's apply the same rules here as we do for HTTP redirects.
Namely:

  - unless http.followRedirects is set to "always", we will
not follow remote redirects from http-alternates (or
alternates) at all

  - set CURLOPT_PROTOCOLS alongside CURLOPT_REDIR_PROTOCOLS
restrict ourselves to a known-safe set and respect any
user-provided whitelist.

  - mention alternate object stores on stderr so that the
user is aware another source of objects may be involved

The first item may prove to be too restrictive. The most
common use of alternates is to point to another path on the
same server. While it's possible for a single-server
redirect to be an attack, it takes a fairly obscure setup
(victim and evil repository on the same host, host speaks
dumb http, and evil repository has access to edit its own
http-alternates file).

So we could make the checks more specific, and only cover
cross-server redirects. But that means parsing the URLs
ourselves, rather than letting curl handle them. This patch
goes for the simpler approach. Given that they are only used
with dumb http, http-alternates are probably pretty rare.
And there's an escape hatch: the user can allow redirects on
a specific server by setting http..followRedirects to
"always".

Reported-by: Jann Horn 
Signed-off-by: Jeff King 
---
 http-walker.c  |  8 +---
 http.c |  1 +
 t/t5550-http-fetch-dumb.sh | 38 ++
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/http-walker.c b/http-walker.c
index 2c721f0c3..4bff31c44 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -290,9 +290,8 @@ static void process_alternates_response(void *callback_data)
struct strbuf target = STRBUF_INIT;
strbuf_add(, base, serverlen);
strbuf_add(, data + i, posn - i - 7);
-   if (walker->get_verbosely)
-   fprintf(stderr, "Also look at %s\n",
-   target.buf);
+   warning("adding alternate object store: %s",
+   target.buf);
newalt = xmalloc(sizeof(*newalt));
newalt->next = NULL;
newalt->base = strbuf_detach(, NULL);
@@ -318,6 +317,9 @@ static void fetch_alternates(struct walker *walker, const 
char *base)
struct alternates_request alt_req;
struct walker_data *cdata = walker->data;
 
+   if (http_follow_config != HTTP_FOLLOW_ALWAYS)
+   return;
+
/*
 * If another request has already started fetching alternates,
 * wait for them to arrive and return to processing this request's
diff --git a/http.c b/http.c
index b99ade5fa..a9778bfa4 100644
--- a/http.c
+++ b/http.c
@@ -581,6 +581,7 @@ static CURL *get_curl_handle(void)
if (is_transport_allowed("ftps"))
allowed_protocols |= CURLPROTO_FTPS;
curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS, allowed_protocols);
+   curl_easy_setopt(result, CURLOPT_PROTOCOLS, allowed_protocols);
 #else
if (transport_restrict_protocols())
warning("protocol restrictions not applied to curl redirects 
because\n"
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index ad94ed7b1..22011f0b6 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -322,5 +322,43 @@ test_expect_success 'http.followRedirects defaults to 
"initial"' '
test_must_fail git clone $HTTPD_URL/redir-objects/repo.git default
 '
 
+# The goal is for a clone of the "evil" repository, which has no objects
+# itself, to cause the client to fetch objects from the "victim" repository.
+test_expect_success 'set up evil alternates scheme' '
+   victim=$HTTPD_DOCUMENT_ROOT_PATH/victim.git &&
+   git init --bare "$victim" &&
+   git -C "$victim" 

[PATCH v2 2/6] http: always update the base URL for redirects

2016-12-06 Thread Jeff King
If a malicious server redirects the initial ref
advertisement, it may be able to leak sha1s from other,
unrelated servers that the client has access to. For
example, imagine that Alice is a git user, she has access to
a private repository on a server hosted by Bob, and Mallory
runs a malicious server and wants to find out about Bob's
private repository.

Mallory asks Alice to clone an unrelated repository from her
over HTTP. When Alice's client contacts Mallory's server for
the initial ref advertisement, the server issues an HTTP
redirect for Bob's server. Alice contacts Bob's server and
gets the ref advertisement for the private repository. If
there is anything to fetch, she then follows up by asking
the server for one or more sha1 objects. But who is the
server?

If it is still Mallory's server, then Alice will leak the
existence of those sha1s to her.

Since commit c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28), the client usually rewrites the base
URL such that all further requests will go to Bob's server.
But this is done by textually matching the URL. If we were
originally looking for "http://mallory/repo.git/info/refs;,
and we got pointed at "http://bob/other.git/info/refs;, then
we know that the right root is "http://bob/other.git;.

If the redirect appears to change more than just the root,
we punt and continue to use the original server. E.g.,
imagine the redirect adds a URL component that Bob's server
will ignore, like "http://bob/other.git/info/refs?dummy=1;.

We can solve this by aborting in this case rather than
silently continuing to use Mallory's server. In addition to
protecting from sha1 leakage, it's arguably safer and more
sane to refuse a confusing redirect like that in general.
For example, part of the motivation in c93c92f30 is
avoiding accidentally sending credentials over clear http,
just to get a response that says "try again over https". So
even in a non-malicious case, we'd prefer to err on the side
of caution.

The downside is that it's possible this will break a
legitimate but complicated server-side redirection scheme.
The setup given in the newly added test does work, but it's
convoluted enough that we don't need to care about it. A
more plausible case would be a server which redirects a
request for "info/refs?service=git-upload-pack" to just
"info/refs" (because it does not do smart HTTP, and for some
reason really dislikes query parameters).  Right now we
would transparently downgrade to dumb-http, but with this
patch, we'd complain (and the user would have to set
GIT_SMART_HTTP=0 to fetch).

Reported-by: Jann Horn 
Signed-off-by: Jeff King 
---
 http.c| 12 
 t/lib-httpd/apache.conf   |  8 
 t/t5551-http-fetch-smart.sh   |  4 
 t/t5812-proto-disable-http.sh |  1 +
 4 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/http.c b/http.c
index d4034a14b..718d2109b 100644
--- a/http.c
+++ b/http.c
@@ -1491,9 +1491,9 @@ static int http_request(const char *url,
  *
  * Note that this assumes a sane redirect scheme. It's entirely possible
  * in the example above to end up at a URL that does not even end in
- * "info/refs".  In such a case we simply punt, as there is not much we can
- * do (and such a scheme is unlikely to represent a real git repository,
- * which means we are likely about to abort anyway).
+ * "info/refs".  In such a case we die. There's not much we can do, such a
+ * scheme is unlikely to represent a real git repository, and failing to
+ * rewrite the base opens options for malicious redirects to do funny things.
  */
 static int update_url_from_redirect(struct strbuf *base,
const char *asked,
@@ -1511,10 +1511,14 @@ static int update_url_from_redirect(struct strbuf *base,
 
new_len = got->len;
if (!strip_suffix_mem(got->buf, _len, tail))
-   return 0; /* insane redirect scheme */
+   die(_("unable to update url base from redirection:\n"
+ "  asked for: %s\n"
+ "   redirect: %s"),
+   asked, got->buf);
 
strbuf_reset(base);
strbuf_add(base, got->buf, new_len);
+
return 1;
 }
 
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 018a83a5a..9a355fb1c 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -132,6 +132,14 @@ RewriteRule ^/ftp-redir/(.*)$ ftp://localhost:1000/$1 
[R=302]
 RewriteRule ^/loop-redir/x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-(.*) /$1 
[R=302]
 RewriteRule ^/loop-redir/(.*)$ /loop-redir/x-$1 [R=302]
 
+# The first rule issues a client-side redirect to something
+# that _doesn't_ look like a git repo. The second rule is a
+# server-side rewrite, so that it turns out the odd-looking
+# thing _is_ a git repo. The "[PT]" tells Apache to match
+# the usual ScriptAlias rules for /smart.
+RewriteRule ^/insane-redir/(.*)$ /intern-redir/$1/foo 

[PATCH v2 4/6] http: make redirects more obvious

2016-12-06 Thread Jeff King
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.

Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:

  1. Mallory may simply redirect all fetch requests to Bob's
 server. Git will transparently follow those redirects
 and fetch Bob's history, which Alice may believe she
 got from Mallory. The subsequent push seems like it is
 just feeding Mallory back her own objects, but is
 actually leaking Bob's objects. There is nothing in
 git's output to indicate that Bob's repository was
 involved at all.

 The downside (for Mallory) of this attack is that Alice
 will have received Bob's entire repository, and is
 likely to notice that when building on top of it.

  2. If Mallory happens to know the sha1 of some object X in
 Bob's repository, she can instead build her own history
 that references that object. She then runs a dumb http
 server, and Alice's client will fetch each object
 individually. When it asks for X, Mallory redirects her
 to Bob's server. The end result is that Alice obtains
 objects from Bob, but they may be buried deep in
 history. Alice is less likely to notice.

Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.

But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.

First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.

Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.

As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).

Reported-by: Jann Horn 
Signed-off-by: Jeff King 
---
 Documentation/config.txt   | 10 ++
 http.c | 31 +--
 http.h | 10 +-
 remote-curl.c  |  4 
 t/lib-httpd/apache.conf|  6 ++
 t/t5550-http-fetch-dumb.sh | 23 +++
 6 files changed, 81 insertions(+), 3 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f4721a048..815333643 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1833,6 +1833,16 @@ http.userAgent::
of common USER_AGENT strings (but not including those like git/1.7.1).
Can be overridden by the `GIT_HTTP_USER_AGENT` environment variable.
 
+http.followRedirects::
+   Whether git should follow HTTP redirects. If set to `true`, git
+   will transparently follow any redirect issued by a server it
+   encounters. If set to `false`, git will treat all redirects as
+   errors. If set to `initial`, git will follow redirects only for
+   the initial request to a remote, but not for subsequent
+   follow-up HTTP requests. Since git uses the redirected URL as
+   the base for the follow-up requests, this is generally
+   sufficient. The default is `initial`.
+
 http..*::
Any of the http.* options above can be applied selectively to some URLs.
For a config key to match a URL, each element of the config key is
diff --git a/http.c b/http.c
index 718d2109b..b99ade5fa 100644
--- a/http.c
+++ b/http.c
@@ -98,6 +98,8 @@ static int http_proactive_auth;
 static const char *user_agent;
 static int curl_empty_auth;
 
+enum http_follow_config http_follow_config = HTTP_FOLLOW_INITIAL;
+
 #if LIBCURL_VERSION_NUM >= 0x071700
 /* Use CURLOPT_KEYPASSWD as is */
 #elif 

[PATCH v2 3/6] remote-curl: rename shadowed options variable

2016-12-06 Thread Jeff King
The discover_refs() function has a local "options" variable
to hold the http_get_options we pass to http_get_strbuf().
But this shadows the global "struct options" that holds our
program-level options, which cannot be accessed from this
function.

Let's give the local one a more descriptive name so we can
tell the two apart.

Signed-off-by: Jeff King 
---
 remote-curl.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 6b83b7783..e3803daa3 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -254,7 +254,7 @@ static struct discovery *discover_refs(const char *service, 
int for_push)
struct strbuf effective_url = STRBUF_INIT;
struct discovery *last = last_discovery;
int http_ret, maybe_smart = 0;
-   struct http_get_options options;
+   struct http_get_options http_options;
 
if (last && !strcmp(service, last->service))
return last;
@@ -271,15 +271,15 @@ static struct discovery *discover_refs(const char 
*service, int for_push)
strbuf_addf(_url, "service=%s", service);
}
 
-   memset(, 0, sizeof(options));
-   options.content_type = 
-   options.charset = 
-   options.effective_url = _url;
-   options.base_url = 
-   options.no_cache = 1;
-   options.keep_error = 1;
+   memset(_options, 0, sizeof(http_options));
+   http_options.content_type = 
+   http_options.charset = 
+   http_options.effective_url = _url;
+   http_options.base_url = 
+   http_options.no_cache = 1;
+   http_options.keep_error = 1;
 
-   http_ret = http_get_strbuf(refs_url.buf, , );
+   http_ret = http_get_strbuf(refs_url.buf, , _options);
switch (http_ret) {
case HTTP_OK:
break;
-- 
2.11.0.191.gdb26c57



[PATCH v2 1/6] http: simplify update_url_from_redirect

2016-12-06 Thread Jeff King
This function looks for a common tail between what we asked
for and where we were redirected to, but it open-codes the
comparison. We can avoid some confusing subtractions by
using strip_suffix_mem().

Signed-off-by: Jeff King 
---
 http.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/http.c b/http.c
index d8e427b69..d4034a14b 100644
--- a/http.c
+++ b/http.c
@@ -1500,7 +1500,7 @@ static int update_url_from_redirect(struct strbuf *base,
const struct strbuf *got)
 {
const char *tail;
-   size_t tail_len;
+   size_t new_len;
 
if (!strcmp(asked, got->buf))
return 0;
@@ -1509,14 +1509,12 @@ static int update_url_from_redirect(struct strbuf *base,
die("BUG: update_url_from_redirect: %s is not a superset of %s",
asked, base->buf);
 
-   tail_len = strlen(tail);
-
-   if (got->len < tail_len ||
-   strcmp(tail, got->buf + got->len - tail_len))
+   new_len = got->len;
+   if (!strip_suffix_mem(got->buf, _len, tail))
return 0; /* insane redirect scheme */
 
strbuf_reset(base);
-   strbuf_add(base, got->buf, got->len - tail_len);
+   strbuf_add(base, got->buf, new_len);
return 1;
 }
 
-- 
2.11.0.191.gdb26c57



Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin

2016-12-06 Thread Stefan Beller
On Tue, Dec 6, 2016 at 7:09 AM, Jeff King  wrote:

>>
>> Maybe even go a step further and say that the config code needs a context
>> "object".
>
> If I were writing git from scratch, I'd consider making a "struct
> repository" object. I'm not sure how painful it would be to retro-fit it
> at this point.

Would it be possible to introduce "the repo" struct similar to "the index"
in cache.h?

>From a submodule perspective I would very much welcome this
object oriented approach to repositories.


Re: [RFC PATCH] GIT-VERSION-GEN: set --abbrev=9 to match auto-scaling

2016-12-06 Thread Junio C Hamano
Jeff King  writes:

> On Mon, Dec 05, 2016 at 10:01:19AM -0800, Junio C Hamano wrote:
>
>> > That said, I think the right patch may be to just drop --abbrev
>> > entirely.
>> > ...
>> > I think at that point it was a noop, as 7 should have been the default.
>> > And now we probably ought to drop it, so that we can use the
>> > auto-scaling default.
>> 
>> Yeah, I agree.
>> 
>> It does mean that snapshot binaries built out of the same commit in
>> the same repository before and after a repack have higher chances of
>> getting named differently, which may surprise people, but that
>> already is possible with a fixed length if the repacking involves
>> pruning (albeit with lower probabilities), and I do not think it is
>> a problem.
>
> I think that the number is already unstable, even with --abbrev, as it
> just specifies a minimum. So any operation that creates objects has a
> possibility of increasing the length. Or more likely, two people
> describing the same version may end up with different strings because
> they have different objects in their repositories (e.g., I used to
> carry's trast's git-notes archive of the mailing list which added quite
> a few objects).
>
> I agree that having it change over the course of a repack is slightly
> _more_ surprising than those cases, but ultimately the value just isn't
> stable.

Yup, that is what I meant to say with "that is already possible" and
we are on the same page.  As all three of us seem to be happy with
just dropping --abbrev and letting describe do its default thing (as
configured by whoever is doing the build), let's do so.

-- >8 --
From: Ramsay Jones 
Date: Sun, 4 Dec 2016 20:45:59 +
Subject: [PATCH] GIT-VERSION-GEN: do not force abbreviation length used by 
'describe'

The default version name for a Git binary is computed by running
"git describe" on the commit the binary is made out of, basing on a
tag whose name matches "v[0-9]*", e.g. v2.11.0-rc2-2-g7f1dc9.

In the very early days, with 9b88fcef7d ("Makefile: use git-describe
to mark the git version.", 2005-12-27), we used "--abbrev=4" to get
absolute minimum number of abbreviated commit object name.  This was
later changed to match the default minimum of 7 with bf505158d0
("Git 1.7.10.1", 2012-05-01).

These days, the "default minimum" scales automatically depending on
the size of the repository, and there is no point in specifying a
particular abbreviation length; all we wanted since Git 1.7.10.1
days was to get "something reasonable we would use by default".

Just drop "--abbrev=" from the invocation of "git describe"
and let the command pick what it thinks is appropriate, taking the
end user's configuration and the repository contents into account.

Signed-off-by: Ramsay Jones 
Helped-by: Jeff King 
Signed-off-by: Junio C Hamano 
---
 GIT-VERSION-GEN | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN
index 556fbfc104..f95b04bb36 100755
--- a/GIT-VERSION-GEN
+++ b/GIT-VERSION-GEN
@@ -12,7 +12,7 @@ if test -f version
 then
VN=$(cat version) || VN="$DEF_VER"
 elif test -d ${GIT_DIR:-.git} -o -f .git &&
-   VN=$(git describe --match "v[0-9]*" --abbrev=7 HEAD 2>/dev/null) &&
+   VN=$(git describe --match "v[0-9]*" HEAD 2>/dev/null) &&
case "$VN" in
*$LF*) (exit 1) ;;
v[0-9]*)
-- 
2.11.0-263-gd89495280e



Re: [PATCH v7 4/4] transport: add from_user parameter to is_transport_allowed

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 09:53:53AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I don't know if that makes things any easier. I feel funny saying "no,
> > no, mine preempts yours because it is more maint-worthy", but I think
> > that order does make sense.
> >
> > I think it would be OK to put Brandon's on maint, too, though. It is a
> > refactor of an existing security feature to make it more featureful, but
> > the way it is implemented could not cause security regressions unless
> > you use the new feature (IOW, we still respect the whitelist environment
> > exactly as before).
> 
> I think I merged yours and then Brandon's on jch/pu branches in that
> order, and the conflict resolution should look OK.
> 
> I however forked yours on v2.11.0-rc1, which would need to be
> rebased to one of the earlier maintenance tracks, before we can
> merge it to 'next'.

Yeah, I built it on top of master.

It does depend on some of the http-walker changes Eric made a few months
ago. In particular, 17966c0a6 (http: avoid disconnecting on 404s for
loose objects, 2016-07-11) added some checks against the HTTP status
code, and my series modifies the checks (mostly so that ">= 400" becomes
">= 300").

Rebasing on maint-2.9 means omitting those changes. That preserves the
security properties, but means that the error handling is worse when we
see an illegal redirect. That may be OK, though.

Since the resolution is to omit the changes entirely from my series,
merging up to v2.11 wouldn't produce any conflicts. We'd need to have a
separate set of patches adding those changes back in.

-Peff


Re: "git add -p ." raises an unexpected "warning: empty strings as pathspecs will be made invalid in upcoming releases. please use . instead if you meant to match all paths"

2016-12-06 Thread Brandon Williams
On 11/30, Junio C Hamano wrote:
> Junio C Hamano  forgot to Cc: the author of the
> most relevant change to the issue, d426430e6e ("pathspec: warn on
> empty strings as pathspec", 2016-06-22).
> 
> > Kevin Daudt  writes:
> >
> >> On Wed, Nov 30, 2016 at 12:31:49PM -0800, Peter Urda wrote:
> >>> After upgrading to version 2.11.0 I am getting a warning about empty
> >>> strings as pathspecs while using 'patch'
> >>> 
> >>> - Ran 'git add -p .' from the root of my git repository.
> >>> 
> >>> - I was able to normally stage my changes, but was presented with a
> >>> "warning: empty strings as pathspecs will be made invalid in upcoming
> >>> releases. please use . instead if you meant to match all paths"
> >>> message.
> >>> 
> >>> - I expected no warning message since I included a "." with my original 
> >>> command.
> >>> 
> >>> I believe that I should not be seeing this warning message as I
> >>> included the requested "." pathspec.
> >
> > Yes, this seems to be caused by pathspec.c::prefix_pathspec()
> > overwriting the original pathspec "." into "".  The callchain
> > looks like this:
> >
> > builtin/add.c::interactive_add()
> >  -> parse_pathspec()
> > passes argv[] that has "." to the caller,
> > receives pathspec whose pathspec->items[].original
> > is supposed to point at the unmolested original,
> > but prefix_pathspec() munges "." into ""
> >  -> run_add_interactive()
> > which runs "git add--interactive" with
> > pathspec->items[].original as pathspecs
> >
> >
> > Perhaps this would work it around, but there should be a better way
> > to fix it (like, making sure that what we call "original" indeed
> > stays "original").
> >
> >  builtin/add.c | 13 +++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/builtin/add.c b/builtin/add.c
> > index e8fb80b36e..137097192d 100644
> > --- a/builtin/add.c
> > +++ b/builtin/add.c
> > @@ -167,9 +167,18 @@ int run_add_interactive(const char *revision, const 
> > char *patch_mode,
> > if (revision)
> > argv_array_push(, revision);
> > argv_array_push(, "--");
> > -   for (i = 0; i < pathspec->nr; i++)
> > +   for (i = 0; i < pathspec->nr; i++) {
> > /* pass original pathspec, to be re-parsed */
> > +   if (!*pathspec->items[i].original) {
> > +   /*
> > +* work around a misfeature in parse_pathspecs()
> > +* that munges "." into "".
> > +*/
> > +   argv_array_push(, ".");
> > +   continue;
> > +   }
> > argv_array_push(, pathspec->items[i].original);
> > +   }
> >  
> > status = run_command_v_opt(argv.argv, RUN_GIT_CMD);
> > argv_array_clear();
> > @@ -180,7 +189,7 @@ int interactive_add(int argc, const char **argv, const 
> > char *prefix, int patch)
> >  {
> > struct pathspec pathspec;
> >  
> > -   parse_pathspec(, 0,
> > +   parse_pathspec(, 0,
> >PATHSPEC_PREFER_FULL |
> >PATHSPEC_SYMLINK_LEADING_PATH |
> >PATHSPEC_PREFIX_ORIGIN,

I've been doing a bit of work trying to clean up the pathspec
initialization code and I believe this can be fixed without
having to add in this work around.  The code which does the munging is
always trying to prefix the pathspec regardless if there is a prefix or
not.  If instead its changed to only try and prefix the original if
there is indeed a prefix, then it should fix the munging.

I'll try to get the series I'm working on out in the next day.

-- 
Brandon Williams


Re: [PATCH v7 4/4] transport: add from_user parameter to is_transport_allowed

2016-12-06 Thread Junio C Hamano
Jeff King  writes:

> I don't know if that makes things any easier. I feel funny saying "no,
> no, mine preempts yours because it is more maint-worthy", but I think
> that order does make sense.
>
> I think it would be OK to put Brandon's on maint, too, though. It is a
> refactor of an existing security feature to make it more featureful, but
> the way it is implemented could not cause security regressions unless
> you use the new feature (IOW, we still respect the whitelist environment
> exactly as before).

I think I merged yours and then Brandon's on jch/pu branches in that
order, and the conflict resolution should look OK.

I however forked yours on v2.11.0-rc1, which would need to be
rebased to one of the earlier maintenance tracks, before we can
merge it to 'next'.




Re: Re: Feature request: Set git svn options in .git/config file

2016-12-06 Thread Juergen Kosel
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Hello,

Am 05.12.2016 um 23:54 schrieb Eric Wong:
> So, can you confirm that svn.addAuthorFrom and svn.useLogAuthor 
> config keys work and can be documented?

yes, I can confirm, that adding this configuration keys works with git
2.1.4 work.
I have added the config keys as follows:

git config --add --bool svn.useLogAuthor true
git config --add --bool svn.addAuthorFrom true

> 
> Even better would be for you to provide a patch to the 
> documentation :) Otherwise, I can write one up sometime this week.

My English is not that well. So I prefer, if you update the
documentation :-)


Greetings
Juergen


-BEGIN PGP SIGNATURE-
Version: GnuPG v2
Comment: Using GnuPG with Icedove - http://www.enigmail.net/

iD8DBQFYRvRhYZbcPghIM9IRAjtHAJ0QaqbUxcgoPXmidIg9WALXmg1JAQCfTHFj
wgPLKXK5Ny0bP/K9vpE9KzY=
=A+Yy
-END PGP SIGNATURE-


Re: [PATCH v4] diff: handle --no-abbrev in no-index case

2016-12-06 Thread Jack Bates

On 06/12/16 09:56 AM, Jack Bates wrote:

There are two different places where the --no-abbrev option is parsed,
and two different places where SHA-1s are abbreviated. We normally parse
--no-abbrev with setup_revisions(), but in the no-index case, "git diff"
calls diff_opt_parse() directly, and diff_opt_parse() didn't handle
--no-abbrev until now. (It did handle --abbrev, however.) We normally
abbreviate SHA-1s with find_unique_abbrev(), but commit 4f03666 ("diff:
handle sha1 abbreviations outside of repository, 2016-10-20) recently
introduced a special case when you run "git diff" outside of a
repository.

setup_revisions() does also call diff_opt_parse(), but not for --abbrev
or --no-abbrev, which it handles itself. setup_revisions() sets
rev_info->abbrev, and later copies that to diff_options->abbrev. It
handles --no-abbrev by setting abbrev to zero. (This change doesn't
touch that.)

Setting abbrev to zero was broken in the outside-of-a-repository special
case, which until now resulted in a truly zero-length SHA-1, rather than
taking zero to mean do not abbreviate. The only way to trigger this bug,
however, was by running "git diff --raw" without either the --abbrev or
--no-abbrev options, because 1) without --raw it doesn't respect abbrev
(which is bizarre, but has been that way forever), 2) we silently clamp
--abbrev=0 to MINIMUM_ABBREV, and 3) --no-abbrev wasn't handled until
now.

The outside-of-a-repository case is one of three no-index cases. The
other two are when one of the files you're comparing is outside of the
repository you're in, and the --no-index option.

Signed-off-by: Jack Bates 
---
 diff.c  | 6 +-
 t/t4013-diff-various.sh | 7 +++
 t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir  | 3 +++
 t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir | 3 +++
 t/t4013/diff.diff_--no-index_--raw_dir2_dir | 3 +++
 t/t4013/diff.diff_--raw_--abbrev=4_initial  | 6 ++
 t/t4013/diff.diff_--raw_--no-abbrev_initial | 6 ++
 t/t4013/diff.diff_--raw_initial | 6 ++
 8 files changed, 39 insertions(+), 1 deletion(-)
 create mode 100644 t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir
 create mode 100644 t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir
 create mode 100644 t/t4013/diff.diff_--no-index_--raw_dir2_dir
 create mode 100644 t/t4013/diff.diff_--raw_--abbrev=4_initial
 create mode 100644 t/t4013/diff.diff_--raw_--no-abbrev_initial
 create mode 100644 t/t4013/diff.diff_--raw_initial

diff --git a/diff.c b/diff.c
index ec87283..84dba60 100644
--- a/diff.c
+++ b/diff.c
@@ -3106,7 +3106,8 @@ static const char *diff_abbrev_oid(const struct object_id 
*oid, int abbrev)
abbrev = FALLBACK_DEFAULT_ABBREV;
if (abbrev > GIT_SHA1_HEXSZ)
die("BUG: oid abbreviation out of range: %d", abbrev);
-   hex[abbrev] = '\0';
+   if (abbrev)
+   hex[abbrev] = '\0';
return hex;
}
 }
@@ -3364,6 +3365,7 @@ void diff_setup(struct diff_options *options)

options->file = stdout;

+   options->abbrev = DEFAULT_ABBREV;
options->line_termination = '\n';
options->break_opt = -1;
options->rename_limit = -1;
@@ -4024,6 +4026,8 @@ int diff_opt_parse(struct diff_options *options,
offending, optarg);
return argcount;
}
+   else if (!strcmp(arg, "--no-abbrev"))
+   options->abbrev = 0;
else if (!strcmp(arg, "--abbrev"))
options->abbrev = DEFAULT_ABBREV;
else if (skip_prefix(arg, "--abbrev=", )) {
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 566817e..d09acfe 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -311,6 +311,13 @@ diff --line-prefix=abc master master^ side
 diff --dirstat master~1 master~2
 diff --dirstat initial rearrange
 diff --dirstat-by-file initial rearrange
+# No-index --abbrev and --no-abbrev


I updated this comment to be consistent with the no-index vs. 
outside-of-a-repository distinction in the commit message.



+diff --raw initial
+diff --raw --abbrev=4 initial
+diff --raw --no-abbrev initial
+diff --no-index --raw dir2 dir
+diff --no-index --raw --abbrev=4 dir2 dir
+diff --no-index --raw --no-abbrev dir2 dir
 EOF

 test_expect_success 'log -S requires an argument' '
diff --git a/t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir 
b/t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir
new file mode 100644
index 000..a71b38a
--- /dev/null
+++ b/t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir
@@ -0,0 +1,3 @@
+$ git diff --no-index --raw --abbrev=4 dir2 dir
+:00 100644 ... ... A   dir/sub
+$
diff --git a/t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir 

[PATCH v4] diff: handle --no-abbrev in no-index case

2016-12-06 Thread Jack Bates
There are two different places where the --no-abbrev option is parsed,
and two different places where SHA-1s are abbreviated. We normally parse
--no-abbrev with setup_revisions(), but in the no-index case, "git diff"
calls diff_opt_parse() directly, and diff_opt_parse() didn't handle
--no-abbrev until now. (It did handle --abbrev, however.) We normally
abbreviate SHA-1s with find_unique_abbrev(), but commit 4f03666 ("diff:
handle sha1 abbreviations outside of repository, 2016-10-20) recently
introduced a special case when you run "git diff" outside of a
repository.

setup_revisions() does also call diff_opt_parse(), but not for --abbrev
or --no-abbrev, which it handles itself. setup_revisions() sets
rev_info->abbrev, and later copies that to diff_options->abbrev. It
handles --no-abbrev by setting abbrev to zero. (This change doesn't
touch that.)

Setting abbrev to zero was broken in the outside-of-a-repository special
case, which until now resulted in a truly zero-length SHA-1, rather than
taking zero to mean do not abbreviate. The only way to trigger this bug,
however, was by running "git diff --raw" without either the --abbrev or
--no-abbrev options, because 1) without --raw it doesn't respect abbrev
(which is bizarre, but has been that way forever), 2) we silently clamp
--abbrev=0 to MINIMUM_ABBREV, and 3) --no-abbrev wasn't handled until
now.

The outside-of-a-repository case is one of three no-index cases. The
other two are when one of the files you're comparing is outside of the
repository you're in, and the --no-index option.

Signed-off-by: Jack Bates 
---
 diff.c  | 6 +-
 t/t4013-diff-various.sh | 7 +++
 t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir  | 3 +++
 t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir | 3 +++
 t/t4013/diff.diff_--no-index_--raw_dir2_dir | 3 +++
 t/t4013/diff.diff_--raw_--abbrev=4_initial  | 6 ++
 t/t4013/diff.diff_--raw_--no-abbrev_initial | 6 ++
 t/t4013/diff.diff_--raw_initial | 6 ++
 8 files changed, 39 insertions(+), 1 deletion(-)
 create mode 100644 t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir
 create mode 100644 t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir
 create mode 100644 t/t4013/diff.diff_--no-index_--raw_dir2_dir
 create mode 100644 t/t4013/diff.diff_--raw_--abbrev=4_initial
 create mode 100644 t/t4013/diff.diff_--raw_--no-abbrev_initial
 create mode 100644 t/t4013/diff.diff_--raw_initial

diff --git a/diff.c b/diff.c
index ec87283..84dba60 100644
--- a/diff.c
+++ b/diff.c
@@ -3106,7 +3106,8 @@ static const char *diff_abbrev_oid(const struct object_id 
*oid, int abbrev)
abbrev = FALLBACK_DEFAULT_ABBREV;
if (abbrev > GIT_SHA1_HEXSZ)
die("BUG: oid abbreviation out of range: %d", abbrev);
-   hex[abbrev] = '\0';
+   if (abbrev)
+   hex[abbrev] = '\0';
return hex;
}
 }
@@ -3364,6 +3365,7 @@ void diff_setup(struct diff_options *options)
 
options->file = stdout;
 
+   options->abbrev = DEFAULT_ABBREV;
options->line_termination = '\n';
options->break_opt = -1;
options->rename_limit = -1;
@@ -4024,6 +4026,8 @@ int diff_opt_parse(struct diff_options *options,
offending, optarg);
return argcount;
}
+   else if (!strcmp(arg, "--no-abbrev"))
+   options->abbrev = 0;
else if (!strcmp(arg, "--abbrev"))
options->abbrev = DEFAULT_ABBREV;
else if (skip_prefix(arg, "--abbrev=", )) {
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 566817e..d09acfe 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -311,6 +311,13 @@ diff --line-prefix=abc master master^ side
 diff --dirstat master~1 master~2
 diff --dirstat initial rearrange
 diff --dirstat-by-file initial rearrange
+# No-index --abbrev and --no-abbrev
+diff --raw initial
+diff --raw --abbrev=4 initial
+diff --raw --no-abbrev initial
+diff --no-index --raw dir2 dir
+diff --no-index --raw --abbrev=4 dir2 dir
+diff --no-index --raw --no-abbrev dir2 dir
 EOF
 
 test_expect_success 'log -S requires an argument' '
diff --git a/t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir 
b/t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir
new file mode 100644
index 000..a71b38a
--- /dev/null
+++ b/t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir
@@ -0,0 +1,3 @@
+$ git diff --no-index --raw --abbrev=4 dir2 dir
+:00 100644 ... ... A   dir/sub
+$
diff --git a/t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir 
b/t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir
new file mode 100644
index 000..e0f0097
--- /dev/null
+++ b/t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir
@@ 

[PATCH v4] diff: handle --no-abbrev in no-index case

2016-12-06 Thread Jack Bates
There are two different places where the --no-abbrev option is parsed,
and two different places where SHA-1s are abbreviated. We normally parse
--no-abbrev with setup_revisions(), but in the no-index case, "git diff"
calls diff_opt_parse() directly, and diff_opt_parse() didn't handle
--no-abbrev until now. (It did handle --abbrev, however.) We normally
abbreviate SHA-1s with find_unique_abbrev(), but commit 4f03666 ("diff:
handle sha1 abbreviations outside of repository, 2016-10-20) recently
introduced a special case when you run "git diff" outside of a
repository.

setup_revisions() does also call diff_opt_parse(), but not for --abbrev
or --no-abbrev, which it handles itself. setup_revisions() sets
rev_info->abbrev, and later copies that to diff_options->abbrev. It
handles --no-abbrev by setting abbrev to zero. (This change doesn't
touch that.)

Setting abbrev to zero was broken in the outside-of-a-repository special
case, which until now resulted in a truly zero-length SHA-1, rather than
taking zero to mean do not abbreviate. The only way to trigger this bug,
however, was by running "git diff --raw" without either the --abbrev or
--no-abbrev options, because 1) without --raw it doesn't respect abbrev
(which is bizarre, but has been that way forever), 2) we silently clamp
--abbrev=0 to MINIMUM_ABBREV, and 3) --no-abbrev wasn't handled until
now.

The outside-of-a-repository case is one of three no-index cases. The
other two are when one of the files you're comparing is outside of the
repository you're in, and the --no-index option.

Signed-off-by: Jack Bates 
---
 diff.c  | 6 +-
 t/t4013-diff-various.sh | 7 +++
 t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir  | 3 +++
 t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir | 3 +++
 t/t4013/diff.diff_--no-index_--raw_dir2_dir | 3 +++
 t/t4013/diff.diff_--raw_--abbrev=4_initial  | 6 ++
 t/t4013/diff.diff_--raw_--no-abbrev_initial | 6 ++
 t/t4013/diff.diff_--raw_initial | 6 ++
 8 files changed, 39 insertions(+), 1 deletion(-)
 create mode 100644 t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir
 create mode 100644 t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir
 create mode 100644 t/t4013/diff.diff_--no-index_--raw_dir2_dir
 create mode 100644 t/t4013/diff.diff_--raw_--abbrev=4_initial
 create mode 100644 t/t4013/diff.diff_--raw_--no-abbrev_initial
 create mode 100644 t/t4013/diff.diff_--raw_initial

diff --git a/diff.c b/diff.c
index ec87283..84dba60 100644
--- a/diff.c
+++ b/diff.c
@@ -3106,7 +3106,8 @@ static const char *diff_abbrev_oid(const struct object_id 
*oid, int abbrev)
abbrev = FALLBACK_DEFAULT_ABBREV;
if (abbrev > GIT_SHA1_HEXSZ)
die("BUG: oid abbreviation out of range: %d", abbrev);
-   hex[abbrev] = '\0';
+   if (abbrev)
+   hex[abbrev] = '\0';
return hex;
}
 }
@@ -3364,6 +3365,7 @@ void diff_setup(struct diff_options *options)
 
options->file = stdout;
 
+   options->abbrev = DEFAULT_ABBREV;
options->line_termination = '\n';
options->break_opt = -1;
options->rename_limit = -1;
@@ -4024,6 +4026,8 @@ int diff_opt_parse(struct diff_options *options,
offending, optarg);
return argcount;
}
+   else if (!strcmp(arg, "--no-abbrev"))
+   options->abbrev = 0;
else if (!strcmp(arg, "--abbrev"))
options->abbrev = DEFAULT_ABBREV;
else if (skip_prefix(arg, "--abbrev=", )) {
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 566817e..d7b71a0 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -311,6 +311,13 @@ diff --line-prefix=abc master master^ side
 diff --dirstat master~1 master~2
 diff --dirstat initial rearrange
 diff --dirstat-by-file initial rearrange
+# --abbrev and --no-abbrev outside of repository
+diff --raw initial
+diff --raw --abbrev=4 initial
+diff --raw --no-abbrev initial
+diff --no-index --raw dir2 dir
+diff --no-index --raw --abbrev=4 dir2 dir
+diff --no-index --raw --no-abbrev dir2 dir
 EOF
 
 test_expect_success 'log -S requires an argument' '
diff --git a/t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir 
b/t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir
new file mode 100644
index 000..a71b38a
--- /dev/null
+++ b/t/t4013/diff.diff_--no-index_--raw_--abbrev=4_dir2_dir
@@ -0,0 +1,3 @@
+$ git diff --no-index --raw --abbrev=4 dir2 dir
+:00 100644 ... ... A   dir/sub
+$
diff --git a/t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir 
b/t/t4013/diff.diff_--no-index_--raw_--no-abbrev_dir2_dir
new file mode 100644
index 000..e0f0097
--- /dev/null
+++ 

Re: [PATCH] git-p4: add p4 shelf support

2016-12-06 Thread Vinicius Kursancew
It seems if you do shelve and then later submit the p4 workspace is
left dirty (just like --prepare-p4-only would've done).
Vinicius Alexandre Kursancew 



On Tue, Dec 6, 2016 at 8:36 AM, Luke Diamand  wrote:
> On 6 December 2016 at 02:02, Nuno Subtil  wrote:
>> Extends the submit command to support shelving a commit instead of
>> submitting it to p4 (similar to --prepare-p4-only).
>
> Is this just the same as these two changes?
>
> http://www.spinics.net/lists/git/msg290755.html
> http://www.spinics.net/lists/git/msg291103.html
>
> Thanks,
> Luke
>
>>
>> Signed-off-by: Nuno Subtil 
>> ---
>>  git-p4.py | 36 ++--
>>  1 file changed, 30 insertions(+), 6 deletions(-)
>>
>> diff --git a/git-p4.py b/git-p4.py
>> index fd5ca52..3c4be22 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -1286,6 +1286,8 @@ def __init__(self):
>>  optparse.make_option("--export-labels", 
>> dest="exportLabels", action="store_true"),
>>  optparse.make_option("--dry-run", "-n", dest="dry_run", 
>> action="store_true"),
>>  optparse.make_option("--prepare-p4-only", 
>> dest="prepare_p4_only", action="store_true"),
>> +optparse.make_option("--shelve-only", dest="shelve_only", 
>> action="store_true", help="Create P4 shelf for first change that would be 
>> submitted (using a new CL)"),
>> +optparse.make_option("--shelve-cl", dest="shelve_cl", 
>> help="Replace shelf under existing CL number (previously shelved files will 
>> be deleted)"),
>>  optparse.make_option("--conflict", dest="conflict_behavior",
>>   
>> choices=self.conflict_behavior_choices),
>>  optparse.make_option("--branch", dest="branch"),
>> @@ -1297,6 +1299,8 @@ def __init__(self):
>>  self.preserveUser = gitConfigBool("git-p4.preserveUser")
>>  self.dry_run = False
>>  self.prepare_p4_only = False
>> +self.shelve_only = False
>> +self.shelve_cl = None
>>  self.conflict_behavior = None
>>  self.isWindows = (platform.system() == "Windows")
>>  self.exportLabels = False
>> @@ -1496,6 +1500,12 @@ def prepareSubmitTemplate(self):
>>  else:
>>  inFilesSection = False
>>  else:
>> +if self.shelve_only and self.shelve_cl:
>> +if line.startswith("Change:"):
>> +line = "Change: %s\n" % self.shelve_cl
>> +if line.startswith("Status:"):
>> +line = "Status: pending\n"
>> +
>>  if line.startswith("Files:"):
>>  inFilesSection = True
>>
>> @@ -1785,7 +1795,11 @@ def applyCommit(self, id):
>>  if self.isWindows:
>>  message = message.replace("\r\n", "\n")
>>  submitTemplate = message[:message.index(separatorLine)]
>> -p4_write_pipe(['submit', '-i'], submitTemplate)
>> +
>> +if self.shelve_only:
>> +p4_write_pipe(['shelve', '-i', '-r'], submitTemplate)
>> +else:
>> +p4_write_pipe(['submit', '-i'], submitTemplate)
>>
>>  if self.preserveUser:
>>  if p4User:
>> @@ -1799,12 +1813,17 @@ def applyCommit(self, id):
>>  # new file.  This leaves it writable, which confuses p4.
>>  for f in pureRenameCopy:
>>  p4_sync(f, "-f")
>> -submitted = True
>> +
>> +if not self.shelve_only:
>> +submitted = True
>>
>>  finally:
>>  # skip this patch
>>  if not submitted:
>> -print "Submission cancelled, undoing p4 changes."
>> +if not self.shelve_only:
>> +print "Submission cancelled, undoing p4 changes."
>> +else:
>> +print "Change shelved, undoing p4 changes."
>>  for f in editedFiles:
>>  p4_revert(f)
>>  for f in filesToAdd:
>> @@ -2034,9 +2053,13 @@ def run(self, args):
>>  if ok:
>>  applied.append(commit)
>>  else:
>> -if self.prepare_p4_only and i < last:
>> -print "Processing only the first commit due to option" \
>> -  " --prepare-p4-only"
>> +if (self.prepare_p4_only or self.shelve_only) and i < last:
>> +if self.prepare_p4_only:
>> +print "Processing only the first commit due to 
>> option" \
>> +  " --prepare-p4-only"
>> +else:
>> +print "Processing only the first commit due to 
>> option" \
>> +  

[PATCH] stash: disable renames when calling git-diff

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 10:25:30AM -0500, Jeff King wrote:

> I agree that the second stash not saving the deletion seems like a bug.
> Oddly, just stashing a deletion, like:
> 
>   rm a
>   git stash
>   git stash apply
> 
> does store it. So it's not like we can't represent the deletion. Somehow
> the presence of "b" in the index matters.
> 
> It looks like the culprit may be this part of create_stash():
> 
>   git diff --name-only -z HEAD -- >"$TMP-stagenames"
> 
> That is using the "git diff" porcelain, which will respect renames, and
> the --name-only bit mentions only "b".
> 
> If you run:
> 
>   git -c diff.renames=false stash
> 
> then it works.

And here's a patch to fix it.

-- >8 --
Subject: [PATCH] stash: disable renames when calling git-diff

When creating a stash, we need to look at the diff between
the working tree and HEAD. There's no plumbing command that
covers this case, so we do so by calling the git-diff
porcelain. This means we also inherit the usual list of
porcelain options, which can impact the output in confusing
ways.

There's at least one known problem: --name-only will not
mention the source side of a rename, meaning that we will
fail to stash a deletion in such a case.

The correct solution is to move to an appropriate plumbing
command. But since there isn't one available, in the short
term we can plug this particular problem by manually asking
git-diff not to respect renames.

Reported-by: Matthew Patey 
Signed-off-by: Jeff King 
---
There may be other similar problems lurking depending on the various
config options you have set, but I don't think so. Most of the options
only affect --patch output.

I do find it interesting that --name-only does not mention the source
side of a rename. The longer forms like --name-status mention both
sides. Certainly --name-status does not have any way of saying "this is
the source side of a rename", but I'd think it would simply mention both
sides. Anyway, even if that were changed (which would fix this bug), I
think adding --no-renames is still a good thing to be doing here.

 git-stash.sh | 2 +-
 t/t3903-stash.sh | 9 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/git-stash.sh b/git-stash.sh
index 4546abaae..40ab2710e 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -115,7 +115,7 @@ create_stash () {
git read-tree --index-output="$TMPindex" -m $i_tree &&
GIT_INDEX_FILE="$TMPindex" &&
export GIT_INDEX_FILE &&
-   git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
+   git diff --name-only -z --no-renames HEAD -- 
>"$TMP-stagenames" &&
git update-index -z --add --remove --stdin 
<"$TMP-stagenames" &&
git write-tree &&
rm -f "$TMPindex"
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index e1a6ccc00..2de3e18ce 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -766,4 +766,13 @@ test_expect_success 'stash list --cc shows combined diff' '
test_cmp expect actual
 '
 
+test_expect_success 'stash is not confused by partial renames' '
+   mv file renamed &&
+   git add renamed &&
+   git stash &&
+   git stash apply &&
+   test_path_is_file renamed &&
+   test_path_is_missing file
+'
+
 test_done
-- 
2.11.0.191.gdb26c57



Re: Bug: stash staged file move loses original file deletion

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 02:45:05PM +, Matthew Patey wrote:

> Thanks for the reply! I agree that it is weird that applying a stash with a
> move only puts the addition back in the index, and thanks for the tip on
> "index" to properly apply that. For this case I was focusing on the
> behavior of the second stash/apply, with the first stash/apply as an
> example of how to get into a weird state that triggers the odd behavior of
> the second.

Oh, heh, I totally missed that.

I agree that the second stash not saving the deletion seems like a bug.
Oddly, just stashing a deletion, like:

  rm a
  git stash
  git stash apply

does store it. So it's not like we can't represent the deletion. Somehow
the presence of "b" in the index matters.

It looks like the culprit may be this part of create_stash():

  git diff --name-only -z HEAD -- >"$TMP-stagenames"

That is using the "git diff" porcelain, which will respect renames, and
the --name-only bit mentions only "b".

If you run:

  git -c diff.renames=false stash

then it works.

-Peff


Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 03:48:38PM +0100, Johannes Schindelin wrote:

> > Should it blindly look at ".git/config"?
> 
> Absolutely not, of course. You did not need me to say that.
> 
> > Now your program behaves differently depending on whether you are in the
> > top-level of the working tree.
> 
> Exactly. This, BTW, is already how the code would behave if anybody called
> `git_path()` before `setup_git_directory()`, as the former function
> implicitly calls `setup_git_env()` which does *not* call
> `setup_git_directory()` but *does* set up `git_dir` which is then used by
> `do_git_config_sequence()`..
> 
> We have a few of these nasty surprises in our code base, where code
> silently assumes that global state is set up correctly, and succeeds in
> sometimes surprising ways if it is not.

Right. I have been working on fixing this. v2.11 has a ton of tweaks in
this area, and my patch to die() rather than default to ".git" is
cooking in next to catch any stragglers.

> > Should it speculatively do repo discovery, and use any discovered config
> > file?
> 
> Personally, I find the way we discover the repository most irritating. It
> seems that we have multiple, mutually incompatible code paths
> (`setup_git_directory()` and `setup_git_env()` come to mind already, and
> it does not help that consecutive calls to `setup_git_directory()` will
> yield a very unexpected outcome).

I agree. We should be killing off setup_git_env(), which is something
I've been slowly working towards over the years.

There are some annoyances with setup_git_directory(), too (like the fact
that you cannot ask "is there a git repository you can find" without
making un-recoverable changes to the process state). I think we should
fix those, too.

> > Now some commands respect config that they shouldn't (e.g., running "git
> > init foo.git" from inside another repository will accidentally pick up
> > the value of core.sharedrepository from wherever you happen to run it).
> 
> Right. That points to another problem with the way we do things: we leak
> global state from discovering a git_dir, which means that we can neither
> undo nor override it.
> 
> If we discovered our git_dir in a robust manner, `git init` could say:
> hey, this git_dir is actually not what I wanted, here is what I want.
> 
> Likewise, `git submodule` would eventually be able to run in the very same
> process as the calling `git`, as would a local fetch.

Yep, I agree with all that. I do think things _have_ been improving over
the years, and the code is way less tangled than it used to be. But
there are so many corner cases, and the code is so fundamental, that it
has been slow going. I'd be happy to review patches if you want to push
it along.

> > So I think the caller of the config code has to provide some kind of
> > context about how it is expecting to run and how the value will be used.
> 
> Yep.
> 
> Maybe even go a step further and say that the config code needs a context
> "object".

If I were writing git from scratch, I'd consider making a "struct
repository" object. I'm not sure how painful it would be to retro-fit it
at this point.

> > Right now if setup_git_directory() or similar hasn't been called, the
> > config code does not look.
> 
> Correct.
> 
> Actually, half correct. If setup_git_directory() has not been called, but,
> say, git_path() (and thereby implicitly setup_git_env()), the config code
> *does* look. At a hard-coded .git/config.

Not since b9605bc4f (config: only read .git/config from configured
repos, 2016-09-12). That's why pager.c needs its little hack.

I guess you could see that as a step backwards, but I think it is
necessary one on the road to doing it right.

> > Ideally there would be a way for a caller to say "I am running early and
> > not even sure yet if we are in a repo; please speculatively try to find
> > repo config for me".
> 
> And ideally, it would not roll *yet another* way to discover the git_dir,
> but it would reuse the same function (fixing it not to chdir()
> unilaterally).

Yes, absolutely.

> Of course, not using `chdir()` means that we have to figure out symbolic
> links somehow, in case somebody works from a symlinked subdirectory, e.g.:
> 
>   ln -s $PWD/t/ ~/test-directory
>   cd ~/test-directory
>   git log

There's work happening elsewhere[1] on making real_path() work without
calling chdir(). Which necessarily involves resolving symlinks
ourselves. I wonder if we could leverage that work here (ideally by
using real_path() under the hood, and not reimplementing the same
readlink() logic ourselves).

[1] 
http://public-inbox.org/git/1480964316-99305-1-git-send-email-bmw...@google.com/

> > The pager code does this manually, and without great accuracy; see the
> > hack in pager.c's read_early_config().
> 
> I saw it. And that is what triggered the mail you are responding to (you
> probably saw my eye-rolling between the lines).
> 
> The real question is: can we fix this? Or is there simply too 

inconsistent output from git add about ignored files

2016-12-06 Thread Yaroslav Halchenko
Dear Git gurus,

It seems that there is some inconsistency in output of git while it is
ignoring files:  

if a single path within ignored directory is provided  -- git outputs
the file path.  If multiple files are provided (some of which aren't
ignored) -- git outputs only the ignored directory name:

% git --version
git version 2.10.2
% git status
On branch master
Untracked files:
  (use "git add ..." to include in what will be committed)

bu

nothing added to commit but untracked files present (use "git add" to track)
% cat .gitignore
*.me
% git add blah.me/bu
The following paths are ignored by one of your .gitignore files:
blah.me/bu
Use -f if you really want to add them.
% git add blah.me/bu bu
The following paths are ignored by one of your .gitignore files:
blah.me
Use -f if you really want to add them.
% git status
On branch master
Changes to be committed:
  (use "git reset HEAD ..." to unstage)

new file:   bu



So note that in the first case it reports "blah.me/bu" whenever in the
second -- only "blah.me".

P.S. Please CC us in your replies.

With best regards,
-- 
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834   Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik


Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin

2016-12-06 Thread Johannes Schindelin
Hi Peff,

On Tue, 6 Dec 2016, Jeff King wrote:

> On Tue, Dec 06, 2016 at 02:16:35PM +0100, Johannes Schindelin wrote:
> 
> > > Johannes Schindelin  writes:
> > > 
> > > > Seriously, do you really think it is a good idea to have
> > > > git_config_get_value() *ignore* any value in .git/config?
> > > 
> > > When we do not know ".git/" directory we see is the repository we
> > > were told to work on, then we should ignore ".git/config" file.  So
> > > allowing git_config_get_value() to _ignore_ ".git/config" before the
> > > program calls setup_git_directory() does have its uses.
> > 
> > I think you are wrong. This is yet another inconsistent behavior that
> > violates the Law of Least Surprises.
> 
> There are surprises in this code any way you turn.  If we have not
> called setup_git_directory(), then how does git_config_get_value() know
> if we are in a repository or not?

My biggest surprise, frankly, would be that `git init` and `git clone` are
not special-cased.

> Should it blindly look at ".git/config"?

Absolutely not, of course. You did not need me to say that.

> Now your program behaves differently depending on whether you are in the
> top-level of the working tree.

Exactly. This, BTW, is already how the code would behave if anybody called
`git_path()` before `setup_git_directory()`, as the former function
implicitly calls `setup_git_env()` which does *not* call
`setup_git_directory()` but *does* set up `git_dir` which is then used by
`do_git_config_sequence()`..

We have a few of these nasty surprises in our code base, where code
silently assumes that global state is set up correctly, and succeeds in
sometimes surprising ways if it is not.

> Should it speculatively do repo discovery, and use any discovered config
> file?

Personally, I find the way we discover the repository most irritating. It
seems that we have multiple, mutually incompatible code paths
(`setup_git_directory()` and `setup_git_env()` come to mind already, and
it does not help that consecutive calls to `setup_git_directory()` will
yield a very unexpected outcome).

Just try to explain to a veteran software engineer why you cannot call
`setup_git_directory_gently()` multiple times and expect the very same
return value every time.

Another irritation is that some commands that clearly would like to use a
repository if there is one (such as `git diff`) are *not* marked with
RUN_SETUP_GENTLY, due to these unfortunate implementation details.

> Now some commands respect config that they shouldn't (e.g., running "git
> init foo.git" from inside another repository will accidentally pick up
> the value of core.sharedrepository from wherever you happen to run it).

Right. That points to another problem with the way we do things: we leak
global state from discovering a git_dir, which means that we can neither
undo nor override it.

If we discovered our git_dir in a robust manner, `git init` could say:
hey, this git_dir is actually not what I wanted, here is what I want.

Likewise, `git submodule` would eventually be able to run in the very same
process as the calling `git`, as would a local fetch.

> So I think the caller of the config code has to provide some kind of
> context about how it is expecting to run and how the value will be used.

Yep.

Maybe even go a step further and say that the config code needs a context
"object".

> Right now if setup_git_directory() or similar hasn't been called, the
> config code does not look.

Correct.

Actually, half correct. If setup_git_directory() has not been called, but,
say, git_path() (and thereby implicitly setup_git_env()), the config code
*does* look. At a hard-coded .git/config.

> Ideally there would be a way for a caller to say "I am running early and
> not even sure yet if we are in a repo; please speculatively try to find
> repo config for me".

And ideally, it would not roll *yet another* way to discover the git_dir,
but it would reuse the same function (fixing it not to chdir()
unilaterally).

Of course, not using `chdir()` means that we have to figure out symbolic
links somehow, in case somebody works from a symlinked subdirectory, e.g.:

ln -s $PWD/t/ ~/test-directory
cd ~/test-directory
git log

> The pager code does this manually, and without great accuracy; see the
> hack in pager.c's read_early_config().

I saw it. And that is what triggered the mail you are responding to (you
probably saw my eye-rolling between the lines).

The real question is: can we fix this? Or is there simply too great
reluctance to change the current code?

> I think the way forward is:
> 
>   1. Make that an optional behavior in git_config_with_options() so
>  other spots can reuse it (probably alias lookup, and something like
>  your difftool config).

Ideally: *any* early call to `git_config_get_value()`. Make things less
surprising.

>   2. Make it more accurate. Right now it blindly looks in .git/config,
>  but it should be able to 

Re: Bug: stash staged file move loses original file deletion

2016-12-06 Thread Jeff King
On Mon, Dec 05, 2016 at 09:37:51AM -0500, Matthew Patey wrote:

> Git version 2.8.0 (sorry can't update to more recent on my machine) on Ubuntu.

The behavior is the same on more recent versions of git, too. The short
answer is "use --index". The longer one is below.

> After moving a file, if the new file is in the index but the deletion
> of the old file is not staged, git stash loses the deletion operation.
> Repro:
> 
> 1. git mv a b
> This will have both the "deletion" and the "file added" in the index
> 
> 2. git stash; git stash pop
> Now file a is still in the index, but file b deletion is not staged.
> 
> 3. git stash; git stash show -v
> This will show that the deletion operation is not in the stash
> 
> 4. git stash pop
> Again confirms the issue, file a is in the index, but file b is
> present and unmodified in the working directory.

Thanks for a clear reproduction case. I think the oddball, though, is
not that "b" is not staged for deletion, but that the addition of "a"
_is_ staged.

Applying a stash usually does not re-stage index contents, unless you
specify --index. For example, try:

  # Make a staged change
  echo change >>a
  git add a

  # This puts the change back into the working tree, but does _not_
  # put it into the index.
  git stash apply

  # Now reset to try again.
  git reset --hard

  # This does restore the index.
  git stash apply --index

So in your case, the deletion of "b" is following that same rule. What's
unusual is that "a" is staged. There's code specifically in git-stash
specifically to make sure this is the case, but I don't remember offhand
why this is so. The code comes from the original f2c66ed19 (Add
git-stash script, 2007-06-30), which in turn seems to come from Junio's
comments in:

  http://public-inbox.org/git/7vmyyq2zrz@assigned-by-dhcp.pobox.com/

I don't see any specific reasoning, but I think it is simply that it is
confusing for the files to become untracked totally. These days we have
intent-to-add via "git add -N", so that might actually be a better
choice for this case.

Anyway, that history digression is neither here nor there for your case.
If you want to restore the index contents, use "--index". That does what
you were expecting:

  $ git mv a b
  $ git stash && git stash apply --index
  Saved working directory and index state WIP on master: 5355755 foo
  HEAD is now at 5355755 foo
  On branch master
  Changes to be committed:
  renamed:a -> b

-Peff


Re: [RFC PATCH] GIT-VERSION-GEN: set --abbrev=9 to match auto-scaling

2016-12-06 Thread Jeff King
On Mon, Dec 05, 2016 at 10:01:19AM -0800, Junio C Hamano wrote:

> > That said, I think the right patch may be to just drop --abbrev
> > entirely.
> > ...
> > I think at that point it was a noop, as 7 should have been the default.
> > And now we probably ought to drop it, so that we can use the
> > auto-scaling default.
> 
> Yeah, I agree.
> 
> It does mean that snapshot binaries built out of the same commit in
> the same repository before and after a repack have higher chances of
> getting named differently, which may surprise people, but that
> already is possible with a fixed length if the repacking involves
> pruning (albeit with lower probabilities), and I do not think it is
> a problem.

I think that the number is already unstable, even with --abbrev, as it
just specifies a minimum. So any operation that creates objects has a
possibility of increasing the length. Or more likely, two people
describing the same version may end up with different strings because
they have different objects in their repositories (e.g., I used to
carry's trast's git-notes archive of the mailing list which added quite
a few objects).

I agree that having it change over the course of a repack is slightly
_more_ surprising than those cases, but ultimately the value just isn't
stable.

-Peff


Re: git add -p doesn't honor diff.noprefix config

2016-12-06 Thread Jeff King
On Sat, Dec 03, 2016 at 07:45:18AM +0100, paddor wrote:

> I set the config diff.noprefix = true because I don't like the a/ and
> b/ prefixes, which nicely changed the output of `git diff`.
> Unfortunately, the filenames in the output of `git add --patch` are
> still prefixed.
> 
> To me, this seems like a bug. Or there's a config option missing.

The interactive-add process is a perl script built around plumbing
commands like diff-tree, diff-files, etc.  Plumbing commands do not
respect some config options, so that the output remains stable or
scripts built around them. And diff.noprefix is one of these. So scripts
have to get the value themselves and decide whether to pass it along to
the plumbing.

In this case, I think there are two steps needed:

  1. Confirm that git-add--interactive.perl can actually handle
 no-prefix patches. It feeds the patches back to git-apply, which
 may be a complication (so it may need, for example, to pass a
 special flag to git-apply).

  2. git-add--interactive.perl needs to parse the config value, and if
 set, pass the appropriate option to the diff plumbing. This should
 only be one or two lines; see how $diff_algorithm is handled in
 that script.

-Peff


Re: [PATCH v2 0/6] shallow.c improvements

2016-12-06 Thread Duy Nguyen
On Tue, Dec 6, 2016 at 8:42 PM, Jeff King  wrote:
> The final one _seems_ reasonable after reading your explanation, but I
> lack enough context to know whether or not there might be a corner case
> that you're missing. I'm inclined to trust your assessment on it.

Yeah I basically just wrote down my thoughts so somebody could maybe
spot something wrong. I'm going to think about it some more in the
next few days.
-- 
Duy


Re: [PATCH v7 4/4] transport: add from_user parameter to is_transport_allowed

2016-12-06 Thread Jeff King
On Mon, Dec 05, 2016 at 12:04:52PM -0800, Junio C Hamano wrote:

> > I'm sending out another reroll of this series so that in Jeff's he can
> > just call 'get_curl_allowed_protocols(-1)' for the non-redirection curl
> > option, which should make this test stop barfing.
> 
> I was hoping to eventually merge Peff's series to older maintenance
> tracks.  How bad would it be if we rebased the v8 of this series
> together with Peff's series to say v2.9 (or even older if it does
> not look too bad)?

My series actually fixes existing security problems, so I'd consider it
a bug-fix. I _think_ Brandon's series is purely about allowing more
expressiveness in the whitelist policy, and so could be considered more
of a feature.

So one option is to apply my series for older 'maint', and then just
rebase Brandon's on top of that for 'master'.

I don't know if that makes things any easier. I feel funny saying "no,
no, mine preempts yours because it is more maint-worthy", but I think
that order does make sense.

I think it would be OK to put Brandon's on maint, too, though. It is a
refactor of an existing security feature to make it more featureful, but
the way it is implemented could not cause security regressions unless
you use the new feature (IOW, we still respect the whitelist environment
exactly as before).

-Peff


Re: [PATCH] difftool: fix dir-diff index creation when in a subdirectory

2016-12-06 Thread Johannes Schindelin
Hi David,

On Mon, 5 Dec 2016, David Aguilar wrote:

> 9ec26e797781239b36ebccb87c590e5778358007 corrected how path arguments

How about using the "whatis" instead, i.e. "9ec26e7977 (difftool: fix
argument handling in subdirs, 2016-07-18)"?

Ciao,
Dscho


[PATCH v2 0/6] shallow.c improvements

2016-12-06 Thread Nguyễn Thái Ngọc Duy
After staring not-so-hard and not-for-so-long at the code. This is
what I come up with. Rasmus I replaced two of your commits with my
own (and thank you for giving me an opportunity to refresh my memory
with this stuff). The first two commits are new and the result of
Jeff's observation on COMMIT_SLAB_SIZE.

You may find the description here a bit different from my explanation
previously (about "exclude/shallow requests"). Well.. I was wrong.
I had the recent --exclude-tag and friends in mind, but this is about
clone/fetch/push from/to a shallow repository since 2013, no wonder I
don't remember much about it :-D

Nguyễn Thái Ngọc Duy (4):
  shallow.c: rename fields in paint_info to better express their purposes
  shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory pools
  shallow.c: make paint_alloc slightly more robust
  shallow.c: remove useless code

Rasmus Villemoes (2):
  shallow.c: avoid theoretical pointer wrap-around
  shallow.c: bit manipulation tweaks

 shallow.c | 39 ---
 1 file changed, 20 insertions(+), 19 deletions(-)

-- 
2.8.2.524.g6ff3d78



Re: [PATCH v2 0/6] shallow.c improvements

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 07:53:33PM +0700, Nguyễn Thái Ngọc Duy wrote:

> Nguyễn Thái Ngọc Duy (4):
>   shallow.c: rename fields in paint_info to better express their purposes
>   shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory pools
>   shallow.c: make paint_alloc slightly more robust
>   shallow.c: remove useless code
> 
> Rasmus Villemoes (2):
>   shallow.c: avoid theoretical pointer wrap-around
>   shallow.c: bit manipulation tweaks

The first 5 patches look obviously good to me. The naming changes in
paint_alloc() make things much clearer, and the fixes retained from
Rasmus are all obvious improvements.

The final one _seems_ reasonable after reading your explanation, but I
lack enough context to know whether or not there might be a corner case
that you're missing. I'm inclined to trust your assessment on it.

-Peff


Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin

2016-12-06 Thread Jeff King
On Tue, Dec 06, 2016 at 02:16:35PM +0100, Johannes Schindelin wrote:

> > Johannes Schindelin  writes:
> > 
> > > Seriously, do you really think it is a good idea to have
> > > git_config_get_value() *ignore* any value in .git/config?
> > 
> > When we do not know ".git/" directory we see is the repository we were
> > told to work on, then we should ignore ".git/config" file.  So allowing
> > git_config_get_value() to _ignore_ ".git/config" before the program
> > calls setup_git_directory() does have its uses.
> 
> I think you are wrong. This is yet another inconsistent behavior that
> violates the Law of Least Surprises.

There are surprises in this code any way you turn.  If we have not
called setup_git_directory(), then how does git_config_get_value() know
if we are in a repository or not?

Should it blindly look at ".git/config"? Now your program behaves
differently depending on whether you are in the top-level of the working
tree.

Should it speculatively do repo discovery, and use any discovered config
file? Now some commands respect config that they shouldn't (e.g.,
running "git init foo.git" from inside another repository will
accidentally pick up the value of core.sharedrepository from wherever
you happen to run it).

So I think the caller of the config code has to provide some kind of
context about how it is expecting to run and how the value will be used.

Right now if setup_git_directory() or similar hasn't been called, the
config code does not look. Ideally there would be a way for a caller to
say "I am running early and not even sure yet if we are in a repo;
please speculatively try to find repo config for me".

The pager code does this manually, and without great accuracy; see the
hack in pager.c's read_early_config(). I think the way forward is:

  1. Make that an optional behavior in git_config_with_options() so
 other spots can reuse it (probably alias lookup, and something like
 your difftool config).

  2. Make it more accurate. Right now it blindly looks in .git/config,
 but it should be able to do the usual repo-detection (_without_
 actually entering the repo) to try to find a possible config file.

-Peff


Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin

2016-12-06 Thread Johannes Schindelin
Hi Junio,

On Mon, 5 Dec 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > Seriously, do you really think it is a good idea to have
> > git_config_get_value() *ignore* any value in .git/config?
> 
> When we do not know ".git/" directory we see is the repository we were
> told to work on, then we should ignore ".git/config" file.  So allowing
> git_config_get_value() to _ignore_ ".git/config" before the program
> calls setup_git_directory() does have its uses.

I think you are wrong. This is yet another inconsistent behavior that
violates the Law of Least Surprises.

> > We need to fix this.
> 
> I have a feeling that "environment modifications that cannot be undone"
> we used as the rationale in 73c2779f42 ("builtin-am: implement skeletal
> builtin am", 2015-08-04) might be overly pessimistic and in order to
> implement undo_setup_git_directory(), all we need to do may just be
> matter of doing a chdir(2) back to prefix and unsetting GIT_PREFIX
> environment, but I haven't looked into details of the setup sequence
> recently.

The problem is that we may not know enough anymore to "undo
setup_git_directory()", as some environment variables may have been set
before calling Git. If we simply unset the environment variables, we do it
incorrectly. On the other hand, the environment variables *may* have been
set by setup_git_directory(). In which case we *do* have to unset them.

The entire setup_git_directory() logic is a bit of a historically-grown
garden.

Ciao,
Dscho


[PATCH v2 2/6] shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory pools

2016-12-06 Thread Nguyễn Thái Ngọc Duy
We need to allocate a "big" block of memory in paint_alloc(). The exact
size does not really matter. But the pool size has no relation with
commit-slab. Stop using that macro here.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 shallow.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/shallow.c b/shallow.c
index 8100dfd..2512ed3 100644
--- a/shallow.c
+++ b/shallow.c
@@ -431,6 +431,8 @@ void remove_nonexistent_theirs_shallow(struct shallow_info 
*info)
 
 define_commit_slab(ref_bitmap, uint32_t *);
 
+#define POOL_SIZE (512 * 1024)
+
 struct paint_info {
struct ref_bitmap ref_bitmap;
unsigned nr_bits;
@@ -447,9 +449,9 @@ static uint32_t *paint_alloc(struct paint_info *info)
if (!info->pool_count || info->free + size > info->end) {
info->pool_count++;
REALLOC_ARRAY(info->pools, info->pool_count);
-   info->free = xmalloc(COMMIT_SLAB_SIZE);
+   info->free = xmalloc(POOL_SIZE);
info->pools[info->pool_count - 1] = info->free;
-   info->end = info->free + COMMIT_SLAB_SIZE;
+   info->end = info->free + POOL_SIZE;
}
p = info->free;
info->free += size;
-- 
2.8.2.524.g6ff3d78



[PATCH v2 1/6] shallow.c: rename fields in paint_info to better express their purposes

2016-12-06 Thread Nguyễn Thái Ngọc Duy
paint_alloc() is basically malloc(), tuned for allocating a fixed number
of bits on every call without worrying about freeing any individual
allocation since all will be freed at the end. It does it by allocating
a big block of memory every time it runs out of "free memory". "slab" is
a poor choice of name, at least poorer than "pool".

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 shallow.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/shallow.c b/shallow.c
index 4d0b005..8100dfd 100644
--- a/shallow.c
+++ b/shallow.c
@@ -434,9 +434,9 @@ define_commit_slab(ref_bitmap, uint32_t *);
 struct paint_info {
struct ref_bitmap ref_bitmap;
unsigned nr_bits;
-   char **slab;
+   char **pools;
char *free, *end;
-   unsigned slab_count;
+   unsigned pool_count;
 };
 
 static uint32_t *paint_alloc(struct paint_info *info)
@@ -444,11 +444,11 @@ static uint32_t *paint_alloc(struct paint_info *info)
unsigned nr = (info->nr_bits + 31) / 32;
unsigned size = nr * sizeof(uint32_t);
void *p;
-   if (!info->slab_count || info->free + size > info->end) {
-   info->slab_count++;
-   REALLOC_ARRAY(info->slab, info->slab_count);
+   if (!info->pool_count || info->free + size > info->end) {
+   info->pool_count++;
+   REALLOC_ARRAY(info->pools, info->pool_count);
info->free = xmalloc(COMMIT_SLAB_SIZE);
-   info->slab[info->slab_count - 1] = info->free;
+   info->pools[info->pool_count - 1] = info->free;
info->end = info->free + COMMIT_SLAB_SIZE;
}
p = info->free;
@@ -624,9 +624,9 @@ void assign_shallow_commits_to_refs(struct shallow_info 
*info,
post_assign_shallow(info, _bitmap, ref_status);
 
clear_ref_bitmap(_bitmap);
-   for (i = 0; i < pi.slab_count; i++)
-   free(pi.slab[i]);
-   free(pi.slab);
+   for (i = 0; i < pi.pool_count; i++)
+   free(pi.pools[i]);
+   free(pi.pools);
free(shallow);
 }
 
-- 
2.8.2.524.g6ff3d78



[PATCH v2 6/6] shallow.c: remove useless code

2016-12-06 Thread Nguyễn Thái Ngọc Duy
Some context before we talk about the removed code.

This paint_down() is part of step 6 of 58babff (shallow.c: the 8 steps
to select new commits for .git/shallow - 2013-12-05). When we fetch from
a shallow repository, we need to know if one of the new/updated refs
needs new "shallow commits" in .git/shallow (because we don't have
enough history of those refs) and which one.

The question at step 6 is, what (new) shallow commits are required in
other to maintain reachability throughout the repository _without_
cutting our history short? To answer, we mark all commits reachable from
existing refs with UNINTERESTING ("rev-list --not --all"), mark shallow
commits with BOTTOM, then for each new/updated refs, walk through the
commit graph until we either hit UNINTERESTING or BOTTOM, marking the
ref on the commit as we walk.

After all the walking is done, we check the new shallow commits. If we
have not seen any new ref marked on a new shallow commit, we know all
new/updated refs are reachable using just our history and .git/shallow.
The shallow commit in question is not needed and can be thrown away.

So, the code.

The loop here (to walk through commits) is basically

1.  get one commit from the queue
2.  ignore if it's SEEN or UNINTERESTING
3.  mark it
4.  go through all the parents and..
5a. mark it if it's never marked before
5b. put it back in the queue

What we do in this patch is drop step 5a because it is not
necessary. The commit being marked at 5a is put back on the queue, and
will be marked at step 3 at the next iteration. The only case it will
not be marked is when the commit is already marked UNINTERESTING (5a
does not check this), which will be ignored at step 2.

But we don't care about refs marking on UNINTERESTING. We care about the
marking on _shallow commits_ that are not reachable from our current
history (and having UNINTERESTING on it means it's reachable). So it's
ok for an UNINTERESTING not to be ref-marked.

Reported-by: Rasmus Villemoes 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 shallow.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/shallow.c b/shallow.c
index beb967e..11f7dde 100644
--- a/shallow.c
+++ b/shallow.c
@@ -512,12 +512,8 @@ static void paint_down(struct paint_info *info, const 
unsigned char *sha1,
oid_to_hex(>object.oid));
 
for (p = c->parents; p; p = p->next) {
-   uint32_t **p_refs = ref_bitmap_at(>ref_bitmap,
- p->item);
if (p->item->object.flags & SEEN)
continue;
-   if (*p_refs == NULL || *p_refs == *refs)
-   *p_refs = *refs;
commit_list_insert(p->item, );
}
}
-- 
2.8.2.524.g6ff3d78



  1   2   >