date:20130712

Re: [PATCH 7/7] push: document --lockref

2013-07-12 Thread Johannes Sixt

Am 12.07.2013 23:19, schrieb Junio C Hamano:
> Johannes Sixt  writes:
> 
>> We have three independent options that the user can choose in any
>> combination:
>>
>>  o --force given or not;
>>
>>  o --lockref semantics enabled or not;
>>
>>  o refspec with or without +;
>>
>> and these two orthogonal preconditions of the push
>>
>>  o push is fast-forward or it is not ("ff", "noff");
>>
>>  o the branch at the remote is at the expected rev or it is not
>>("match", "mismatch").
>>
>> Here is a table with the expected outcome. "ok" means that the push is
>> allowed(*), "fail" means that the push is denied. (Four more lines with
>> --force are omitted because they have "ok" in all spots.)
>>
>>ff   noff ff  noff
>>   match match mismatch mismatch
>>
>> --lockref +refspec okokdenied   denied
>> --lockref  refspec ok  denied  denied   denied
> 
> I am confused with these.  The latter is the most typical:
> 
>   git fetch
> git checkout topic
> git rebase topic
>   git push --lockref topic
> 
> where we know it is "noff" already, and we just want to make sure
> that nobody mucked with our remote while we are rebasing.

Today (without --lockref), the above sequence would fail to push.
(Because there is no + and no --force.)

> If nobody updated the remote, why should this push be denied?  And in
> order to make it succeed, you need to force with +refspec or --force,
> but that would bypass match/mismatch safety, which makes the whole
> "make sure the other end is unchanged" safety meaningless, no?

I am suggesting that +refspec would *not* override the match/mismatch
safety, but --force would.

> 
>>   +refspec okok  ok   ok
> 
> This is traditional --force.
> 
>>refspec ok  deniedok denied
> 
> We are not asking for --lockref, so match/mismatch does not affect
> the outcome.

I think you are worried that a deviation from the principle that
+refspec == --force hurts current users. But I am arguing that this is
not the case because "current" users do not use --lockref. As you have
seen from the table, without --lockref there is *no change* in behavior.

I still have not seen an example where +refspec != --force would have
unexpected consequences. (The inequality is merely that +refspec fails
on mismatch when --lockref was also given while --force does not.)

>> Notice that without --lockref semantics enabled, +refspec and refspec
>> keep the current behavior.
> 
> But I do not think the above table with --lockref makes much sense.
> 
> Let's look at noff/match case.  That is the only interesting one.
> 
> This should fail:
> 
>   git push topic
> 
> due to no-ff.

Yes.

> Your table above makes this fail:
> 
> git push --lockref topic
> 
> and the user has to force it,

Of course.

> like this?
> 
>   git push --lockref --force topic ;# or alternatively
> git push --lockref +topic
> 
> Why is it even necessary?

Because it is no-ff. How do you achieve the push today (without
--lockref)? You use one of these two options. It does not change with
--lockref.

> If you make
> 
>   git push --lockref topic
> 
> succeed in noff/match case, everything makes more sense to me.

Not to me, obviously ;)

> The --lockref option is merely a weaker form of --force but still a
> way to override the noff check.

No; --lockref only adds the check that the destination is at the
expected revision, but does *NOT* override the no-ff check. Why should
it? (This is not a rethoric question.)

(I think I said differently in an earlier messages, but back then things
were still blurry. The table in my previous message is what I mean.)

>  If the user wants to keep noff
> check, the user can simply choose not to use the option.

No. If the user wants to keep the no-ff check, she does not use the + in
the refspec and does not use --force. (Just like today.)

> Of course, that form should fail if "mismatch".  And then you can
> force it,
> 
>   git push --force [--lockref] topic
> 
> As "--force" is "anything goes", it does not matter if you give the
> other option on the command line.

... or the + in the refsepc.

-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] t4203: fix checks for email address remapping

2013-07-12 Thread Stefan Beller

On 07/13/2013 02:35 AM, Eric Sunshine wrote:
> Two tests in t4203-mailmap.sh set up the mapping  =>
>  in an apparent attempt to check that email address
> remapping works as expected (in addition to name remapping which is also
> tested).  To test the remapping, git-shortlog is invoked but the
> invocation lacks the -e option instructing it to show email addresses,
> hence the tests do not actually prove that address remapping succeeded.
> Fix this by instructing git-shortlog to output email addresses as well.
> 
> Signed-off-by: Eric Sunshine 
> ---
> 
> The very last git-shortlog "complex" test in the script does use -e and
> checks that email address remapping actually works, so it's not clear
> that this patch is needed. The  => 
> remapping done by the two tests touched by this patch, however, is
> misleading to the reader since it seems to imply that these two tests
> want to check address remapping as well. Perhaps a better change would
> be to remove the address remapping from these two tests.
> 
> 
>  t/t4203-mailmap.sh | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
> index 842b754..3cf64de 100755
> --- a/t/t4203-mailmap.sh
> +++ b/t/t4203-mailmap.sh
> @@ -102,10 +102,10 @@ test_expect_success 'mailmap.file non-existent' '
>  '
>  
>  cat >expect <<\EOF
> -Internal Guy (1):
> +Internal Guy  (1):
>second
>  
> -Repo Guy (1):
> +Repo Guy  (1):
>initial
>  
>  EOF
> @@ -114,15 +114,15 @@ test_expect_success 'name entry after email entry' '
>   mkdir -p internal_mailmap &&
>   echo " " >internal_mailmap/.mailmap &&
>   echo "Internal Guy " >>internal_mailmap/.mailmap &&
> - git shortlog HEAD >actual &&
> + git shortlog -e HEAD >actual &&
>   test_cmp expect actual
>  '
>  
>  cat >expect <<\EOF
> -Internal Guy (1):
> +Internal Guy  (1):
>second
>  
> -Repo Guy (1):
> +Repo Guy  (1):
>initial
>  
>  EOF
> @@ -131,7 +131,7 @@ test_expect_success 'name entry after email entry, 
> case-insensitive' '
>   mkdir -p internal_mailmap &&
>   echo " " >internal_mailmap/.mailmap &&
>   echo "Internal Guy " >>internal_mailmap/.mailmap &&+

So here it is capitalized email address (BUGS@), but at the expect file
it's still lower cased. I think this is a bug.
Junio was trying to fix it in 543f99173c2d2f648d8f846e24875150f7de03d3
(origin/jc/mailmap-case-insensitivity)
So I think we need another yet test case there:
commited:
Internal Guy 
Internal Guy 

Having just one entry in the mailmap
Internal Guy  

should still work with the "shortlog -e"

> - git shortlog HEAD >actual &&
> + git shortlog -e HEAD >actual &&
>   test_cmp expect actual
>  '
>  
> 


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] show-ref: make --head always show the HEAD ref

2013-07-12 Thread Doug Bell

On Jul 11, 2013, at 10:41 AM, Junio C Hamano  wrote:

> Doug Bell  writes:
> 
>> The docs seem to say that doing
>> 
>>  git show-ref --head --tags
>> 
>> would show both the HEAD ref and all the tag refs. However, doing
>> both --head and either of --tags or --heads would filter out the HEAD
>> ref.
>> 
>> Signed-off-by: Doug Bell 
>> ---
> 
> I think this patch fell through the cracks, and looking at it, I am
> somewhat torn.
> 
> The command help for "--head" says "show the HEAD reference", which
> may mean:
> 
> (1) in addition to everything else the command would do if there
> weren't this option specified, show HEAD;
> 
> (2) show the HEAD and nothing else; or
> 
> (3) add HEAD to the candidates to be shown, but apply the usual
> filtering rule based on --heads, --tags and/or pattern
> arguments.
> 
> While the last interpretation is what we have used since the
> beginning of the command at 358ddb62 (Add "git show-ref" builtin
> command, 2006-09-15), I tend to agree with you that the first
> interpretation may make more sense, at least from the end user's
> point of view.

> 
> But at a deeper level, it makes the command somewhat inconsistent.
> 
> What happens in the command is
> 
> - We iterate over "candidates to be shown", which is usually
>   "everything under refs/", but with "--head", HEAD is added to
>   this set.  For each of these candidates:
> 
>   - if one or more  parameters are given, reject the
> candidate ref if it does not tail-match with any of the
> patterns;
> 
>   - if either "--heads" or "--tags" is given, among the ones that
> pass  filter, check if they:
> 
> - begin with "refs/heads/" (if "--heads" is given); or
> - begin with "refs/tags/" (if "--tags" is given).
> 
> and reject those that don't.
> 
>   - show it if it is still surviving after these two tests.
> 
> And taht is why "git show-ref --tags master v1.3.0" shows only the
> v1.3.0 tag without showing the master branch, and giving "--heads"
> instead of "--tags" yields only the master branch without the tag.
> 
> The semantics your patch wants, by changing the definition of
> "--head" from (3) to (1), is:
> 
> - If "--head" is given, show HEAD no matter what.
> 
> - Iterate over everything under refs/, and for each of them, do the
>   same filter-and-show as we currently do (see above).
> 
> While I think the new semantics is also understandable as the
> current one, and personally I think it is a better behaviour than
> the current one, it will require an update to the document to
> highlight that "--head" is special-cased in a big way, to bypass all
> the filtering that is applied to normal refs.
> 
> A few additional observations (these are not complaints to this
> patch and please do not take them as such):
> 
> - The command help says "(can be combined with heads)" for "--tags"
>   and vice versa, but does not mention their interaction with
>   "--head".  This is because we take interpretation (3) above and
>   do not treat "--head" as a mechanism to add to 
>   parameter like these two.
> 
> - The command help for "--heads" and "--tags" says "only show
>   heads/tags", which technically does not contradict with "can be
>   combined with" above, but a logical consequence of combining
>   ought to be showing nothing, as a ref cannot be a head (an old
>   nomenclature for a "branch") and a tag at the same time.  
> 
> I think we should find a word better than "only" to use here, but I
> am not sure what would be a good phrase to use.
> 

The reason I had initially wanted both --tags and --head was I wanted to 
compare HEAD against all the tags to see which one(s) I was on (if any). I was 
eventually pointed to `git describe`, but I ended up just using show-ref 
without any options and filtering the result using Perl (the entire application 
is in Perl, so this wasn't a big deal). Then, yeah, I figured it was confusing 
enough to either patch the code or the docs.

For the doc changes, I think if it's explained that by default it show-ref 
shows refs/{tags,heads,remotes}, it becomes easier to explain what the options 
will end up doing. I'll put together a second patch.--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2] show-ref: make --head always show the HEAD ref

2013-07-12 Thread Doug Bell

Updated the documentation to describe the new behavior.

Doug Bell (1):
  show-ref: make --head always show the HEAD ref

 Documentation/git-show-ref.txt | 10 ++
 builtin/show-ref.c |  8 ++--
 2 files changed, 12 insertions(+), 6 deletions(-)

-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] show-ref: make --head always show the HEAD ref

2013-07-12 Thread Doug Bell

The docs seem to say that doing

git show-ref --head --tags

would show both the HEAD ref and all the tag refs. However, doing
both --head and either of --tags or --heads would filter out the HEAD
ref.

Also update the documentation to describe the new behavior.

Signed-off-by: Doug Bell 
---
 Documentation/git-show-ref.txt | 10 ++
 builtin/show-ref.c |  8 ++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-show-ref.txt b/Documentation/git-show-ref.txt
index de4d352..b0a309b 100644
--- a/Documentation/git-show-ref.txt
+++ b/Documentation/git-show-ref.txt
@@ -21,6 +21,8 @@ commit IDs. Results can be filtered using a pattern and tags 
can be
 dereferenced into object IDs. Additionally, it can be used to test whether a
 particular ref exists.
 
+By default, shows the tags, heads, and remote refs.
+
 The --exclude-existing form is a filter that does the inverse, it shows the
 refs from stdin that don't exist in the local repository.
 
@@ -32,14 +34,14 @@ OPTIONS
 
 --head::
 
-   Show the HEAD reference.
+   Show the HEAD reference, even if it would normally be filtered out.
 
 --tags::
 --heads::
 
-   Limit to only "refs/heads" and "refs/tags", respectively.  These
-   options are not mutually exclusive; when given both, references stored
-   in "refs/heads" and "refs/tags" are displayed.
+   Limit to "refs/heads" and "refs/tags", respectively.  These options
+   are not mutually exclusive; when given both, references stored in
+   "refs/heads" and "refs/tags" are displayed.
 
 -d::
 --dereference::
diff --git a/builtin/show-ref.c b/builtin/show-ref.c
index 4a0310d..4b069e7 100644
--- a/builtin/show-ref.c
+++ b/builtin/show-ref.c
@@ -31,6 +31,9 @@ static int show_ref(const char *refname, const unsigned char 
*sha1, int flag, vo
const char *hex;
unsigned char peeled[20];
 
+   if (show_head && !strncmp(refname, "HEAD\0", 5))
+   goto match;
+
if (tags_only || heads_only) {
int match;
 
@@ -167,9 +170,10 @@ static const struct option show_ref_options[] = {
OPT_BOOLEAN(0, "verify", &verify, N_("stricter reference checking, "
"requires exact ref path")),
{ OPTION_BOOLEAN, 'h', NULL, &show_head, NULL,
- N_("show the HEAD reference"),
+ N_("show the HEAD reference, even if it would be filtered out"),
  PARSE_OPT_NOARG | PARSE_OPT_HIDDEN },
-   OPT_BOOLEAN(0, "head", &show_head, N_("show the HEAD reference")),
+   OPT_BOOLEAN(0, "head", &show_head,
+ N_("show the HEAD reference, even if it would be filtered out")),
OPT_BOOLEAN('d', "dereference", &deref_tags,
N_("dereference tags into object IDs")),
{ OPTION_CALLBACK, 's', "hash", &abbrev, N_("n"),
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 1/2] builtin: add git-check-mailmap command

2013-07-12 Thread Eric Sunshine

Introduce command check-mailmap, similar to check-attr and check-ignore,
which allows direct testing of .mailmap configuration.

As plumbing accessible to scripts and other porcelain, check-mailmap
publishes the stable, well-tested .mailmap functionality employed by
built-in Git commands.  Consequently, script authors need not
re-implement .mailmap functionality manually, thus avoiding potential
quirks and behavioral differences.

Signed-off-by: Eric Sunshine 
---
 .gitignore |  1 +
 Documentation/git-check-mailmap.txt| 47 
 Makefile   |  1 +
 builtin.h  |  1 +
 builtin/check-mailmap.c| 66 ++
 command-list.txt   |  1 +
 contrib/completion/git-completion.bash |  1 +
 git.c  |  1 +
 8 files changed, 119 insertions(+)
 create mode 100644 Documentation/git-check-mailmap.txt
 create mode 100644 builtin/check-mailmap.c

diff --git a/.gitignore b/.gitignore
index efa8db0..6b1fd1b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -23,6 +23,7 @@
 /git-cat-file
 /git-check-attr
 /git-check-ignore
+/git-check-mailmap
 /git-check-ref-format
 /git-checkout
 /git-checkout-index
diff --git a/Documentation/git-check-mailmap.txt 
b/Documentation/git-check-mailmap.txt
new file mode 100644
index 000..39028ee
--- /dev/null
+++ b/Documentation/git-check-mailmap.txt
@@ -0,0 +1,47 @@
+git-check-mailmap(1)
+
+
+NAME
+
+git-check-mailmap - Show canonical names and email addresses of contacts
+
+
+SYNOPSIS
+
+[verse]
+'git check-mailmap' [options] ...
+
+
+DESCRIPTION
+---
+
+For each ``Name '' or ``'' from the command-line
+or standard input (when using `--stdin`), look up the person's canonical name
+and email address (see "Mapping Authors" below). If found, print them;
+otherwise print the input as-is.
+
+
+OPTIONS
+---
+--stdin::
+   Read contacts, one per line, from the standard input after exhausting
+   contacts provided on the command-line.
+
+
+OUTPUT
+--
+
+For each contact, a single line is output, terminated by a newline.  If the
+name is provided or known to the 'mailmap', ``Name '' is
+printed; otherwise only ``'' is printed.
+
+
+MAPPING AUTHORS
+---
+
+include::mailmap.txt[]
+
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 0600eb4..ef442eb 100644
--- a/Makefile
+++ b/Makefile
@@ -912,6 +912,7 @@ BUILTIN_OBJS += builtin/bundle.o
 BUILTIN_OBJS += builtin/cat-file.o
 BUILTIN_OBJS += builtin/check-attr.o
 BUILTIN_OBJS += builtin/check-ignore.o
+BUILTIN_OBJS += builtin/check-mailmap.o
 BUILTIN_OBJS += builtin/check-ref-format.o
 BUILTIN_OBJS += builtin/checkout-index.o
 BUILTIN_OBJS += builtin/checkout.o
diff --git a/builtin.h b/builtin.h
index 1ed8edb..8afa2de 100644
--- a/builtin.h
+++ b/builtin.h
@@ -40,6 +40,7 @@ extern int cmd_checkout(int argc, const char **argv, const 
char *prefix);
 extern int cmd_checkout_index(int argc, const char **argv, const char *prefix);
 extern int cmd_check_attr(int argc, const char **argv, const char *prefix);
 extern int cmd_check_ignore(int argc, const char **argv, const char *prefix);
+extern int cmd_check_mailmap(int argc, const char **argv, const char *prefix);
 extern int cmd_check_ref_format(int argc, const char **argv, const char 
*prefix);
 extern int cmd_cherry(int argc, const char **argv, const char *prefix);
 extern int cmd_cherry_pick(int argc, const char **argv, const char *prefix);
diff --git a/builtin/check-mailmap.c b/builtin/check-mailmap.c
new file mode 100644
index 000..8f4d809
--- /dev/null
+++ b/builtin/check-mailmap.c
@@ -0,0 +1,66 @@
+#include "builtin.h"
+#include "mailmap.h"
+#include "parse-options.h"
+#include "string-list.h"
+
+static int use_stdin;
+static const char * const check_mailmap_usage[] = {
+N_("git check-mailmap [options] ..."),
+NULL
+};
+
+static const struct option check_mailmap_options[] = {
+   OPT_BOOL(0, "stdin", &use_stdin, N_("also read contacts from stdin")),
+   OPT_END()
+};
+
+static void check_mailmap(struct string_list *mailmap, const char *contact)
+{
+   const char *name, *mail;
+   size_t namelen, maillen;
+   struct ident_split ident;
+
+   if (split_ident_line(&ident, contact, strlen(contact)))
+   die(_("unable to parse contact: %s"), contact);
+
+   name = ident.name_begin;
+   namelen = ident.name_end - ident.name_begin;
+   mail = ident.mail_begin;
+   maillen = ident.mail_end - ident.mail_begin;
+
+   map_user(mailmap, &mail, &maillen, &name, &namelen);
+
+   if (namelen)
+   printf("%.*s ", (int)namelen, name);
+   printf("<%.*s>\n", (int)maillen, mail);
+}
+
+int cmd_check_mailmap(int argc, const char **argv, const char *prefix)
+{
+   int i;
+   struct string_list mailmap = STRING_LIST_INIT_NODUP;
+
+

[PATCH v3 2/2] t4203: test check-mailmap command invocation

2013-07-12 Thread Eric Sunshine

Test the command-line interface of check-mailmap.

(Actual .mailmap functionality is already covered by existing tests.)

Signed-off-by: Eric Sunshine 
---
 t/t4203-mailmap.sh | 43 +++
 1 file changed, 43 insertions(+)

diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 842b754..b7476f2 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -13,6 +13,11 @@ fuzz_blame () {
 }
 
 test_expect_success setup '
+   cat >contacts <<-\EOF &&
+   A U Thor 
+   nick1 
+   EOF
+
echo one >one &&
git add one &&
test_tick &&
@@ -23,6 +28,44 @@ test_expect_success setup '
git commit --author "nick1 " -m second
 '
 
+test_expect_success 'check-mailmap no arguments' '
+   test_must_fail git check-mailmap
+'
+
+test_expect_success 'check-mailmap arguments' '
+   cat >expect <<-\EOF &&
+   A U Thor 
+   nick1 
+   EOF
+   git check-mailmap \
+   "A U Thor " \
+   "nick1 " >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'check-mailmap --stdin' '
+   cat >expect <<-\EOF &&
+   A U Thor 
+   nick1 
+   EOF
+   git check-mailmap --stdin actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'check-mailmap --stdin arguments' '
+   cat >expect <<-\EOF &&
+   Internal Guy 
+   EOF
+   cat >expect &&
+   git check-mailmap --stdin "Internal Guy " \
+   actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'check-mailmap bogus contact' '
+   test_must_fail git check-mailmap bogus
+'
+
 cat >expect <<\EOF
 A U Thor (1):
   initial
-- 
1.8.3.2.804.g0da7a53

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 0/2] add git-check-mailmap command

2013-07-12 Thread Eric Sunshine

This is a re-roll of v2 [1] which introduces git-check-mailmap. The
primary motivation for this command is to expose git's stable,
well-tested C-implementation of .mailmap functionality to scripts and
porcelains, thus relieving them of the need to re-implement support
themselves. The git-contacts [2] script (currently at es/contacts in
'pu') would be one such client.

v3 addresses comments from reviewers [1].

Changes since v2:

* Drop -z/--null option.

* Simplify grammar of DESCRIPTION and OUTPUT sections of
  Documentation/git-check-mailmap.txt.

* Normalize wrapped code indentation to match standard Git style.

* Drop v2 patches [3/4] and [4/4] which convert t4203-mailmap.sh to test
  low-level mailmap functionality directly via git-check-mailmap rather
  than indirectly via git-shortlog.

[1]: http://thread.gmane.org/gmane.comp.version-control.git/230105/
[2]: http://thread.gmane.org/gmane.comp.version-control.git/229533/

Eric Sunshine (2):
  builtin: add git-check-mailmap command
  t4203: test check-mailmap command invocation

 .gitignore |  1 +
 Documentation/git-check-mailmap.txt| 47 
 Makefile   |  1 +
 builtin.h  |  1 +
 builtin/check-mailmap.c| 66 ++
 command-list.txt   |  1 +
 contrib/completion/git-completion.bash |  1 +
 git.c  |  1 +
 t/t4203-mailmap.sh | 43 ++
 9 files changed, 162 insertions(+)
 create mode 100644 Documentation/git-check-mailmap.txt
 create mode 100644 builtin/check-mailmap.c

-- 
1.8.3.2.804.g0da7a53

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] t4203: fix checks for email address remapping

2013-07-12 Thread Eric Sunshine

Two tests in t4203-mailmap.sh set up the mapping  =>
 in an apparent attempt to check that email address
remapping works as expected (in addition to name remapping which is also
tested).  To test the remapping, git-shortlog is invoked but the
invocation lacks the -e option instructing it to show email addresses,
hence the tests do not actually prove that address remapping succeeded.
Fix this by instructing git-shortlog to output email addresses as well.

Signed-off-by: Eric Sunshine 
---

The very last git-shortlog "complex" test in the script does use -e and
checks that email address remapping actually works, so it's not clear
that this patch is needed. The  => 
remapping done by the two tests touched by this patch, however, is
misleading to the reader since it seems to imply that these two tests
want to check address remapping as well. Perhaps a better change would
be to remove the address remapping from these two tests.


 t/t4203-mailmap.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 842b754..3cf64de 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -102,10 +102,10 @@ test_expect_success 'mailmap.file non-existent' '
 '
 
 cat >expect <<\EOF
-Internal Guy (1):
+Internal Guy  (1):
   second
 
-Repo Guy (1):
+Repo Guy  (1):
   initial
 
 EOF
@@ -114,15 +114,15 @@ test_expect_success 'name entry after email entry' '
mkdir -p internal_mailmap &&
echo " " >internal_mailmap/.mailmap &&
echo "Internal Guy " >>internal_mailmap/.mailmap &&
-   git shortlog HEAD >actual &&
+   git shortlog -e HEAD >actual &&
test_cmp expect actual
 '
 
 cat >expect <<\EOF
-Internal Guy (1):
+Internal Guy  (1):
   second
 
-Repo Guy (1):
+Repo Guy  (1):
   initial
 
 EOF
@@ -131,7 +131,7 @@ test_expect_success 'name entry after email entry, 
case-insensitive' '
mkdir -p internal_mailmap &&
echo " " >internal_mailmap/.mailmap &&
echo "Internal Guy " >>internal_mailmap/.mailmap &&
-   git shortlog HEAD >actual &&
+   git shortlog -e HEAD >actual &&
test_cmp expect actual
 '
 
-- 
1.8.3.2.804.g0da7a53

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git p4 clone not processing branches properly

2013-07-12 Thread Vitor Antunes

On Mon, Jul 8, 2013 at 3:03 PM, Matthieu Brucher
 wrote:
> Hi,
>
> I tried without spec, but then it tried importing everything, even
> though there was a .gitignore and a .git/config/exclude file.
> Then, it crashed during the importation because it could find an old
> branch (I don't have access to everything on the repository), so I
> tried importing just the recent past, but then it failed because it
> identified a branch names Branch/Main/src...
> It is starting to feel as if I will have to compromise between
> something that works but without branches and without the proper names
> (the files are named Project/Branch/Main/...) or having the proper
> names, but with all binaries, bogus branches...
> I know it is not due to git, it is mainly that Perforce and git have
> very different workflows. Or perhaps with any luck, the server is up
> to date, and I can find a way of using Perforce's bridge.

Hi Matthieu,

I created a simple test case where I added a file to one of the branches in
P4 but excluded it in the client spec. During synchronization git p4
correctly ignored it.

Please check which git version you are using. It is possible the issue you
are experiencing was fixed in a more recent version.

Also, if possible, please try creating a simple test case that replicates
your issue (p4 and p4d are free to use when the repo is only used by one
person). If you record the commands you used and share them with me I will
be able to better check this issue on my side.

Cheers,
Vitor
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git in nutshell Inbox

2013-07-12 Thread David Aguilar

On Wed, Jul 10, 2013 at 3:10 AM, Muhammad Bashir Al-Noimi
 wrote:
> On Tue, Jul 9, 2013 at 10:47 PM, Muhammad Bashir Al-Noimi
>  wrote:
>> Now I've to make some tests because the new version is completely different
>> than the old one so I'll send a feedback soon.
>
> Thanks guys, the recent git-cola fabulous so I don't need to use the
> terminal anymore for git.

Thanks for the kind words :-)  I'm glad you like it. (I'm the git-cola author)

I noticed that you were asking about creating new repositories so I
recently added that feature to the File menu (FIle -> New
Repository...).  That feature was previously only available in the
startup dialog that is activated when the launcher icon is clicked (or
"git-cola --prompt") so you probably found it there.

I'll probably be doing an official release later this month; you won't
see it until the PPAs and packagers pickup the new release, though.

If you're interested in trying it out I often point people to its the
Git repository -- the project is setup such that it can run directly
out of its source tree without needing to "make" or install anything.
Your machine has all of git-cola's dependencies available since you've
already installed its deb package.  All it takes to run a development
version is to clone the repo and run e.g.
$HOME/src/git-cola/bin/git-cola (assuming you've cloned the project to
$HOME/src/git-cola).

I keep the "master" branch very stable so it's quite safe to run that
version, especially if you're interested in the latest/greatest.

Best,
--
David
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

What's cooking in git.git (Jul 2013, #05; Fri, 12)

2013-07-12 Thread Junio C Hamano

Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.

As many of you may know, I hate to do back-to-back "What's cooking",
but we ended up acquiring many new topics, and many existing ones
have moved from 'pu' to 'next' and 'next' to 'master'.

Those that graduated to 'master' are mostly code and documentation
clean-up patches.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* as/log-output-encoding-in-user-format (2013-07-05) 11 commits
  (merged to 'next' on 2013-07-08 at 2e1bdd9)
 + t4205 (log-pretty-formats): avoid using `sed`
 + t6006 (rev-list-format): add tests for "%b" and "%s" for the case 
i18n.commitEncoding is not set
 + t4205, t6006, t7102: make functions better readable
 + t4205 (log-pretty-formats): revert back single quotes
  (merged to 'next' on 2013-07-05 at d2c99e5)
 + t4041, t4205, t6006, t7102: use iso8859-1 rather than iso-8859-1
  (merged to 'next' on 2013-07-01 at 3318aa8)
 + t4205: replace .\+ with ..* in sed commands
  (merged to 'next' on 2013-06-28 at 4063330)
 + pretty: --format output should honor logOutputEncoding
 + pretty: Add failing tests: --format output should honor logOutputEncoding
 + t4205 (log-pretty-formats): don't hardcode SHA-1 in expected outputs
 + t7102 (reset): don't hardcode SHA-1 in expected outputs
 + t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs

 "log --format=" did not honor i18n.logoutputencoding configuration
 and this attempts to fix it.


* ft/diff-rename-default-score-is-half (2013-07-05) 1 commit
  (merged to 'next' on 2013-07-09 at 6a6b57e)
 + diff-options: document default similarity index


* jc/remote-http-argv-array (2013-07-09) 1 commit
  (merged to 'next' on 2013-07-11 at 7fbe8bd)
 + remote-http: use argv-array


* jk/maint-config-multi-order (2013-07-07) 1 commit
  (merged to 'next' on 2013-07-09 at 0db1db9)
 + git-config(1): clarify precedence of multiple values


* jk/pull-to-integrate (2013-07-08) 2 commits
  (merged to 'next' on 2013-07-09 at 2ecac24)
 + pull: change the description to "integrate" changes
 + push: avoid suggesting "merging" remote changes


* ml/cygwin-does-not-have-fifo (2013-07-05) 1 commit
  (merged to 'next' on 2013-07-09 at 7d6849d)
 + test-lib.sh - cygwin does not have usable FIFOs


* ms/remote-tracking-branches-in-doc (2013-07-03) 1 commit
  (merged to 'next' on 2013-07-09 at 411a8bd)
 + Change "remote tracking" to "remote-tracking"


* rr/name-rev-stdin-doc (2013-07-07) 1 commit
  (merged to 'next' on 2013-07-09 at 7cfbff6)
 + name-rev doc: rewrite --stdin paragraph


* rs/pickaxe-simplify (2013-07-07) 1 commit
  (merged to 'next' on 2013-07-11 at c5972f7)
 + diffcore-pickaxe: simplify has_changes and contains


* tf/gitweb-extra-breadcrumbs (2013-07-04) 1 commit
  (merged to 'next' on 2013-07-09 at 525331b)
 + gitweb: allow extra breadcrumbs to prefix the trail

 An Gitweb installation that is a part of larger site can optionally
 show extra links that point at the levels higher than the Gitweb
 pages itself in the link hierarchy of pages.


* tr/test-lint-no-export-assignment-in-shell (2013-07-08) 2 commits
  (merged to 'next' on 2013-07-09 at 6f10ea2)
 + test-lint: detect 'export FOO=bar'
 + t9902: fix 'test A == B' to use = operator

--
[New Topics]

* es/check-mailmap (2013-07-11) 2 commits
 - t4203: test check-mailmap command invocation
 - builtin: add git-check-mailmap command

 A new command to allow scripts to query the mailmap information.

 Expecting a reroll to lose the -z option.


* jc/check-x-z (2013-07-11) 4 commits
 - check-attr -z: a single -z should apply to both input and output
 - check-ignore -z: a single -z should apply to both input and output
 - check-attr: the name of the character is NUL, not NULL
 - check-ignore: the name of the character is NUL, not NULL

 "git check-ignore -z" applied the NUL termination to both its input
 (with --stdin) and its output, but "git check-attr -z" ignored the
 option on the output side.

 This is potentially a backward incompatible fix.  I am tempted to
 merge this to and keep it in 'next' for a while to see if anybody
 screams before deciding if we want to do anything to help existing
 users (there may be none).


* jk/cat-file-batch-optim (2013-07-12) 8 commits
 - sha1_object_info_extended: pass object_info to helpers
 - sha1_object_info_extended: make type calculation optional
 - packed_object_info: make type lookup optional
 - packed_object_info: hoist delta type resolution to helper
 - sha1_loose_object_info: make type lookup optional
 - sha1_object_info_extended: rename "status" to "type"
 - cat-file: disable object/refname ambiguity check for batch mode
 - Merge branch 'nd/warn-ambiguous-

Re: [PATCH 4/4] add a testcase for checking case insensitivity of mailmap

2013-07-12 Thread Eric Sunshine

On Fri, Jul 12, 2013 at 5:38 PM, Junio C Hamano  wrote:
> From: Stefan Beller 
>
> Signed-off-by: Stefan Beller 
> Signed-off-by: Junio C Hamano 
> ---
>  t/t4203-mailmap.sh | 33 +
>  1 file changed, 33 insertions(+)
>
> diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
> index 842b754..07152e9 100755
> --- a/t/t4203-mailmap.sh
> +++ b/t/t4203-mailmap.sh
> +test_expect_success 'Test case sensitivity of Names' '
> +   # do a commit:
> +   echo "asdf" > test1
> +   git add test1
> +   git commit -a --author="A " -m "add test1"
> +
> +   # commit with same name, but different email
> +   # (different capitalization does the trick already,
> +   # but here I am going to use a different mail)
> +   echo "asdf" > test2
> +   git add test2
> +   git commit -a --author="A " -m "add test2"
> +
> +   # Adding the line to the mailmap should make life easy, so we know
> +   # it is the same person
> +   echo "A  " > .mailmap
> +
> +   git shortlog -sne HEAD >actual && test_cmp expect actual
> +'

I forgot to mention that the &&-chain is broken (missing) in the
entire test (except for the last line).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH] fetch: make --prune configurable

2013-07-12 Thread Junio C Hamano

Here is my previous review comments in a squashable patch form.  The
result seems to pass all 27 combinations (fetch.prune, remote.*.prune
and command line all are tristate yes/no/unspecified).

Without the fix-up in *.c files, three combinations seem to fail.

 Documentation/config.txt |   3 +-
 builtin/fetch.c  |  41 +---
 remote.c |   1 +
 t/t5510-fetch.sh | 118 ---
 4 files changed, 108 insertions(+), 55 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index fc39f3a..e4ce7c4 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1051,7 +1051,7 @@ fetch.unpackLimit::
 
 fetch.prune::
If true, fetch will automatically behave as if the `--prune`
-   option was given on the command line.
+   option was given on the command line.  See also `remote..prune`.
 
 format.attach::
Enable multipart/mixed attachments as the default for
@@ -1992,6 +1992,7 @@ remote..prune::
When set to true, fetching from this remote by default will also
remove any remote-tracking branches which no longer exist on the
remote (as if the `--prune` option was give on the command line).
+   Overrides `fetch.prune` settings, if any.
 
 remotes.::
The list of remotes which are fetched by "git remote update
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 082450b..08ab948 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -30,13 +30,10 @@ enum {
TAGS_SET = 2
 };
 
-enum {
-   PRUNE_UNSET = 0,
-   PRUNE_DEFAULT = 1,
-   PRUNE_FORCE = 2
-};
+static int fetch_prune_config = -1; /* unspecified */
+static int prune = -1; /* unspecified */
+#define PRUNE_BY_DEFAULT 0 /* do we prune by default? */
 
-static int prune = PRUNE_DEFAULT;
 static int all, append, dry_run, force, keep, multiple, update_head_ok, 
verbosity;
 static int progress = -1, recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
 static int tags = TAGS_DEFAULT, unshallow;
@@ -64,12 +61,10 @@ static int option_parse_recurse_submodules(const struct 
option *opt,
 static int git_fetch_config(const char *k, const char *v, void *cb)
 {
if (!strcmp(k, "fetch.prune")) {
-   int boolval = git_config_bool(k, v);
-   if (boolval)
-   prune = PRUNE_FORCE;
+   fetch_prune_config = git_config_bool(k, v);
return 0;
}
-   return git_default_config(k, v, cb);
+   return 0;
 }
 
 static struct option builtin_fetch_options[] = {
@@ -87,8 +82,8 @@ static struct option builtin_fetch_options[] = {
N_("fetch all tags and associated objects"), TAGS_SET),
OPT_SET_INT('n', NULL, &tags,
N_("do not fetch all tags (--no-tags)"), TAGS_UNSET),
-   OPT_BOOLEAN('p', "prune", &prune,
-   N_("prune remote-tracking branches no longer on remote")),
+   OPT_BOOL('p', "prune", &prune,
+N_("prune remote-tracking branches no longer on remote")),
{ OPTION_CALLBACK, 0, "recurse-submodules", NULL, N_("on-demand"),
N_("control recursive fetching of submodules"),
PARSE_OPT_OPTARG, option_parse_recurse_submodules },
@@ -756,8 +751,11 @@ static int do_fetch(struct transport *transport,
free_refs(ref_map);
return 1;
}
-   if (prune == PRUNE_FORCE || (transport->remote->prune && prune)) {
-   /* If --tags was specified, pretend the user gave us the 
canonical tags refspec */
+   if (prune) {
+   /*
+* If --tags was specified, pretend that the user gave us
+* the canonical tags refspec
+*/
if (tags == TAGS_SET) {
const char *tags_str = "refs/tags/*:refs/tags/*";
struct refspec *tags_refspec, *refspec;
@@ -866,10 +864,8 @@ static void add_options_to_argv(struct argv_array *argv)
 {
if (dry_run)
argv_array_push(argv, "--dry-run");
-   if (prune == PRUNE_FORCE)
+   if (prune > 0)
argv_array_push(argv, "--prune");
-   else if (prune == PRUNE_UNSET)
-   argv_array_push(argv, "--no-prune");
if (update_head_ok)
argv_array_push(argv, "--update-head-ok");
if (force)
@@ -936,6 +932,17 @@ static int fetch_one(struct remote *remote, int argc, 
const char **argv)
"remote name from which new revisions should be fetched."));
 
transport = transport_get(remote, NULL);
+
+   if (prune < 0) {
+   /* no command line request */
+   if (0 <= transport->remote->prune)
+   prune = transport->remote->prune;
+   else if (0 <= fetch_prune_config)
+   prune = fetch_prune_config;
+   else
+   prune = P

Re: [PATCH 4/4] add a testcase for checking case insensitivity of mailmap

2013-07-12 Thread Eric Sunshine

On Fri, Jul 12, 2013 at 5:38 PM, Junio C Hamano  wrote:
> From: Stefan Beller 
>
> Signed-off-by: Stefan Beller 
> Signed-off-by: Junio C Hamano 
> ---
>  t/t4203-mailmap.sh | 33 +
>  1 file changed, 33 insertions(+)
>
> diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
> index 842b754..07152e9 100755
> --- a/t/t4203-mailmap.sh
> +++ b/t/t4203-mailmap.sh
> @@ -409,4 +409,37 @@ test_expect_success 'Blame output (complex mapping)' '
> test_cmp expect actual.fuzz
>  '
>
> +test_expect_success 'cleanup after mailmap.blob tests' '
> +   rm -f .mailmap
> +'

This "test" was copied from earlier in the file, but the description
was not updated; it has nothing to do with mailmap.blob tests for
which cleanup has already taken place.

It's also misleading since no .mailmap file exists at this point.
Instead, configuration key mailmap.file is set to
internal_mailmap/.mailmap, so if you need to clean up anything, it
would be that.

(You're also recreating .mailmap from scratch directly in your test
via "echo foo >.mailmap", so this "test" doesn't really add anything.)

> +cat >expect <<\EOF
> + 2 A 
> + 2 Other Author 
> + 2 Santa Claus 
> + 1 A U Thor 
> + 1 CTO 
> + 1 Some Dude 
> +EOF
> +
> +test_expect_success 'Test case sensitivity of Names' '

s/Names/names/

Also, most of the test descriptions in this script start with
lowercase, so s/Test/test/ would be appropriate.

> +   # do a commit:
> +   echo "asdf" > test1

Git practice normally avoids whitespace after the redirection
operator. This particular test script, has a mix of ">foo" and ">
foo", but we may want to avoid adding more of the latter form.

> +   git add test1
> +   git commit -a --author="A " -m "add test1"
> +
> +   # commit with same name, but different email
> +   # (different capitalization does the trick already,
> +   # but here I am going to use a different mail)

Without context of the mailing list discussion, the reader does not
know what "trick" is or precisely how this may have failed in the
past. It's also not clear why the test is being performed with a
different email address entirely rather than one which differs only by
case (which is what the test description talks about).

> +   echo "asdf" > test2

Whitespace after redirection.

> +   git add test2
> +   git commit -a --author="A " -m "add test2"
> +
> +   # Adding the line to the mailmap should make life easy, so we know
> +   # it is the same person

The comment can probably be dropped, as the new .mailmap entry is
self-explanatory.

> +   echo "A  " > .mailmap

Whitespace after redirection.

> +
> +   git shortlog -sne HEAD >actual && test_cmp expect actual

For consistency with more other tests, perhaps break the line apart:

  git shortlog -sne HEAD >actual &&
  test_cmp expect actual

> +'
> +
>  test_done
> --
> 1.8.3.2-941-gda9c3c8
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] mailmap: style fixes

2013-07-12 Thread Junio C Hamano

Wrap overlong lines and format the multi-line comments to match our
coding style.

Signed-off-by: Junio C Hamano 
---
 mailmap.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/mailmap.c b/mailmap.c
index a7e92db..5a3347d 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -37,7 +37,8 @@ static void free_mailmap_info(void *p, const char *s)
 static void free_mailmap_entry(void *p, const char *s)
 {
struct mailmap_entry *me = (struct mailmap_entry *)p;
-   debug_mm("mailmap: removing entries for <%s>, with %d sub-entries\n", 
s, me->namemap.nr);
+   debug_mm("mailmap: removing entries for <%s>, with %d sub-entries\n",
+s, me->namemap.nr);
debug_mm("mailmap: - simple: '%s' <%s>\n", me->name, me->email);
free(me->name);
free(me->email);
@@ -73,7 +74,8 @@ static void add_mapping(struct string_list *map,
}
 
if (old_name == NULL) {
-   debug_mm("mailmap: adding (simple) entry for %s at index %d\n", 
old_email, index);
+   debug_mm("mailmap: adding (simple) entry for %s at index %d\n",
+old_email, index);
/* Replace current name and new email for simple entry */
if (new_name) {
free(me->name);
@@ -85,7 +87,8 @@ static void add_mapping(struct string_list *map,
}
} else {
struct mailmap_info *mi = xcalloc(1, sizeof(struct 
mailmap_info));
-   debug_mm("mailmap: adding (complex) entry for %s at index 
%d\n", old_email, index);
+   debug_mm("mailmap: adding (complex) entry for %s at index %d\n",
+old_email, index);
if (new_name)
mi->name = xstrdup(new_name);
if (new_email)
@@ -315,8 +318,10 @@ int map_user(struct string_list *map,
if (item != NULL) {
me = (struct mailmap_entry *)item->util;
if (me->namemap.nr) {
-   /* The item has multiple items, so we'll look up on 
name too */
-   /* If the name is not found, we choose the simple entry 
 */
+   /*
+* The item has multiple items, so we'll look up on 
name too.
+* If the name is not found, we choose the simple entry.
+*/
struct string_list_item *subitem;
subitem = lookup_prefix(&me->namemap, *name, *namelen);
if (subitem)
-- 
1.8.3.2-941-gda9c3c8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/4] mailmap: do not downcase mailmap entries

2013-07-12 Thread Junio C Hamano

The email addresses in the records read from the .mailmap file are
downcased very early, and then used to match against e-mail
addresses in the input.  Because we do use case insensitive version
of string list to manage these entries, there is no need to do this,
and worse yet, downcasing the rewritten/canonical e-mail read from
the .mailmap file loses information.

Stop doing that, and also make the string list used to keep multiple
names for an mailmap entry case insensitive (the code that uses the
list, lookup_prefix(), expects a case insensitive match).

Signed-off-by: Junio C Hamano 
---
 mailmap.c | 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/mailmap.c b/mailmap.c
index 418081e..a7e92db 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -51,14 +51,6 @@ static void add_mapping(struct string_list *map,
 {
struct mailmap_entry *me;
int index;
-   char *p;
-
-   if (old_email)
-   for (p = old_email; *p; p++)
-   *p = tolower(*p);
-   if (new_email)
-   for (p = new_email; *p; p++)
-   *p = tolower(*p);
 
if (old_email == NULL) {
old_email = new_email;
@@ -68,13 +60,17 @@ static void add_mapping(struct string_list *map,
if ((index = string_list_find_insert_index(map, old_email, 1)) < 0) {
/* mailmap entry exists, invert index value */
index = -1 - index;
+   me = (struct mailmap_entry *)map->items[index].util;
} else {
/* create mailmap entry */
-   struct string_list_item *item = 
string_list_insert_at_index(map, index, old_email);
-   item->util = xcalloc(1, sizeof(struct mailmap_entry));
-   ((struct mailmap_entry *)item->util)->namemap.strdup_strings = 
1;
+   struct string_list_item *item;
+
+   item = string_list_insert_at_index(map, index, old_email);
+   me = xcalloc(1, sizeof(struct mailmap_entry));
+   me->namemap.strdup_strings = 1;
+   me->namemap.cmp = strcasecmp;
+   item->util = me;
}
-   me = (struct mailmap_entry *)map->items[index].util;
 
if (old_name == NULL) {
debug_mm("mailmap: adding (simple) entry for %s at index %d\n", 
old_email, index);
-- 
1.8.3.2-941-gda9c3c8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] add a testcase for checking case insensitivity of mailmap

2013-07-12 Thread Junio C Hamano

From: Stefan Beller 

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 t/t4203-mailmap.sh | 33 +
 1 file changed, 33 insertions(+)

diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 842b754..07152e9 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -409,4 +409,37 @@ test_expect_success 'Blame output (complex mapping)' '
test_cmp expect actual.fuzz
 '
 
+test_expect_success 'cleanup after mailmap.blob tests' '
+   rm -f .mailmap
+'
+
+cat >expect <<\EOF
+ 2 A 
+ 2 Other Author 
+ 2 Santa Claus 
+ 1 A U Thor 
+ 1 CTO 
+ 1 Some Dude 
+EOF
+
+test_expect_success 'Test case sensitivity of Names' '
+   # do a commit:
+   echo "asdf" > test1
+   git add test1
+   git commit -a --author="A " -m "add test1"
+
+   # commit with same name, but different email
+   # (different capitalization does the trick already,
+   # but here I am going to use a different mail)
+   echo "asdf" > test2
+   git add test2
+   git commit -a --author="A " -m "add test2"
+
+   # Adding the line to the mailmap should make life easy, so we know
+   # it is the same person
+   echo "A  " > .mailmap
+
+   git shortlog -sne HEAD >actual && test_cmp expect actual
+'
+
 test_done
-- 
1.8.3.2-941-gda9c3c8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/4] Microfixes to mailmap

2013-07-12 Thread Junio C Hamano

There were a few bugs in mailmap parsing and matching code.

Junio C Hamano (3):
  mailmap: do not lose single-letter names
  mailmap: do not downcase mailmap entries
  mailmap: style fixes

Stefan Beller (1):
  Add a testcase for checking case insensitivity of mailmap

 mailmap.c  | 37 +++--
 t/t4203-mailmap.sh | 33 +
 2 files changed, 52 insertions(+), 18 deletions(-)

-- 
1.8.3.2-941-gda9c3c8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/4] mailmap: do not lose single-letter names

2013-07-12 Thread Junio C Hamano

In parse_name_and_email() function, there is this line:

*name = (nstart < nend ? nstart : NULL);

When the function is given a buffer "A  ",
nstart scans from the beginning of the buffer, skipping whitespaces
(there isn't any, so nstart points at the buffer), while nend starts
from one byte before the first '<' and skips whitespaces backwards
and stops at the first non-whitespace (i.e. it hits "A" at the
beginning of the buffer).  nstart == nend in this case for a
single-letter name, and an off-by-one error makes it fail to pick up
the name, which makes the entry equivalent to

 

without the name.

Signed-off-by: Junio C Hamano 
---
 mailmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mailmap.c b/mailmap.c
index 2a7b366..418081e 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -122,7 +122,7 @@ static char *parse_name_and_email(char *buffer, char **name,
while (nend > nstart && isspace(*nend))
--nend;
 
-   *name = (nstart < nend ? nstart : NULL);
+   *name = (nstart <= nend ? nstart : NULL);
*email = left+1;
*(nend+1) = '\0';
*right++ = '\0';
-- 
1.8.3.2-941-gda9c3c8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 7/7] push: document --lockref

2013-07-12 Thread Junio C Hamano

Johannes Sixt  writes:

> We have three independent options that the user can choose in any
> combination:
>
>  o --force given or not;
>
>  o --lockref semantics enabled or not;
>
>  o refspec with or without +;
>
> and these two orthogonal preconditions of the push
>
>  o push is fast-forward or it is not ("ff", "noff");
>
>  o the branch at the remote is at the expected rev or it is not
>("match", "mismatch").
>
> Here is a table with the expected outcome. "ok" means that the push is
> allowed(*), "fail" means that the push is denied. (Four more lines with
> --force are omitted because they have "ok" in all spots.)
>
>ff   noff ff  noff
>   match match mismatch mismatch
>
> --lockref +refspec okokdenied   denied
> --lockref  refspec ok  denied  denied   denied

I am confused with these.  The latter is the most typical:

git fetch
git checkout topic
git rebase topic
git push --lockref topic

where we know it is "noff" already, and we just want to make sure
that nobody mucked with our remote while we are rebasing.

If nobody updated the remote, why should this push be denied?  And in
order to make it succeed, you need to force with +refspec or --force,
but that would bypass match/mismatch safety, which makes the whole
"make sure the other end is unchanged" safety meaningless, no?

>   +refspec okok  ok   ok

This is traditional --force.

>refspec ok  deniedok denied

We are not asking for --lockref, so match/mismatch does not affect
the outcome.

> Notice that without --lockref semantics enabled, +refspec and refspec
> keep the current behavior.

But I do not think the above table with --lockref makes much sense.

Let's look at noff/match case.  That is the only interesting one.

This should fail:

git push topic

due to no-ff.

Your table above makes this fail:

git push --lockref topic

and the user has to force it, like this?

git push --lockref --force topic ;# or alternatively
git push --lockref +topic

Why is it even necessary?

If you make

git push --lockref topic

succeed in noff/match case, everything makes more sense to me.

The --lockref option is merely a weaker form of --force but still a
way to override the noff check.  If the user wants to keep noff
check, the user can simply choose not to use the option.

Of course, that form should fail if "mismatch".  And then you can
force it,

git push --force [--lockref] topic

As "--force" is "anything goes", it does not matter if you give the
other option on the command line.

> (*) As we are talking only about the client-side of the push here, I'm
> saying "allowed" instead of "succeeds" because the server can have
> additional restrictions that can make the push fail.

Yes, you and I have known from the beginning that we are in
agreement on that, but it is a good idea to explicitly say so for
the sake of bystanders.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Add a testcase for checking case insensitivity of mail map

2013-07-12 Thread Stefan Beller

Your patch seems to do it.
I added a test case which doesn't fail, if your patch is added.

Signed-off-by: Stefan Beller 

---
 t/t4203-mailmap.sh | 33 +
 1 file changed, 33 insertions(+)

diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 842b754..07152e9 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -409,4 +409,37 @@ test_expect_success 'Blame output (complex mapping)' '
test_cmp expect actual.fuzz
 '
 
+test_expect_success 'cleanup after mailmap.blob tests' '
+   rm -f .mailmap
+'
+
+cat >expect <<\EOF
+ 2 A 
+ 2 Other Author 
+ 2 Santa Claus 
+ 1 A U Thor 
+ 1 CTO 
+ 1 Some Dude 
+EOF
+
+test_expect_success 'Test case sensitivity of Names' '
+   # do a commit:
+   echo "asdf" > test1
+   git add test1
+   git commit -a --author="A " -m "add test1"
+
+   # commit with same name, but different email
+   # (different capitalization does the trick already,
+   # but here I am going to use a different mail)
+   echo "asdf" > test2
+   git add test2
+   git commit -a --author="A " -m "add test2"
+
+   # Adding the line to the mailmap should make life easy, so we know
+   # it is the same person
+   echo "A  " > .mailmap
+
+   git shortlog -sne HEAD >actual && test_cmp expect actual
+'
+
 test_done
-- 
1.8.3.2.776.gfcf213d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] config: add support for http..* settings

2013-07-12 Thread Aaron Schrab


At 06:07 -0700 12 Jul 2013, "Kyle J. McKay"  wrote:
I don't think it's necessary to split the URL apart though.  Whatever 
URL the user gave to git on the command line (at some point even if 
it's now stored as a remote setting in config) complete with URL-
encoding, user names, port names, etc. is the same url, possibly 
shortened, that needs to be used for the http..option setting.


This seems to be assuming that the configuration is done after the URL 
is entered and that URLs are always entered manually.  I don't think 
either of those assumptions is valid.  A user may want to specify http 
settings for all repositories on a specified host and so add settings 
for that host to ~/.gitconfig expecting those settings to be used later.  
A URL in a slightly different format may later be copy+pasted without 
the user realizing that it won't use that config due to one of the 
versions being in a non-canonical form.


I think that's simple and very easy to explain and avoids user 
confusion and surprise while still allowing a default to be set for a 
site but easily overridden for a portion of that site without needing 
to worry about the order config files are processed or the order of the 
[http ""] sections within them.


I agree that the method is easy to explain, but I think a user may very 
well be surprised and confused in a scenario like I described above.  
And having the order not matter (in some cases) for these configuration 
items deviates from how other configuration values are handled.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug in .mailmap handling?

2013-07-12 Thread Junio C Hamano

Junio C Hamano  writes:

> Junio C Hamano  writes:
>
>> Stefan Beller  writes:
>>
>>> git shortlog -sne
>>>  1  A 
>>>  1  A 
>>
>> This is coming from mailmap.c::add_mapping() that downcases the
>> e-mail address.
>>
>> changed_em...@example.org is mapped to a...@example.org because of this
>> downcasing, while "A " does not have any entry for it
>> in the .mailmap file, so it is given back as-is.  Hence we see two
>> distinct entries.
>
> I think it is wrong for the parser to lose information by
> downcasing.
>
> It is perfectly fine to do the comparison case insensitively, to
> allow  in the input is mangled using the
> entry for  in the .mailmap file; it is
> not fine to downcase the parsed result, especially the side that is
> used as the "rewritten result" (i.e. the tokens earlier on the line),
> as that would mean mangling the output.
>
> Let me see if I can quickly whip up a fix.

It might be just the matter of doing this.

I suspect that we could drop the downcasing of old-email, too, but
the new-email is supposed to be the rewritten result that appears on
the output, and downcasing it is very wrong.

I also suspect that this was an old workaround for the original
string-list that did not know how to match items case insensitively,
which we should have removed when we added case sensitivity support
to the string-list at around 577f63e7 (Merge branch
'ap/log-mailmap', 2013-01-20).

 mailmap.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/mailmap.c b/mailmap.c
index 418081e..c64a53d 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -56,9 +56,6 @@ static void add_mapping(struct string_list *map,
if (old_email)
for (p = old_email; *p; p++)
*p = tolower(*p);
-   if (new_email)
-   for (p = new_email; *p; p++)
-   *p = tolower(*p);
 
if (old_email == NULL) {
old_email = new_email;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] wt-status: use "format" function attribute for status_printf

2013-07-12 Thread Jeff King

On Fri, Jul 12, 2013 at 09:10:30AM -0700, Junio C Hamano wrote:

> > You can "fix" it with -Wno-zero-format-length, so the hassle is not
> > huge. But I am also inclined to just drop this one. We have lived
> > without the extra safety for a long time, and list review does tend to
> > catch such problems in practice.
> 
> I am tempted to actually merge the original one as-is without any of
> the workaround, and just tell people to use -Wno-format-zero-length.

Yeah, I think the only downside is the cognitive burden on individual
developers who try -Wall and have to figure out that we need
-Wno-zero-format-length (and that the warnings are not interesting).

It would be nice to add it automatically to CFLAGS, but I do not know if
we can reliably detect from the Makefile that we are compiling under
gcc.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] t0008: avoid SIGPIPE race condition on fifo

2013-07-12 Thread Jeff King

On Fri, Jul 12, 2013 at 09:23:54AM -0700, Junio C Hamano wrote:

> > In that case, check-ignore gets a SIGPIPE and dies. The
> > outer shell then tries to open the output fifo but blocks
> > indefinitely, because there is no writer.  We can fix it by
> > keeping a descriptor open through the whole procedure.
> 
> Ahh, figures.
> 
> I wish I were smart enough to figure that out immediately after
> seeing the test that does funny things to "in" with "9".

I wish I were smart enough to have figured it out when I was writing the
funny stuff with "in" and "9" in the first place. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug in .mailmap handling?

2013-07-12 Thread Junio C Hamano

Junio C Hamano  writes:

> Stefan Beller  writes:
>
>>  git shortlog -sne
>>   1  A 
>>   1  A 
>
> This is coming from mailmap.c::add_mapping() that downcases the
> e-mail address.
>
> changed_em...@example.org is mapped to a...@example.org because of this
> downcasing, while "A " does not have any entry for it
> in the .mailmap file, so it is given back as-is.  Hence we see two
> distinct entries.

I think it is wrong for the parser to lose information by
downcasing.

It is perfectly fine to do the comparison case insensitively, to
allow  in the input is mangled using the
entry for  in the .mailmap file; it is
not fine to downcase the parsed result, especially the side that is
used as the "rewritten result" (i.e. the tokens earlier on the line),
as that would mean mangling the output.

Let me see if I can quickly whip up a fix.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug in .mailmap handling?

2013-07-12 Thread Stefan Beller

On 07/12/2013 10:28 PM, Junio C Hamano wrote:
> Stefan Beller  writes:
> 
>>  # Adding the line to the mailmap should make life easy, so we know
>>  # it's the same person
>>  echo "A  " > .mailmap
> 
> While I was looking at this, I noticed this piece of code:
> 
> diff --git a/mailmap.c b/mailmap.c
> index 2a7b366..418081e 100644
> --- a/mailmap.c
> +++ b/mailmap.c
> @@ -122,7 +122,7 @@ static char *parse_name_and_email(char *buffer, char 
> **name,
>   while (nend > nstart && isspace(*nend))
>   --nend;
>  
> - *name = (nstart < nend ? nstart : NULL);
> + *name = (nstart <= nend ? nstart : NULL);
>   *email = left+1;
>   *(nend+1) = '\0';
>   *right++ = '\0';
> 
> The function is given a buffer "A ...", nstart scans
> from the beginning of the buffer, skipping whitespaces (there isn't
> any, so nstart points at the buffer), while nend starts from one
> byte before the first '<' and skips whitespaces backwards and ends
> at the first non-whitespace (i.e. it hits "A" at the beginning of
> the buffer).  nstart == nend in this case for a single-letter name,
> and an off-by-one error makes it fail to pick up the name, which
> makes the entry equivalent to
> 
>
> 
> without the name.  I do not think this bug affected anything you
> observed, though.
> 
>>  git shortlog -sne
>>   1  A 
>>   1  A 
> 
> This is coming from mailmap.c::add_mapping() that downcases the
> e-mail address.
> 
> changed_em...@example.org is mapped to a...@example.org because of this
> downcasing, while "A " does not have any entry for it
> in the .mailmap file, so it is given back as-is.  Hence we see two
> distinct entries.
> 

So do I understand it right, we're having 2 bugs in here?

One being triggered by the short name, only one character.
So if you want to debug the other bug with a longer name,
you can either use a made up name, or run
git shortlog -sne |grep Knut
in the git repository having the mailmap file already updated.
The way the mailmap file is written, I'd assume only one line
to be found, as of now 2 lines come up
 2  Knut Franke 
 1  Knut Franke 

which seems to downcase the whole first email.





--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug in .mailmap handling?

2013-07-12 Thread Junio C Hamano

Stefan Beller  writes:

>   # Adding the line to the mailmap should make life easy, so we know
>   # it's the same person
>   echo "A  " > .mailmap

While I was looking at this, I noticed this piece of code:

diff --git a/mailmap.c b/mailmap.c
index 2a7b366..418081e 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -122,7 +122,7 @@ static char *parse_name_and_email(char *buffer, char **name,
while (nend > nstart && isspace(*nend))
--nend;

-   *name = (nstart < nend ? nstart : NULL);
+   *name = (nstart <= nend ? nstart : NULL);
*email = left+1;
*(nend+1) = '\0';
*right++ = '\0';

The function is given a buffer "A ...", nstart scans
from the beginning of the buffer, skipping whitespaces (there isn't
any, so nstart points at the buffer), while nend starts from one
byte before the first '<' and skips whitespaces backwards and ends
at the first non-whitespace (i.e. it hits "A" at the beginning of
the buffer).  nstart == nend in this case for a single-letter name,
and an off-by-one error makes it fail to pick up the name, which
makes the entry equivalent to

without the name.  I do not think this bug affected anything you
observed, though.

>   git shortlog -sne
>1  A 
>1  A 

This is coming from mailmap.c::add_mapping() that downcases the
e-mail address.

changed_em...@example.org is mapped to a...@example.org because of this
downcasing, while "A " does not have any entry for it
in the .mailmap file, so it is given back as-is.  Hence we see two
distinct entries.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/7] cat-file --batch-check performance improvements

2013-07-12 Thread Jeff King

On Fri, Jul 12, 2013 at 10:23:34AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > The results for running (in linux.git):
> >
> >   $ git rev-list --objects --all >objects
> >   $ git cat-file --batch-check='%(objectsize:disk)' /dev/null
> 
> I can see how these patches are very logical avenue to grab only
> on-disk footprint for large number of objects, but among the type,
> payload size and on-disk footprint, I find it highly narrow niche
> that a real user or script is interested _only_ in on-disk footprint
> without even worrying about the type of object.

Yeah, I agree it is a bit of a niche. However, there are other code
paths that might want only the size and not the type (e.g., we already
know the object is a blob, but want to know size before deciding how to
handle diff). But in general, I doubt the performance impact is a big
deal there. It's only measurable when you're doing millions of objects.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] .mailmap: Map email addresses to names

2013-07-12 Thread Junio C Hamano

Stefan Beller  writes:

> Additionally to adding (name, email) mappings to the
> .mailmap file, it has also been sorted alphabetically.
> (which explains the removals, which are added
> 3 lines later on again).

What is this "3 lines later on again" about?  Is it a remnant from
the previous iteration?  If so, I can remove this "(which ...)"
locally.

> While the most changes happen at the email addresses,
> we also have a name change in here. Karl Hasselström
> is now known as Karl Wiberg due to marriage. Congratulations!

Indeed.  I like this part of the log message ;-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 7/7] push: document --lockref

2013-07-12 Thread Johannes Sixt

Am 12.07.2013 19:40, schrieb Junio C Hamano:
> Johannes Sixt  writes:
> 
>> Am 12.07.2013 00:14, schrieb Junio C Hamano:
>>> Johannes Sixt  writes:
>>>
 Again: Why not just define +refspec as the way to achieve this check?
>>>
>>> What justification do we have to break existing people's
>>> configuration that says something like:
>>>
>>> [remote "ko"]
>>> url = kernel.org:/pub/scm/git/git.git
>>> push = master
>>> push = next
>>> push = +pu
>>> push = maint
>>>
>>> by adding a _new_ requirement they may not be able to satisify?
>>> Notice that the above is a typical "push only" publishing point,
>>> where you do not need any remote tracking branches.
>>
>> Why would it break? When you do not specify --lockref, there is no
>> change whatsoever.
> 
> I thought your suggestion "Why not just define +pu as the way to
> achieve _THIS_ check?" was to make +pu to mean
> 
>   git push ko --lockref pu
> 
> which would mean "check refs/remotes/ko/pu and make sure the remote
> side still is at that commit".
> 
> If that is not what you meant, please clarify what _THIS_ is.

We have three independent options that the user can choose in any
combination:

 o --force given or not;

 o --lockref semantics enabled or not;

 o refspec with or without +;

and these two orthogonal preconditions of the push

 o push is fast-forward or it is not ("ff", "noff");

 o the branch at the remote is at the expected rev or it is not
   ("match", "mismatch").

Here is a table with the expected outcome. "ok" means that the push is
allowed(*), "fail" means that the push is denied. (Four more lines with
--force are omitted because they have "ok" in all spots.)

   ff   noff ff  noff
  match match mismatch mismatch

--lockref +refspec okokdenied   denied
--lockref  refspec ok  denied  denied   denied
  +refspec okok  ok   ok
   refspec ok  deniedok denied

Notice that without --lockref semantics enabled, +refspec and refspec
keep the current behavior.

(*) As we are talking only about the client-side of the push here, I'm
saying "allowed" instead of "succeeds" because the server can have
additional restrictions that can make the push fail.

-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] http.c: fix parsing of http.sslCertPasswordProtected variable

2013-07-12 Thread Junio C Hamano

Jonathan Nieder  writes:

> FWIW the GIT_SSL_CERT_PASSWORD_PROTECTED envvar has a similar "can
> only enable" behavior, but since it's documented, that's not as big
> of a problem.  Do you remember why it was written that way?

Not me ;-).

If I have to guess, it is probably that these two are thought to be
equally trivial way to defeat existing environment settings:

(unset GIT_SSL_CERT_PASSWORD_PROTECTED ; git cmd)
GIT_SSL_CERT_PASSWORD_PROTECTED=no git cmd

> When that setting was first added[1], there was some mention of
> autodetection if libcurl could learn to do that.  Did it happen?

I do not think so, but let's see if our resident cURL guru has
something to say about it.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] .mailmap: Map email addresses to names

2013-07-12 Thread Junio C Hamano

Stefan Beller  writes:

> Ok I am sending all confirmed changes now again
> in one big patch, as the sorting was wrong.
>
> Stefan Beller (1):
>   .mailmap: Map email addresses to names

So this corresponds to both of your patches, or just the first
batch?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] .mailmap: Map email addresses to names

2013-07-12 Thread Stefan Beller

People change email addresses quite often and sometimes
forget to add their entry to the mailmap file.
I have contacted lots of people, whose name occurs
multiple times in the short log having different
email addresses. The entries in the mailmap of
this patch are either confirmed by them or are trivial.
Trivial means different capitalisation of the domain
(@MIT.EDU and @mit.edu) or the domain was localhost,
(none) or @local.

Additionally to adding (name, email) mappings to the
.mailmap file, it has also been sorted alphabetically.
(which explains the removals, which are added
3 lines later on again). The sorting was done using
export LC_ALL=C; /usr/bin/sort without arguments.

While the most changes happen at the email addresses,
we also have a name change in here. Karl Hasselström
is now known as Karl Wiberg due to marriage. Congratulations!

To find out whom to contact I used the following small
script:
-
#!/bin/bash
git shortlog -sne |awk '{ NF--; $1=""; print }' |sort |uniq -d > 
mailmapdoubles
while read line ; do
# remove leading whitespace
trimmed=$(echo $line | sed -e 's/^ *//g' -e 's/ *$//g')
echo "git shortlog -sne | grep \""$trimmed"\""
done < mailmapdoubles > mailmapdoubles2
sh mailmapdoubles2
rm mailmapdoubles
rm mailmapdoubles2
-
Also interesting for similar tasks are these snippets:

# Finding out duplicates by comparing email addresses:
git shortlog -sne |awk '{ print $NF }' |sort |uniq -d

# Finding out duplicates by comparing names:
git shortlog -sne |awk '{ NF--; $1=""; print }' |sort |uniq -d
-

Signed-off-by: Stefan Beller 
---
 .mailmap | 135 +++
 1 file changed, 110 insertions(+), 25 deletions(-)

diff --git a/.mailmap b/.mailmap
index 345cce6..22d3d70 100644
--- a/.mailmap
+++ b/.mailmap
@@ -5,99 +5,184 @@
 # same person appearing not to be so.
 #
 
+ 
+Alejandro R. Sedeño  
 Alex Bennée 
+Alex Riesen  
+Alex Riesen  
+Alex Riesen  
+Alex Vandiver  
 Alexander Gavrilov 
+Alexey Shumkin  
+Anders Kaseorg  
+Anders Kaseorg  
 Aneesh Kumar K.V 
+Bernt Hansen  
+Brandon Casey  
 Brian M. Carlson 
+Bryan Larsen  
+Bryan Larsen  
 Cheng Renquan 
 Chris Shoemaker 
 Dan Johnson 
 Dana L. How 
 Dana L. How 
 Daniel Barkalow 
+David Brown  
 David D. Kilzer 
 David Kågedal 
+David Reiss  
 David S. Miller 
 Deskin Miller 
 Dirk Süsserott 
+Eric Blake  
+Eric Hanchrow  
 Eric S. Raymond 
 Erik Faye-Lund  
+Eyvind Bernhardsen  
+Florian Achleitner  

+Franck Bui-Huu  
+Frank Lichtenheld  
+Frank Lichtenheld  
 Fredrik Kuivinen 
 Frédéric Heitzmann 
 H. Merijn Brand  H.Merijn Brand 
-H. Peter Anvin 
-H. Peter Anvin 
-H. Peter Anvin 
+H. Peter Anvin  
+H. Peter Anvin  
+H. Peter Anvin  
+H. Peter Anvin  
+Han-Wen Nienhuys  Han-Wen Nienhuys 
 Horst H. von Brand 
-İsmail Dönmez 
+J. Bruce Fields  
+J. Bruce Fields  
+J. Bruce Fields  
 Jakub Narębski 
-Jay Soffian 
+Jason Riedy  
+Jason Riedy  
+Jay Soffian  
 Jeff King  
+Jeff Muizelaar  
 Joachim Berdal Haga 
-Johannes Sixt  
-Johannes Sixt  
+Johannes Schindelin  
 Johannes Sixt  
+Johannes Sixt  
+Johannes Sixt  
 Jon Loeliger 
-Jon Seymour 
-Jonathan Nieder 
+Jon Seymour  
+Jonathan Nieder  
+Jonathan del Strother  
+Josh Triplett  
+Josh Triplett  
+Julian Phillips  
 Junio C Hamano  
-Junio C Hamano  
-Junio C Hamano  
-Junio C Hamano  
 Junio C Hamano  
 Junio C Hamano  
+Junio C Hamano  
+Junio C Hamano  
 Junio C Hamano  
-Karl Hasselström 
-Kevin Leung 
+Junio C Hamano  
+Karl Wiberg  Karl Hasselström 
+Karl Wiberg  Karl Hasselström 

+Karsten Blees  
+Karsten Blees  
+Kay Sievers  
+Kay Sievers  
+Keith Cascio  
 Kent Engstrom 
+Kevin Leung 
+Kirill Smelkov  
+Kirill Smelkov  
+Knut Franke  
 Lars Doelle 
 Lars Doelle 
 Li Hong 
-Linus Torvalds  

-Linus Torvalds  
-Linus Torvalds  
 Linus Torvalds  
-Linus Torvalds  
+Linus Torvalds  
+Linus Torvalds  
 Linus Torvalds  

-Lukas Sandström 
+Linus Torvalds  
+Linus Torvalds  

+Lukas Sandström  
+Marc Khouzam  
 Marc-André Lureau 
+Marco Costalba  
+Mark Levedahl  
 Mark Rada 
 Martin Langhoff  
 Martin von Zweigbergk  
+Matt Draisey  
+Matt Kraai  
+Matthias Kestenholz  
+Matthias Urlichs  
+Matthias Urlichs  
 Michael Coleman 
 Michael J Gruber  
 Michael W. Olson 
+Michael Witten  
+Michael Witten  
 Michele Ballabio 
+Miklos Vajna  
+Namhyung Kim  
+Namhyung Kim  
 Nanako Shiraishi 
 Nanako Shiraishi 
+Nelson Elhage  
+Nelson Elhage  
 Nguyễn Thái Ngọc Duy 
- 
-Peter Krefting  
+Nicolas Morey-Chaisemartin  

+Nicolas Morey-Chaisemartin  

+Paolo Bonzini  
+Pascal Obry  
+Pascal Obry  
+Pat Notz  
+Paul Mackerras  
+Paul Mackerras  
 Peter Krefting  
+Peter Krefting  
 Petr Baudis  
+Petr Baudis  
+Phil Hord  
+Philip Jägenstedt  
+Philipp A. Hartmann  
 Philippe Bruhat 
 Ralf Thielow  
 Ramsay Allan Jones 
 René Scharfe 
 Robert Fitzsimons 
 Robert Zeh 
-Sam Vilain 
-Santi Béjar 
+Robin Rosenberg  
+Salikh Zakirov  
+Sam Vilain  
+Santi Béjar  
 Sean Estab

[PATCH] .mailmap: Map email addresses to names

2013-07-12 Thread Stefan Beller

Ok I am sending all confirmed changes now again
in one big patch, as the sorting was wrong.

Stefan Beller (1):
  .mailmap: Map email addresses to names

 .mailmap | 135 +++
 1 file changed, 110 insertions(+), 25 deletions(-)

-- 
1.8.3.2.790.g9192b0b

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] .mailmap: Map email addresses to names

2013-07-12 Thread Stefan Beller

>From the man page

*** WARNING *** The locale specified by the environment affects sort
order.  Set LC_ALL=C to get the traditional sort  order  that  uses
   native byte values.

And indeed I can avoid being nitpicked again for wrong sorting. ;)

On 07/12/2013 09:02 PM, Stefan Beller wrote:
> 
> Which tool would you recommend to sort stuff?
> Or rather the exact parameters for /usr/bin/sort ?
> 
> Thanks,
> Stefan
> 
> On 07/12/2013 08:57 PM, Jonathan Nieder wrote:
>> Stefan Beller wrote:
>>
>>> --- a/.mailmap
>>> +++ b/.mailmap
>>> @@ -5,99 +5,146 @@
>> [...]
>>>  Chris Shoemaker 
>>> -Dan Johnson 
>>>  Dana L. How 
>>>  Dana L. How 
>>>  Daniel Barkalow 
>>> +Dan Johnson 
>>
>> Small nit: the sorting looks broken here and in similar places below
>> (the usual ordering is Dan < Dana < Daniel).
>>
>> Thanks,
>> Jonathan
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] http.c: fix parsing of http.sslCertPasswordProtected variable

2013-07-12 Thread Jonathan Nieder

Junio C Hamano wrote:

> The existing code triggers only when the configuration variable is
> set to true.  Once the variable is set to true in a more generic
> configuration file (e.g. ~/.gitconfig), it cannot be overriden to
> false in the repository specific one (e.g. .git/config).
[...]
> --- a/http.c
> +++ b/http.c
> @@ -160,8 +160,7 @@ static int http_options(const char *var, const char 
> *value, void *cb)
>   if (!strcmp("http.sslcainfo", var))
>   return git_config_string(&ssl_cainfo, var, value);
>   if (!strcmp("http.sslcertpasswordprotected", var)) {
> - if (git_config_bool(var, value))
> - ssl_cert_password_required = 1;
> + ssl_cert_password_required = git_config_bool(var, value);

Thanks for catching it.  The documentation doesn't say anything about
this "can only enable and cannot disable" behavior and the usual
pattern is to allow later settings to override earlier ones, so this
change looks good.

Reviewed-by: Jonathan Nieder 

FWIW the GIT_SSL_CERT_PASSWORD_PROTECTED envvar has a similar "can
only enable" behavior, but since it's documented, that's not as big
of a problem.  Do you remember why it was written that way?

When that setting was first added[1], there was some mention of
autodetection if libcurl could learn to do that.  Did it happen?
(Please forgive my ignorance.)

Thanks,
Jonathan

[1] http://thread.gmane.org/gmane.comp.version-control.git/122793/focus=122816
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] .mailmap: Map email addresses to names

2013-07-12 Thread Stefan Beller


Which tool would you recommend to sort stuff?
Or rather the exact parameters for /usr/bin/sort ?

Thanks,
Stefan

On 07/12/2013 08:57 PM, Jonathan Nieder wrote:
> Stefan Beller wrote:
> 
>> --- a/.mailmap
>> +++ b/.mailmap
>> @@ -5,99 +5,146 @@
> [...]
>>  Chris Shoemaker 
>> -Dan Johnson 
>>  Dana L. How 
>>  Dana L. How 
>>  Daniel Barkalow 
>> +Dan Johnson 
> 
> Small nit: the sorting looks broken here and in similar places below
> (the usual ordering is Dan < Dana < Daniel).
> 
> Thanks,
> Jonathan
> 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] .mailmap: Map email addresses to names

2013-07-12 Thread Jonathan Nieder

Stefan Beller wrote:

> --- a/.mailmap
> +++ b/.mailmap
> @@ -5,99 +5,146 @@
[...]
>  Chris Shoemaker 
> -Dan Johnson 
>  Dana L. How 
>  Dana L. How 
>  Daniel Barkalow 
> +Dan Johnson 

Small nit: the sorting looks broken here and in similar places below
(the usual ordering is Dan < Dana < Daniel).

Thanks,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] http.c: fix parsing of http.sslCertPasswordProtected variable

2013-07-12 Thread Junio C Hamano

The existing code triggers only when the configuration variable is
set to true.  Once the variable is set to true in a more generic
configuration file (e.g. ~/.gitconfig), it cannot be overriden to
false in the repository specific one (e.g. .git/config).

Signed-off-by: Junio C Hamano 
---
 http.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/http.c b/http.c
index 92aba59..37986f8 100644
--- a/http.c
+++ b/http.c
@@ -160,8 +160,7 @@ static int http_options(const char *var, const char *value, 
void *cb)
if (!strcmp("http.sslcainfo", var))
return git_config_string(&ssl_cainfo, var, value);
if (!strcmp("http.sslcertpasswordprotected", var)) {
-   if (git_config_bool(var, value))
-   ssl_cert_password_required = 1;
+   ssl_cert_password_required = git_config_bool(var, value);
return 0;
}
if (!strcmp("http.ssltry", var)) {
-- 
1.8.3.2-942-gc84dfcb

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4] config: add support for http..* settings

2013-07-12 Thread Junio C Hamano

"Kyle J. McKay"  writes:

> + if (!strcmp("sslcertpasswordprotected", key)) {
> + if (check_matched_len(opt_passwd_req, matchlen))
> + return 0;
>   if (git_config_bool(var, value))
>   ssl_cert_password_required = 1;
>   return 0;
>   }

This is not a new problem, but I think the existing code is wrong.
There is no way to countermand an earlier

[http]
sslcertpasswordprotected

in a more generic configuration file with


[http]
sslcertpasswordprotected = no

in a repository specific configuration file.

Perhaps we should fix it as a preparatory patch (1/2) before the
main "feature addition" patch.

> - if (!strcmp("http.ssltry", var)) {
> + if (!strcmp("ssltry", key)) {
> + if (check_matched_len(opt_ssl_try, matchlen))
> + return 0;
>   curl_ssl_try = git_config_bool(var, value);
>   return 0;
>   }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4] config: add support for http..* settings

2013-07-12 Thread Junio C Hamano

"Kyle J. McKay"  writes:

> The  value is considered a match to a url if the  value
> is either an exact match or a prefix of the url which ends on a
> path component boundary ('/').  So "https://example.com/test"; will
> match "https://example.com/test"; and "https://example.com/test/too";
> but not "https://example.com/testextra";.
>
> Longer matches take precedence over shorter matches with
> environment variable settings taking precedence over all.
>
> With this configuration:
>
> [http]
> useragent = other-agent
> noEPSV = true
> [http "https://example.com";]
> useragent = example-agent
> sslVerify = false
> [http "https://example.com/path";]
> useragent = path-agent
>
> The "https://other.example.com/"; url will have useragent
> "other-agent" and sslVerify will be on.
>
> The "https://example.com/"; url will have useragent
> "example-agent" and sslVerify will be off.
>
> The "https://example.com/path/sub"; url will have useragent
> "path-agent" and sslVerify will be off.
>
> All three of the examples will have noEPSV enabled.
>
> Signed-off-by: Kyle J. McKay 
> Reviewed-by: Junio C Hamano 

I haven't reviewed this version (yet).

> ---
>
> The credentials configuration values already support url-specific
> configuration items in the form credential..*.  This patch
> adds similar support for http configuration values.

Let's move the above three lines into the proposed log message and
make it its first paragraph.  A log message should say what it is
about (i.e. what does "add support" really mean?  what kind of
functionality the new support brings to us?) upfront and then
explain what it does in detail (i.e. the explanation of matching
semantics you have as the first paragraph).

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index b4d4887..3731a3a 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1531,6 +1531,17 @@ http.useragent::
>   of common USER_AGENT strings (but not including those like git/1.7.1).
>   Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
>  
> +http..*::
> + Any of the http.* options above can be applied selectively to some urls.
> + For example "http.https://example.com.useragent"; would set the user
> + agent only for https connections to example.com.  The  value
> + matches a url if it is an exact match or a prefix of the url matching
> + at a '/' boundary.

Given Peff's review, I am no longer sure if the "strictly textual
match" is the semantics we want.  At least, it is easy to explain
and understand, but it might be too limiting to be useful.

Let's assume it is what we want, for the rest of the review.

> diff --git a/http.c b/http.c
> index 2d086ae..feca70a 100644
> --- a/http.c
> +++ b/http.c
> @@ -30,6 +30,34 @@ static CURL *curl_default;
>  
>  char curl_errorstr[CURL_ERROR_SIZE];
>  
> +enum http_option_type {
> + opt_post_buffer = 0,

Do we need to have "= 0" here?

Is this order meant to match some externally defined order
(e.g. alphabetical, or the value of corresponding constants defined
in the cURL library), other than "opt_max is not a real option but
just has to be there at the end to count all of them"?

I am wondering if it would make it easier to read to move all the #ifdef
at the end immediately before opt-max, or something.

> + opt_min_sessions,
> +#ifdef USE_CURL_MULTI
> + opt_max_requests,
> +#endif
> +...
> + opt_user_agent,
> + opt_passwd_req,
> + opt_max
> +};

> +static int check_matched_len(enum http_option_type opt, size_t matchlen)
> +{
> + /*
> +  * Check the last matched length of option opt against matchlen and
> +  * return true if the last matched length is larger (meaning the config
> +  * setting should be ignored).  Upon seeing the _same_ key (i.e. new key
> +  * has the same match length which is therefore not larger) the new
> +  * setting will override the previous setting.  Otherwise return false
> +  * and record matchlen as the current last matched length of option opt.
> +  */
> + if (http_option_max_matched_len[opt] > matchlen)
> + return 1;
> + http_option_max_matched_len[opt] = matchlen;
> + return 0;
> +}

A "check_foo" that returns either 0 or 1 usually mean 1 is good and
0 is not.  A "check_foo" that returns either negative or 0 usually
mean negative is an error and 0 is good.

In general, "check_foo" is a bad name for a boolean function.  This
checks "seen_longer_match()", and it is up to the caller if it
considers good or bad to have a matching element that is longer or
shorter what it has already seen.  See below.

>  static int http_options(const char *var, const char *value, void *cb)
>  {
> - if (!strcmp("http.sslverify", var)) {
> + const char *url = (const char *)cb;

Cast?

> + if (!strcmp("sslverify", key)) {
> + if (check_matched_len(opt_ssl_verify, matchlen))
> +

Re: Bug in .mailmap handling?

2013-07-12 Thread Stefan Beller

The commands are exactly as given in the first mail.
(I run it multiple times, and this exact behavior comes up)

I think there is some problem with the capitalisation
of the first (if not all) characters. (Hence the small 'a')

So either check with these example commands yourself, or take my latest
patch for the mailmap to reproduce on real names, please.

On 07/12/2013 07:35 PM, Junio C Hamano wrote:
> Stefan Beller  writes:
> 
>> Hello,
>>
>> you may have noticed I am currently trying to bring the 
>> mailmap file of git itself up to date. I noticed
>> some behavior, which I did not expect. Have a look yourself:
>>
>> ---
>>  # prepare test environment:
>>  mkdir testmailmap
>>  cd testmailmap/
>>  git init
>>
>>  # do a commit:
>>  echo "asdf" > test1 
>>  git add test1
>>  git commit -a --author="A " -m "add test1"
>>
>>  # commit with same name, but different email 
>>  # (different capitalization does the trick already, 
>>  # but here I am going to use a different mail)
>>  echo "asdf" > test2
>>  git add test2
>>  git commit -a --author="A " -m "add test2"
>>
>>  # how do we know it's the same person?
>>  git shortlog
>>  A (2):
>>add test1
>>add test2
> 
> You don't, and it is a long known behaviour.
> 
>>  # reports as expected:
>>  git shortlog -sne
>>1  A 
>>1  A 
> 
> Yes.
> 
>>  # Adding the line to the mailmap should make life easy, so we know
>>  # it's the same person
>>  echo "A  " > .mailmap
>>
>>  # Come on, I just wanted to have it reported as one person!
>>  git shortlog -sne
>>   1  A 
>>   1  A 
> 
> Err, where does the lowercase a@ come from in the above?  Are we
> missing some steps before we get here?
> 
>>  # So let's try another line in the mailmap file, (small 'a')
>>  echo "A  " > .mailmap
> 
> This is ">", not ">>", I presume?  Otherwise changed_email is mapped
> to two destination, no?
> 
>>  # We're not there yet?
>>  git shortlog -sne
>>   1  A 
>>   1  A 
> 
> Expected, as long as some hidden set-up you did not describe that
> caused me to say "Err, where does the lowercase a@ come from" is
> there, i.e. one of the two commits is done by .
> 
>>  # Now let's write it rather explicit: 
>>  # (essentially just write 2 lines into the mailmap file)
>>  cat << EOF > .mailmap
>>  A  
>>  A  
>>  EOF
>>   
>>  # works as expected now
>>  git shortlog -sne
>>   2  A 
> 
> Makes sense.
> 
>>  # works as expected now as well
>>  git shortlog  
>>  A (2):
>>add test1
>>add test2




signature.asc
Description: OpenPGP digital signature

Re: [PATCH 7/7] push: document --lockref

2013-07-12 Thread Junio C Hamano

Johannes Sixt  writes:

> Am 12.07.2013 00:14, schrieb Junio C Hamano:
>> Johannes Sixt  writes:
>> 
>>> Again: Why not just define +refspec as the way to achieve this check?
>> 
>> What justification do we have to break existing people's
>> configuration that says something like:
>> 
>>  [remote "ko"]
>>  url = kernel.org:/pub/scm/git/git.git
>> push = master
>> push = next
>> push = +pu
>> push = maint
>> 
>> by adding a _new_ requirement they may not be able to satisify?
>> Notice that the above is a typical "push only" publishing point,
>> where you do not need any remote tracking branches.
>
> Why would it break? When you do not specify --lockref, there is no
> change whatsoever.

I thought your suggestion "Why not just define +pu as the way to
achieve _THIS_ check?" was to make +pu to mean

git push ko --lockref pu

which would mean "check refs/remotes/ko/pu and make sure the remote
side still is at that commit".

If that is not what you meant, please clarify what _THIS_ is.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug in .mailmap handling?

2013-07-12 Thread Junio C Hamano

Stefan Beller  writes:

> Hello,
>
> you may have noticed I am currently trying to bring the 
> mailmap file of git itself up to date. I noticed
> some behavior, which I did not expect. Have a look yourself:
>
> ---
>   # prepare test environment:
>   mkdir testmailmap
>   cd testmailmap/
>   git init
>
>   # do a commit:
>   echo "asdf" > test1 
>   git add test1
>   git commit -a --author="A " -m "add test1"
>
>   # commit with same name, but different email 
>   # (different capitalization does the trick already, 
>   # but here I am going to use a different mail)
>   echo "asdf" > test2
>   git add test2
>   git commit -a --author="A " -m "add test2"
>
>   # how do we know it's the same person?
>   git shortlog
>   A (2):
> add test1
> add test2

You don't, and it is a long known behaviour.

>   # reports as expected:
>   git shortlog -sne
> 1  A 
> 1  A 

Yes.

>   # Adding the line to the mailmap should make life easy, so we know
>   # it's the same person
>   echo "A  " > .mailmap
>
>   # Come on, I just wanted to have it reported as one person!
>   git shortlog -sne
>1  A 
>1  A 

Err, where does the lowercase a@ come from in the above?  Are we
missing some steps before we get here?

>   # So let's try another line in the mailmap file, (small 'a')
>   echo "A  " > .mailmap

This is ">", not ">>", I presume?  Otherwise changed_email is mapped
to two destination, no?

>   # We're not there yet?
>   git shortlog -sne
>1  A 
>1  A 

Expected, as long as some hidden set-up you did not describe that
caused me to say "Err, where does the lowercase a@ come from" is
there, i.e. one of the two commits is done by .

>   # Now let's write it rather explicit: 
>   # (essentially just write 2 lines into the mailmap file)
>   cat << EOF > .mailmap
>   A  
>   A  
>   EOF
>
>   # works as expected now
>   git shortlog -sne
>2  A 

Makes sense.

>   # works as expected now as well
>   git shortlog  
>   A (2):
> add test1
> add test2
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 19/19] p0003-index.sh: add perf test for the index formats

2013-07-12 Thread Thomas Gummerer

From: Thomas Rast 

Add a performance test for index version [23]/4/5 by using
git update-index --index-version=x, thus testing both the reader
and the writer speed of all index formats.

Signed-off-by: Thomas Rast 
Signed-off-by: Thomas Gummerer 
---
 t/perf/p0003-index.sh | 59 +++
 1 file changed, 59 insertions(+)
 create mode 100755 t/perf/p0003-index.sh

diff --git a/t/perf/p0003-index.sh b/t/perf/p0003-index.sh
new file mode 100755
index 000..3e02868
--- /dev/null
+++ b/t/perf/p0003-index.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+
+test_description="Tests index versions [23]/4/5"
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+
+test_expect_success "convert to v3" "
+   git update-index --index-version=2
+"
+
+test_perf "v[23]: update-index" "
+   git update-index --index-version=2 >/dev/null
+"
+
+subdir=$(git ls-files | sed 's#/[^/]*$##' | grep -v '^$' | uniq | tail -n 30 | 
head -1)
+
+test_perf "v[23]: grep nonexistent -- subdir" "
+   test_must_fail git grep nonexistent -- $subdir >/dev/null
+"
+
+test_perf "v[23]: ls-files -- subdir" "
+   git ls-files $subdir >/dev/null
+"
+
+test_expect_success "convert to v4" "
+   git update-index --index-version=4
+"
+
+test_perf "v4: update-index" "
+   git update-index --index-version=4 >/dev/null
+"
+
+test_perf "v4: grep nonexistent -- subdir" "
+   test_must_fail git grep nonexistent -- $subdir >/dev/null
+"
+
+test_perf "v4: ls-files -- subdir" "
+   git ls-files $subdir >/dev/null
+"
+
+test_expect_success "convert to v5" "
+   git update-index --index-version=5
+"
+
+test_perf "v5: update-index" "
+   git update-index --index-version=5 >/dev/null
+"
+
+test_perf "v5: grep nonexistent -- subdir" "
+   test_must_fail git grep nonexistent -- $subdir >/dev/null
+"
+
+test_perf "v5: ls-files -- subdir" "
+   git ls-files $subdir >/dev/null
+"
+
+test_done
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 15/19] read-cache: write index-v5

2013-07-12 Thread Thomas Gummerer

Write the index version 5 file format to disk. This version doesn't
write the cache-tree data and resolve-undo data to the file.

The main work is done when filtering out the directories from the
current in-memory format, where in the same turn also the conflicts
and the file data is calculated.

Helped-by: Nguyen Thai Ngoc Duy 
Helped-by: Thomas Rast 
Signed-off-by: Thomas Gummerer 
---
 cache.h |   8 +
 read-cache-v5.c | 594 +++-
 read-cache.c|  11 +-
 read-cache.h|   1 +
 4 files changed, 611 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index 1e5cc77..f3685f8 100644
--- a/cache.h
+++ b/cache.h
@@ -565,6 +565,7 @@ extern int unmerged_index(const struct index_state *);
 extern int verify_path(const char *path);
 extern struct cache_entry *index_name_exists(struct index_state *istate, const 
char *name, int namelen, int igncase);
 extern int index_name_pos(const struct index_state *, const char *name, int 
namelen);
+extern struct directory_entry *init_directory_entry(char *pathname, int len);
 #define ADD_CACHE_OK_TO_ADD 1  /* Ok to add */
 #define ADD_CACHE_OK_TO_REPLACE 2  /* Ok to replace file/directory */
 #define ADD_CACHE_SKIP_DFCHECK 4   /* Ok to skip DF conflict checks */
@@ -1363,6 +1364,13 @@ static inline ssize_t write_str_in_full(int fd, const 
char *str)
return write_in_full(fd, str, strlen(str));
 }
 
+/* index-v5 helper functions */
+extern char *super_directory(const char *filename);
+extern void insert_directory_entry(struct directory_entry *, struct hash_table 
*, int *, unsigned int *, uint32_t);
+extern void add_conflict_to_directory_entry(struct directory_entry *, struct 
conflict_entry *);
+extern void add_part_to_conflict_entry(struct directory_entry *, struct 
conflict_entry *, struct conflict_part *);
+extern struct conflict_entry *create_new_conflict(char *, int, int);
+
 /* pager.c */
 extern void setup_pager(void);
 extern const char *pager_program;
diff --git a/read-cache-v5.c b/read-cache-v5.c
index 0b9c320..33667d7 100644
--- a/read-cache-v5.c
+++ b/read-cache-v5.c
@@ -769,9 +769,601 @@ static int read_index_v5(struct index_state *istate, void 
*mmap,
return 0;
 }
 
+#define WRITE_BUFFER_SIZE 8192
+static unsigned char write_buffer[WRITE_BUFFER_SIZE];
+static unsigned long write_buffer_len;
+
+static int ce_write_flush(int fd)
+{
+   unsigned int buffered = write_buffer_len;
+   if (buffered) {
+   if (write_in_full(fd, write_buffer, buffered) != buffered)
+   return -1;
+   write_buffer_len = 0;
+   }
+   return 0;
+}
+
+static int ce_write(uint32_t *crc, int fd, void *data, unsigned int len)
+{
+   if (crc)
+   *crc = crc32(*crc, (Bytef*)data, len);
+   while (len) {
+   unsigned int buffered = write_buffer_len;
+   unsigned int partial = WRITE_BUFFER_SIZE - buffered;
+   if (partial > len)
+   partial = len;
+   memcpy(write_buffer + buffered, data, partial);
+   buffered += partial;
+   if (buffered == WRITE_BUFFER_SIZE) {
+   write_buffer_len = buffered;
+   if (ce_write_flush(fd))
+   return -1;
+   buffered = 0;
+   }
+   write_buffer_len = buffered;
+   len -= partial;
+   data = (char *) data + partial;
+   }
+   return 0;
+}
+
+static int ce_flush(int fd)
+{
+   unsigned int left = write_buffer_len;
+
+   if (left)
+   write_buffer_len = 0;
+
+   if (write_in_full(fd, write_buffer, left) != left)
+   return -1;
+
+   return 0;
+}
+
+static void ce_smudge_racily_clean_entry(struct cache_entry *ce)
+{
+   /*
+* This method shall only be called if the timestamp of ce
+* is racy (check with is_racy_timestamp). If the timestamp
+* is racy, the writer will set the CE_SMUDGED flag.
+*
+* The reader (match_stat_basic) will then take care
+* of checking if the entry is really changed or not, by
+* taking into account the size and the stat_crc and if
+* that hasn't changed checking the sha1.
+*/
+   ce->ce_flags |= CE_SMUDGED;
+}
+
+char *super_directory(const char *filename)
+{
+   char *slash;
+
+   slash = strrchr(filename, '/');
+   if (slash)
+   return xmemdupz(filename, slash-filename);
+   return NULL;
+}
+
+struct directory_entry *init_directory_entry(char *pathname, int len)
+{
+   struct directory_entry *de = xmalloc(directory_entry_size(len));
+
+   memcpy(de->pathname, pathname, len);
+   de->pathname[len] = '\0';
+   de->de_flags  = 0;
+   de->de_foffset= 0;
+   de->de_cr = 0;
+   de->de_ncr= 0;
+   de->de_nsubtrees  = 0;
+   de->de

[PATCH v2 17/19] read-cache: write resolve-undo data for index-v5

2013-07-12 Thread Thomas Gummerer

Make git read the resolve-undo data from the index.

Since the resolve-undo data is joined with the conflicts in
the ondisk format of the index file version 5, conflicts and
resolved data is read at the same time, and the resolve-undo
data is then converted to the in-memory format.

Helped-by: Thomas Rast 
Signed-off-by: Thomas Gummerer 
---
 read-cache-v5.c | 94 +
 1 file changed, 94 insertions(+)

diff --git a/read-cache-v5.c b/read-cache-v5.c
index cd819b4..093ee1a 100644
--- a/read-cache-v5.c
+++ b/read-cache-v5.c
@@ -992,6 +992,99 @@ static void cache_tree_to_ondisk_v5(struct hash_table 
*table, struct cache_tree
convert_one_to_ondisk_v5(table, root, "", 0, 0);
 }
 
+static void resolve_undo_to_ondisk_v5(struct hash_table *table,
+ struct string_list *resolve_undo,
+ unsigned int *ndir, int *total_dir_len,
+ struct directory_entry *de)
+{
+   struct string_list_item *item;
+   struct directory_entry *search;
+
+   if (!resolve_undo)
+   return;
+   for_each_string_list_item(item, resolve_undo) {
+   struct conflict_entry *conflict_entry;
+   struct resolve_undo_info *ui = item->util;
+   char *super;
+   int i, dir_len, len;
+   uint32_t crc;
+   struct directory_entry *found, *current, *new_tree;
+
+   if (!ui)
+   continue;
+
+   super = super_directory(item->string);
+   if (!super)
+   dir_len = 0;
+   else
+   dir_len = strlen(super);
+   crc = crc32(0, (Bytef*)super, dir_len);
+   found = lookup_hash(crc, table);
+   current = NULL;
+   new_tree = NULL;
+
+   while (!found) {
+   struct directory_entry *new;
+
+   new = init_directory_entry(super, dir_len);
+   if (!current)
+   current = new;
+   insert_directory_entry(new, table, total_dir_len, ndir, 
crc);
+   if (new_tree != NULL)
+   new->de_nsubtrees = 1;
+   new->next = new_tree;
+   new_tree = new;
+   super = super_directory(super);
+   if (!super)
+   dir_len = 0;
+   else
+   dir_len = strlen(super);
+   crc = crc32(0, (Bytef*)super, dir_len);
+   found = lookup_hash(crc, table);
+   }
+   search = found;
+   while (search->next_hash && strcmp(super, search->pathname) != 
0)
+   search = search->next_hash;
+   if (search && !current)
+   current = search;
+   if (!search && !current)
+   current = new_tree;
+   if (!super && new_tree) {
+   new_tree->next = de->next;
+   de->next = new_tree;
+   de->de_nsubtrees++;
+   } else if (new_tree) {
+   struct directory_entry *temp;
+
+   search = de->next;
+   while (strcmp(super, search->pathname))
+   search = search->next;
+   temp = new_tree;
+   while (temp->next)
+   temp = temp->next;
+   search->de_nsubtrees++;
+   temp->next = search->next;
+   search->next = new_tree;
+   }
+
+   len = strlen(item->string);
+   conflict_entry = create_new_conflict(item->string, len, 
current->de_pathlen);
+   add_conflict_to_directory_entry(current, conflict_entry);
+   for (i = 0; i < 3; i++) {
+   if (ui->mode[i]) {
+   struct conflict_part *cp;
+
+   cp = xmalloc(sizeof(struct conflict_part));
+   cp->flags = (i + 1) << CONFLICT_STAGESHIFT;
+   cp->entry_mode = ui->mode[i];
+   cp->next = NULL;
+   hashcpy(cp->sha1, ui->sha1[i]);
+   add_part_to_conflict_entry(current, 
conflict_entry, cp);
+   }
+   }
+   }
+}
+
 static struct directory_entry *compile_directory_data(struct index_state 
*istate,
int nfile,
unsigned int *ndir,
@@ -1099,6 +1192,7 @@ static struct directory_entry 
*compile_directory_data(struc

[PATCH v2 16/19] read-cache: write index-v5 cache-tree data

2013-07-12 Thread Thomas Gummerer

Write the cache-tree data for the index version 5 file format. The
in-memory cache-tree data is converted to the ondisk format, by adding
it to the directory entries, that were compiled from the cache-entries
in the step before.

Signed-off-by: Thomas Gummerer 
---
 read-cache-v5.c | 53 +
 1 file changed, 53 insertions(+)

diff --git a/read-cache-v5.c b/read-cache-v5.c
index 33667d7..cd819b4 100644
--- a/read-cache-v5.c
+++ b/read-cache-v5.c
@@ -941,6 +941,57 @@ static struct conflict_entry 
*create_conflict_entry_from_ce(struct cache_entry *
return create_new_conflict(ce->name, ce_namelen(ce), pathlen);
 }
 
+static void convert_one_to_ondisk_v5(struct hash_table *table, struct 
cache_tree *it,
+   const char *path, int pathlen, uint32_t crc)
+{
+   int i;
+   struct directory_entry *found, *search;
+
+   crc = crc32(crc, (Bytef*)path, pathlen);
+   found = lookup_hash(crc, table);
+   search = found;
+   while (search && strcmp(path, search->pathname + search->de_pathlen - 
strlen(path)) != 0)
+   search = search->next_hash;
+   if (!search)
+   return;
+   /*
+* The number of subtrees is already calculated by
+* compile_directory_data, therefore we only need to
+* add the entry_count
+*/
+   search->de_nentries = it->entry_count;
+   if (0 <= it->entry_count)
+   hashcpy(search->sha1, it->sha1);
+   if (strcmp(path, "") != 0)
+   crc = crc32(crc, (Bytef*)"/", 1);
+
+#if DEBUG
+   if (0 <= it->entry_count)
+   fprintf(stderr, "cache-tree <%.*s> (%d ent, %d subtree) %s\n",
+   pathlen, path, it->entry_count, it->subtree_nr,
+   sha1_to_hex(it->sha1));
+   else
+   fprintf(stderr, "cache-tree <%.*s> (%d subtree) invalid\n",
+   pathlen, path, it->subtree_nr);
+#endif
+
+   for (i = 0; i < it->subtree_nr; i++) {
+   struct cache_tree_sub *down = it->down[i];
+   if (i) {
+   struct cache_tree_sub *prev = it->down[i-1];
+   if (subtree_name_cmp(down->name, down->namelen,
+prev->name, prev->namelen) <= 0)
+   die("fatal - unsorted cache subtree");
+   }
+   convert_one_to_ondisk_v5(table, down->cache_tree, down->name, 
down->namelen, crc);
+   }
+}
+
+static void cache_tree_to_ondisk_v5(struct hash_table *table, struct 
cache_tree *root)
+{
+   convert_one_to_ondisk_v5(table, root, "", 0, 0);
+}
+
 static struct directory_entry *compile_directory_data(struct index_state 
*istate,
int nfile,
unsigned int *ndir,
@@ -1046,6 +1097,8 @@ static struct directory_entry 
*compile_directory_data(struct index_state *istate
previous_entry->next = no_subtrees;
}
}
+   if (istate->cache_tree)
+   cache_tree_to_ondisk_v5(&table, istate->cache_tree);
return de;
 }
 
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 18/19] update-index.c: rewrite index when index-version is given

2013-07-12 Thread Thomas Gummerer

Make update-index always rewrite the index when a index-version
is given, even if the index already has the right version.
This option is used for performance testing the writer and
reader.

Signed-off-by: Thomas Gummerer 
---
 builtin/update-index.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 4c6e3a6..7e723c0 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -6,6 +6,7 @@
 #include "cache.h"
 #include "quote.h"
 #include "cache-tree.h"
+#include "read-cache.h"
 #include "tree-walk.h"
 #include "builtin.h"
 #include "refs.h"
@@ -863,8 +864,7 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
preferred_index_format,
INDEX_FORMAT_LB, INDEX_FORMAT_UB);
 
-   if (the_index.version != preferred_index_format)
-   active_cache_changed = 1;
+   active_cache_changed = 1;
the_index.version = preferred_index_format;
}
 
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 13/19] read-cache: read resolve-undo data

2013-07-12 Thread Thomas Gummerer

Make git read the resolve-undo data from the index.

Since the resolve-undo data is joined with the conflicts in
the ondisk format of the index file version 5, conflicts and
resolved data is read at the same time, and the resolve-undo
data is then converted to the in-memory format.

Helped-by: Thomas Rast 
Signed-off-by: Thomas Gummerer 
---
 read-cache-v5.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/read-cache-v5.c b/read-cache-v5.c
index 00112ea..853b97d 100644
--- a/read-cache-v5.c
+++ b/read-cache-v5.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "read-cache.h"
+#include "string-list.h"
 #include "resolve-undo.h"
 #include "cache-tree.h"
 #include "dir.h"
@@ -447,6 +448,43 @@ static int read_conflicts(struct conflict_entry **head,
return 0;
 }
 
+static void resolve_undo_convert_v5(struct index_state *istate,
+   struct conflict_entry *conflict)
+{
+   int i;
+
+   while (conflict) {
+   struct string_list_item *lost;
+   struct resolve_undo_info *ui;
+   struct conflict_part *cp;
+
+   if (conflict->entries &&
+   (conflict->entries->flags & CONFLICT_CONFLICTED) != 0) {
+   conflict = conflict->next;
+   continue;
+   }
+   if (!istate->resolve_undo) {
+   istate->resolve_undo = xcalloc(1, sizeof(struct 
string_list));
+   istate->resolve_undo->strdup_strings = 1;
+   }
+
+   lost = string_list_insert(istate->resolve_undo, conflict->name);
+   if (!lost->util)
+   lost->util = xcalloc(1, sizeof(*ui));
+   ui = lost->util;
+
+   cp = conflict->entries;
+   for (i = 0; i < 3; i++)
+   ui->mode[i] = 0;
+   while (cp) {
+   ui->mode[conflict_stage(cp) - 1] = cp->entry_mode;
+   hashcpy(ui->sha1[conflict_stage(cp) - 1], cp->sha1);
+   cp = cp->next;
+   }
+   conflict = conflict->next;
+   }
+}
+
 static int read_entries(struct index_state *istate, struct directory_entry 
**de,
unsigned int *entry_offset, void **mmap,
unsigned long mmap_size, unsigned int *nr,
@@ -460,6 +498,7 @@ static int read_entries(struct index_state *istate, struct 
directory_entry **de,
conflict_queue = NULL;
if (read_conflicts(&conflict_queue, *de, mmap, mmap_size) < 0)
return -1;
+   resolve_undo_convert_v5(istate, conflict_queue);
for (i = 0; i < (*de)->de_nfiles; i++) {
if (read_entry(&ce,
   *de,
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 12/19] read-cache: read index-v5

2013-07-12 Thread Thomas Gummerer

Make git read the index file version 5 without complaining.

This version of the reader doesn't read neither the cache-tree
nor the resolve undo data, but doesn't choke on an index that
includes such data.

Helped-by: Junio C Hamano 
Helped-by: Nguyen Thai Ngoc Duy 
Helped-by: Thomas Rast 
Signed-off-by: Thomas Gummerer 
---
 Makefile|   1 +
 cache.h |  75 ++-
 read-cache-v5.c | 638 
 read-cache.h|   1 +
 4 files changed, 714 insertions(+), 1 deletion(-)
 create mode 100644 read-cache-v5.c

diff --git a/Makefile b/Makefile
index 73369ae..80e35f5 100644
--- a/Makefile
+++ b/Makefile
@@ -856,6 +856,7 @@ LIB_OBJS += quote.o
 LIB_OBJS += reachable.o
 LIB_OBJS += read-cache.o
 LIB_OBJS += read-cache-v2.o
+LIB_OBJS += read-cache-v5.o
 LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += remote.o
diff --git a/cache.h b/cache.h
index 2097105..1e5cc77 100644
--- a/cache.h
+++ b/cache.h
@@ -99,7 +99,7 @@ unsigned long git_deflate_bound(git_zstream *, unsigned long);
 #define CACHE_SIGNATURE 0x44495243 /* "DIRC" */
 
 #define INDEX_FORMAT_LB 2
-#define INDEX_FORMAT_UB 4
+#define INDEX_FORMAT_UB 5
 
 /*
  * The "cache_time" is just the low 32 bits of the
@@ -121,6 +121,15 @@ struct stat_data {
unsigned int sd_size;
 };
 
+/*
+ * The *next pointer is used in read_entries_v5 for holding
+ * all the elements of a directory, and points to the next
+ * cache_entry in a directory.
+ *
+ * It is reset by the add_name_hash call in set_index_entry
+ * to set it to point to the next cache_entry in the
+ * correct in-memory format ordering.
+ */
 struct cache_entry {
struct stat_data ce_stat_data;
unsigned int ce_mode;
@@ -132,11 +141,59 @@ struct cache_entry {
char name[FLEX_ARRAY]; /* more */
 };
 
+struct directory_entry {
+   struct directory_entry *next;
+   struct directory_entry *next_hash;
+   struct cache_entry *ce;
+   struct cache_entry *ce_last;
+   struct conflict_entry *conflict;
+   struct conflict_entry *conflict_last;
+   unsigned int conflict_size;
+   unsigned int de_foffset;
+   unsigned int de_cr;
+   unsigned int de_ncr;
+   unsigned int de_nsubtrees;
+   unsigned int de_nfiles;
+   unsigned int de_nentries;
+   unsigned char sha1[20];
+   unsigned short de_flags;
+   unsigned int de_pathlen;
+   char pathname[FLEX_ARRAY];
+};
+
+struct conflict_part {
+   struct conflict_part *next;
+   unsigned short flags;
+   unsigned short entry_mode;
+   unsigned char sha1[20];
+};
+
+struct conflict_entry {
+   struct conflict_entry *next;
+   unsigned int nfileconflicts;
+   struct conflict_part *entries;
+   unsigned int namelen;
+   unsigned int pathlen;
+   char name[FLEX_ARRAY];
+};
+
+struct ondisk_conflict_part {
+   unsigned short flags;
+   unsigned short entry_mode;
+   unsigned char sha1[20];
+};
+
+#define CE_NAMEMASK  (0x0fff)
 #define CE_STAGEMASK (0x3000)
 #define CE_EXTENDED  (0x4000)
 #define CE_VALID (0x8000)
+#define CE_SMUDGED   (0x0400) /* index v5 only flag */
 #define CE_STAGESHIFT 12
 
+#define CONFLICT_CONFLICTED (0x8000)
+#define CONFLICT_STAGESHIFT 13
+#define CONFLICT_STAGEMASK (0x6000)
+
 /*
  * Range 0x in ce_flags is divided into
  * two parts: in-memory flags and on-disk ones.
@@ -173,6 +230,18 @@ struct cache_entry {
 #define CE_EXTENDED_FLAGS (CE_INTENT_TO_ADD | CE_SKIP_WORKTREE)
 
 /*
+ * Representation of the extended on-disk flags in the v5 format.
+ * They must not collide with the ordinary on-disk flags, and need to
+ * fit in 16 bits.  Note however that v5 does not save the name
+ * length.
+ */
+#define CE_INTENT_TO_ADD_V5  (0x4000)
+#define CE_SKIP_WORKTREE_V5  (0x0800)
+#if (CE_VALID|CE_STAGEMASK) & (CE_INTENTTOADD_V5|CE_SKIPWORKTREE_V5)
+#error "v5 on-disk flags collide with ordinary on-disk flags"
+#endif
+
+/*
  * Safeguard to avoid saving wrong flags:
  *  - CE_EXTENDED2 won't get saved until its semantic is known
  *  - Bits in 0x have been saved in ce_flags already
@@ -211,6 +280,8 @@ static inline unsigned create_ce_flags(unsigned stage)
 #define ce_skip_worktree(ce) ((ce)->ce_flags & CE_SKIP_WORKTREE)
 #define ce_mark_uptodate(ce) ((ce)->ce_flags |= CE_UPTODATE)
 
+#define conflict_stage(c) ((CONFLICT_STAGEMASK & (c)->flags) >> 
CONFLICT_STAGESHIFT)
+
 #define ce_permissions(mode) (((mode) & 0100) ? 0755 : 0644)
 static inline unsigned int create_ce_mode(unsigned int mode)
 {
@@ -258,6 +329,8 @@ static inline unsigned int canon_mode(unsigned int mode)
 }
 
 #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
+#define directory_entry_size(len) (offsetof(struct directory_entry,pathname) + 
(len) + 1)
+#define conflict_entry_size(len) (offsetof(struct conflict_entry,name) + (len) 
+ 1)
 
 /*
  * Options by which the index should be filtered when read partially.
diff --git a/read-cache-v5

[PATCH v2 14/19] read-cache: read cache-tree in index-v5

2013-07-12 Thread Thomas Gummerer

Since the cache-tree data is saved as part of the directory data,
we already read it at the beginning of the index. The cache-tree
is only converted from this directory data.

The cache-tree data is arranged in a tree, with the children sorted by
pathlen at each node, while the ondisk format is sorted lexically.
So we have to rebuild this format from the on-disk directory list.

Signed-off-by: Thomas Gummerer 
---
 cache-tree.c|   2 +-
 cache-tree.h|   6 
 read-cache-v5.c | 100 
 3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/cache-tree.c b/cache-tree.c
index 37e4d00..f4b0917 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -31,7 +31,7 @@ void cache_tree_free(struct cache_tree **it_p)
*it_p = NULL;
 }
 
-static int subtree_name_cmp(const char *one, int onelen,
+int subtree_name_cmp(const char *one, int onelen,
const char *two, int twolen)
 {
if (onelen < twolen)
diff --git a/cache-tree.h b/cache-tree.h
index 55d0f59..9aac493 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -21,10 +21,16 @@ struct cache_tree {
struct cache_tree_sub **down;
 };
 
+struct directory_queue {
+   struct directory_queue *down;
+   struct directory_entry *de;
+};
+
 struct cache_tree *cache_tree(void);
 void cache_tree_free(struct cache_tree **);
 void cache_tree_invalidate_path(struct cache_tree *, const char *);
 struct cache_tree_sub *cache_tree_sub(struct cache_tree *, const char *);
+int subtree_name_cmp(const char *, int, const char *, int);
 
 void cache_tree_write(struct strbuf *, struct cache_tree *root);
 struct cache_tree *cache_tree_read(const char *buffer, unsigned long size);
diff --git a/read-cache-v5.c b/read-cache-v5.c
index 853b97d..0b9c320 100644
--- a/read-cache-v5.c
+++ b/read-cache-v5.c
@@ -448,6 +448,103 @@ static int read_conflicts(struct conflict_entry **head,
return 0;
 }
 
+static struct cache_tree *convert_one(struct directory_queue *queue, int dirnr)
+{
+   int i, subtree_nr;
+   struct cache_tree *it;
+   struct directory_queue *down;
+
+   it = cache_tree();
+   it->entry_count = queue[dirnr].de->de_nentries;
+   subtree_nr = queue[dirnr].de->de_nsubtrees;
+   if (0 <= it->entry_count)
+   hashcpy(it->sha1, queue[dirnr].de->sha1);
+
+   /*
+* Just a heuristic -- we do not add directories that often but
+* we do not want to have to extend it immediately when we do,
+* hence +2.
+*/
+   it->subtree_alloc = subtree_nr + 2;
+   it->down = xcalloc(it->subtree_alloc, sizeof(struct cache_tree_sub *));
+   down = queue[dirnr].down;
+   for (i = 0; i < subtree_nr; i++) {
+   struct cache_tree *sub;
+   struct cache_tree_sub *subtree;
+   char *buf, *name;
+
+   name = "";
+   buf = strtok(down[i].de->pathname, "/");
+   while (buf) {
+   name = buf;
+   buf = strtok(NULL, "/");
+   }
+   sub = convert_one(down, i);
+   if(!sub)
+   goto free_return;
+   subtree = cache_tree_sub(it, name);
+   subtree->cache_tree = sub;
+   }
+   if (subtree_nr != it->subtree_nr)
+   die("cache-tree: internal error");
+   return it;
+ free_return:
+   cache_tree_free(&it);
+   return NULL;
+}
+
+static int compare_cache_tree_elements(const void *a, const void *b)
+{
+   const struct directory_entry *de1, *de2;
+
+   de1 = ((const struct directory_queue *)a)->de;
+   de2 = ((const struct directory_queue *)b)->de;
+   return subtree_name_cmp(de1->pathname, de1->de_pathlen,
+   de2->pathname, de2->de_pathlen);
+}
+
+static struct directory_entry *sort_directories(struct directory_entry *de,
+   struct directory_queue *queue)
+{
+   int i, nsubtrees;
+
+   nsubtrees = de->de_nsubtrees;
+   for (i = 0; i < nsubtrees; i++) {
+   struct directory_entry *new_de;
+   de = de->next;
+   new_de = xmalloc(directory_entry_size(de->de_pathlen));
+   memcpy(new_de, de, directory_entry_size(de->de_pathlen));
+   queue[i].de = new_de;
+   if (de->de_nsubtrees) {
+   queue[i].down = xcalloc(de->de_nsubtrees,
+   sizeof(struct directory_queue));
+   de = sort_directories(de,
+   queue[i].down);
+   }
+   }
+   qsort(queue, nsubtrees, sizeof(struct directory_queue),
+   compare_cache_tree_elements);
+   return de;
+}
+
+/*
+ * This function modifies the directory argument that is given to it.
+ * Don't use it if the directory entries are still needed after.
+ */
+sta

[PATCH v2 11/19] read-cache: make in-memory format aware of stat_crc

2013-07-12 Thread Thomas Gummerer

Make the in-memory format aware of the stat_crc used by index-v5.
It is simply ignored by index version prior to v5.

Signed-off-by: Thomas Gummerer 
---
 cache.h  |  1 +
 read-cache.c | 25 +
 2 files changed, 26 insertions(+)

diff --git a/cache.h b/cache.h
index 455b772..2097105 100644
--- a/cache.h
+++ b/cache.h
@@ -127,6 +127,7 @@ struct cache_entry {
unsigned int ce_flags;
unsigned int ce_namelen;
unsigned char sha1[20];
+   uint32_t ce_stat_crc;
struct cache_entry *next; /* used by name_hash */
char name[FLEX_ARRAY]; /* more */
 };
diff --git a/read-cache.c b/read-cache.c
index ab716ed..9bfbb4f 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -108,6 +108,29 @@ int match_stat_data(const struct stat_data *sd, struct 
stat *st)
return changed;
 }
 
+static uint32_t calculate_stat_crc(struct cache_entry *ce)
+{
+   unsigned int ctimens = 0;
+   uint32_t stat, stat_crc;
+
+   stat = htonl(ce->ce_stat_data.sd_ctime.sec);
+   stat_crc = crc32(0, (Bytef*)&stat, 4);
+#ifdef USE_NSEC
+   ctimens = ce->ce_stat_data.sd_ctime.nsec;
+#endif
+   stat = htonl(ctimens);
+   stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+   stat = htonl(ce->ce_stat_data.sd_ino);
+   stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+   stat = htonl(ce->ce_stat_data.sd_dev);
+   stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+   stat = htonl(ce->ce_stat_data.sd_uid);
+   stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+   stat = htonl(ce->ce_stat_data.sd_gid);
+   stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+   return stat_crc;
+}
+
 /*
  * This only updates the "non-critical" parts of the directory
  * cache, ie the parts that aren't tracked by GIT, and only used
@@ -122,6 +145,8 @@ void fill_stat_cache_info(struct cache_entry *ce, struct 
stat *st)
 
if (S_ISREG(st->st_mode))
ce_mark_uptodate(ce);
+
+   ce->ce_stat_crc = calculate_stat_crc(ce);
 }
 
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 10/19] documentation: add documentation of the index-v5 file format

2013-07-12 Thread Thomas Gummerer

Add a documentation of the index file format version 5 to
Documentation/technical.

Helped-by: Michael Haggerty 
Helped-by: Junio C Hamano 
Helped-by: Thomas Rast 
Helped-by: Nguyen Thai Ngoc Duy 
Helped-by: Robin Rosenberg 
Signed-off-by: Thomas Gummerer 
---
 Documentation/technical/index-file-format-v5.txt | 296 +++
 1 file changed, 296 insertions(+)
 create mode 100644 Documentation/technical/index-file-format-v5.txt

diff --git a/Documentation/technical/index-file-format-v5.txt 
b/Documentation/technical/index-file-format-v5.txt
new file mode 100644
index 000..4213087
--- /dev/null
+++ b/Documentation/technical/index-file-format-v5.txt
@@ -0,0 +1,296 @@
+GIT index format
+
+
+== The git index
+
+   The git index file (.git/index) documents the status of the files
+ in the git staging area.
+
+   The staging area is used for preparing commits, merging, etc.
+
+== The git index file format
+
+   All binary numbers are in network byte order. Version 5 is described
+ here. The index file consists of various sections. They appear in
+ the following order in the file.
+
+   - header: the description of the index format, including it's signature,
+ version and various other fields that are used internally.
+
+   - diroffsets (ndir entries of "direcotry offset"): A 4-byte offset
+   relative to the beginning of the "direntries block" (see below)
+   for each of the ndir directories in the index, sorted by pathname
+   (of the directory it's pointing to). [1]
+
+   - direntries (ndir entries of "directory offset"): A directory entry
+   for each of the ndir directories in the index, sorted by pathname
+   (see below). [2]
+
+   - fileoffsets (nfile entries of "file offset"): A 4-byte offset
+   relative to the beginning of the fileentries block (see below)
+   for each of the nfile files in the index. [1]
+
+   - fileentries (nfile entries of "file entry"): A file entry for
+   each of the nfile files in the index (see below).
+
+   - crdata: A number of entries for conflicted data/resolved conflicts
+   (see below).
+
+   - Extensions (Currently none, see below in the future)
+
+ Extensions are identified by signature. Optional extensions can
+ be ignored if GIT does not understand them.
+
+ GIT supports an arbitrary number of extension, but currently none
+ is implemented. [3]
+
+ extsig (32-bits): extension signature. If the first byte is 'A'..'Z'
+ the extension is optional and can be ignored.
+
+ extsize (32-bits): size of the extension, excluding the header
+   (extsig, extsize, extchecksum).
+
+ extchecksum (32-bits): crc32 checksum of the extension signature
+   and size.
+
+- Extension data.
+
+== Header
+   sig (32-bits): Signature:
+ The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache")
+
+   vnr (32-bits): Version number:
+ The current supported versions are 2, 3, 4 and 5.
+
+   ndir (32-bits): number of directories in the index.
+
+   nfile (32-bits): number of file entries in the index.
+
+   fblockoffset (32-bits): offset to the file block, relative to the
+ beginning of the file.
+
+   - Offset to the extensions.
+
+ nextensions (32-bits): number of extensions.
+
+ extoffset (32-bits): offset to the extension. (Possibly none, as
+   many as indicated in the 4-byte number of extensions)
+
+   headercrc (32-bits): crc checksum including the header and the
+ offsets to the extensions.
+
+
+== Directory offsets (diroffsets)
+
+  diroffset (32-bits): offset to the directory relative to the beginning
+of the index file. There are ndir + 1 offsets in the diroffset table,
+the last is pointing to the end of the last direntry. With this last
+entry, we are able to replace the strlen of when reading the directory
+name, by calculating it from diroffset[n+1]-diroffset[n]-61.  61 is the
+size of the directory data, which follows each each directory + the
+crc sum + the NUL byte.
+
+  This part is needed for making the directory entries bisectable and
+thus allowing a binary search.
+
+== Directory entry (direntries)
+
+  Directory entries are sorted in lexicographic order by the name
+of their path starting with the root.
+
+  pathname (variable length, nul terminated): relative to top level
+directory (without the leading slash). '/' is used as path
+separator. A string of length 0 ('') indicates the root directory.
+The special path components ".", and ".." (without quotes) are
+disallowed. The path also includes a trailing slash. [9]
+
+  foffset (32-bits): offset to the lexicographically first file in
+the file offsets (fileoffsets), relative to the beginning of
+the fileoffset block.
+
+  cr (32-bits): offset to conflicted/resolved data at the end of the
+index. 0 if there is no such data. [4]
+
+  ncr (32-bits): number of conflicted/resolved data entries at the
+end

[PATCH] .mailmap: Map email addresses to names

2013-07-12 Thread Stefan Beller

The same kind of cleanup as sent earlier today
(2e2ae79df4fabea0157c5eb527b5396eb89185a1 locally here)

I asked all the people before, whether
they like their lines added. Many had
requests to change the order of the mail address.

When having this patch applied, you'll notice the
bug as described here
http://marc.info/?l=git&m=137364524514927&w=2
http://www.mail-archive.com/git@vger.kernel.org/msg31964.html
("Bug in .mailmap handling?", for example look for Knut Franke)

Signed-off-by: Stefan Beller 
---
 .mailmap | 28 ++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/.mailmap b/.mailmap
index 1179767..1d6ba17 100644
--- a/.mailmap
+++ b/.mailmap
@@ -7,6 +7,7 @@
 
 Alejandro R. Sedeño  
 Alexander Gavrilov 
+Alexey Shumkin  
 Alex Bennée 
 Alex Riesen  
 Alex Riesen  
@@ -18,12 +19,15 @@ anonymous 
 anonymous 
 Brandon Casey  
 Brian M. Carlson 
+Bryan Larsen  
+Bryan Larsen  
 Cheng Renquan 
 Chris Shoemaker 
 Dana L. How 
 Dana L. How 
 Daniel Barkalow 
 Dan Johnson 
+David Brown  
 David D. Kilzer 
 David Kågedal 
 David Reiss  
@@ -31,7 +35,10 @@ David S. Miller 
 Deskin Miller 
 Dirk Süsserott 
 Eric S. Raymond 
+Eric Blake  
+Eric Hanchrow  
 Erik Faye-Lund  
+Eyvind Bernhardsen  
 Florian Achleitner  

 Franck Bui-Huu  
 Frank Lichtenheld  
@@ -47,19 +54,25 @@ H. Peter Anvin  

 H. Peter Anvin  
 İsmail Dönmez 
 Jakub Narębski 
+Jason Riedy  
+Jason Riedy  
 Jay Soffian  
 J. Bruce Fields  
 J. Bruce Fields  
 J. Bruce Fields  
 Jeff King  
+Jeff Muizelaar  
 Joachim Berdal Haga 
 Johannes Schindelin  
 Johannes Sixt  
 Johannes Sixt  
 Johannes Sixt  
+Jonathan del Strother  
 Jonathan Nieder  
 Jon Loeliger 
-Jon Seymour 
+Jon Seymour  
+Josh Triplett  
+Josh Triplett  
 Junio C Hamano  
 Junio C Hamano  
 Junio C Hamano  
@@ -71,11 +84,14 @@ Karl Wiberg  Karl Hasselström 

 Karl Wiberg  Karl Hasselström 

 Kay Sievers  
 Kay Sievers  
+Karsten Blees  
+Karsten Blees  
 Keith Cascio  
 Kent Engstrom 
 Kevin Leung 
 Kirill Smelkov  
 Kirill Smelkov  
+Knut Franke  
 Lars Doelle 
 Lars Doelle 
 Li Hong 
@@ -85,11 +101,14 @@ Linus Torvalds  

 Linus Torvalds  
 Linus Torvalds  

 Linus Torvalds  

-Lukas Sandström 
+Lukas Sandström  
+Matt Kraai  
 Marc-André Lureau 
+Mark Levedahl  
 Mark Rada 
 Martin Langhoff  
 Martin von Zweigbergk  
+Matt Draisey  
 Matthias Kestenholz  
 Matthias Urlichs  
 Matthias Urlichs  
@@ -104,8 +123,11 @@ Namhyung Kim  
 Namhyung Kim  
 Nanako Shiraishi 
 Nanako Shiraishi 
+Nelson Elhage  
+Nelson Elhage  
 Nguyễn Thái Ngọc Duy 
  
+Paolo Bonzini  
 Pascal Obry  
 Pascal Obry  
 Paul Mackerras  
@@ -114,6 +136,7 @@ Peter Krefting  

 Peter Krefting  
 Petr Baudis  
 Petr Baudis  
+Phil Hord  
 Philippe Bruhat 
 Ralf Thielow  
 Ramsay Allan Jones 
@@ -145,6 +168,7 @@ Uwe Kleine-König 
 Uwe Kleine-König 
 Ville Skyttä 
 Vitaly "_Vi" Shukela 
+W. Trevor King  
 William Pursell 
 YOSHIFUJI Hideaki 
 
-- 
1.8.3.2.790.g9192b0b

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] open() error checking

2013-07-12 Thread Junio C Hamano

Thomas Rast  writes:

> #1 is Dale's suggested change.  Dale, to include it we'd need your
> Signed-off-by as per Documentation/SubmittingPatches.
>
> #2 is a similar error-checking fix; I reviewed 'git grep "\bopen\b"'
> and found one case where the return value was obviously not tested.
> The corresponding Windows code path has the same problem, but I dare
> not touch it; perhaps someone from the Windows side can look into it?
>
> I originally had a four-patch series to open 0/1/2 from /dev/null, but
> then I noticed that this was shot down in 2008:
>
>   http://thread.gmane.org/gmane.comp.version-control.git/93605/focus=93896

The way I recall the thread was not "shot down" but more like
"fizzled out without seeing a clear consensus".  As a normal POSIX
program, we do rely on fd#2 connected to an error stream, and I do
agree with the general sentiment of that old thread that it is very
wrong for warning() or die() to write to a pipe or file descriptor
we opened for some other purpose, corrupting the destination.

I briefly wondered if we can do the sanity check lazily (e.g. upon
first warning() see of fd#2 is open and otherwise die silently), but
we may open a fd (e.g. to create a new loose object) that may happen
to grab fd#2 and then it is too late for us to do anything about it,
so...

> Do you want to resurrect this?
>
> The worst part about it is that because we don't have a stderr to rely
> on, we can't simply die("stop playing mind games").

Right.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 09/19] ls-files.c: use index api

2013-07-12 Thread Thomas Gummerer

Use the index api to read only part of the index, if the on-disk version
of the index is index-v5.

Signed-off-by: Thomas Gummerer 
---
 builtin/ls-files.c | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 08d9786..80cc398 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -31,6 +31,7 @@ static const char *prefix;
 static int max_prefix_len;
 static int prefix_len;
 static const char **pathspec;
+static struct pathspec pathspec_struct;
 static int error_unmatch;
 static char *ps_matched;
 static const char *with_tree;
@@ -457,6 +458,7 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
struct dir_struct dir;
struct exclude_list *el;
struct string_list exclude_list = STRING_LIST_INIT_NODUP;
+   struct filter_opts *opts = xmalloc(sizeof(*opts));
struct option builtin_ls_files_options[] = {
{ OPTION_CALLBACK, 'z', NULL, NULL, NULL,
N_("paths are separated with NUL character"),
@@ -522,9 +524,6 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
prefix_len = strlen(prefix);
git_config(git_default_config, NULL);
 
-   if (read_cache() < 0)
-   die("index file corrupt");
-
argc = parse_options(argc, argv, prefix, builtin_ls_files_options,
ls_files_usage, 0);
el = add_exclude_list(&dir, EXC_CMDL, "--exclude option");
@@ -556,14 +555,7 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
setup_work_tree();
 
pathspec = get_pathspec(prefix, argv);
-
-   /* be nice with submodule paths ending in a slash */
-   if (pathspec)
-   strip_trailing_slash_from_submodules();
-
-   /* Find common prefix for all pathspec's */
-   max_prefix = common_prefix(pathspec);
-   max_prefix_len = max_prefix ? strlen(max_prefix) : 0;
+   init_pathspec(&pathspec_struct, pathspec);
 
/* Treat unmatching pathspec elements as errors */
if (pathspec && error_unmatch) {
@@ -573,6 +565,23 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
ps_matched = xcalloc(1, num);
}
 
+   if (!with_tree) {
+   memset(opts, 0, sizeof(*opts));
+   opts->pathspec = &pathspec_struct;
+   opts->read_staged = 1;
+   if (show_resolve_undo)
+   opts->read_resolve_undo = 1;
+   read_cache_filtered(opts);
+   } else {
+   read_cache();
+   }
+   /* be nice with submodule paths ending in a slash */
+   if (pathspec)
+   strip_trailing_slash_from_submodules();
+
+   max_prefix = common_prefix(pathspec);
+   max_prefix_len = max_prefix ? strlen(max_prefix) : 0;
+
if ((dir.flags & DIR_SHOW_IGNORED) && !exc_given)
die("ls-files --ignored needs some exclude pattern");
 
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 08/19] grep.c: Use index api

2013-07-12 Thread Thomas Gummerer

Signed-off-by: Thomas Gummerer 
---
 builtin/grep.c | 71 ++
 1 file changed, 37 insertions(+), 34 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index a419cda..8b02644 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -368,41 +368,33 @@ static void run_pager(struct grep_opt *opt, const char 
*prefix)
free(argv);
 }
 
-static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, 
int cached)
+struct grep_opts {
+   struct grep_opt *opt;
+   const struct pathspec *pathspec;
+   int cached;
+   int hit;
+};
+
+static int grep_cache(struct cache_entry *ce, void *cb_data)
 {
-   int hit = 0;
-   int nr;
-   read_cache();
+   struct grep_opts *opts = cb_data;
 
-   for (nr = 0; nr < active_nr; nr++) {
-   struct cache_entry *ce = active_cache[nr];
-   if (!S_ISREG(ce->ce_mode))
-   continue;
-   if (!match_pathspec_depth(pathspec, ce->name, ce_namelen(ce), 
0, NULL))
-   continue;
-   /*
-* If CE_VALID is on, we assume worktree file and its cache 
entry
-* are identical, even if worktree file has been modified, so 
use
-* cache version instead
-*/
-   if (cached || (ce->ce_flags & CE_VALID) || 
ce_skip_worktree(ce)) {
-   if (ce_stage(ce))
-   continue;
-   hit |= grep_sha1(opt, ce->sha1, ce->name, 0, ce->name);
-   }
-   else
-   hit |= grep_file(opt, ce->name);
-   if (ce_stage(ce)) {
-   do {
-   nr++;
-   } while (nr < active_nr &&
-!strcmp(ce->name, active_cache[nr]->name));
-   nr--; /* compensate for loop control */
-   }
-   if (hit && opt->status_only)
-   break;
-   }
-   return hit;
+   if (!S_ISREG(ce->ce_mode))
+   return 0;
+   if (!match_pathspec_depth(opts->pathspec, ce->name, ce_namelen(ce), 0, 
NULL))
+   return 0;
+   /*
+* If CE_VALID is on, we assume worktree file and its cache entry
+* are identical, even if worktree file has been modified, so use
+* cache version instead
+*/
+   if (opts->cached || (ce->ce_flags & CE_VALID) || ce_skip_worktree(ce))
+   opts->hit |= grep_sha1(opts->opt, ce->sha1, ce->name, 0, 
ce->name);
+   else
+   opts->hit |= grep_file(opts->opt, ce->name);
+   if (opts->hit && opts->opt->status_only)
+   return 1;
+   return 0;
 }
 
 static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
@@ -895,10 +887,21 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
} else if (0 <= opt_exclude) {
die(_("--[no-]exclude-standard cannot be used for tracked 
contents."));
} else if (!list.nr) {
+   struct grep_opts opts;
+   struct filter_opts *filter_opts = xmalloc(sizeof(*filter_opts));
+
if (!cached)
setup_work_tree();
 
-   hit = grep_cache(&opt, &pathspec, cached);
+   memset(filter_opts, 0, sizeof(*filter_opts));
+   filter_opts->pathspec = &pathspec;
+   opts.opt = &opt;
+   opts.pathspec = &pathspec;
+   opts.cached = cached;
+   opts.hit = 0;
+   read_cache_filtered(filter_opts);
+   for_each_cache_entry(grep_cache, &opts);
+   hit = opts.hit;
} else {
if (cached)
die(_("both --cached and trees are given."));
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 07/19] make sure partially read index is not changed

2013-07-12 Thread Thomas Gummerer

A partially read index file currently cannot be written to disk.  Make
sure that never happens, by erroring out when a caller tries to change a
partially read index.  The caller is responsible for reading the whole
index when it's trying to change it later.

Forcing the caller to load the right part of the index file instead of
re-reading it when changing it, gives a bit of a performance advantage,
by avoiding to read parts of the index twice.

Signed-off-by: Thomas Gummerer 
---
 builtin/update-index.c |  4 
 cache.h|  1 +
 read-cache-v2.c|  2 ++
 read-cache.c   | 10 ++
 4 files changed, 17 insertions(+)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 5c7762e..4c6e3a6 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -49,6 +49,8 @@ static int mark_ce_flags(const char *path, int flag, int mark)
int namelen = strlen(path);
int pos = cache_name_pos(path, namelen);
if (0 <= pos) {
+   if (active_cache_partially_read)
+   die("BUG: Can't change a partially read index");
if (mark)
active_cache[pos]->ce_flags |= flag;
else
@@ -253,6 +255,8 @@ static void chmod_path(int flip, const char *path)
pos = cache_name_pos(path, strlen(path));
if (pos < 0)
goto fail;
+   if (active_cache_partially_read)
+   die("BUG: Can't change a partially read index");
ce = active_cache[pos];
mode = ce->ce_mode;
if (!S_ISREG(mode))
diff --git a/cache.h b/cache.h
index d305d21..455b772 100644
--- a/cache.h
+++ b/cache.h
@@ -311,6 +311,7 @@ extern void free_name_hash(struct index_state *istate);
 #define active_alloc (the_index.cache_alloc)
 #define active_cache_changed (the_index.cache_changed)
 #define active_cache_tree (the_index.cache_tree)
+#define active_cache_partially_read (the_index.filter_opts)
 
 #define read_cache() read_index(&the_index)
 #define read_cache_from(path) read_index_from(&the_index, (path))
diff --git a/read-cache-v2.c b/read-cache-v2.c
index 51b618f..f3c0685 100644
--- a/read-cache-v2.c
+++ b/read-cache-v2.c
@@ -479,6 +479,8 @@ static int write_index_v2(struct index_state *istate, int 
newfd)
struct stat st;
struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
 
+   if (istate->filter_opts)
+   die("BUG: index: cannot write a partially read index");
for (i = removed = extended = 0; i < entries; i++) {
if (cache[i]->ce_flags & CE_REMOVE)
removed++;
diff --git a/read-cache.c b/read-cache.c
index 9053d43..ab716ed 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -30,6 +30,8 @@ static void replace_index_entry(struct index_state *istate, 
int nr, struct cache
 {
struct cache_entry *old = istate->cache[nr];
 
+   if (istate->filter_opts)
+   die("BUG: Can't change a partially read index");
remove_name_hash(istate, old);
set_index_entry(istate, nr, ce);
istate->cache_changed = 1;
@@ -467,6 +469,8 @@ int remove_index_entry_at(struct index_state *istate, int 
pos)
 {
struct cache_entry *ce = istate->cache[pos];
 
+   if (istate->filter_opts)
+   die("BUG: Can't change a partially read index");
record_resolve_undo(istate, ce);
remove_name_hash(istate, ce);
istate->cache_changed = 1;
@@ -973,6 +977,8 @@ int add_index_entry(struct index_state *istate, struct 
cache_entry *ce, int opti
 {
int pos;
 
+   if (istate->filter_opts)
+   die("BUG: Can't change a partially read index");
if (option & ADD_CACHE_JUST_APPEND)
pos = istate->cache_nr;
else {
@@ -1173,6 +1179,8 @@ int refresh_index(struct index_state *istate, unsigned 
int flags, const char **p
/* If we are doing --really-refresh that
 * means the index is not valid anymore.
 */
+   if (istate->filter_opts)
+   die("BUG: Can't change a partially read 
index");
ce->ce_flags &= ~CE_VALID;
istate->cache_changed = 1;
}
@@ -1331,6 +1339,8 @@ int read_index_filtered_from(struct index_state *istate, 
const char *path,
void *mmap;
size_t mmap_size;
 
+   if (istate->filter_opts)
+   die("BUG: Can't re-read partially read index");
errno = EBUSY;
if (istate->initialized)
return istate->cache_nr;
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 06/19] read-cache: add index reading api

2013-07-12 Thread Thomas Gummerer

Add an api for access to the index file.  Currently there is only a very
basic api for accessing the index file, which only allows a full read of
the index, and lets the users of the data filter it.  The new index api
gives the users the possibility to use only part of the index and
provides functions for iterating over and accessing cache entries.

This simplifies future improvements to the in-memory format, as changes
will be concentrated on one file, instead of the whole git source code.

Helped-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Thomas Gummerer 
---
 cache.h | 42 +-
 read-cache-v2.c | 35 +--
 read-cache.c| 44 
 read-cache.h|  8 +++-
 4 files changed, 121 insertions(+), 8 deletions(-)

diff --git a/cache.h b/cache.h
index 5082b34..d305d21 100644
--- a/cache.h
+++ b/cache.h
@@ -127,7 +127,7 @@ struct cache_entry {
unsigned int ce_flags;
unsigned int ce_namelen;
unsigned char sha1[20];
-   struct cache_entry *next;
+   struct cache_entry *next; /* used by name_hash */
char name[FLEX_ARRAY]; /* more */
 };
 
@@ -258,6 +258,29 @@ static inline unsigned int canon_mode(unsigned int mode)
 
 #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
 
+/*
+ * Options by which the index should be filtered when read partially.
+ *
+ * pathspec: The pathspec which the index entries have to match
+ * seen: Used to return the seen parameter from match_pathspec()
+ * max_prefix_len: The common prefix length of the pathspecs
+ *
+ * read_staged: used to indicate if the conflicted entries (entries
+ * with a stage) should be included
+ * read_cache_tree: used to indicate if the cache-tree should be read
+ * read_resolve_undo: used to indicate if the resolve undo data should
+ * be read
+ */
+struct filter_opts {
+   const struct pathspec *pathspec;
+   char *seen;
+   int max_prefix_len;
+
+   int read_staged;
+   int read_cache_tree;
+   int read_resolve_undo;
+};
+
 struct index_state {
struct cache_entry **cache;
unsigned int version;
@@ -270,6 +293,8 @@ struct index_state {
struct hash_table name_hash;
struct hash_table dir_hash;
struct index_ops *ops;
+   struct internal_ops *internal_ops;
+   struct filter_opts *filter_opts;
 };
 
 extern struct index_state the_index;
@@ -311,6 +336,12 @@ extern void free_name_hash(struct index_state *istate);
 #define unmerge_cache_entry_at(at) unmerge_index_entry_at(&the_index, at)
 #define unmerge_cache(pathspec) unmerge_index(&the_index, pathspec)
 #define read_blob_data_from_cache(path, sz) 
read_blob_data_from_index(&the_index, (path), (sz))
+
+/* index api */
+#define read_cache_filtered(opts) read_index_filtered(&the_index, (opts))
+#define read_cache_filtered_from(path, opts) 
read_index_filtered_from(&the_index, (path), (opts))
+#define for_each_cache_entry(fn, cb_data) \
+   for_each_index_entry(&the_index, (fn), (cb_data))
 #endif
 
 enum object_type {
@@ -438,6 +469,15 @@ extern int init_db(const char *template_dir, unsigned int 
flags);
} \
} while (0)
 
+/* index api */
+extern int read_index_filtered(struct index_state *, struct filter_opts *opts);
+extern int read_index_filtered_from(struct index_state *, const char *path, 
struct filter_opts *opts);
+
+typedef int each_cache_entry_fn(struct cache_entry *ce, void *);
+extern int for_each_index_entry(struct index_state *istate,
+   each_cache_entry_fn, void *);
+
+
 /* Initialize and use the cache information */
 extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const char **pathspec);
diff --git a/read-cache-v2.c b/read-cache-v2.c
index a6883c3..51b618f 100644
--- a/read-cache-v2.c
+++ b/read-cache-v2.c
@@ -3,6 +3,7 @@
 #include "resolve-undo.h"
 #include "cache-tree.h"
 #include "varint.h"
+#include "dir.h"
 
 /* Mask for the name length in ce_flags in the on-disk index */
 #define CE_NAMEMASK  (0x0fff)
@@ -207,8 +208,14 @@ static int read_index_extension(struct index_state *istate,
return 0;
 }
 
+/*
+ * The performance is the same if we read the whole index or only
+ * part of it, therefore we always read the whole index to avoid
+ * having to re-read it later.  The filter_opts will determine
+ * what part of the index is used when retrieving the cache-entries.
+ */
 static int read_index_v2(struct index_state *istate, void *mmap,
-unsigned long mmap_size)
+unsigned long mmap_size, struct filter_opts *opts)
 {
int i;
unsigned long src_offset;
@@ -238,7 +245,6 @@ static int read_index_v2(struct index_state *istate, void 
*mmap,
disk_ce = (struct ondisk_cache_entry *)((char *)mmap + 
src_offset);
ce = create_from_disk(disk_ce, &con

[PATCH v2 05/19] Add documentation for the index api

2013-07-12 Thread Thomas Gummerer

Add documentation for the index reading api.  This also includes
documentation for the new api functions introduced in the next patch.

Helped-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Thomas Gummerer 
---
 Documentation/technical/api-in-core-index.txt | 54 +--
 1 file changed, 50 insertions(+), 4 deletions(-)

diff --git a/Documentation/technical/api-in-core-index.txt 
b/Documentation/technical/api-in-core-index.txt
index adbdbf5..9b8c37c 100644
--- a/Documentation/technical/api-in-core-index.txt
+++ b/Documentation/technical/api-in-core-index.txt
@@ -1,14 +1,60 @@
 in-core index API
 =
 
+Reading API
+---
+
+`cache`::
+
+   An array of cache entries.  This is used to access the cache
+   entries directly.  Use `index_name_pos` to search for the
+   index of a specific cache entry.
+
+`read_index_filtered`::
+
+   Read a part of the index, filtered by the pathspec given in
+   the opts.  The function may load more than necessary, so the
+   caller still responsible to apply filters appropriately.  The
+   filtering is only done for performance reasons, as it's
+   possible to only read part of the index when the on-disk
+   format is index-v5.
+
+   To iterate only over the entries that match the pathspec, use
+   the for_each_index_entry function.
+
+`read_index`::
+
+   Read the whole index file from disk.
+
+`index_name_pos`::
+
+   Find a cache_entry with name in the index.  Returns pos if an
+   entry is matched exactly and -1-pos if an entry is matched
+   partially.
+   e.g.
+   index:
+   file1
+   file2
+   path/file1
+   zzz
+
+   index_name_pos("path/file1", 10) returns 2, while
+   index_name_pos("path", 4) returns -3
+
+`for_each_index_entry`::
+
+   Iterates over all cache_entries in the index filtered by
+   filter_opts in the index_state.  For each cache entry fn is
+   executed with cb_data as callback data.  From within the loop
+   do `return 0` to continue, or `return 1` to break the loop.
+
+TODO
+
 Talk about  and , things like:
 
-* cache -> the_index macros
-* read_index()
 * write_index()
 * ie_match_stat() and ie_modified(); how they are different and when to
   use which.
-* index_name_pos()
 * remove_index_entry_at()
 * remove_file_from_index()
 * add_file_to_index()
@@ -18,4 +64,4 @@ Talk about  and , things like:
 * cache_tree_invalidate_path()
 * cache_tree_update()
 
-(JC, Linus)
+(JC, Linus, Thomas Gummerer)
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 04/19] read-cache: Re-read index if index file changed

2013-07-12 Thread Thomas Gummerer

Add the possibility of re-reading the index file, if it changed
while reading.

The index file might change during the read, causing outdated
information to be displayed. We check if the index file changed
by using its stat data as heuristic.

Helped-by: Ramsay Jones 
Signed-off-by: Thomas Gummerer 
---
 read-cache.c | 91 +---
 1 file changed, 57 insertions(+), 34 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 1e7ffc2..3e3a0e2 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1275,11 +1275,31 @@ int read_index(struct index_state *istate)
return read_index_from(istate, get_index_file());
 }
 
+static int index_changed(struct stat *st_old, struct stat *st_new)
+{
+   if (st_old->st_mtime != st_new->st_mtime ||
+#if !defined (__CYGWIN__)
+   st_old->st_uid   != st_new->st_uid ||
+   st_old->st_gid   != st_new->st_gid ||
+   st_old->st_ino   != st_new->st_ino ||
+#endif
+#if USE_NSEC
+   ST_MTIME_NSEC(*st_old) != ST_MTIME_NSEC(*st_new) ||
+#endif
+#if USE_STDEV
+   st_old->st_dev != st_new->st_dev ||
+#endif
+   st_old->st_size != st_new->st_size)
+   return 1;
+
+   return 0;
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int read_index_from(struct index_state *istate, const char *path)
 {
-   int fd;
-   struct stat st;
+   int fd, err, i;
+   struct stat st_old, st_new;
struct cache_version_header *hdr;
void *mmap;
size_t mmap_size;
@@ -1291,41 +1311,44 @@ int read_index_from(struct index_state *istate, const 
char *path)
errno = ENOENT;
istate->timestamp.sec = 0;
istate->timestamp.nsec = 0;
+   for (i = 0; i < 50; i++) {
+   err = 0;
+   fd = open(path, O_RDONLY);
+   if (fd < 0) {
+   if (errno == ENOENT)
+   return 0;
+   die_errno("index file open failed");
+   }
 
-   fd = open(path, O_RDONLY);
-   if (fd < 0) {
-   if (errno == ENOENT)
-   return 0;
-   die_errno("index file open failed");
+   if (fstat(fd, &st_old))
+   die_errno("cannot stat the open index");
+
+   errno = EINVAL;
+   mmap_size = xsize_t(st_old.st_size);
+   mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, 
MAP_PRIVATE, fd, 0);
+   close(fd);
+   if (mmap == MAP_FAILED)
+   die_errno("unable to map index file");
+
+   hdr = mmap;
+   if (verify_hdr_version(istate, hdr, mmap_size) < 0)
+   err = 1;
+
+   if (istate->ops->verify_hdr(mmap, mmap_size) < 0)
+   err = 1;
+
+   if (istate->ops->read_index(istate, mmap, mmap_size) < 0)
+   err = 1;
+   istate->timestamp.sec = st_old.st_mtime;
+   istate->timestamp.nsec = ST_MTIME_NSEC(st_old);
+   if (lstat(path, &st_new))
+   die_errno("cannot stat the open index");
+
+   munmap(mmap, mmap_size);
+   if (!index_changed(&st_old, &st_new) && !err)
+   return istate->cache_nr;
}
 
-   if (fstat(fd, &st))
-   die_errno("cannot stat the open index");
-
-   errno = EINVAL;
-   mmap_size = xsize_t(st.st_size);
-   mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 
0);
-   close(fd);
-   if (mmap == MAP_FAILED)
-   die_errno("unable to map index file");
-
-   hdr = mmap;
-   if (verify_hdr_version(istate, hdr, mmap_size) < 0)
-   goto unmap;
-
-   if (istate->ops->verify_hdr(mmap, mmap_size) < 0)
-   goto unmap;
-
-   if (istate->ops->read_index(istate, mmap, mmap_size) < 0)
-   goto unmap;
-   istate->timestamp.sec = st.st_mtime;
-   istate->timestamp.nsec = ST_MTIME_NSEC(st);
-
-   munmap(mmap, mmap_size);
-   return istate->cache_nr;
-
-unmap:
-   munmap(mmap, mmap_size);
die("index file corrupt");
 }
 
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 03/19] read-cache: move index v2 specific functions to their own file

2013-07-12 Thread Thomas Gummerer

Move index version 2 specific functions to their own file. The non-index
specific functions will be in read-cache.c, while the index version 2
specific functions will be in read-cache-v2.c.

Helped-by: Nguyen Thai Ngoc Duy 
Signed-off-by: Thomas Gummerer 
---
 Makefile |   2 +
 cache.h  |  16 +-
 read-cache-v2.c  | 556 +
 read-cache.c | 575 ---
 read-cache.h |  57 +
 test-index-version.c |   5 +
 6 files changed, 661 insertions(+), 550 deletions(-)
 create mode 100644 read-cache-v2.c
 create mode 100644 read-cache.h

diff --git a/Makefile b/Makefile
index 5a68fe5..73369ae 100644
--- a/Makefile
+++ b/Makefile
@@ -711,6 +711,7 @@ LIB_H += progress.h
 LIB_H += prompt.h
 LIB_H += quote.h
 LIB_H += reachable.h
+LIB_H += read-cache.h
 LIB_H += reflog-walk.h
 LIB_H += refs.h
 LIB_H += remote.h
@@ -854,6 +855,7 @@ LIB_OBJS += prompt.o
 LIB_OBJS += quote.o
 LIB_OBJS += reachable.o
 LIB_OBJS += read-cache.o
+LIB_OBJS += read-cache-v2.o
 LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += remote.o
diff --git a/cache.h b/cache.h
index 7af853b..5082b34 100644
--- a/cache.h
+++ b/cache.h
@@ -95,19 +95,8 @@ unsigned long git_deflate_bound(git_zstream *, unsigned 
long);
  */
 #define DEFAULT_GIT_PORT 9418
 
-/*
- * Basic data structures for the directory cache
- */
 
 #define CACHE_SIGNATURE 0x44495243 /* "DIRC" */
-struct cache_version_header {
-   unsigned int hdr_signature;
-   unsigned int hdr_version;
-};
-
-struct cache_header {
-   unsigned int hdr_entries;
-};
 
 #define INDEX_FORMAT_LB 2
 #define INDEX_FORMAT_UB 4
@@ -280,6 +269,7 @@ struct index_state {
 initialized : 1;
struct hash_table name_hash;
struct hash_table dir_hash;
+   struct index_ops *ops;
 };
 
 extern struct index_state the_index;
@@ -489,8 +479,8 @@ extern void *read_blob_data_from_index(struct index_state 
*, const char *, unsig
 #define CE_MATCH_RACY_IS_DIRTY 02
 /* do stat comparison even if CE_SKIP_WORKTREE is true */
 #define CE_MATCH_IGNORE_SKIP_WORKTREE  04
-extern int ie_match_stat(const struct index_state *, const struct cache_entry 
*, struct stat *, unsigned int);
-extern int ie_modified(const struct index_state *, const struct cache_entry *, 
struct stat *, unsigned int);
+extern int ie_match_stat(struct index_state *, const struct cache_entry *, 
struct stat *, unsigned int);
+extern int ie_modified(struct index_state *, const struct cache_entry *, 
struct stat *, unsigned int);
 
 #define PATHSPEC_ONESTAR 1 /* the pathspec pattern sastisfies GFNM_ONESTAR 
*/
 
diff --git a/read-cache-v2.c b/read-cache-v2.c
new file mode 100644
index 000..a6883c3
--- /dev/null
+++ b/read-cache-v2.c
@@ -0,0 +1,556 @@
+#include "cache.h"
+#include "read-cache.h"
+#include "resolve-undo.h"
+#include "cache-tree.h"
+#include "varint.h"
+
+/* Mask for the name length in ce_flags in the on-disk index */
+#define CE_NAMEMASK  (0x0fff)
+
+struct cache_header {
+   unsigned int hdr_entries;
+};
+
+/*
+ * Index File I/O
+ */
+
+/*
+ * dev/ino/uid/gid/size are also just tracked to the low 32 bits
+ * Again - this is just a (very strong in practice) heuristic that
+ * the inode hasn't changed.
+ *
+ * We save the fields in big-endian order to allow using the
+ * index file over NFS transparently.
+ */
+struct ondisk_cache_entry {
+   struct cache_time ctime;
+   struct cache_time mtime;
+   unsigned int dev;
+   unsigned int ino;
+   unsigned int mode;
+   unsigned int uid;
+   unsigned int gid;
+   unsigned int size;
+   unsigned char sha1[20];
+   unsigned short flags;
+   char name[FLEX_ARRAY]; /* more */
+};
+
+/*
+ * This struct is used when CE_EXTENDED bit is 1
+ * The struct must match ondisk_cache_entry exactly from
+ * ctime till flags
+ */
+struct ondisk_cache_entry_extended {
+   struct cache_time ctime;
+   struct cache_time mtime;
+   unsigned int dev;
+   unsigned int ino;
+   unsigned int mode;
+   unsigned int uid;
+   unsigned int gid;
+   unsigned int size;
+   unsigned char sha1[20];
+   unsigned short flags;
+   unsigned short flags2;
+   char name[FLEX_ARRAY]; /* more */
+};
+
+/* These are only used for v3 or lower */
+#define align_flex_name(STRUCT,len) ((offsetof(struct STRUCT,name) + (len) + 
8) & ~7)
+#define ondisk_cache_entry_size(len) align_flex_name(ondisk_cache_entry,len)
+#define ondisk_cache_entry_extended_size(len) 
align_flex_name(ondisk_cache_entry_extended,len)
+#define ondisk_ce_size(ce) (((ce)->ce_flags & CE_EXTENDED) ? \
+   ondisk_cache_entry_extended_size(ce_namelen(ce)) : \
+   ondisk_cache_entry_size(ce_namelen(ce)))
+
+static int veri

[PATCH v2 02/19] read-cache: split index file version specific functionality

2013-07-12 Thread Thomas Gummerer

Split index file version specific functionality to their own functions,
to prepare for moving the index file version specific parts to their own
file.  This makes it easier to add a new index file format later.

Signed-off-by: Thomas Gummerer 
---
 cache.h  |   5 +-
 read-cache.c | 130 +--
 test-index-version.c |   2 +-
 3 files changed, 90 insertions(+), 47 deletions(-)

diff --git a/cache.h b/cache.h
index c288678..7af853b 100644
--- a/cache.h
+++ b/cache.h
@@ -100,9 +100,12 @@ unsigned long git_deflate_bound(git_zstream *, unsigned 
long);
  */
 
 #define CACHE_SIGNATURE 0x44495243 /* "DIRC" */
-struct cache_header {
+struct cache_version_header {
unsigned int hdr_signature;
unsigned int hdr_version;
+};
+
+struct cache_header {
unsigned int hdr_entries;
 };
 
diff --git a/read-cache.c b/read-cache.c
index d5201f9..93947bf 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1268,10 +1268,8 @@ struct ondisk_cache_entry_extended {
ondisk_cache_entry_extended_size(ce_namelen(ce)) : \
ondisk_cache_entry_size(ce_namelen(ce)))
 
-static int verify_hdr(struct cache_header *hdr, unsigned long size)
+static int verify_hdr_version(struct cache_version_header *hdr, unsigned long 
size)
 {
-   git_SHA_CTX c;
-   unsigned char sha1[20];
int hdr_version;
 
if (hdr->hdr_signature != htonl(CACHE_SIGNATURE))
@@ -1279,10 +1277,22 @@ static int verify_hdr(struct cache_header *hdr, 
unsigned long size)
hdr_version = ntohl(hdr->hdr_version);
if (hdr_version < INDEX_FORMAT_LB || INDEX_FORMAT_UB < hdr_version)
return error("bad index version %d", hdr_version);
+   return 0;
+}
+
+static int verify_hdr(void *mmap, unsigned long size)
+{
+   git_SHA_CTX c;
+   unsigned char sha1[20];
+
+   if (size < sizeof(struct cache_version_header)
+   + sizeof(struct cache_header) + 20)
+   die("index file smaller than expected");
+
git_SHA1_Init(&c);
-   git_SHA1_Update(&c, hdr, size - 20);
+   git_SHA1_Update(&c, mmap, size - 20);
git_SHA1_Final(sha1, &c);
-   if (hashcmp(sha1, (unsigned char *)hdr + size - 20))
+   if (hashcmp(sha1, (unsigned char *)mmap + size - 20))
return error("bad index file sha1 signature");
return 0;
 }
@@ -1424,47 +1434,19 @@ static struct cache_entry *create_from_disk(struct 
ondisk_cache_entry *ondisk,
return ce;
 }
 
-/* remember to discard_cache() before reading a different cache! */
-int read_index_from(struct index_state *istate, const char *path)
+static int read_index_v2(struct index_state *istate, void *mmap, unsigned long 
mmap_size)
 {
-   int fd, i;
-   struct stat st;
+   int i;
unsigned long src_offset;
-   struct cache_header *hdr;
-   void *mmap;
-   size_t mmap_size;
+   struct cache_version_header *hdr;
+   struct cache_header *hdr_v2;
struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
 
-   if (istate->initialized)
-   return istate->cache_nr;
-
-   istate->timestamp.sec = 0;
-   istate->timestamp.nsec = 0;
-   fd = open(path, O_RDONLY);
-   if (fd < 0) {
-   if (errno == ENOENT)
-   return 0;
-   die_errno("index file open failed");
-   }
-
-   if (fstat(fd, &st))
-   die_errno("cannot stat the open index");
-
-   mmap_size = xsize_t(st.st_size);
-   if (mmap_size < sizeof(struct cache_header) + 20)
-   die("index file smaller than expected");
-
-   mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 
0);
-   if (mmap == MAP_FAILED)
-   die_errno("unable to map index file");
-   close(fd);
-
hdr = mmap;
-   if (verify_hdr(hdr, mmap_size) < 0)
-   goto unmap;
+   hdr_v2 = (struct cache_header *)((char *)mmap + sizeof(*hdr));
 
istate->version = ntohl(hdr->hdr_version);
-   istate->cache_nr = ntohl(hdr->hdr_entries);
+   istate->cache_nr = ntohl(hdr_v2->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
istate->cache = xcalloc(istate->cache_alloc, sizeof(*istate->cache));
istate->initialized = 1;
@@ -1474,7 +1456,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
else
previous_name = NULL;
 
-   src_offset = sizeof(*hdr);
+   src_offset = sizeof(*hdr) + sizeof(*hdr_v2);
for (i = 0; i < istate->cache_nr; i++) {
struct ondisk_cache_entry *disk_ce;
struct cache_entry *ce;
@@ -1487,8 +1469,6 @@ int read_index_from(struct index_state *istate, const 
char *path)
src_offset += consumed;
}
strbuf_release(&previous_name_buf);
-   istate->timestamp.sec = st.st_mtime;
-   istate->t

[PATCH v2 01/19] t2104: Don't fail for index versions other than [23]

2013-07-12 Thread Thomas Gummerer

t2104 currently checks for the exact index version 2 or 3,
depending if there is a skip-worktree flag or not. Other
index versions do not use extended flags and thus cannot
be tested for version changes.

Make this test update the index to version 2 at the beginning
of the test. Testing the skip-worktree flags for the default
index format is still covered by t7011 and t7012.

Signed-off-by: Thomas Gummerer 
---
 t/t2104-update-index-skip-worktree.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t2104-update-index-skip-worktree.sh 
b/t/t2104-update-index-skip-worktree.sh
index 1d0879b..bd9644f 100755
--- a/t/t2104-update-index-skip-worktree.sh
+++ b/t/t2104-update-index-skip-worktree.sh
@@ -22,6 +22,7 @@ H sub/2
 EOF
 
 test_expect_success 'setup' '
+   git update-index --index-version=2 &&
mkdir sub &&
touch ./1 ./2 sub/1 sub/2 &&
git add 1 2 sub/1 sub/2 &&
-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 00/19] Index-v5

2013-07-12 Thread Thomas Gummerer

Hi,

previous rounds (without api) are at $gmane/202752, $gmane/202923,
$gmane/203088 and $gmane/203517, the previous round with api was at
$gmane/229732.  Thanks to Junio, Duy and Eric for their comments on
the previous round.

Changes since the previous round:

Add documentation for the index reading api (was sent later in the
previous series as 5.5/22).

read-cache: add index reading api

  - Remove the next_ce pointer, and use index_state->cache[] as part
of the api, making a few functions unnecessary (next_index_entry,
get_index_entry_by_name, sort_index). Those were removed.

  - Replace const char **pathspec with const struct pathspec

  - remove the index_change_filter_opts function.  For better
performance the caller should make up its mind before reading
anything which part of the index file should be read.  Changing
the filter_opts later makes us read part of the index again, which
is sub-optimal.

  - set the filter_opts to NULL when discarding the cache.

  - discussed and didn't change: replacing the pathspec with
tree_entry_interesting doesn't work well.  Using common_prefix()
to calculate the adjusted_pathspec would work, but has some
disadvantages when glob is used in the pathspec.

make sure partially read index is not changed

  - instead of using the index_change_filter_opts function to set the
filter_opts to NULL before changing the index, die() when the
caller tries that.  

  - use filter_opts to check if the index was partially read instead
of an extra flag.

  - die() when trying to re-read a partially read index.

patches (6,7,8,9,12/22) were removed, as they are no longer needed
when keeping index_state->cache[] as part of the api.

ls-files: use index api

  - only use the new api for loading the filtered index, but don't use
the for_each_index_entry function, as it would need to change the
filter_opts after reading the index.  That will be made obsolete
by a series that Duy has cooking [1], after which the loops can be
switched out by using for_each_index_entry.

Other than that, the typos found by Eric were fixed.

[1] http://thread.gmane.org/gmane.comp.version-control.git/229732/focus=230023

Thomas Gummerer (18):
  t2104: Don't fail for index versions other than [23]
  read-cache: split index file version specific functionality
  read-cache: move index v2 specific functions to their own file
  read-cache: Re-read index if index file changed
  Add documentation for the index api
  read-cache: add index reading api
  make sure partially read index is not changed
  grep.c: Use index api
  ls-files.c: use index api
  documentation: add documentation of the index-v5 file format
  read-cache: make in-memory format aware of stat_crc
  read-cache: read index-v5
  read-cache: read resolve-undo data
  read-cache: read cache-tree in index-v5
  read-cache: write index-v5
  read-cache: write index-v5 cache-tree data
  read-cache: write resolve-undo data for index-v5
  update-index.c: rewrite index when index-version is given

Thomas Rast (1):
  p0003-index.sh: add perf test for the index formats

 Documentation/technical/api-in-core-index.txt|   54 +-
 Documentation/technical/index-file-format-v5.txt |  296 +
 Makefile |3 +
 builtin/grep.c   |   71 +-
 builtin/ls-files.c   |   31 +-
 builtin/update-index.c   |8 +-
 cache-tree.c |2 +-
 cache-tree.h |6 +
 cache.h  |  140 +-
 read-cache-v2.c  |  589 +
 read-cache-v5.c  | 1516 ++
 read-cache.c |  676 +++---
 read-cache.h |   65 +
 t/perf/p0003-index.sh|   59 +
 t/t2104-update-index-skip-worktree.sh|1 +
 test-index-version.c |7 +-
 16 files changed, 2942 insertions(+), 582 deletions(-)
 create mode 100644 Documentation/technical/index-file-format-v5.txt
 create mode 100644 read-cache-v2.c
 create mode 100644 read-cache-v5.c
 create mode 100644 read-cache.h
 create mode 100755 t/perf/p0003-index.sh

-- 
1.8.3.453.g1dfc63d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/7] cat-file --batch-check performance improvements

2013-07-12 Thread Junio C Hamano

Jeff King  writes:

> The results for running (in linux.git):
>
>   $ git rev-list --objects --all >objects
>   $ git cat-file --batch-check='%(objectsize:disk)' /dev/null

I can see how these patches are very logical avenue to grab only
on-disk footprint for large number of objects, but among the type,
payload size and on-disk footprint, I find it highly narrow niche
that a real user or script is interested _only_ in on-disk footprint
without even worrying about the type of object.

> ... (though I think the result actually cleans up the
> sha1_object_info_extended interface a bit, and is worth it).

I tend to agree, especially eyeballing the result of 7/7.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 7/7] push: document --lockref

2013-07-12 Thread Johannes Sixt

Am 12.07.2013 00:14, schrieb Junio C Hamano:
> Johannes Sixt  writes:
> 
>> Again: Why not just define +refspec as the way to achieve this check?
> 
> What justification do we have to break existing people's
> configuration that says something like:
> 
>   [remote "ko"]
>   url = kernel.org:/pub/scm/git/git.git
> push = master
> push = next
> push = +pu
> push = maint
> 
> by adding a _new_ requirement they may not be able to satisify?
> Notice that the above is a typical "push only" publishing point,
> where you do not need any remote tracking branches.

Why would it break? When you do not specify --lockref, there is no
change whatsoever.

To achieve any safety at all for these push-only refs, you have to be
very explicit by saying --lockref=pu:$myoldpu
--lockref=master:$myoldmaster etc, and what is wrong if in this case
--lockref semantics are applied, but only pu is allowed to be no-ff?

> I am not opposed if your proposal were to introduce a new syntax
> element that calls for this new feature, e.g.
> 
>   [remote "ko"]
>   url = kernel.org:/pub/scm/git/git.git
> push = *pu
> fetch = +refs/heads/*:refs/remotes/ko/*
> 
> but changing what "+" means to something new will simply not fly.

I still do not see why we need two different kinds of ways to spell the
same strong kind of override (--force and +refspec) under the presence
of --lockref, and why we need a third one (--allow-no-ff) to give a
weaker kind of override (that makes sense only when --lockref was given
in the first place).

-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] t0008: avoid SIGPIPE race condition on fifo

2013-07-12 Thread Brian Gernhardt


On Jul 12, 2013, at 6:35 AM, Jeff King  wrote:

> Subject: [PATCH] t0008: avoid SIGPIPE race condition on fifo

Was able to complete a prove run with this patch.  Many thanks.

~~ Brian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] .mailmap: Map email addresses to names

2013-07-12 Thread Junio C Hamano

Stefan Beller  writes:

> People change email addresses quite often and sometimes
> forget to add their entry to the mailmap file.
> I have contacted lots of people, whose name occurs
> multiple times in the short log having different
> email addresses. The entries in the mailmap of
> this patch are either confirmed by them or are trivial.
> Trivial means different capitalisation of the domain
> (@MIT.EDU and @mit.edu) or the domain was localhost,
> (none) or @local.
>
> Additionally to adding (name, email) mappings to the
> .mailmap file, it has also been sorted alphabetically.
> (which explains the removals, which are added
> 3 lines later on again)
>
> While the most changes happen at the email addresses,
> we also have a name change in here. Karl Hasselström
> is now known as Karl Wiberg due to marriage. Congratulations!
>
> To find out whom to contact I used the following small
> script:
> ---
> #!/bin/bash
> git shortlog -sne |awk '{ NF--; $1=""; print }' |sort |uniq -d > 
> mailmapdoubles
> while read line ; do
> # remove leading whitespace
> trimmed=$(echo $line | sed -e 's/^ *//g' -e 's/ *$//g')
> echo "git shortlog -sne | grep \""$trimmed"\""
> done < mailmapdoubles > mailmapdoubles2
> sh mailmapdoubles2
> rm mailmapdoubles
> rm mailmapdoubles2
> ---
> Also interesting for similar tasks are these snippets:
>
> # Finding out duplicates by comparing email addresses:
> git shortlog -sne |awk '{ print $NF }' |sort |uniq -d
>
> # Finding out duplicates by comparing names:
> git shortlog -sne |awk '{ NF--; $1=""; print }' |sort |uniq -d
> ---
>
> Signed-off-by: Stefan Beller 
> ---

Thanks, but please be careful about these three-dashes when sending
the next batch.  As you may have already guessed, Git cannot guess
reliably which one of the abouve four three-dash lines is the end of
the proposed log message, and cuts at the first one.

>  .mailmap | 95 
> 
>  1 file changed, 71 insertions(+), 24 deletions(-)
>
> diff --git a/.mailmap b/.mailmap
> index 345cce6..1179767 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -5,99 +5,146 @@
>  # same person appearing not to be so.
>  #
>  
> -Alex Bennée 
> +Alejandro R. Sedeño  
>  Alexander Gavrilov 
> +Alex Bennée 
> +Alex Riesen  
> +Alex Riesen  
> +Alex Riesen  
> +Anders Kaseorg  
> +Anders Kaseorg  
>  Aneesh Kumar K.V 
> +anonymous 
> +anonymous 
> +Brandon Casey  
>  Brian M. Carlson 
>  Cheng Renquan 
>  Chris Shoemaker 
> -Dan Johnson 
>  Dana L. How 
>  Dana L. How 
>  Daniel Barkalow 
> +Dan Johnson 
>  David D. Kilzer 
>  David Kågedal 
> +David Reiss  
>  David S. Miller 
>  Deskin Miller 
>  Dirk Süsserott 
>  Eric S. Raymond 
>  Erik Faye-Lund  
> -Fredrik Kuivinen 
> +Florian Achleitner  
> 
> +Franck Bui-Huu  
> +Frank Lichtenheld  
> +Frank Lichtenheld  
>  Frédéric Heitzmann 
> +Fredrik Kuivinen 
> +Han-Wen Nienhuys  Han-Wen Nienhuys 
>  H. Merijn Brand  H.Merijn Brand 
> -H. Peter Anvin 
> -H. Peter Anvin 
> -H. Peter Anvin 
>  Horst H. von Brand 
> +H. Peter Anvin  
> +H. Peter Anvin  
> +H. Peter Anvin  
> +H. Peter Anvin  
>  İsmail Dönmez 
>  Jakub Narębski 
> -Jay Soffian 
> +Jay Soffian  
> +J. Bruce Fields  
> +J. Bruce Fields  
> +J. Bruce Fields  
>  Jeff King  
>  Joachim Berdal Haga 
> +Johannes Schindelin  
>  Johannes Sixt  
> -Johannes Sixt  
>  Johannes Sixt  
> +Johannes Sixt  
> +Jonathan Nieder  
>  Jon Loeliger 
>  Jon Seymour 
> -Jonathan Nieder 
>  Junio C Hamano  
> -Junio C Hamano  
> -Junio C Hamano  
> -Junio C Hamano  
>  Junio C Hamano  
>  Junio C Hamano  
> +Junio C Hamano  
> +Junio C Hamano  
>  Junio C Hamano  
> -Karl Hasselström 
> -Kevin Leung 
> +Junio C Hamano  
> +Karl Wiberg  Karl Hasselström 
> +Karl Wiberg  Karl Hasselström 
> 
> +Kay Sievers  
> +Kay Sievers  
> +Keith Cascio  
>  Kent Engstrom 
> +Kevin Leung 
> +Kirill Smelkov  
> +Kirill Smelkov  
>  Lars Doelle 
>  Lars Doelle 
>  Li Hong 
> -Linus Torvalds  
> 
> -Linus Torvalds  
> -Linus Torvalds  
>  Linus Torvalds  
> +Linus Torvalds  
> +Linus Torvalds  
>  Linus Torvalds  
>  Linus Torvalds  
> 
> +Linus Torvalds  
> 
>  Lukas Sandström 
>  Marc-André Lureau 
>  Mark Rada 
>  Martin Langhoff  
>  Martin von Zweigbergk  
> 
> +Matthias Kestenholz  
> +Matthias Urlichs  
> +Matthias Urlichs  
>  Michael Coleman 
>  Michael J Gruber  
> 
> +Michael Witten  
> +Michael Witten  
>  Michael W. Olson 
>  Michele Ballabio 
> +Miklos Vajna  
> +Namhyung Kim  
> +Namhyung Kim  
>  Nanako Shiraishi 
>  Nanako Shiraishi 
>  Nguyễn Thái Ngọc Duy 
>   
> -Peter Krefting  
> 
> +Pascal Obry  
> +Pascal Obry  
> +Paul Mackerras  
> +Paul Mackerras  
>  Peter Krefting  
> +Peter Krefting  
> 
>  Petr Baudis  
> +Petr Baudis  
>  Philippe Bruhat 
>  Ralf Thielow  
>  Ramsay Allan Jones 
>  René Scharfe 
>  Robert Fitzsimons 
>  Robert Zeh 
> +Robin Rosenberg  
> 
> +Salikh Zakirov  
>  Sam Vilain 
> -Santi Béjar 
> +Santi Béja

Re: [PATCH v2] Corrections to the mailmap file

2013-07-12 Thread Junio C Hamano

Stefan Beller  writes:

> By now I contacted more than half the people, who might
> get some .mailmap entries. Not all of them have responded,
> but maybe 2/3 of those, whom I contacted.
>
> I used 2 branches to get this work done. One branch having
> all the proposed patches, where each patch just changes one
> name, so I can send it to that specific person for review.
> The second branch would be slightly behind the first branch
> and only have the patches of the confirmed .mailmap changes.
> The following patch is a squashed version of the confirmed 
> branch.
>
> Whenever somebody confirmed their patch, I'd include it
> into the confirmed branch and rebase the first branch on top
> of it. That works fine, if there are no many commits
> in between, so no merge conflicts occur.
>
> Junio, therefore I'd ask to include the following patch as 
> the 1/3 milestone in the mailmap completion, so the number of
> my local patches floating around is reduced.
>
> The 6 patches sent at 4th July are not required anymore,
> but the following patch directly applies to your master branch.
>
> Stefan Beller (1):
>   .mailmap: Map email addresses to names
>
>  .mailmap | 95 
> 
>  1 file changed, 71 insertions(+), 24 deletions(-)

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] t0008: avoid SIGPIPE race condition on fifo

2013-07-12 Thread Junio C Hamano

Jeff King  writes:

> Subject: [PATCH] t0008: avoid SIGPIPE race condition on fifo
>
> To test check-ignore's --stdin feature, we use two fifos to
> send and receive data. We carefully keep a descriptor to its
> input open so that it does not receive EOF between input
> lines. However, we do not do the same for its output. That
> means there is a potential race condition in which
> check-ignore has opened the output pipe once (when we read
> the first line), and then writes the second line before we
> have re-opened the pipe.
>
> In that case, check-ignore gets a SIGPIPE and dies. The
> outer shell then tries to open the output fifo but blocks
> indefinitely, because there is no writer.  We can fix it by
> keeping a descriptor open through the whole procedure.

Ahh, figures.

I wish I were smart enough to figure that out immediately after
seeing the test that does funny things to "in" with "9".

Thanks.

> This should also help if check-ignore dies for any other
> reason (we would already have opened the fifo and would
> therefore not block, but just get EOF on read).
>
> However, we are technically still susceptible to
> check-ignore dying early, before we have opened the fifo.
> This is an unlikely race and shouldn't generally happen in
> practice, though, so we can hopefully ignore it.
>
> Signed-off-by: Jeff King 
> ---
>  t/t0008-ignores.sh | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
> index a56db80..c29342d 100755
> --- a/t/t0008-ignores.sh
> +++ b/t/t0008-ignores.sh
> @@ -697,13 +697,21 @@ test_expect_success PIPE 'streaming support for 
> --stdin' '
>   # shell, and then echo to the fd. We make sure to close it at
>   # the end, so that the subprocess does get EOF and dies
>   # properly.
> + #
> + # Similarly, we must keep "out" open so that check-ignore does
> + # not ever get SIGPIPE trying to write to us. Not only would that
> + # produce incorrect results, but then there would be no writer on the
> + # other end of the pipe, and we would potentially block forever trying
> + # to open it.
>   exec 9>in &&
> + exec 8   test_when_finished "exec 9>&-" &&
> + test_when_finished "exec 8<&-" &&
>   echo >&9 one &&
> - read response  + read response <&8 &&
>   echo "$response" | grep "^\.gitignore:1:one one" &&
>   echo >&9 two &&
> - read response  + read response <&8 &&
>   echo "$response" | grep "^::two"
>  '
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] wt-status: use "format" function attribute for status_printf

2013-07-12 Thread Junio C Hamano

Jeff King  writes:

> On Tue, Jul 09, 2013 at 10:26:04PM -0700, Junio C Hamano wrote:
>
>> Jeff King  writes:
>> ...
>> > I'm torn on this one. It really does provide us with more compile-time
>> > safety checks, but it's annoying that "-Wall -Werror" will no longer
>> > work out of the box.
>> 
>> Yeah, that is a show-stopper for me X-<.
>
> You can "fix" it with -Wno-zero-format-length, so the hassle is not
> huge. But I am also inclined to just drop this one. We have lived
> without the extra safety for a long time, and list review does tend to
> catch such problems in practice.

I am tempted to actually merge the original one as-is without any of
the workaround, and just tell people to use -Wno-format-zero-length.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bug in .mailmap handling?

2013-07-12 Thread Stefan Beller

Hello,

you may have noticed I am currently trying to bring the 
mailmap file of git itself up to date. I noticed
some behavior, which I did not expect. Have a look yourself:

---
# prepare test environment:
mkdir testmailmap
cd testmailmap/
git init

# do a commit:
echo "asdf" > test1 
git add test1
git commit -a --author="A " -m "add test1"

# commit with same name, but different email 
# (different capitalization does the trick already, 
# but here I am going to use a different mail)
echo "asdf" > test2
git add test2
git commit -a --author="A " -m "add test2"

# how do we know it's the same person?
git shortlog
A (2):
  add test1
  add test2

# reports as expected:
git shortlog -sne
  1  A 
  1  A 
  
# Adding the line to the mailmap should make life easy, so we know
# it's the same person
echo "A  " > .mailmap

# Come on, I just wanted to have it reported as one person!
git shortlog -sne
 1  A 
 1  A 
 
# So let's try another line in the mailmap file, (small 'a')
echo "A  " > .mailmap

# We're not there yet?
git shortlog -sne
 1  A 
 1  A 

# Now let's write it rather explicit: 
# (essentially just write 2 lines into the mailmap file)
cat << EOF > .mailmap
A  
A  
EOF
 
# works as expected now
git shortlog -sne
 2  A 

# works as expected now as well
git shortlog  
A (2):
  add test1
  add test2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Cygwin, git and x: directory

2013-07-12 Thread Mark Levedahl


On 07/12/2013 08:42 AM, Mikko Rapeli wrote:

(please Cc: me in replies, not subscribed to the lists)

Hi Cygwin and git developers,

Does following scenario show signs of bugs in Cygwin and/or git?

# setup git repo
$ cd /tmp
$ mkdir foo && cd foo
$ git init

# create x: directory
$ mkdir x:
$ ls
x:



I would report this on the Cygwin list, not here. Any use of dos file 
paths using a Cygwin tool is not recommended, officially only POSIX 
paths are supported. If cygwin's Cmake is generating dos style paths, 
that is a bug that needs reporting to the Cygwin list. Also, I'm not 
sure how the developers will react to mishandling a directory named x:, 
but the behaviour you see is a limitation of the Cygwin platform, not of 
git.


Mark
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Webmail actualización

2013-07-12 Thread Maruti Suzuki


Webmail actualización
20GB 23GB
 
Su buzón ha superado el límite de almacenamiento lo establecido por el  
administrador y usted no será capaz de recibir nuevos correos hasta  
que vuelva a validar. Para revalidar:  
https://docs.google.com/a/uc.cl/spreadsheet/viewform?formkey=dC1aMjNqcThwR3BLaVJWUy1fTlFDNmc6MQ

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] config: add support for http..* settings

2013-07-12 Thread Kyle J. McKay


On Jul 12, 2013, at 02:59, Jeff King wrote:

On Thu, Jul 11, 2013 at 02:50:03PM -0700, Kyle J. McKay wrote:

A few comments on the implementation:


+enum http_option_type {
+   opt_post_buffer = 0,
+   opt_min_sessions,


We usually put enum fields in ALL_CAPS to mark them as constants  
(though

you can find few exceptions in the code).


OK.


+static size_t http_options_url_match_prefix(const char *url,
+   const char *url_prefix,
+   size_t url_prefix_len)
+{


It looks like you're matching the URLs as raw strings, and I don't see
any canonicalization going on. What happens if I have
"https://example.com/foo+bar"; in my config, but then I visit
"https://example.comfoo%20bar";?


The documentation for http..* says:


+http..*::

[snip]
+   Note that  must match the url exactly (other than  
possibly being a
+   prefix).  This means any user, password and/or port setting  
that appears
+   in a url must also appear in  to have a successful  
match.

+


Or what about optional components? If I have "https://example.com";  
in my

config, but I am visiting "https://p...@example.com/";, shouldn't it
match? The config spec is more general than my specific URL; you would
not want it to match in the opposite direction, though.


They have to be included to match.  Any URL-encoding present in the  
URL given to git needs to be duplicated in the URL in the config file  
exactly or there will not be a match.


That does make your by-length ordering impossible, but I don't think  
you

can do it in the general case. If I have:

 [http "http://p...@example.com";] foo = 1
 [http "http://example.com:8080";] foo = 2

and I visit "http://p...@example.com:8080";, which one is the winner?


If I were to spilt everything out, then I would only have the second  
one match.  The first config is on a different port, quite possibly an  
entirely different service.  It shouldn't match.  Consider if you were  
port forwarding with ssh, services from many different machines would  
all be on localhost on different ports.  The second one is on the same  
port and matches because it's slightly more general (missing the user  
name), but it's clearly talking to the same service.


As the patch stands now, neither of them would match since the  
documentation requires an exact match (except for possibly being a  
prefix).


I don't think it's necessary to split the URL apart though.  Whatever  
URL the user gave to git on the command line (at some point even if  
it's now stored as a remote setting in config) complete with URL- 
encoding, user names, port names, etc. is the same url, possibly  
shortened, that needs to be used for the http..option setting.


I think that's simple and very easy to explain and avoids user  
confusion and surprise while still allowing a default to be set for a  
site but easily overridden for a portion of that site without needing  
to worry about the order config files are processed or the order of  
the [http ""] sections within them.


I don't think there is an unambiguous definition. I'd suggest  
instead just

using the usual "last one wins" strategy that our config uses. It has
the unfortunate case that:

 [http "http://example.com";]
foo = 1
 [http]
foo = 2

will always choose http.foo=2, but I don't think that is a big problem
in practice.


I think that violates the principle of least surprise.  In this case:

[http "https://cacert.org";]
  sslVerify = false
[http]
  sslVerify = true

the intention here is very clear -- for cacert.org only, sslVerify  
should be false.  To have the [http] setting step on the [http "https://cacert.org 
"] setting seems completely surprising and unexpected.


The "last one wins" is still in effect though for the same paths.  If  
you have:


# System config
[http "https://cacert.org";]
  sslVerify = false

# Global config
[http "https://cacert.org";]
  sslVerify = true

The setting from the Global config still wins.


You often only set one or the other, and in the cases where
you want to have a non-standard default and then override it with
another host-specific case, the "last one wins" rule makes it simple  
to

explain that you need to specify keys in most-general to most-specific
order.


Unless, of course, different config files are involved and that's not  
possible without duplicating entries into the later-processed config  
file.



static int http_options(const char *var, const char *value, void *cb)
{
-   if (!strcmp("http.sslverify", var)) {
+   const char *url = (const char *)cb;


No need to cast from void, this isn't C++. :)


Indeed.


The rest of the http_options() changes look like what I'd expect.

@@ -344,7 +479,7 @@ void http_init(struct remote *remote, const  
char *url, int proactive_auth)


http_is_verbose = 0;

-   git_config(http_options, NULL);
+   git_config(http_options, (void *)url);


No need to cas

Cygwin, git and x: directory

2013-07-12 Thread Mikko Rapeli

(please Cc: me in replies, not subscribed to the lists)

Hi Cygwin and git developers,

Does following scenario show signs of bugs in Cygwin and/or git?

# setup git repo
$ cd /tmp
$ mkdir foo && cd foo
$ git init

# create x: directory
$ mkdir x:
$ ls
x:

# create Windows X: drive, cygwin utils can work with both unix and dos style
# path names
$ mkdir c:/temp/bar
$ subst x: c:/temp/bar
$ touch x:/file.txt
$ ls x:/
file.txt

# clean git tree from non-tracked files
$ git clean -d -x -f
Removing x:/

# observe results, git did rm -rf on the X drive instead of the local
# directory named x:
$ ls
x:
$ file x\:
x:: directory
$ ls x:/
ls: cannot access x:/: No such file or directory
$ ls c:/temp/bar
ls: cannot access c:/temp/bar: No such file or directory
$ subst
X:\: => C:\temp\bar

In real life CMake created C: file in a build tree -- which is also a bug
but a separate one -- which resulted in obviously catastrophic results.

-Mikko
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] .mailmap: Map email addresses to names

2013-07-12 Thread Stefan Beller

People change email addresses quite often and sometimes
forget to add their entry to the mailmap file.
I have contacted lots of people, whose name occurs
multiple times in the short log having different
email addresses. The entries in the mailmap of
this patch are either confirmed by them or are trivial.
Trivial means different capitalisation of the domain
(@MIT.EDU and @mit.edu) or the domain was localhost,
(none) or @local.

Additionally to adding (name, email) mappings to the
.mailmap file, it has also been sorted alphabetically.
(which explains the removals, which are added
3 lines later on again)

While the most changes happen at the email addresses,
we also have a name change in here. Karl Hasselström
is now known as Karl Wiberg due to marriage. Congratulations!

To find out whom to contact I used the following small
script:
---
#!/bin/bash
git shortlog -sne |awk '{ NF--; $1=""; print }' |sort |uniq -d > 
mailmapdoubles
while read line ; do
# remove leading whitespace
trimmed=$(echo $line | sed -e 's/^ *//g' -e 's/ *$//g')
echo "git shortlog -sne | grep \""$trimmed"\""
done < mailmapdoubles > mailmapdoubles2
sh mailmapdoubles2
rm mailmapdoubles
rm mailmapdoubles2
---
Also interesting for similar tasks are these snippets:

# Finding out duplicates by comparing email addresses:
git shortlog -sne |awk '{ print $NF }' |sort |uniq -d

# Finding out duplicates by comparing names:
git shortlog -sne |awk '{ NF--; $1=""; print }' |sort |uniq -d
---

Signed-off-by: Stefan Beller 
---
 .mailmap | 95 
 1 file changed, 71 insertions(+), 24 deletions(-)

diff --git a/.mailmap b/.mailmap
index 345cce6..1179767 100644
--- a/.mailmap
+++ b/.mailmap
@@ -5,99 +5,146 @@
 # same person appearing not to be so.
 #
 
-Alex Bennée 
+Alejandro R. Sedeño  
 Alexander Gavrilov 
+Alex Bennée 
+Alex Riesen  
+Alex Riesen  
+Alex Riesen  
+Anders Kaseorg  
+Anders Kaseorg  
 Aneesh Kumar K.V 
+anonymous 
+anonymous 
+Brandon Casey  
 Brian M. Carlson 
 Cheng Renquan 
 Chris Shoemaker 
-Dan Johnson 
 Dana L. How 
 Dana L. How 
 Daniel Barkalow 
+Dan Johnson 
 David D. Kilzer 
 David Kågedal 
+David Reiss  
 David S. Miller 
 Deskin Miller 
 Dirk Süsserott 
 Eric S. Raymond 
 Erik Faye-Lund  
-Fredrik Kuivinen 
+Florian Achleitner  

+Franck Bui-Huu  
+Frank Lichtenheld  
+Frank Lichtenheld  
 Frédéric Heitzmann 
+Fredrik Kuivinen 
+Han-Wen Nienhuys  Han-Wen Nienhuys 
 H. Merijn Brand  H.Merijn Brand 
-H. Peter Anvin 
-H. Peter Anvin 
-H. Peter Anvin 
 Horst H. von Brand 
+H. Peter Anvin  
+H. Peter Anvin  
+H. Peter Anvin  
+H. Peter Anvin  
 İsmail Dönmez 
 Jakub Narębski 
-Jay Soffian 
+Jay Soffian  
+J. Bruce Fields  
+J. Bruce Fields  
+J. Bruce Fields  
 Jeff King  
 Joachim Berdal Haga 
+Johannes Schindelin  
 Johannes Sixt  
-Johannes Sixt  
 Johannes Sixt  
+Johannes Sixt  
+Jonathan Nieder  
 Jon Loeliger 
 Jon Seymour 
-Jonathan Nieder 
 Junio C Hamano  
-Junio C Hamano  
-Junio C Hamano  
-Junio C Hamano  
 Junio C Hamano  
 Junio C Hamano  
+Junio C Hamano  
+Junio C Hamano  
 Junio C Hamano  
-Karl Hasselström 
-Kevin Leung 
+Junio C Hamano  
+Karl Wiberg  Karl Hasselström 
+Karl Wiberg  Karl Hasselström 

+Kay Sievers  
+Kay Sievers  
+Keith Cascio  
 Kent Engstrom 
+Kevin Leung 
+Kirill Smelkov  
+Kirill Smelkov  
 Lars Doelle 
 Lars Doelle 
 Li Hong 
-Linus Torvalds  

-Linus Torvalds  
-Linus Torvalds  
 Linus Torvalds  
+Linus Torvalds  
+Linus Torvalds  
 Linus Torvalds  
 Linus Torvalds  

+Linus Torvalds  

 Lukas Sandström 
 Marc-André Lureau 
 Mark Rada 
 Martin Langhoff  
 Martin von Zweigbergk  
+Matthias Kestenholz  
+Matthias Urlichs  
+Matthias Urlichs  
 Michael Coleman 
 Michael J Gruber  
+Michael Witten  
+Michael Witten  
 Michael W. Olson 
 Michele Ballabio 
+Miklos Vajna  
+Namhyung Kim  
+Namhyung Kim  
 Nanako Shiraishi 
 Nanako Shiraishi 
 Nguyễn Thái Ngọc Duy 
  
-Peter Krefting  
+Pascal Obry  
+Pascal Obry  
+Paul Mackerras  
+Paul Mackerras  
 Peter Krefting  
+Peter Krefting  
 Petr Baudis  
+Petr Baudis  
 Philippe Bruhat 
 Ralf Thielow  
 Ramsay Allan Jones 
 René Scharfe 
 Robert Fitzsimons 
 Robert Zeh 
+Robin Rosenberg  
+Salikh Zakirov  
 Sam Vilain 
-Santi Béjar 
+Santi Béjar  
 Sean Estabrooks 
+Sebastian Schuberth  
 Shawn O. Pearce 
-Steven Grimm 
+Stephen Boyd  
+Steven Grimm  
+Sven Verdoolaege  
+Sven Verdoolaege  
 Tay Ray Chuan 
 Theodore Ts'o 
+Thomas Ackermann  
 Thomas Rast  
+Timo Hirvonen  
+Toby Allsopp  
 Tony Luck 
-Uwe Kleine-König 
-Uwe Kleine-König 
 Uwe Kleine-König 
+Uwe Kleine-König 
+Uwe Kleine-König 
 Uwe Kleine-König 
 Uwe Kleine-König 
 Ville Skyttä 
 Vitaly "_Vi" Shukela 
 William Pursell 
 YOSHIFUJI Hideaki 
-anonymous 
-anonymous 
+
-- 
1.8.3.2.636.g7943f03

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/ma

[PATCH v2] Corrections to the mailmap file

2013-07-12 Thread Stefan Beller

By now I contacted more than half the people, who might
get some .mailmap entries. Not all of them have responded,
but maybe 2/3 of those, whom I contacted.

I used 2 branches to get this work done. One branch having
all the proposed patches, where each patch just changes one
name, so I can send it to that specific person for review.
The second branch would be slightly behind the first branch
and only have the patches of the confirmed .mailmap changes.
The following patch is a squashed version of the confirmed 
branch.

Whenever somebody confirmed their patch, I'd include it
into the confirmed branch and rebase the first branch on top
of it. That works fine, if there are no many commits
in between, so no merge conflicts occur.

Junio, therefore I'd ask to include the following patch as 
the 1/3 milestone in the mailmap completion, so the number of
my local patches floating around is reduced.

The 6 patches sent at 4th July are not required anymore,
but the following patch directly applies to your master branch.

Stefan Beller (1):
  .mailmap: Map email addresses to names

 .mailmap | 95 
 1 file changed, 71 insertions(+), 24 deletions(-)

-- 
1.8.3.2.636.g7943f03

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4] config: add support for http..* settings

2013-07-12 Thread Kyle J. McKay

The  value is considered a match to a url if the  value
is either an exact match or a prefix of the url which ends on a
path component boundary ('/').  So "https://example.com/test"; will
match "https://example.com/test"; and "https://example.com/test/too";
but not "https://example.com/testextra";.

Longer matches take precedence over shorter matches with
environment variable settings taking precedence over all.

With this configuration:

[http]
useragent = other-agent
noEPSV = true
[http "https://example.com";]
useragent = example-agent
sslVerify = false
[http "https://example.com/path";]
useragent = path-agent

The "https://other.example.com/"; url will have useragent
"other-agent" and sslVerify will be on.

The "https://example.com/"; url will have useragent
"example-agent" and sslVerify will be off.

The "https://example.com/path/sub"; url will have useragent
"path-agent" and sslVerify will be off.

All three of the examples will have noEPSV enabled.

Signed-off-by: Kyle J. McKay 
Reviewed-by: Junio C Hamano 
---

The credentials configuration values already support url-specific
configuration items in the form credential..*.  This patch
adds similar support for http configuration values.

This version of the patch addresses the following feedback:

  * The url[url_prefix_len-1] test can now never succeed.  It is removed.

  * More text added to check_matched_len to clarify behavior when two
options have the same key.

 Documentation/config.txt |  11 
 http.c   | 168 ++-
 2 files changed, 162 insertions(+), 17 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b4d4887..3731a3a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1531,6 +1531,17 @@ http.useragent::
of common USER_AGENT strings (but not including those like git/1.7.1).
Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
 
+http..*::
+   Any of the http.* options above can be applied selectively to some urls.
+   For example "http.https://example.com.useragent"; would set the user
+   agent only for https connections to example.com.  The  value
+   matches a url if it is an exact match or a prefix of the url matching
+   at a '/' boundary.  Longer  matches take precedence over shorter
+   ones with the environment variable settings taking precedence over all.
+   Note that  must match the url exactly (other than possibly being a
+   prefix).  This means any user, password and/or port setting that appears
+   in a url must also appear in  to have a successful match.
+
 i18n.commitEncoding::
Character encoding the commit messages are stored in; Git itself
does not care per se, but this information is necessary e.g. when
diff --git a/http.c b/http.c
index 2d086ae..feca70a 100644
--- a/http.c
+++ b/http.c
@@ -30,6 +30,34 @@ static CURL *curl_default;
 
 char curl_errorstr[CURL_ERROR_SIZE];
 
+enum http_option_type {
+   opt_post_buffer = 0,
+   opt_min_sessions,
+#ifdef USE_CURL_MULTI
+   opt_max_requests,
+#endif
+   opt_ssl_verify,
+   opt_ssl_try,
+   opt_ssl_cert,
+#if LIBCURL_VERSION_NUM >= 0x070903
+   opt_ssl_key,
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070908
+   opt_ssl_capath,
+#endif
+   opt_ssl_cainfo,
+   opt_low_speed,
+   opt_low_time,
+   opt_no_epsv,
+   opt_http_proxy,
+   opt_cookie_file,
+   opt_user_agent,
+   opt_passwd_req,
+   opt_max
+};
+
+static size_t http_option_max_matched_len[opt_max];
+
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
 static const char *ssl_cert;
@@ -141,34 +169,121 @@ static void process_curl_messages(void)
 }
 #endif
 
+static size_t http_options_url_match_prefix(const char *url,
+   const char *url_prefix,
+   size_t url_prefix_len)
+{
+   /*
+* url_prefix matches url if url_prefix is an exact match for url or it
+* is a prefix of url and the match ends on a path component boundary.
+* Both url and url_prefix are considered to have an implicit '/' on the
+* end for matching purposes if they do not already.
+*
+* url must be NUL terminated.  url_prefix_len is the length of
+* url_prefix which need not be NUL terminated.
+*
+* The return value is the length of the match in characters (excluding
+* any final '/') or 0 for no match.  Passing "/" as url_prefix will
+* always cause 0 to be returned.
+*/
+   size_t url_len;
+   if (url_prefix_len && url_prefix[url_prefix_len - 1] == '/')
+   url_prefix_len--;
+   if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
+   return 0;
+   url_len = strlen(url);
+   if ((url_len == url_prefix_len) || (url[

[PATCH] t0008: avoid SIGPIPE race condition on fifo

2013-07-12 Thread Jeff King

On Thu, Jul 11, 2013 at 09:09:55AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > On Wed, Jul 10, 2013 at 12:36:40PM -0400, Brian Gernhardt wrote:
> >
> >> The newest test in t0008 "streaming support for --stdin", seems to
> >> hang sporadically on my MacBook Pro (running 10.8.4).  The hang seems
> >> to be related to running it in parallel with other tests, as I can
> >> only reliably cause it by running with prove  and -j 3.  However, once
> >> that has hung I am able to semi-reliably have it occur by running the
> >> test separately (with the test hung in the background, using separate
> >> trash directories via the --root option).
> >
> > I can't replicate the hang here (on Linux) doing:
> >
> >   for i in `seq 1 30`; do
> >   ./t0008-ignores.sh --root=/tmp/foo/$i &
> >   done
> 
> It seems to hang on me with bash, but not dash, here.

Thanks, I was able to replicate it with bash, and like Brian, I saw it
hanging in the second read. strace revealed that we were blocked on
open("out").

The patch below should fix it. I'm still not sure why the choice of
shell matters; it may simply be a timing fluke that bash is more likely
to hit for some reason.

-- >8 --
Subject: [PATCH] t0008: avoid SIGPIPE race condition on fifo

To test check-ignore's --stdin feature, we use two fifos to
send and receive data. We carefully keep a descriptor to its
input open so that it does not receive EOF between input
lines. However, we do not do the same for its output. That
means there is a potential race condition in which
check-ignore has opened the output pipe once (when we read
the first line), and then writes the second line before we
have re-opened the pipe.

In that case, check-ignore gets a SIGPIPE and dies. The
outer shell then tries to open the output fifo but blocks
indefinitely, because there is no writer.  We can fix it by
keeping a descriptor open through the whole procedure.

This should also help if check-ignore dies for any other
reason (we would already have opened the fifo and would
therefore not block, but just get EOF on read).

However, we are technically still susceptible to
check-ignore dying early, before we have opened the fifo.
This is an unlikely race and shouldn't generally happen in
practice, though, so we can hopefully ignore it.

Signed-off-by: Jeff King 
---
 t/t0008-ignores.sh | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index a56db80..c29342d 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -697,13 +697,21 @@ test_expect_success PIPE 'streaming support for --stdin' '
# shell, and then echo to the fd. We make sure to close it at
# the end, so that the subprocess does get EOF and dies
# properly.
+   #
+   # Similarly, we must keep "out" open so that check-ignore does
+   # not ever get SIGPIPE trying to write to us. Not only would that
+   # produce incorrect results, but then there would be no writer on the
+   # other end of the pipe, and we would potentially block forever trying
+   # to open it.
exec 9>in &&
+   exec 8&-" &&
+   test_when_finished "exec 8<&-" &&
echo >&9 one &&
-   read response &9 two &&
-   read response http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/7] cat-file: disable object/refname ambiguity check for batch mode

2013-07-12 Thread Michael Haggerty

On 07/12/2013 11:22 AM, Jeff King wrote:
> Yet another option is to consider what the check is doing, and
> accomplish the same thing in a different way. The real pain is that we
> are individually trying to resolve each object by hitting the filesystem
> (and doing lots of silly checks on the refname format, when we know it
> must be valid).
> 
> We don't actually care in this case if the ref list is up to date (we
> are not trying to update or read a ref, but only know if it exists, and
> raciness is OK). IOW, could we replace the dwim_ref call for the warning
> with something that directly queries the ref cache?

I think it would be quite practical to add an API something like

struct ref_snapshot *get_ref_snapshot(const char *prefix)
void release_ref_snapshot(struct ref_snapshot *)
int lookup_ref(struct ref_snapshot *, const char *refname,
   unsigned char *sha1, int *flags)

where prefix is the part of the refs tree that you want included in the
snapshot (e.g., "refs/heads") and ref_snapshot is probably opaque
outside of the refs module.

Symbolic refs, which are currently not stored in the ref_cache, would
have to be added because otherwise we would have to do all of the
lookups anyway.

I think this would be a good step to take for many reasons, including
because it would be another useful step in the direction of ref
transactions.

But with particular respect to "git cat-file", I see problems:

1. get_ref_snapshot() would have to read all loose and packed refs
within the specified subtree, because loose refs have to be read before
packed refs.  So the call would be expensive if there are a lot of loose
refs.  And DWIM wouldn't know in advance where the references might be,
so it would have to set prefix="".  If many refs are looked up, then it
would presumably be worth it.  But if only a couple of lookups are done
and there are a lot of loose refs, then using a cache would probably
slow things down.

The slowdown could be ameliorated by adding some more intelligence, for
example only populating the loose refs cache after a certain number of
lookups have already been done.

2. A "git cat-file --batch" process can be long-lived.  What guarantees
would users expect regarding its lookup results?  Currently, its ref
lookups reflect the state of the repo at the moment the commit
identifier is written into the pipe.  Using a cache like this would mean
that ref lookups would always reflect the snapshot taken at the start of
the "git cat-file" run, regardless of whether the script using it might
have added or modified some references since then.  I think this would
have to be considered a regression.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 0/5] allow more sources for config values

2013-07-12 Thread Jeff King

On Thu, Jul 11, 2013 at 04:26:02PM -0700, Junio C Hamano wrote:

> Thanks.
> 
> The differences since the last round I see are these.  And I think
> the series overall makes sense (I haven't look hard enough to pick
> any nits yet, though).

Both v4 and the interdiff look fine to me. I gave it one more cursory
read-through, and I don't see any problems. So:

Acked-by: Jeff King 

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] config: add support for http..* settings

2013-07-12 Thread Jeff King

On Thu, Jul 11, 2013 at 02:50:03PM -0700, Kyle J. McKay wrote:

> The  value is considered a match to a url if the  value
> is either an exact match or a prefix of the url which ends on a
> path component boundary ('/').  So "https://example.com/test"; will
> match "https://example.com/test"; and "https://example.com/test/too";
> but not "https://example.com/testextra";.
> 
> Longer matches take precedence over shorter matches with
> environment variable settings taking precedence over all.
> 
> With this configuration:
> 
> [http]
> useragent = other-agent
> noEPSV = true
> [http "https://example.com";]
> useragent = example-agent
> sslVerify = false
> [http "https://example.com/path";]
> useragent = path-agent

I like the general direction you are going, and especially the path
prefix matching is something I had always meant to implement for the
credential code.

A few comments on the implementation:

> +enum http_option_type {
> + opt_post_buffer = 0,
> + opt_min_sessions,
> +#ifdef USE_CURL_MULTI
> + opt_max_requests,
> +#endif
> + opt_ssl_verify,
> + opt_ssl_try,
> + opt_ssl_cert,
> +#if LIBCURL_VERSION_NUM >= 0x070903
> + opt_ssl_key,
> +#endif
> +#if LIBCURL_VERSION_NUM >= 0x070908
> + opt_ssl_capath,
> +#endif
> + opt_ssl_cainfo,
> + opt_low_speed,
> + opt_low_time,
> + opt_no_epsv,
> + opt_http_proxy,
> + opt_cookie_file,
> + opt_user_agent,
> + opt_passwd_req,
> + opt_max
> +};

We usually put enum fields in ALL_CAPS to mark them as constants (though
you can find few exceptions in the code).

> +static size_t http_options_url_match_prefix(const char *url,
> + const char *url_prefix,
> + size_t url_prefix_len)
> +{

It looks like you're matching the URLs as raw strings, and I don't see
any canonicalization going on. What happens if I have
"https://example.com/foo+bar"; in my config, but then I visit
"https://example.comfoo%20bar";?

Or what about optional components? If I have "https://example.com"; in my
config, but I am visiting "https://p...@example.com/";, shouldn't it
match? The config spec is more general than my specific URL; you would
not want it to match in the opposite direction, though.

I think you would want to break down the URL into fields, canonicalize
each field by undoing any encoding, and then compare the broken-down
URLs, as credential_match does (adding in your prefix/boundary matching to
the path instead of a straight string comparison).

I think you could mostly reuse the code there by doing:

  1. Create a "struct url" that contains the broken-down form; "struct
 credential" would contain a url.

  2. Pull out credential_from_url into "int url_parse(struct url *,
 const char *)".

  3. Pull out credential_match into "int url_match(struct url *, struct
 url *)".

  4. Parse and compare URLs from "http.$URL.*" during the config read
 (similar to credential_config_callback).

That does make your by-length ordering impossible, but I don't think you
can do it in the general case. If I have:

  [http "http://p...@example.com";] foo = 1
  [http "http://example.com:8080";] foo = 2

and I visit "http://p...@example.com:8080";, which one is the winner? I
don't think there is an unambiguous definition. I'd suggest instead just
using the usual "last one wins" strategy that our config uses. It has
the unfortunate case that:

  [http "http://example.com";]
 foo = 1
  [http]
 foo = 2

will always choose http.foo=2, but I don't think that is a big problem
in practice. You often only set one or the other, and in the cases where
you want to have a non-standard default and then override it with
another host-specific case, the "last one wins" rule makes it simple to
explain that you need to specify keys in most-general to most-specific
order.

>  static int http_options(const char *var, const char *value, void *cb)
>  {
> - if (!strcmp("http.sslverify", var)) {
> + const char *url = (const char *)cb;

No need to cast from void, this isn't C++. :)

The rest of the http_options() changes look like what I'd expect.

> @@ -344,7 +479,7 @@ void http_init(struct remote *remote, const char *url, 
> int proactive_auth)
>  
>   http_is_verbose = 0;
>  
> - git_config(http_options, NULL);
> + git_config(http_options, (void *)url);

No need to cast again. :)

So this is the URL that we got on the command line. Which means that if
we get a redirect, we will not re-examine the options. I think that's
OK; we do not even _see_ the redirect, as it happens internally within
curl. The credential code has the same problem, but it is not a security
issue because curl clears the credentials on redirect.

However, it may be worth mentioning in the documentation that the config
options operate on the URL you give git, _not_ necessarily on the actual
URL you are hitting at any given moment

Re: [PATCH 1/7] cat-file: disable object/refname ambiguity check for batch mode

2013-07-12 Thread Jeff King

On Fri, Jul 12, 2013 at 10:47:46AM +0200, Michael Haggerty wrote:

> > The solution feels a little hacky, but I'm not sure there is a better
> > one short of reverting the warning entirely.
> > 
> > We could also tie it to warn_ambiguous_refs (or add another config
> > variable), but I don't think that makes sense. It is not about "do I
> > care about ambiguities in this repository", but rather "I am going to
> > do a really large number of sha1 resolutions, and I do not want to pay
> > the price in this particular code path for the warning").
> > 
> > There may be other sites that resolve a large number of refs and run
> > into this, but I couldn't think of any. Doing for_each_ref would not
> > have the same problem, as we already know they are refs there.
> 
> ISTM that callers of --batch-check would usually know whether they are
> feeding it SHA-1s or arbitrary object specifiers.  (And in most cases
> when there are a large number of them, they are probably all SHA-1s).

Thanks for bringing this up; I had meant to outline this option in the
cover letter, but forgot when posting.

If were designing cat-file from scratch, I'd consider having --batch
take just 40-hex sha1s in the first place. Since we can't do that now
for backwards compatibility, having an option is the best we can do
there.

But having to use such an option to avoid a 10x slowdown just strikes me
as ugly for two reasons:

  1. It's part of the user-visible interface, and it seems like an extra
 "wtf?" moment when somebody wonders why their script is painfully
 slow, and why they need a magic option to fix it. We're cluttering
 the interface forever.

  2. It's a sign that the refname check was put in for a specific
 performance profile (resolving one or a few refs), but didn't
 consider resolving a large number. I'm wondering what other
 performance regressions it's possible to trigger.

> So instead of framing this change as "skip object refname ambiguity
> check" (i.e., sacrifice some correctness for performance), perhaps it
> would be less hacky to offer the following new feature:

I wouldn't frame it as "correctness for performance" at all. The check
does not impact behavior at all, and is purely informational (to catch
quite a rare situation, too). I'd argue that the check simply isn't that
interesting in this case in the first place.

> To cat-file we could add an option like "--sha1-only" or "--literal" or
> "--no-dwim" (... better names are failing me) which would skip *all*
> dwimming of 40-character strings.  It would also assume that any shorter
> strings are abbreviated SHA-1s and fail if they are not.  This would be
> a nice feature by itself ("these are object names, dammit, and don't try
> to tell me differently!") and would have the additional small advantage
> of speeding up lookups of abbreviated SHA-1s, which (regardless of your
> patch) otherwise go through the whole DWIM process.

I can see in theory that somebody might want that, but I am having a
hard time thinking of a practical use.

> I understand that such an option alone would leave some old scripts
> suffering the performance regression caused by the check, but in most
> cases it would be easy to fix them.

I'm less worried about old scripts, and more worried about carrying
around a confusing option forever, especially when I expect most
invocations to use it (perhaps my experience is biased, but I use
cat-file --batch quite a lot in scripting, and it has almost always been
on the output of rev-list).

IOW, it seems like a poor default, and we are choosing it only because
of backwards compatibility. I guess another option is to switch the
default with the usual deprecation dance.

Yet another option is to consider what the check is doing, and
accomplish the same thing in a different way. The real pain is that we
are individually trying to resolve each object by hitting the filesystem
(and doing lots of silly checks on the refname format, when we know it
must be valid).

We don't actually care in this case if the ref list is up to date (we
are not trying to update or read a ref, but only know if it exists, and
raciness is OK). IOW, could we replace the dwim_ref call for the warning
with something that directly queries the ref cache?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] run-command: dup_devnull(): guard against syscalls failing

2013-07-12 Thread Thomas Rast

dup_devnull() did not check the return values of open() and dup2().
Fix this omission.

Signed-off-by: Thomas Rast 
---
 run-command.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index aece872..1b7f88e 100644
--- a/run-command.c
+++ b/run-command.c
@@ -76,7 +76,10 @@ static inline void close_pair(int fd[2])
 static inline void dup_devnull(int to)
 {
int fd = open("/dev/null", O_RDWR);
-   dup2(fd, to);
+   if (fd < 0)
+   die_errno(_("open /dev/null failed"));
+   if (dup2(fd, to) < 0)
+   die_errno(_("dup2(%d,%d) failed"), fd, to);
close(fd);
 }
 #endif
-- 
1.8.3.2.998.g1d087bc

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/2] open() error checking

2013-07-12 Thread Thomas Rast

#1 is Dale's suggested change.  Dale, to include it we'd need your
Signed-off-by as per Documentation/SubmittingPatches.

#2 is a similar error-checking fix; I reviewed 'git grep "\bopen\b"'
and found one case where the return value was obviously not tested.
The corresponding Windows code path has the same problem, but I dare
not touch it; perhaps someone from the Windows side can look into it?

I originally had a four-patch series to open 0/1/2 from /dev/null, but
then I noticed that this was shot down in 2008:

  http://thread.gmane.org/gmane.comp.version-control.git/93605/focus=93896

Do you want to resurrect this?

The worst part about it is that because we don't have a stderr to rely
on, we can't simply die("stop playing mind games").


Dale R. Worley (1):
  git_mkstemps: correctly test return value of open()

Thomas Rast (1):
  run-command: dup_devnull(): guard against syscalls failing

 run-command.c | 5 -
 wrapper.c | 2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

-- 
1.8.3.2.998.g1d087bc

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] git_mkstemps: correctly test return value of open()

2013-07-12 Thread Thomas Rast

From: "Dale R. Worley" 

open() returns -1 on failure, and indeed 0 is a possible success value
if the user closed stdin in our process.  Fix the test.

Signed-off-by: Thomas Rast 
---
 wrapper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/wrapper.c b/wrapper.c
index dd7ecbb..6a015de 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -322,7 +322,7 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int 
mode)
template[5] = letters[v % num_letters]; v /= num_letters;
 
fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
-   if (fd > 0)
+   if (fd >= 0)
return fd;
/*
 * Fatal error (EPERM, ENOSPC etc).
-- 
1.8.3.2.998.g1d087bc

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/7] cat-file: disable object/refname ambiguity check for batch mode

2013-07-12 Thread Michael Haggerty

On 07/12/2013 08:20 AM, Jeff King wrote:
> A common use of "cat-file --batch-check" is to feed a list
> of objects from "rev-list --objects" or a similar command.
> In this instance, all of our input objects are 40-byte sha1
> ids. However, cat-file has always allowed arbitrary revision
> specifiers, and feeds the result to get_sha1().
> 
> Fortunately, get_sha1() recognizes a 40-byte sha1 before
> doing any hard work trying to look up refs, meaning this
> scenario should end up spending very little time converting
> the input into an object sha1. However, since 798c35f
> (get_sha1: warn about full or short object names that look
> like refs, 2013-05-29), when we encounter this case, we
> spend the extra effort to do a refname lookup anyway, just
> to print a warning. This is further exacerbated by ca91993
> (get_packed_ref_cache: reload packed-refs file when it
> changes, 2013-06-20), which makes individual ref lookup more
> expensive by requiring a stat() of the packed-refs file for
> each missing ref.
> 
> With no patches, this is the time it takes to run:
> 
>   $ git rev-list --objects --all >objects
>   $ time git cat-file --batch-check='%(objectname)'  
> on the linux.git repository:
> 
>   real1m13.494s
>   user0m25.924s
>   sys 0m47.532s
> 
> If we revert ca91993, the packed-refs up-to-date check, it
> gets a little better:
> 
>   real0m54.697s
>   user0m21.692s
>   sys 0m32.916s
> 
> but we are still spending quite a bit of time on ref lookup
> (and we would not want to revert that patch, anyway, which
> has correctness issues).  If we revert 798c35f, disabling
> the warning entirely, we get a much more reasonable time:
> 
>   real0m7.452s
>   user0m6.836s
>   sys 0m0.608s
> 
> This patch does the moral equivalent of this final case (and
> gets similar speedups). We introduce a global flag that
> callers of get_sha1() can use to avoid paying the price for
> the warning.
> 
> Signed-off-by: Jeff King 
> ---
> The solution feels a little hacky, but I'm not sure there is a better
> one short of reverting the warning entirely.
> 
> We could also tie it to warn_ambiguous_refs (or add another config
> variable), but I don't think that makes sense. It is not about "do I
> care about ambiguities in this repository", but rather "I am going to
> do a really large number of sha1 resolutions, and I do not want to pay
> the price in this particular code path for the warning").
> 
> There may be other sites that resolve a large number of refs and run
> into this, but I couldn't think of any. Doing for_each_ref would not
> have the same problem, as we already know they are refs there.

ISTM that callers of --batch-check would usually know whether they are
feeding it SHA-1s or arbitrary object specifiers.  (And in most cases
when there are a large number of them, they are probably all SHA-1s).

So instead of framing this change as "skip object refname ambiguity
check" (i.e., sacrifice some correctness for performance), perhaps it
would be less hacky to offer the following new feature:

To cat-file we could add an option like "--sha1-only" or "--literal" or
"--no-dwim" (... better names are failing me) which would skip *all*
dwimming of 40-character strings.  It would also assume that any shorter
strings are abbreviated SHA-1s and fail if they are not.  This would be
a nice feature by itself ("these are object names, dammit, and don't try
to tell me differently!") and would have the additional small advantage
of speeding up lookups of abbreviated SHA-1s, which (regardless of your
patch) otherwise go through the whole DWIM process.

I understand that such an option alone would leave some old scripts
suffering the performance regression caused by the check, but in most
cases it would be easy to fix them.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] git-clone.txt: remove the restriction on pushing from a shallow clone

2013-07-12 Thread Duy Nguyen

On Fri, Jul 12, 2013 at 12:54 PM, Junio C Hamano  wrote:
> Nguyễn Thái Ngọc Duy   writes:
>
>> + number of limitations (you cannot clone or fetch from it, nor
>> + push into it), but is adequate if you are only interested in
>> + the recent history of a large project with a long history.
>
> Ahh, sorry for the noise.  You still say you cannot push _into_ it.

Yes, I actually removed that check in the first iteration of this
patch, reasoning that it can't cause any harm and it mostly works. But
"mostly" is not good enough. I will try remove it later if I can make
it always work by extending git protocol a bit (I think full clone
pushing to shallow one can always work. Shallow pushing to shallow,
probably no such guarantee)
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] During a shallow fetch, prevent sending over unneeded objects

2013-07-12 Thread Matthijs Kooijman

Hi Junio,

> [administrivia: you seem to have mail-followup-to that points at you
> and the list; is that really needed???]
I'm not subscribed to the list, so yes :-)

> > This happens when a client issues a fetch with a depth bigger or equal
> > to the number of commits the server is ahead of the client.
> 
> Do you mean "smaller" (not "bigger")?
Yes, I meant smaller (reworded this first sentence a few times and then messed
up :-)

> > diff --git a/upload-pack.c b/upload-pack.c
> > index 59f43d1..5885f33 100644
> > --- a/upload-pack.c
> > +++ b/upload-pack.c
> > @@ -122,6 +122,14 @@ static int do_rev_list(int in, int out, void 
> > *user_data)
> > if (prepare_revision_walk(&revs))
> > die("revision walk setup failed");
> > mark_edges_uninteresting(revs.commits, &revs, show_edge);
> > +   /* In case we create a new shallow root, make sure that all
> > +* we don't send over objects that the client already has just
> > +* because their "have" revisions are no longer reachable from
> > +* the shallow root. */
> > +   for (i = 0; i < have_obj.nr; i++) {
> > +   struct commit *commit = (struct commit 
> > *)have_obj.objects[i].item;
> > +   mark_tree_uninteresting(commit->tree);
> > +   }
> 
> Hmph.
> 
> In your discussion (including the comment), you talk about "shallow
> root" (I think that is the same as what we call "shallow boundary"),
I think so, yes. I mean to refer to the commits referenced in
.git/shallow, that have their parents "hidden".

> but in this added block, there is nothing that checks CLIENT_SHALLOW
> or SHALLOW flags to special case that.
>
> Is it a good idea to unconditionally do this for all "have"
> revisions?
That's what I meant in my mail with "applying the fix unconditionally" -
there is probably some check needed (I discussed a few options in the
mail as well).

Note that this entire do_rev_list function is only called when there are
shallow revisions involved, so there is also a basic "only when shallow"
check in place.

> Also there is another loop that iterates over "have" revisions just
> above the precontext.  I wonder if this added code belongs in that
> loop.
I think we could add it there, yes. On the other hand, if we only want
to execute this code when there are shallow boundaries in the list of
revisions to send (as I suggested in my previous mail), then we can't
move this code up.

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

94 matches

Mail list logo