Re: [RFC] - url-safe base64 commit-id's

2017-02-27 Thread Bryan Turner
On Mon, Feb 27, 2017 at 6:27 PM, G. Sylvie Davies
 wrote:
> Is there any appetite for base64'd commit-id's, using the url-safe
> variant (e.g. RFC 4648 [1] with padding removed)?
>
> And so this:
> 712bad335dfa9c410a83f9873614a19726acb3a8
>
> Becomes this:
> cSutM136nEEKg_mHNhShlyass6g
>
>
> Under the hood things cannot change (e.g., ".git/objects/71/") because
> file systems are not always case sensitive.
>
> But for "git log" and "git show" output it would be nice. And helps
> with ambiguous commit id's too if you only want to specify a 7
> character commit-id, since that encodes 42 bits instead of 28 bits.
> I've run into problems with maximum command length on windows (32767
> chars) because I was specifying so many commit-ids on the command-line
> that I blew past that limit. This would help there, too.

Depending on the command, have you considered using stdin instead? git log,
for example, is perfectly happy to read commit IDs from stdin instead of
the command line.

In general, I think the pattern of getting away from command line arguments
is better than trying to shoehorn more data into the same character limit.
Base64 encoding might help get a few more arguments into the available
limit, but in the end it's not going to solve the underlying problem.

>
> Might be particularly helpful with the transition to a new hash.
> e.g., a 43 character Base64 id instead of a 64 character hex id.
>
>
> - Sylvie
>
> [1] - https://tools.ietf.org/html/rfc4648#page-7


Git has been accepted as a GSoC 2017 mentor organization!

2017-02-27 Thread Christian Couder
Hi everyone,

I am happy to let you know that Git has been accepted as a GSoC mentor
organization again this year.

  https://summerofcode.withgoogle.com/organizations/

I invited Dscho and Stefan as potential mentors for Git. I also
invited Junio to give him access to students proposals and the
opportunity to comment on them.

Tell me if you want an invitation to mentor too. There is no
commitment to mentor anyone for now if you accept the invitation. This
will be decided later. You will just get access to the proposals and
be able to comment on them.

As a reminder, our GSoC related pages are:

  http://git.github.io/SoC-2017-Ideas/
  http://git.github.io/SoC-2017-Microprojects/

They could still be improved which can be done through:

https://github.com/git/git.github.io/

Thanks in advance,
Christian.

(PS: Matthieu, yeah I used your email from last year as a template for
this one, thanks!)


Typesafer git hash patch

2017-02-27 Thread Linus Torvalds
So because of the whole SHA1 discussion, I started looking at what it
would involve to turn

unsigned char *sha1

style arguments (and various structure members) in the git source into

   typedef struct { ... } hash_t;

   hash_t *hash;

The answer is that it's pretty painful - more so than I expected (I
looked at it _looong_ ago and decided it was painful - and it has
become more painful as git has grown).

But once I got started I just continued through the slog.

Having the hashes be more encapsulated does seem to make things better
in many ways. What I did was to also just unify the notion of "hash_t"
and "struct object_id", so the two are entirely interchangeable. That
actually can clean up some code, because right now we have duplicate
interfaces for some things that take an oid pointer and others take a
"const unsigned char *sha1", and that duplication essentially can go
away. I didn't do any of that, I tried to keep it as as brute-force
stupid conversion.

I saw that somebody is actually looking at doing this "well" - by
doing it in many smaller steps. I tried. I gave up. The "unsigned char
*sha1" model is so ingrained that doing it incrementally just seemed
like a lot of pain. But it might be the right approach.

It might particularly be the right approach considering the size of the patch:

 216 files changed, 4174 insertions(+), 4080 deletions(-)

which is pretty nasty. The good news is that my patch passes all the
tests, and while it's big it's mostly very very mindless, and a lot of
it looks like cleanups, and the lines are generally shorter, eg

-   const unsigned char *mb = result->item->object.oid.hash;
-   if (!hashcmp(mb, current_bad_oid->hash)) {

turns into

+   const hash_t *mb = >item->object.oid;
+   if (!hashcmp(mb, current_bad_oid)) {

but I ended up also renaming a lot of common "sha1" as "hash", which
adds to the noise in that patch.

NOTE! It doesn't actually _fix_ the SHA1-centricity in any way, but it
makes it a bit more obvious where the bigger problems are. Not that
anybody would be surprised by what they are, but as part of writing
the patch it did kind of pinpoint most of them, and about 30 of those
new lines are added

 /* FIXME! Hardcoded hash sizes */
 /* FIXME! Lots of fixed-size hashes */
 /* FIXME! Fixed 20-byte hash usage */

with the rest of the added lines being a few helper functions and
splitting cache.h up a bit.

And as part of the type safety, I do think I may have found a bug:

show_one_mergetag():

strbuf_addf(_message, "tag %s names a non-parent %s\n",
tag->tag, tag->tagged->oid.hash);

note how it prints out the "non-parent %s", but that's a SHA1 hash
that hasn't been converted to hex. Hmm?

Anyway, is anybody interested?  I warn you - it really is one big
patch on top of 2.12.

   Linus

PS. Just for fun, I tried to look what it does when you then merge pu
or next.. You do get a fair number of merge conflicts because there's
just a lot of changes all around, but they look manageable. So merging
something like that would be painful, but it appears to not entirely
break other development.


Re: [PATCH 4/4] ident: do not ignore empty config name/email

2017-02-27 Thread Christian Couder
On Mon, Feb 27, 2017 at 9:42 PM, Junio C Hamano  wrote:
> Dennis Kaarsemaker  writes:
>
>> On Thu, 2017-02-23 at 23:18 -0500, Jeff King wrote:
>>> On Thu, Feb 23, 2017 at 08:11:11PM -0800, Junio C Hamano wrote:
>>>
>>> > > So I dunno. I could really go either way on it. Feel free to drop it, or
>>> > > even move it into a separate topic to be cooked longer.
>>> >
>>> > If it were 5 years ago, it would have been different, but I do not
>>> > think cooking it longer in 'next' would smoke out breakages in
>>> > obscure scripts any longer.  Git is used by too many people who have
>>> > never seen its source these days.
>>>
>>> Yeah, I have noticed that, too. I wonder if it would be interesting to
>>> cut "weeklies" or something of "master" or even "next" that people could
>>> install with a single click.
>>>
>>> Of course it's not like we have a binary installer in the first place,
>>> so I guess that's a prerequisite.
>>
>> I provide daily[*] snapshots of git's master and next tree as packages
>> for Ubuntu, Debian, Fedora and CentOS on launchpad and SuSE's
>> openbuildservice. If there's sufficient interest in this (I know of
>> only a few users), I can try to put more effort into this.
>
> That sounds handy for people who do not build from the source
> themselves.
>
> Christian, perhaps rev-news can help advertising Dennis's effort to
> recruit like-minded souls?

Yeah, I had already noticed this thread and now Jakub has mentioned it on:

https://github.com/git/git.github.io/issues/231

so yeah we will advertise it one way or another.


[PATCH] http: attempt updating base URL only if no error

2017-02-27 Thread Jonathan Tan
http.c supports HTTP redirects of the form

  http://foo/info/refs?service=git-upload-pack
  -> http://anything
  -> http://bar/info/refs?service=git-upload-pack

(that is to say, as long as the Git part of the path and the query
string is preserved in the final redirect destination, the intermediate
steps can have any URL). However, if one of the intermediate steps
results in an HTTP exception, a confusing "unable to update url base
from redirection" message is printed instead of a Curl error message
with the HTTP exception code.

This was introduced by 2 commits. Commit c93c92f ("http: update base
URLs when we see redirects", 2013-09-28) introduced a best-effort
optimization that required checking if only the "base" part of the URL
differed between the initial request and the final redirect destination,
but it performed the check before any HTTP status checking was done. If
something went wrong, the normal code path was still followed, so this
did not cause any confusing error messages until commit 6628eb4 ("http:
always update the base URL for redirects", 2016-12-06), which taught
http to die if the non-"base" part of the URL differed.

Therefore, teach http to check the HTTP status before attempting to
check if only the "base" part of the URL differed. This commit teaches
http_request_reauth to return early without updating options->base_url
upon an error; the only invoker of this function that passes a non-NULL
"options" is remote-curl.c (through "http_get_strbuf"), which only uses
options->base_url for an informational message in the situations that
this commit cares about (that is, when the return value is not HTTP_OK).

The included test checks that the redirect scheme at the beginning of
this commit message works, and that returning a 502 in the middle of the
redirect scheme produces the correct result. Note that this is different
from the test in commit 6628eb4 ("http: always update the base URL for
redirects", 2016-12-06) in that this commit tests that a Git-shaped URL
(http://.../info/refs?service=git-upload-pack) works, whereas commit
6628eb4 tests that a non-Git-shaped URL
(http://.../info/refs/foo?service=git-upload-pack) does not work (even
though Git is processing that URL) and is an error that is fatal, not
silently swallowed.

Signed-off-by: Jonathan Tan 
---
 http.c | 3 +++
 t/lib-httpd/apache.conf| 9 +
 t/t5550-http-fetch-dumb.sh | 9 +
 3 files changed, 21 insertions(+)

diff --git a/http.c b/http.c
index 90a1c0f11..1df186894 100644
--- a/http.c
+++ b/http.c
@@ -1727,6 +1727,9 @@ static int http_request_reauth(const char *url,
 {
int ret = http_request(url, result, target, options);
 
+   if (ret != HTTP_OK && ret != HTTP_REAUTH)
+   return ret;
+
if (options && options->effective_url && options->base_url) {
if (update_url_from_redirect(options->base_url,
 url, options->effective_url)) {
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 69174c6e3..0642ae7e6 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -133,6 +133,15 @@ RewriteRule ^/ftp-redir/(.*)$ ftp://localhost:1000/$1 
[R=302]
 RewriteRule ^/loop-redir/x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-(.*) /$1 
[R=302]
 RewriteRule ^/loop-redir/(.*)$ /loop-redir/x-$1 [R=302]
 
+# redir-to/502/x?y -> really-redir-to?path=502/x=y which returns 502
+# redir-to/x?y -> really-redir-to?path=x=y -> x?y
+RewriteCond %{QUERY_STRING} ^(.*)$
+RewriteRule ^/redir-to/(.*)$ /really-redir-to?path=$1=%1 [R=302]
+RewriteCond %{QUERY_STRING} ^path=502/(.*)=(.*)$
+RewriteRule ^/really-redir-to$ - [R=502,L]
+RewriteCond %{QUERY_STRING} ^path=(.*)=(.*)$
+RewriteRule ^/really-redir-to$ /%1?%2 [R=302]
+
 # The first rule issues a client-side redirect to something
 # that _doesn't_ look like a git repo. The second rule is a
 # server-side rewrite, so that it turns out the odd-looking
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index aeb3a63f7..2d3b1e9f9 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -378,5 +378,14 @@ test_expect_success 'http-alternates triggers 
not-from-user protocol check' '
clone $HTTPD_URL/dumb/evil.git evil-user
 '
 
+test_expect_success 'can redirect through 
non-"info/refs?service=git-upload-pack" URL' '
+   git clone "$HTTPD_URL/redir-to/dumb/repo.git"
+'
+
+test_expect_success 'print HTTP error when any intermediate redirect throws 
error' '
+   test_must_fail git clone "$HTTPD_URL/redir-to/502" 2> stderr &&
+   test_i18ngrep "unable to access.*/redir-to/502" stderr
+'
+
 stop_httpd
 test_done
-- 
2.11.0.483.g087da7b7c-goog



What's cooking in git.git (Feb 2017, #09; Mon, 27)

2017-02-27 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

The first batch post 2.12 is rather a big one (and intentionally
so).  Among the notable is that major part of "rebase -i" is now
driven by the sequencer backend (Thanks, Dscho), and the API
implementations of attribute subsystem and ref subsystem have also
been cleaned up (Thanks, Brandon & Michael).

The tip of 'next' has been rewound.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* bc/blame-doc-fix (2017-02-22) 1 commit
  (merged to 'next' on 2017-02-22 at 81c0ff2283)
 + Documentation: use brackets for optional arguments

 Doc update.


* bc/worktree-doc-fix-detached (2017-02-22) 1 commit
  (merged to 'next' on 2017-02-22 at 8257025363)
 + Documentation: correctly spell git worktree --detach

 Doc update.


* bw/attr (2017-02-01) 27 commits
  (merged to 'next' on 2017-02-14 at d35c1d7e4a)
 + attr: reformat git_attr_set_direction() function
 + attr: push the bare repo check into read_attr()
 + attr: store attribute stack in attr_check structure
 + attr: tighten const correctness with git_attr and match_attr
 + attr: remove maybe-real, maybe-macro from git_attr
 + attr: eliminate global check_all_attr array
 + attr: use hashmap for attribute dictionary
 + attr: change validity check for attribute names to use positive logic
 + attr: pass struct attr_check to collect_some_attrs
 + attr: retire git_check_attrs() API
 + attr: convert git_check_attrs() callers to use the new API
 + attr: convert git_all_attrs() to use "struct attr_check"
 + attr: (re)introduce git_check_attr() and struct attr_check
 + attr: rename function and struct related to checking attributes
 + attr.c: outline the future plans by heavily commenting
 + Documentation: fix a typo
 + attr.c: add push_stack() helper
 + attr: support quoting pathname patterns in C style
 + attr.c: plug small leak in parse_attr_line()
 + attr.c: tighten constness around "git_attr" structure
 + attr.c: simplify macroexpand_one()
 + attr.c: mark where #if DEBUG ends more clearly
 + attr.c: complete a sentence in a comment
 + attr.c: explain the lack of attr-name syntax check in parse_attr()
 + attr.c: update a stale comment on "struct match_attr"
 + attr.c: use strchrnul() to scan for one line
 + commit.c: use strchrnul() to scan for one line

 The gitattributes machinery is being taught to work better in a
 multi-threaded environment.


* cw/tag-reflog-message (2017-02-08) 1 commit
  (merged to 'next' on 2017-02-10 at 3968b3a58b)
 + tag: generate useful reflog message

 "git tag" did not leave useful message when adding a new entry to
 reflog; this was left unnoticed for a long time because refs/tags/*
 doesn't keep reflog by default.


* dr/doc-check-ref-format-normalize (2017-02-21) 1 commit
  (merged to 'next' on 2017-02-21 at 5e88b7a93d)
 + git-check-ref-format: clarify documentation for --normalize

 Doc update.


* dt/gc-ignore-old-gc-logs (2017-02-13) 1 commit
  (merged to 'next' on 2017-02-16 at 8f48e1b405)
 + gc: ignore old gc.log files

 A "gc.log" file left by a backgrounded "gc --auto" disables further
 automatic gc; it has been taught to run at least once a day (by
 default) by ignoring a stale "gc.log" file that is too old.


* gp/document-dotfiles-in-templates-are-not-copied (2017-02-17) 1 commit
  (merged to 'next' on 2017-02-21 at bbfa2bb7d4)
 + init: document dotfiles exclusion on template copy

 Doc update.


* jh/preload-index-skip-skip (2017-02-10) 1 commit
  (merged to 'next' on 2017-02-16 at 39077062f9)
 + preload-index: avoid lstat for skip-worktree items

 The preload-index code has been taught not to bother with the index
 entries that are paths that are not checked out by "sparse checkout".


* jk/alternate-ref-optim (2017-02-08) 11 commits
  (merged to 'next' on 2017-02-10 at f26f32cff6)
 + receive-pack: avoid duplicates between our refs and alternates
 + receive-pack: treat namespace .have lines like alternates
 + receive-pack: fix misleading namespace/.have comment
 + receive-pack: use oidset to de-duplicate .have lines
 + add oidset API
 + fetch-pack: cache results of for_each_alternate_ref
 + for_each_alternate_ref: replace transport code with for-each-ref
 + for_each_alternate_ref: pass name/oid instead of ref struct
 + for_each_alternate_ref: use strbuf for path allocation
 + for_each_alternate_ref: stop trimming trailing slashes
 + for_each_alternate_ref: handle failure from real_pathdup()

 Optimizes resource usage while enumerating refs from alternate
 object store, to help receiving end of "push" that hosts a
 repository with many "forks".


* jk/delta-chain-limit 

[RFC] - url-safe base64 commit-id's

2017-02-27 Thread G. Sylvie Davies
Is there any appetite for base64'd commit-id's, using the url-safe
variant (e.g. RFC 4648 [1] with padding removed)?

And so this:
712bad335dfa9c410a83f9873614a19726acb3a8

Becomes this:
cSutM136nEEKg_mHNhShlyass6g


Under the hood things cannot change (e.g., ".git/objects/71/") because
file systems are not always case sensitive.

But for "git log" and "git show" output it would be nice. And helps
with ambiguous commit id's too if you only want to specify a 7
character commit-id, since that encodes 42 bits instead of 28 bits.
I've run into problems with maximum command length on windows (32767
chars) because I was specifying so many commit-ids on the command-line
that I blew past that limit. This would help there, too.

Might be particularly helpful with the transition to a new hash.
e.g., a 43 character Base64 id instead of a 64 character hex id.


- Sylvie

[1] - https://tools.ietf.org/html/rfc4648#page-7


[PATCH 2/2] wrapper.c: remove unused gitmkstemps() function

2017-02-27 Thread Ramsay Jones

The last call to the mkstemps() function was removed in commit 659488326
("wrapper.c: delete dead function git_mkstemps()", 22-04-2016). In order
to support platforms without mkstemps(), this functionality was provided,
along with a Makefile build variable (NO_MKSTEMPS), by the gitmkstemps()
function. Remove the dead code, along with the defunct build machinery.

Signed-off-by: Ramsay Jones 
---
 Makefile  |  5 -
 config.mak.uname  | 17 -
 configure.ac  |  6 --
 git-compat-util.h |  5 -
 wrapper.c |  7 ---
 5 files changed, 40 deletions(-)

diff --git a/Makefile b/Makefile
index 8e4081e06..ca9f16d19 100644
--- a/Makefile
+++ b/Makefile
@@ -102,8 +102,6 @@ all::
 #
 # Define MKDIR_WO_TRAILING_SLASH if your mkdir() can't deal with trailing 
slash.
 #
-# Define NO_MKSTEMPS if you don't have mkstemps in the C library.
-#
 # Define NO_GECOS_IN_PWENT if you don't have pw_gecos in struct passwd
 # in the C library.
 #
@@ -1280,9 +1278,6 @@ ifdef MKDIR_WO_TRAILING_SLASH
COMPAT_CFLAGS += -DMKDIR_WO_TRAILING_SLASH
COMPAT_OBJS += compat/mkdir.o
 endif
-ifdef NO_MKSTEMPS
-   COMPAT_CFLAGS += -DNO_MKSTEMPS
-endif
 ifdef NO_UNSETENV
COMPAT_CFLAGS += -DNO_UNSETENV
COMPAT_OBJS += compat/unsetenv.o
diff --git a/config.mak.uname b/config.mak.uname
index 447f36ac2..7bdf86d2a 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -27,7 +27,6 @@ endif
 ifeq ($(uname_S),Linux)
HAVE_ALLOCA_H = YesPlease
NO_STRLCPY = YesPlease
-   NO_MKSTEMPS = YesPlease
HAVE_PATHS_H = YesPlease
LIBC_CONTAINS_LIBINTL = YesPlease
HAVE_DEV_TTY = YesPlease
@@ -41,7 +40,6 @@ endif
 ifeq ($(uname_S),GNU/kFreeBSD)
HAVE_ALLOCA_H = YesPlease
NO_STRLCPY = YesPlease
-   NO_MKSTEMPS = YesPlease
HAVE_PATHS_H = YesPlease
DIR_HAS_BSD_GROUP_SEMANTICS = YesPlease
LIBC_CONTAINS_LIBINTL = YesPlease
@@ -55,7 +53,6 @@ ifeq ($(uname_S),UnixWare)
SHELL_PATH = /usr/local/bin/bash
NO_IPV6 = YesPlease
NO_HSTRERROR = YesPlease
-   NO_MKSTEMPS = YesPlease
BASIC_CFLAGS += -Kthread
BASIC_CFLAGS += -I/usr/local/include
BASIC_LDFLAGS += -L/usr/local/lib
@@ -79,7 +76,6 @@ ifeq ($(uname_S),SCO_SV)
SHELL_PATH = /usr/bin/bash
NO_IPV6 = YesPlease
NO_HSTRERROR = YesPlease
-   NO_MKSTEMPS = YesPlease
BASIC_CFLAGS += -I/usr/local/include
BASIC_LDFLAGS += -L/usr/local/lib
NO_STRCASESTR = YesPlease
@@ -122,7 +118,6 @@ ifeq ($(uname_S),SunOS)
NO_STRCASESTR = YesPlease
NO_MEMMEM = YesPlease
NO_MKDTEMP = YesPlease
-   NO_MKSTEMPS = YesPlease
NO_REGEX = YesPlease
NO_MSGFMT_EXTENDED_OPTIONS = YesPlease
HAVE_DEV_TTY = YesPlease
@@ -168,7 +163,6 @@ ifeq ($(uname_O),Cygwin)
NO_D_TYPE_IN_DIRENT = YesPlease
NO_STRCASESTR = YesPlease
NO_MEMMEM = YesPlease
-   NO_MKSTEMPS = YesPlease
NO_SYMLINK_HEAD = YesPlease
NO_IPV6 = YesPlease
OLD_ICONV = UnfortunatelyYes
@@ -233,7 +227,6 @@ ifeq ($(uname_S),NetBSD)
BASIC_CFLAGS += -I/usr/pkg/include
BASIC_LDFLAGS += -L/usr/pkg/lib $(CC_LD_DYNPATH)/usr/pkg/lib
USE_ST_TIMESPEC = YesPlease
-   NO_MKSTEMPS = YesPlease
HAVE_PATHS_H = YesPlease
HAVE_BSD_SYSCTL = YesPlease
 endif
@@ -242,7 +235,6 @@ ifeq ($(uname_S),AIX)
NO_STRCASESTR = YesPlease
NO_MEMMEM = YesPlease
NO_MKDTEMP = YesPlease
-   NO_MKSTEMPS = YesPlease
NO_STRLCPY = YesPlease
NO_NSEC = YesPlease
FREAD_READS_DIRECTORIES = UnfortunatelyYes
@@ -263,7 +255,6 @@ ifeq ($(uname_S),GNU)
# GNU/Hurd
HAVE_ALLOCA_H = YesPlease
NO_STRLCPY = YesPlease
-   NO_MKSTEMPS = YesPlease
HAVE_PATHS_H = YesPlease
LIBC_CONTAINS_LIBINTL = YesPlease
 endif
@@ -272,7 +263,6 @@ ifeq ($(uname_S),IRIX)
NO_UNSETENV = YesPlease
NO_STRCASESTR = YesPlease
NO_MEMMEM = YesPlease
-   NO_MKSTEMPS = YesPlease
NO_MKDTEMP = YesPlease
# When compiled with the MIPSpro 7.4.4m compiler, and without pthreads
# (i.e. NO_PTHREADS is set), and _with_ MMAP (i.e. NO_MMAP is not set),
@@ -291,7 +281,6 @@ ifeq ($(uname_S),IRIX64)
NO_UNSETENV = YesPlease
NO_STRCASESTR = YesPlease
NO_MEMMEM = YesPlease
-   NO_MKSTEMPS = YesPlease
NO_MKDTEMP = YesPlease
# When compiled with the MIPSpro 7.4.4m compiler, and without pthreads
# (i.e. NO_PTHREADS is set), and _with_ MMAP (i.e. NO_MMAP is not set),
@@ -311,7 +300,6 @@ ifeq ($(uname_S),HP-UX)
NO_SETENV = YesPlease
NO_STRCASESTR = YesPlease
NO_MEMMEM = YesPlease
-   NO_MKSTEMPS = YesPlease
NO_STRLCPY = YesPlease
NO_MKDTEMP = YesPlease
NO_UNSETENV = YesPlease
@@ 

[PATCH 1/2] wrapper.c: remove unused git_mkstemp() function

2017-02-27 Thread Ramsay Jones

The last caller of git_mkstemp() was removed in commit 6fec0a89
("verify_signed_buffer: use tempfile object", 16-06-2016). Since
the introduction of the 'tempfile' APIs, along with git_mkstemp_mode,
it is unlikely that new callers will materialize. Remove the dead
code.

Signed-off-by: Ramsay Jones 
---
 cache.h   |  3 ---
 wrapper.c | 17 -
 2 files changed, 20 deletions(-)

diff --git a/cache.h b/cache.h
index 61fc86e6d..a575684a9 100644
--- a/cache.h
+++ b/cache.h
@@ -1045,9 +1045,6 @@ static inline int is_empty_tree_oid(const struct 
object_id *oid)
return !hashcmp(oid->hash, EMPTY_TREE_SHA1_BIN);
 }
 
-
-int git_mkstemp(char *path, size_t n, const char *template);
-
 /* set default permissions by passing mode arguments to open(2) */
 int git_mkstemps_mode(char *pattern, int suffix_len, int mode);
 int git_mkstemp_mode(char *pattern, int mode);
diff --git a/wrapper.c b/wrapper.c
index e7f197996..1a140639f 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -440,23 +440,6 @@ int xmkstemp(char *template)
return fd;
 }
 
-/* git_mkstemp() - create tmp file honoring TMPDIR variable */
-int git_mkstemp(char *path, size_t len, const char *template)
-{
-   const char *tmp;
-   size_t n;
-
-   tmp = getenv("TMPDIR");
-   if (!tmp)
-   tmp = "/tmp";
-   n = snprintf(path, len, "%s/%s", tmp, template);
-   if (len <= n) {
-   errno = ENAMETOOLONG;
-   return -1;
-   }
-   return mkstemp(path);
-}
-
 /* Adapted from libiberty's mkstemp.c. */
 
 #undef TMP_MAX
-- 
2.12.0


[PATCH 0/2] remove unused 'mkstemp(s)' code

2017-02-27 Thread Ramsay Jones

I promised the first of these patches on 18th June last year! ;-)
(In response to Jeff's 'jk/gpg-interface-cleanup' branch).

Ramsay Jones (2):
  wrapper.c: remove unused git_mkstemp() function
  wrapper.c: remove unused gitmkstemps() function

 Makefile  |  5 -
 cache.h   |  3 ---
 config.mak.uname  | 17 -
 configure.ac  |  6 --
 git-compat-util.h |  5 -
 wrapper.c | 24 
 6 files changed, 60 deletions(-)

-- 
2.12.0


Re: [BUG] branch renamed to 'HEAD'

2017-02-27 Thread Jeff King
On Mon, Feb 27, 2017 at 04:33:36PM -0800, Junio C Hamano wrote:

> A flag to affect the behaviour (as opposed to  as a secondary
> return value, like Peff's patch does) can be made to work.  Perhaps
> a flag that says "keep the input as is if the result is not a local
> branch name" would pass an input "@" intact and that may be
> sufficient to allow "git branch -m @" to rename the current branch
> to "@" (I do not think it is a sensible rename, though ;-).  But
> probably some callers need to keep the original input and compare
> with the result to see if we expanded anything if we go that route.
> At that point, I am not sure if there are much differences in the
> ease of use between the two approaches.

I just went into more detail in my reply to Jacob, but I do think this
is a workable approach (and fortunately we seem to have banned bare "@"
as a name, along with anything containing "@{}", so I think we would end
up rejecting these nonsense names).

I'll see if I can work up a patch. We'll still need to pass the flag
around through the various functions, but at least it will be a flag and
not a confusing negated out-parameter.

-Peff


Re: [PATCH] strbuf: add strbuf_add_real_path()

2017-02-27 Thread René Scharfe

Am 27.02.2017 um 19:22 schrieb Brandon Williams:

On 02/25, René Scharfe wrote:

+void strbuf_add_real_path(struct strbuf *sb, const char *path)
+{
+   if (sb->len) {
+   struct strbuf resolved = STRBUF_INIT;
+   strbuf_realpath(, path, 1);
+   strbuf_addbuf(sb, );
+   strbuf_release();
+   } else
+   strbuf_realpath(sb, path, 1);


I know its not required but I would have braces on the 'else' branch
since they were needed on the 'if' branch.  But that's up to you and
your style :)


Personally I'd actually prefer them as well, but the project's style has 
traditionally been to avoid braces on such trailing single-line branches 
to save lines.  The CodingGuidelines for this topic have been clarified 
recently, though, and seem to require them now.  Interesting.


René


Re: [BUG] branch renamed to 'HEAD'

2017-02-27 Thread Jacob Keller
On Mon, Feb 27, 2017 at 2:28 PM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> I guess something like the patch below works, but I wonder if there is a
>> less-horrible way to accomplish the same thing.
>
> I suspect that a less-horrible would be a lot more intrusive.  It
> would go like "interpret-branch-name only gives local branch name,
> and when it does not show it, the callers that know they do not
> necessarily need local branch name would call other at-mark things".
> As you pointed out with the @{upstream} that potentially point at a
> local branch, it will quickly get more involved, I would think, and
> I tend to think that this patch of yours is probably the least evil
> one among possible solutions.
>
> Perhaps with s/not_in_refs_heads/not_a_branch_name/ (or swapping
> polarity, "is_a_branch_name"), the resulting code may not be too
> hard to read?
>
> Thanks.

What about changing interpret-branch-name gains a flag to return a
fully qualified ref rather than returning just the name? That seems
like it would be more reasonable behavior.

Thanks,
Jake


Re: [BUG] branch renamed to 'HEAD'

2017-02-27 Thread Jeff King
On Mon, Feb 27, 2017 at 03:05:37PM -0800, Jacob Keller wrote:

> > Perhaps with s/not_in_refs_heads/not_a_branch_name/ (or swapping
> > polarity, "is_a_branch_name"), the resulting code may not be too
> > hard to read?
> 
> What about changing interpret-branch-name gains a flag to return a
> fully qualified ref rather than returning just the name? That seems
> like it would be more reasonable behavior.

That's not sufficient. If I feed "refs/remotes/origin/master" as a
branch name to git-branch, then as silly as that is, it is the branch
whose ref is "refs/heads/refs/remotes/origin/master".

Since interpret_branch_name() is not fully qualifying everything, but
rather just _sometimes_ replace @-marks with refnames, we cannot tell
from just the string the difference between "the user fed us
refs/remotes/foo" and "the @-mark expanded to a non-branch
refs/remotes/foo". We need one extra bit of information to know whether
an expansion occurred.

You could give a flag that says "do not expand to anything outside of
refs/heads/" that would suppress the @->HEAD mark, as well as
@{upstream} when upstream is outside of refs/heads. That would certainly
be less nasty than the out-parameter, but I wasn't sure that the error
handling was what we wanted.

In strbuf_branchname(), we quietly turn that error into a "0", which
causes us to retain the original text. We then feed that into
check_refname_format() in strbuf_check_branch_ref(). Which I think would
complain, and you'd get "not a valid refname: @{upstream}". If we have
the out-bit, we can say "I understand what @{upstream} means, but it
does not expand to a local branch". That's a more specific error, but
maybe it is not worth the hassle to produce it.

-Peff


Re: [PATCH 0/6] Use time_t

2017-02-27 Thread Junio C Hamano
Johannes Schindelin  writes:

> One notable fallout of this patch series is that on 64-bit Linux (and
> other platforms where `unsigned long` is 64-bit), we now limit the range
> of dates to LONG_MAX (i.e. the *signed* maximum value). This needs to be
> done as `time_t` can be signed (and indeed is at least on my Ubuntu
> setup).
>
> Obviously, I think that we can live with that, and I hope that all
> interested parties agree.

s/ulong/time_t/ is definintely a good change, and it will take us to
a place we would want to be in in some future.  

As long as there remains no platform we care about whose time_t and
long are still 32-bit signed integer, there will be a fallout to
them with this change.  Probably it is of a larger impact than
losing the upper half of a 64-bit timestamp range on larger boxes.
Hopefully those platforms have died out (or at least we don't mind
breaking them)?

It appears that we use uint64_t in many places in our code.  So
while philosophically time_t is the right type, uint64_t might be
practically a safer alternative type to use at the endgame patch in
this series.  I haven't seen it yet, but presumably the last one 6/6
is the endgame?




Re: [BUG] branch renamed to 'HEAD'

2017-02-27 Thread Jeff King
On Mon, Feb 27, 2017 at 02:28:09PM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I guess something like the patch below works, but I wonder if there is a
> > less-horrible way to accomplish the same thing.
> 
> I suspect that a less-horrible would be a lot more intrusive.  It
> would go like "interpret-branch-name only gives local branch name,
> and when it does not show it, the callers that know they do not
> necessarily need local branch name would call other at-mark things".
> As you pointed out with the @{upstream} that potentially point at a
> local branch, it will quickly get more involved, I would think, and
> I tend to think that this patch of yours is probably the least evil
> one among possible solutions.  
> 
> Perhaps with s/not_in_refs_heads/not_a_branch_name/ (or swapping
> polarity, "is_a_branch_name"), the resulting code may not be too
> hard to read?

I actually started with not_a_branch_name, but I wanted specifically to
talk about refs_heads, because we sometimes refer to remote-tracking
branches as "branches" (and the function is called interpret_branch_name(),
after all).

I agree it would be easier to read if the logic were flipped, but I'm
not sure that would be correct. The function knows when it has done a
replacement that takes us outside of a normal branch name. But when it
hasn't, it doesn't really know how the result should be interpreted.

For our purposes in this caller it is enough to say that "foo" should be
treated as "refs/heads/foo", but I don't think that is generally true of
interpret_branch_name(), which might be called as part of get_sha1().

So one alternative is to leave the logic the same way but try to give it
a better name. E.g., call it something like "replaced_with_non_branch"
or something. But there, another negation snuck in. The correct
non-negated thing is really "replaced_with_HEAD_or_refs_remotes" but
that is rather awkward.

-Peff


Re: git diff --quiet exits with 1 on clean tree with CRLF conversions

2017-02-27 Thread Junio C Hamano
Torsten, you've been quite active in fixing various glitches around
the EOL conversion in the latter half of last year.  Have any
thoughts to share on this topic?

Thanks.

Mike Crowe  writes:

> On Monday 20 February 2017 at 13:25:01 -0800, Junio C Hamano wrote:
>> This almost makes me suspect that the place that checks lengths of
>> one and two in order to refrain from running more expensive content
>> comparison you found earlier need to ask would_convert_to_git()
>> before taking the short-cut, something along the lines of this (for
>> illustration purposes only, not even compile-tested).  The "almost"
>> comes to me because I do not offhand know the performance implications
>> of making calls to would_convert_to_git() here.
>> 
>>  diff.c | 18 ++
>>  1 file changed, 14 insertions(+), 4 deletions(-)
>> 
>> diff --git a/diff.c b/diff.c
>> index 051761be40..094d5913da 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -4921,9 +4921,10 @@ static int diff_filespec_check_stat_unmatch(struct 
>> diff_filepair *p)
>>   *differences.
>>   *
>>   * 2. At this point, the file is known to be modified,
>> - *with the same mode and size, and the object
>> - *name of one side is unknown.  Need to inspect
>> - *the identical contents.
>> + *with the same mode and size, the object
>> + *name of one side is unknown, or size comparison
>> + *cannot be depended upon.  Need to inspect the 
>> + *contents.
>>   */
>>  if (!DIFF_FILE_VALID(p->one) || /* (1) */
>>  !DIFF_FILE_VALID(p->two) ||
>> @@ -4931,7 +4932,16 @@ static int diff_filespec_check_stat_unmatch(struct 
>> diff_filepair *p)
>>  (p->one->mode != p->two->mode) ||
>>  diff_populate_filespec(p->one, CHECK_SIZE_ONLY) ||
>>  diff_populate_filespec(p->two, CHECK_SIZE_ONLY) ||
>> -(p->one->size != p->two->size) ||
>> +
>> +/* 
>> + * only if eol and other conversions are not involved,
>> + * we can say that two contents of different sizes
>> + * cannot be the same without checking their contents.
>> + */
>> +(!would_convert_to_git(p->one->path) &&
>> + !would_convert_to_git(p->two->path) &&
>> + (p->one->size != p->two->size)) ||
>> +
>>  !diff_filespec_is_identical(p->one, p->two)) /* (2) */
>>  p->skip_stat_unmatch_result = 1;
>>  return p->skip_stat_unmatch_result;
>> 
>> 
>
> Thanks for investigating this. I think you are correct that I was misguided
> in my previous "fix". However, your change above does fix the problem for
> me.
>
> It looks like the main cost of convert_to_git is in convert_attrs which
> ends up doing various path operations in attr.c. After that, both
> apply_filter and crlf_to_git return straight away if there's nothing to do.
>
> I experimented several times with running "git diff -quiet" after touching
> every file in Git's own worktree and any difference in total time was lost
> in the noise.
>
> I've further improved my test case. Tests 3 and 4 fail without the above
> change but pass with it. Unfortunately I'm still unable to get those tests
> to fail without the above fix unless the sleeps are present. I've tried
> using the "touch -r .datetime" technique from racy-git.txt but it doesn't
> help. It seems that I'm unable to stop Git from using its cache without
> sleeping. :(
>
> diff --git a/t/t4063-diff-converted.sh b/t/t4063-diff-converted.sh
> new file mode 100755
> index 000..31a730d
> --- /dev/null
> +++ b/t/t4063-diff-converted.sh
> @@ -0,0 +1,44 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2017 Mike Crowe
> +#
> +# These tests ensure that files changing line endings in the presence
> +# of .gitattributes to indicate that line endings should be ignored
> +# don't cause 'git diff' or 'git diff --quiet' to think that they have
> +# been changed.
> +#
> +# The sleeps are necessary to reproduce the problem for reasons that I
> +# don't understand.
> +
> +test_description='git diff with files that require CRLF conversion'
> +
> +. ./test-lib.sh
> +
> +test_expect_success setup '
> + echo "* text=auto" > .gitattributes &&
> + printf "Hello\r\nWorld\r\n" > crlf.txt &&
> + printf "Hello\nWorld\n" > lf.txt &&
> + git add .gitattributes crlf.txt lf.txt &&
> + git commit -m "initial" && echo three
> +'
> +test_expect_success 'noisy diff works on file that requires CRLF conversion' 
> '
> + git status >/dev/null &&
> + git diff >/dev/null &&
> + sleep 1 &&
> + touch crlf.txt lf.txt &&
> + git diff >/dev/null
> +'
> +test_expect_success 'quiet diff works on file that requires CRLF conversion 
> with no changes' '
> + git status &&
> + git diff --quiet &&
> + sleep 1 &&
> + touch crlf.txt lf.txt &&
> + git diff --quiet
> +'
> +
> +test_expect_success 'quiet diff works on file with line-ending change that 
> has no effect on repository' '
> + printf 

Re: [BUG] branch renamed to 'HEAD'

2017-02-27 Thread Junio C Hamano
Jacob Keller  writes:

> What about changing interpret-branch-name gains a flag to return a
> fully qualified ref rather than returning just the name? That seems
> like it would be more reasonable behavior.

There are two kinds of callers to i-b-n.  The ones that want a local
branch name because they are parsing special places on the command
line that using a local branch name makes difference (as opposed to
using any extended SHA-1 expression), like "git checkout master"
(which means different thing from "git checkout master^0").  And the
ones that can use any object name.

It depends on how your flag works, but if it means "add refs/heads/
when you got a local branch name", then that would not work well for
the former callers, as end-user inputs @{-1} and refs/heads/master
would become indistinguishable.  The former is expanded to 'master'
(if you were on that branch) and ends up being refs/heads/master.
"git checkout refs/heads/master" would be (unless you have a branch
with that name, i.e. refs/heads/refs/heads/master) a request to
detach HEAD at the commit, but the user wanted to be on the previous
branch.  And the latter iclass of callers are probably already happy
with or without the flag, so they won't be helped, either.

A flag to affect the behaviour (as opposed to  as a secondary
return value, like Peff's patch does) can be made to work.  Perhaps
a flag that says "keep the input as is if the result is not a local
branch name" would pass an input "@" intact and that may be
sufficient to allow "git branch -m @" to rename the current branch
to "@" (I do not think it is a sensible rename, though ;-).  But
probably some callers need to keep the original input and compare
with the result to see if we expanded anything if we go that route.
At that point, I am not sure if there are much differences in the
ease of use between the two approaches.


[PATCH 6/6] Use time_t where appropriate

2017-02-27 Thread Johannes Schindelin
Git's source code assumes that unsigned long is at least as precise as
time_t. That causes a lot of problems, in particular where unsigned long
is only 32-bit (notably on Windows, even in 64-bit versions).

So let's just use time_t instead.

Note that in some systems (most notably 32-bit Linux), time_t is *still*
only 32-bit.

Therefore, it might seem desirable to simply replace unsigned long by
int64_t when working with timestamps, but that comes with its own set of
problems, as we often interact with the system's date functions that
*do* use time_t.

So let's just stick with time_t.

By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.

As `time_t` can be signed, we now have to switch to using LONG_MAX as
the maximum timestamp lest expressions like `timestamp < TIME_MAX`
evaluate to false. Technically, this introduces a limitation on
platforms where we used 64-bit unsigned longs to represent timestamps
before. Practically, however, this simply unifies the behavior with
platforms where `time_t` is signed (and where sending too large
timestamps to libc functions would fail).

Signed-off-by: Johannes Schindelin 
---
 Documentation/technical/api-parse-options.txt |  8 ++---
 builtin/am.c  |  2 +-
 builtin/blame.c   |  8 ++---
 builtin/fsck.c|  4 +--
 builtin/gc.c  |  2 +-
 builtin/log.c |  2 +-
 builtin/merge-base.c  |  2 +-
 builtin/name-rev.c|  6 ++--
 builtin/pack-objects.c|  4 +--
 builtin/prune.c   |  2 +-
 builtin/receive-pack.c|  6 ++--
 builtin/reflog.c  | 24 ++---
 builtin/show-branch.c |  4 +--
 builtin/worktree.c|  2 +-
 bundle.c  |  2 +-
 cache.h   | 14 
 commit.c  | 12 +++
 commit.h  |  2 +-
 config.mak.uname  |  2 ++
 credential-cache--daemon.c| 12 +++
 date.c| 50 +--
 fetch-pack.c  |  6 ++--
 git-compat-util.h |  2 +-
 http-backend.c|  4 +--
 parse-options-cb.c|  4 +--
 pretty.c  |  2 +-
 reachable.c   | 10 +++---
 reachable.h   |  4 +--
 ref-filter.c  |  2 +-
 reflog-walk.c |  8 ++---
 refs.c| 14 
 refs.h|  8 ++---
 refs/files-backend.c  |  4 +--
 revision.c|  6 ++--
 revision.h|  4 +--
 sha1_name.c   |  6 ++--
 t/helper/test-date.c  |  4 +--
 t/helper/test-parse-options.c |  2 +-
 tag.c |  2 +-
 tag.h |  2 +-
 upload-pack.c |  4 +--
 vcs-svn/fast_export.c |  4 +--
 vcs-svn/fast_export.h |  4 +--
 vcs-svn/svndump.c |  2 +-
 wt-status.c   |  2 +-
 45 files changed, 140 insertions(+), 140 deletions(-)

diff --git a/Documentation/technical/api-parse-options.txt 
b/Documentation/technical/api-parse-options.txt
index 27bd701c0d6..28c9a64fc11 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -178,13 +178,13 @@ There are some macros to easily define options:
scale the provided value by 1024, 1024^2 or 1024^3 respectively.
The scaled value is put into `unsigned_long_var`.
 
-`OPT_DATE(short, long, _var, description)`::
+`OPT_DATE(short, long, _t_var, description)`::
Introduce an option with date argument, see `approxidate()`.
-   The timestamp is put into `int_var`.
+   The timestamp is put into `time_t_var`.
 
-`OPT_EXPIRY_DATE(short, long, _var, description)`::
+`OPT_EXPIRY_DATE(short, long, _t_var, description)`::
Introduce an option with expiry date argument, see 
`parse_expiry_date()`.
-   The timestamp is put into `int_var`.
+   The timestamp is put into `time_t_var`.
 
 `OPT_CALLBACK(short, long, , arg_str, description, func_ptr)`::
Introduce an option with argument.
diff --git a/builtin/am.c b/builtin/am.c
index 

Re: [PATCH v7 4/6] stash: teach 'push' (and 'create_stash') to honor pathspec

2017-02-27 Thread Junio C Hamano
Thomas Gummerer  writes:

>   if test -z "$patch_mode"
>   then
> - git reset --hard ${GIT_QUIET:+-q}
> + if test $# != 0
> + then
> + git reset ${GIT_QUIET:+-q} -- "$@"
> + git checkout ${GIT_QUIET:+-q} HEAD -- $(git ls-files -z 
> --modified "$@")

"ls-files -z" on the command line?  

Apparently new tests do not cover the correctness of this codepath.

I wonder if this

git ls-files -z --modified "$@" |
git checkout-index -z --force --stdin

is what the above "checkout" wanted to do.  The "reset" in the
previous step presumably updated the index entries that match
specified pathspec to those of the HEAD, so checking out the paths
that match "$@" from the index would be the same as checking them
out from the HEAD (while updating the index with them).

Perhaps squash the following into an appropriate patch in the
series?

 git-stash.sh |  3 ++-
 t/t3903-stash.sh | 16 
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/git-stash.sh b/git-stash.sh
index 28d0624c75..9c70662cc8 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -300,7 +300,8 @@ push_stash () {
if test $# != 0
then
git reset ${GIT_QUIET:+-q} -- "$@"
-   git checkout ${GIT_QUIET:+-q} HEAD -- $(git ls-files -z 
--modified "$@")
+   git ls-files -z --modified -- "$@" |
+   git checkout-index -z --force --stdin
git clean --force ${GIT_QUIET:+-q} -d -- "$@"
else
git reset --hard ${GIT_QUIET:+-q}
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index f7733b4dd4..e868aafab2 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -891,4 +891,20 @@ test_expect_success 'stash without verb with pathspec' '
test_path_is_file bar
 '
 
+test_expect_success 'stash with pathspec matching multiple paths' '
+   echo original >file &&
+   echo original >other-file &&
+   git commit -m "two" file other-file &&
+   echo modified >file &&
+   echo modified >other-file &&
+   git stash -- "*file" &&
+   echo original >expect &&
+   test_cmp expect file &&
+   test_cmp expect other-file &&
+   git stash pop &&
+   echo modified >expect &&
+   test_cmp expect file &&
+   test_cmp expect other-file
+'
+
 test_done


Re: [PATCH] t6300: avoid creating refs/heads/HEAD

2017-02-27 Thread Jeff King
On Mon, Feb 27, 2017 at 01:44:26PM -0800, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > ...  I suspect that calling interpret_empty_at() from
> > that function is fundamentally flawed.  The "@" end user types never
> > means refs/heads/HEAD, and HEAD@{either reflog or -1} would not mean
> > anything that should be taken as a branch_name, either.  
> 
> The latter should read "HEAD@{either reflog or -1 or 'upstream'}"
> 
> Or do we make HEAD@{upstream} to mean "deref HEAD to learn the
> current branch name and then take its upstream"?  If so @@{upstream}
> might logically make sense, but I do not see why @{upstream} without
> HEAD or @ is not sufficient to begin with, so...

Yes, HEAD@{upstream} and @@{upstream} are both resolved to the actual
branch name. I also was puzzled whether there was any real use over just
@{upstream}. But it does work, and if you had a script which looked for,
say, $branch@{upstream}, you'd probably want branch=HEAD to keep
working.

The "branch=@" case I am less sympathetic to, as it was mainly supposed
to be a command-line convenience. But it _does_ work now.

-Peff


Re: [PATCH v7 4/6] stash: teach 'push' (and 'create_stash') to honor pathspec

2017-02-27 Thread Junio C Hamano
Thomas Gummerer  writes:

> On 02/27, Junio C Hamano wrote:
>> Thomas Gummerer  writes:
>> 
>> > +  test -n "$untracked" || git ls-files --error-unmatch -- "$@" >/dev/null 
>> > || exit 1
>> 
>> This silent "exit 1" made me scratch my head, but --error-unmatch
>> would have already given an error message, like
>> 
>> error: pathspec 'no such' did not match any file(s) known to git.
>> Did you forget to 'git add'?
>> 
>> so that would be OK.
>
> Yeah exactly, the plan was to let --error-unmatch print the error
> message, as it's better at giving a good error message than we could
> easily do here I think.
>
> Maybe this deserves a comment so others won't have the same doubts
> about this line?

Nah, I do not think so.  It probably is obvious for those who write
(and read) "--error-unmatch".  I was just being slow to realize that
the sole point of that option was to complain ;-)


Re: [PATCH 2/2] commit: don't check for space twice when looking for header

2017-02-27 Thread René Scharfe

Am 27.02.2017 um 23:27 schrieb Jakub Narębski:

W dniu 25.02.2017 o 20:27, René Scharfe pisze:

Both standard_header_field() and excluded_header_field() check if
there's a space after the buffer that's handed to them.  We already
check in the caller if that space is present.  Don't bother calling
the functions if it's missing, as they are guaranteed to return 0 in
that case, and remove the now redundant checks from them.

Signed-off-by: Rene Scharfe 
---
 commit.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/commit.c b/commit.c
index 173c6d3818..fab8269731 100644
--- a/commit.c
+++ b/commit.c
@@ -1308,11 +1308,11 @@ void for_each_mergetag(each_mergetag_fn fn, struct 
commit *commit, void *data)

 static inline int standard_header_field(const char *field, size_t len)
 {
-   return ((len == 4 && !memcmp(field, "tree ", 5)) ||
-   (len == 6 && !memcmp(field, "parent ", 7)) ||
-   (len == 6 && !memcmp(field, "author ", 7)) ||
-   (len == 9 && !memcmp(field, "committer ", 10)) ||
-   (len == 8 && !memcmp(field, "encoding ", 9)));
+   return ((len == 4 && !memcmp(field, "tree", 4)) ||
+   (len == 6 && !memcmp(field, "parent", 6)) ||
+   (len == 6 && !memcmp(field, "author", 6)) ||
+   (len == 9 && !memcmp(field, "committer", 9)) ||
+   (len == 8 && !memcmp(field, "encoding", 8)));


I agree (for what it is worth from me) with the rest of changes,
but I think current code is better self-documenting for this
function.


Having a function that is given a buffer/length pair and accessing the 
byte after it raises questions, though. :)


Nicer than keeping the space would be to use excluded_header_field() for 
standard headers as well as a next step, I think -- but that would be a 
bit slower.





 }

 static int excluded_header_field(const char *field, size_t len, const char 
**exclude)
@@ -1322,8 +1322,7 @@ static int excluded_header_field(const char *field, 
size_t len, const char **exc

while (*exclude) {
size_t xlen = strlen(*exclude);
-   if (len == xlen &&
-   !memcmp(field, *exclude, xlen) && field[xlen] == ' ')
+   if (len == xlen && !memcmp(field, *exclude, xlen))
return 1;
exclude++;
}
@@ -1357,9 +1356,8 @@ static struct commit_extra_header 
*read_commit_extra_header_lines(
eof = memchr(line, ' ', next - line);
if (!eof)
eof = next;
-
-   if (standard_header_field(line, eof - line) ||
-   excluded_header_field(line, eof - line, exclude))
+   else if (standard_header_field(line, eof - line) ||
+excluded_header_field(line, eof - line, exclude))
continue;

it = xcalloc(1, sizeof(*it));





Re: [PATCH 1/6] t0006 & t5000: prepare for 64-bit time_t

2017-02-27 Thread Junio C Hamano
Johannes Schindelin  writes:

> This quick fix, however, tests for *long* to be 64-bit or not. What we
> need, though, is a test that says whether *whatever data type we use for
> timestamps* is 64-bit or not.
>
> The same quick fix was used to handle the similar problem where Git's
> source code uses `unsigned long` to represent size, instead of `size_t`.
>
> So let's just add another prerequisite to test specifically whether
> timestamps are represented by a 64-bit data type or not. Later, when we
> will have switched to using `time_t` where appropriate, we can flip that
> prerequisite to test `time_t` instead of `long`.

The changes that move from LONG_IS to TIME_IS in this patch are all
about time (iow, LONG_IS_64BIT prereq is still used in the check on
sizes).  The patch looks sensible.



Re: [PATCH 2/6] Specify explicitly where we parse timestamps

2017-02-27 Thread Junio C Hamano
Junio C Hamano  writes:

>> -unsigned long number = strtoul(date, , 10);
>> +time_t number = parse_timestamp(date, , 10);
>
> This hunk does not belong to this step.  Everybody else in this step

obviously I meant "the left half of this hunk" ;-)

> still receives parse_timestamp()'s return value in ulong, not time_t.
> I presume that that will happen in the final step 6/6 (which could
> be a huge patch that exceeds 100k?)


Re: [PATCH 2/6] Specify explicitly where we parse timestamps

2017-02-27 Thread Junio C Hamano
Johannes Schindelin  writes:

> Currently, Git's source code represents all timestamps as `unsigned
> long`. In preparation for using `time_t` instead, let's introduce a
> symbol `parse_timestamp` (currently being defined to `strtoul`) where
> appropriate, so that we can later easily switch to use `strtoull()`
> instead.

This definitely is a very good thing to do as a separate step.

> diff --git a/date.c b/date.c
> index a996331f5b3..a8848f6e141 100644
> --- a/date.c
> +++ b/date.c
> ...
> @@ -1066,7 +1066,7 @@ static const char *approxidate_digit(const char *date, 
> struct tm *tm, int *num,
>time_t now)
>  {
>   char *end;
> - unsigned long number = strtoul(date, , 10);
> + time_t number = parse_timestamp(date, , 10);

This hunk does not belong to this step.  Everybody else in this step
still receives parse_timestamp()'s return value in ulong, not time_t.
I presume that that will happen in the final step 6/6 (which could
be a huge patch that exceeds 100k?)


Re: [PATCH v7 4/6] stash: teach 'push' (and 'create_stash') to honor pathspec

2017-02-27 Thread Junio C Hamano
Thomas Gummerer  writes:

>   if test -z "$patch_mode"
>   then
> - git reset --hard ${GIT_QUIET:+-q}
> + if test $# != 0
> + then
> + git reset ${GIT_QUIET:+-q} -- "$@"
> + git checkout ${GIT_QUIET:+-q} HEAD -- $(git ls-files -z 
> --modified "$@")

"ls-files -z" on the command line?  

Apparently new tests do not cover the correctness of this codepath.

I wonder if this

git ls-files -z --modified "$@" |
git checkout-index -z --stdin

is what the above "checkout" wanted to do.  The "reset" in the
previous step presumably updated the index entries that match
specified pathspec to those of the HEAD, so checking out the paths
that match "$@" from the index would be the same as checking them
out from the HEAD (while updating the index with them).

> + git clean --force ${GIT_QUIET:+-q} -d -- "$@"
> + else
> + git reset --hard ${GIT_QUIET:+-q}
> + fi

Thanks.


Re: Reference for quote "creating branch is not the issue, merging is", in context of Subversion/Git

2017-02-27 Thread Jakub Narębski
W dniu 26.02.2017 o 16:19, Igor Djordjevic pisze:
> Hello Michael,
> 
> On 26/02/2017 12:40, Michael Hüttermann wrote:
>> Linus Torvalds made a statement regarding merging/branching and stated
>> (as far as I know) that "creating branch is not the issue, merge is", in
>> context of Subversion/Git.
>> I do not find the origin source for that. Can you please help and point
>> me to a statement or article where Linus elaborated on this?
> 
> Could it be that you think of "Tech Talk: Linus Torvalds on Git"[1]
> (held on May 3, 2007)?
> 
> To give you some clue, here`s an excerpt from Linus' talk/presentation
> (taken from the transcript[2] containing the whole thing):
> 
>   "... Subversion for example, talks very loudly about how they do CVS
>   right by making branching really cheap. It's probably on their main
>   webpage where they probably say branching in subversion is O(1)
>   operation, you can do as many cheap branches as you want. Nevermind
>   that O(1) is actually with pretty large O I think, but even if it
>   takes a millionth of a second to do branching, who cares? It's the
>   wrong thing you are measuring. Nobody is interested in branching,
>   branches are completely useless unless you merge them, and CVS cannot
>   merge anything at all. You can merge things once, but because CVS
>   then forgets what you did, you can never ever merge anything again
>   without getting horrible horrible conflicts. Merging in subversion is
>   a complete disaster. The subversion people kind of acknowledge this
>   and they have a plan, and their plan sucks too. It is incredible how
>   stupid these people are. They've been looking at the wrong problem
>   all the time. Branching is not the issue, merging is..."
> 
> This specific branch/merge performance talk starts at 50:20[3], where
> the part quoted above comes at 51:34[4].

Note also that while "creating branch is not the issue, merge is"
remains true, modern Subversion (post 1.5) makes merging easy thanks
to svn:mergeinfo property.

Though it does it in completely different way than Git and other
"graph of commits" VCS-es, because of the "branch is directory"
philosophy, namely that it keeps information about what was merged
in, rather than finding common ancestor(s) and using this information
for resolving merge.

> 
> Please note that there`s more context before and after this excerpt
> that puts it all into the meant perspective, so you may really want
> to watch/listen/read the whole thing anyway.
> 
> Regards,
> Buga
> 
> [1] https://www.youtube.com/watch?v=4XpnKHJAok8
> [2] https://git.wiki.kernel.org/index.php/LinusTalk200705Transcript
> [3] https://youtu.be/4XpnKHJAok8?t=3020
> [4] https://youtu.be/4XpnKHJAok8?t=3094
> 



Re: [PATCH 05/10] submodule--helper: add is_active command

2017-02-27 Thread Brandon Williams
On 02/23, Stefan Beller wrote:
> On Thu, Feb 23, 2017 at 3:47 PM, Brandon Williams  wrote:
> > There are a lot of places where an explicit check for
> > submodule."".url is done to see if a submodule exists.  In order
> > to more easily facilitate the use of the submodule.active config option
> > to indicate active submodules, add a helper which can be used to query
> > if a submodule is active or not.
> >
> > Signed-off-by: Brandon Williams 
> > ---
> >  builtin/submodule--helper.c| 11 
> >  t/t7413-submodule-is-active.sh | 63 
> > ++
> >  2 files changed, 74 insertions(+)
> >  create mode 100755 t/t7413-submodule-is-active.sh
> >
> > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> > index df0d9c166..dac02604d 100644
> > --- a/builtin/submodule--helper.c
> > +++ b/builtin/submodule--helper.c
> > @@ -1128,6 +1128,16 @@ static int absorb_git_dirs(int argc, const char 
> > **argv, const char *prefix)
> > return 0;
> >  }
> >
> > +static int is_active(int argc, const char **argv, const char *prefix)
> > +{
> > +   if (argc != 2)
> > +   die("submodule--helper is-active takes exactly 1 
> > arguments");
> > +
> > +   gitmodules_config();
> > +
> > +   return !is_submodule_initialized(argv[1]);
> > +}
> > +
> >  #define SUPPORT_SUPER_PREFIX (1<<0)
> >
> >  struct cmd_struct {
> > @@ -1147,6 +1157,7 @@ static struct cmd_struct commands[] = {
> > {"init", module_init, 0},
> > {"remote-branch", resolve_remote_submodule_branch, 0},
> > {"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX},
> > +   {"is-active", is_active, 0},
> >  };
> >
> >  int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
> > diff --git a/t/t7413-submodule-is-active.sh b/t/t7413-submodule-is-active.sh
> > new file mode 100755
> > index 0..683487020
> > --- /dev/null
> > +++ b/t/t7413-submodule-is-active.sh
> > @@ -0,0 +1,63 @@
> > +#!/bin/sh
> > +
> > +test_description='Test submodule--helper is-active
> > +
> > +This test verifies that `git submodue--helper is-active` correclty 
> > identifies
> > +submodules which are "active" and interesting to the user.
> > +'
> > +
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'setup' '
> > +   git init sub &&
> > +   test_commit -C sub initial &&
> > +   git init super &&
> > +   test_commit -C super initial &&
> > +   git -C super submodule add ../sub sub1 &&
> > +   git -C super submodule add ../sub sub2 &&
> > +   git -C super commit -a -m "add 2 submodules at sub{1,2}"
> > +'
> > +
> > +test_expect_success 'is-active works with urls' '
> > +   git -C super submodule--helper is-active sub1 &&
> > +   git -C super submodule--helper is-active sub2 &&
> > +
> > +   git -C super config --unset submodule.sub1.URL &&
> > +   test_must_fail git -C super submodule--helper is-active sub1 &&
> > +   git -C super config submodule.sub1.URL ../sub &&
> > +   git -C super submodule--helper is-active sub1
> > +'
> > +
> > +test_expect_success 'is-active works with basic submodule.active config' '
> > +   git -C super config --add submodule.active "." &&
> > +   git -C super config --unset submodule.sub1.URL &&
> > +   git -C super config --unset submodule.sub2.URL &&
> 
> I think we'd want to unset only one of them here
> 
> > +
> > +   git -C super submodule--helper is-active sub1 &&
> > +   git -C super submodule--helper is-active sub2 &&
> 
> to test 2 different cases of one being active by config setting only and
> the other having both.

Will do.

> 
> I could not spot test for having the URL set but the config setting set, not
> including the submodule, e.g.
> 
> git -C super config  submodule.sub1.URL ../sub &&
> git -C super submodule.active  ":(exclude)sub1" &&
> 
> which would be expected to not be active, as once the configuration
> is there it takes precedence over any (no)URL setting?

The last test, tests this functionality as the URL settings for both
sub1 and sub2 are in the config.  You'll notice in the tests where I
unset the URL config, that they get added back at the end of the test so
that future tests have a clean state to work with.

-- 
Brandon Williams


Re: [PATCH 2/2] apply: handle assertion failure gracefully

2017-02-27 Thread Junio C Hamano
René Scharfe  writes:

> Am 27.02.2017 um 21:04 schrieb Junio C Hamano:
>> René Scharfe  writes:
>>
 diff --git a/apply.c b/apply.c
 index cbf7cc7f2..9219d2737 100644
 --- a/apply.c
 +++ b/apply.c
 @@ -3652,7 +3652,6 @@ static int check_preimage(struct apply_state *state,
if (!old_name)
return 0;

 -  assert(patch->is_new <= 0);
>>>
>>> 5c47f4c6 (builtin-apply: accept patch to an empty file) added that
>>> line. Its intent was to handle diffs that contain an old name even for
>>> a file that's created.  Citing from its commit message: "When we
>>> cannot be sure by parsing the patch that it is not a creation patch,
>>> we shouldn't complain when if there is no such a file."  Why not stop
>>> complaining also in case we happen to know for sure that it's a
>>> creation patch? I.e., why not replace the assert() with:
>>>
>>> if (patch->is_new == 1)
>>> goto is_new;
>>>
previous = previous_patch(state, patch, );
>>
>> When the caller does know is_new is true, old_name must be made/left
>> NULL.  That is the invariant this assert is checking to catch an
>> error in the calling code.
>
> There are some places in apply.c that set ->is_new to 1, but none of
> them set ->old_name to NULL at the same time.

I thought all of these are flipping ->is_new that used to be -1
(unknown) to (now we know it is new), and sets only new_name without
doing anything to old_name, because they know originally both names
are set to NULL.

> Having to keep these two members in sync sounds iffy anyway.  Perhaps
> accessors can help, e.g. a setter which frees old_name when is_new is
> set to 1, or a getter which returns NULL for old_name if is_new is 1.

Definitely, the setter would make it harder to make the mistake.


Re: [BUG] branch renamed to 'HEAD'

2017-02-27 Thread Junio C Hamano
Jeff King  writes:

> I guess something like the patch below works, but I wonder if there is a
> less-horrible way to accomplish the same thing.

I suspect that a less-horrible would be a lot more intrusive.  It
would go like "interpret-branch-name only gives local branch name,
and when it does not show it, the callers that know they do not
necessarily need local branch name would call other at-mark things".
As you pointed out with the @{upstream} that potentially point at a
local branch, it will quickly get more involved, I would think, and
I tend to think that this patch of yours is probably the least evil
one among possible solutions.  

Perhaps with s/not_in_refs_heads/not_a_branch_name/ (or swapping
polarity, "is_a_branch_name"), the resulting code may not be too
hard to read?

Thanks.


Re: [PATCH 2/2] commit: don't check for space twice when looking for header

2017-02-27 Thread Jakub Narębski
W dniu 25.02.2017 o 20:27, René Scharfe pisze:
> Both standard_header_field() and excluded_header_field() check if
> there's a space after the buffer that's handed to them.  We already
> check in the caller if that space is present.  Don't bother calling
> the functions if it's missing, as they are guaranteed to return 0 in
> that case, and remove the now redundant checks from them.
> 
> Signed-off-by: Rene Scharfe 
> ---
>  commit.c | 18 --
>  1 file changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/commit.c b/commit.c
> index 173c6d3818..fab8269731 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -1308,11 +1308,11 @@ void for_each_mergetag(each_mergetag_fn fn, struct 
> commit *commit, void *data)
>  
>  static inline int standard_header_field(const char *field, size_t len)
>  {
> - return ((len == 4 && !memcmp(field, "tree ", 5)) ||
> - (len == 6 && !memcmp(field, "parent ", 7)) ||
> - (len == 6 && !memcmp(field, "author ", 7)) ||
> - (len == 9 && !memcmp(field, "committer ", 10)) ||
> - (len == 8 && !memcmp(field, "encoding ", 9)));
> + return ((len == 4 && !memcmp(field, "tree", 4)) ||
> + (len == 6 && !memcmp(field, "parent", 6)) ||
> + (len == 6 && !memcmp(field, "author", 6)) ||
> + (len == 9 && !memcmp(field, "committer", 9)) ||
> + (len == 8 && !memcmp(field, "encoding", 8)));

I agree (for what it is worth from me) with the rest of changes,
but I think current code is better self-documenting for this
function.

>  }
>  
>  static int excluded_header_field(const char *field, size_t len, const char 
> **exclude)
> @@ -1322,8 +1322,7 @@ static int excluded_header_field(const char *field, 
> size_t len, const char **exc
>  
>   while (*exclude) {
>   size_t xlen = strlen(*exclude);
> - if (len == xlen &&
> - !memcmp(field, *exclude, xlen) && field[xlen] == ' ')
> + if (len == xlen && !memcmp(field, *exclude, xlen))
>   return 1;
>   exclude++;
>   }
> @@ -1357,9 +1356,8 @@ static struct commit_extra_header 
> *read_commit_extra_header_lines(
>   eof = memchr(line, ' ', next - line);
>   if (!eof)
>   eof = next;
> -
> - if (standard_header_field(line, eof - line) ||
> - excluded_header_field(line, eof - line, exclude))
> + else if (standard_header_field(line, eof - line) ||
> +  excluded_header_field(line, eof - line, exclude))
>   continue;
>  
>   it = xcalloc(1, sizeof(*it));
> 



[PATCH 1/6] t0006 & t5000: prepare for 64-bit time_t

2017-02-27 Thread Johannes Schindelin
Git's source code refers to timestamps as unsigned longs. On 32-bit
platforms, as well as on Windows, unsigned long is not large enough to
capture dates that are "absurdly far in the future".

It is perfectly valid by the C standard, of course, for the `long` data
type to refer to 32-bit integers. That is why the `time_t` data type
exists: so that it can be 64-bit even if `long` is 32-bit. Git's source
code simply does not use `time_t`, is all.

The earlier quick fix 6b9c38e14cd (t0006: skip "far in the future" test
when unsigned long is not long enough, 2016-07-11) forced the test cases
to be skipped that require a 64-bit (or larger) data type to be used to
represent the time.

This quick fix, however, tests for *long* to be 64-bit or not. What we
need, though, is a test that says whether *whatever data type we use for
timestamps* is 64-bit or not.

The same quick fix was used to handle the similar problem where Git's
source code uses `unsigned long` to represent size, instead of `size_t`.

So let's just add another prerequisite to test specifically whether
timestamps are represented by a 64-bit data type or not. Later, when we
will have switched to using `time_t` where appropriate, we can flip that
prerequisite to test `time_t` instead of `long`.

Signed-off-by: Johannes Schindelin 
---
 t/helper/test-date.c | 5 -
 t/t0006-date.sh  | 4 ++--
 t/t5000-tar-tree.sh  | 6 +++---
 t/test-lib.sh| 2 ++
 4 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 506054bcd5d..4727bea255c 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -4,7 +4,8 @@ static const char *usage_msg = "\n"
 "  test-date relative [time_t]...\n"
 "  test-date show: [time_t]...\n"
 "  test-date parse [date]...\n"
-"  test-date approxidate [date]...\n";
+"  test-date approxidate [date]...\n"
+"  test-date is64bit\n";
 
 static void show_relative_dates(const char **argv, struct timeval *now)
 {
@@ -93,6 +94,8 @@ int cmd_main(int argc, const char **argv)
parse_dates(argv+1, );
else if (!strcmp(*argv, "approxidate"))
parse_approxidate(argv+1, );
+   else if (!strcmp(*argv, "is64bit"))
+   return sizeof(unsigned long) == 8 ? 0 : 1;
else
usage(usage_msg);
return 0;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index c0c910867d7..9539b425ffb 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -53,8 +53,8 @@ check_show unix-local "$TIME" '146600'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" LONG_IS_64BIT
-check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" LONG_IS_64BIT
+check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT
+check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" TIME_IS_64BIT
 
 check_parse() {
echo "$1 -> $2" >expect
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 886b6953e40..997aa9dea28 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -390,7 +390,7 @@ test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can 
read our huge size' '
test_cmp expect actual
 '
 
-test_expect_success LONG_IS_64BIT 'set up repository with far-future commit' '
+test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
rm -f .git/index &&
echo content >file &&
git add file &&
@@ -398,11 +398,11 @@ test_expect_success LONG_IS_64BIT 'set up repository with 
far-future commit' '
git commit -m "tempori parendum"
 '
 
-test_expect_success LONG_IS_64BIT 'generate tar with future mtime' '
+test_expect_success TIME_IS_64BIT 'generate tar with future mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our future 
mtime' '
+test_expect_success TAR_HUGE,TIME_IS_64BIT 'system tar can read our future 
mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 86d77c16dd3..6151a3d70f8 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1163,3 +1163,5 @@ build_option () {
 test_lazy_prereq LONG_IS_64BIT '
test 8 -le "$(build_option sizeof-long)"
 '
+
+test_lazy_prereq TIME_IS_64BIT 'test-date is64bit'
-- 
2.11.1.windows.1.379.g44ae0bc




Re: [PATCH v2] convert: add "status=delayed" to filter process protocol

2017-02-27 Thread Jakub Narębski
W dniu 27.02.2017 o 11:32, Lars Schneider pisze:
> 
>> On 27 Feb 2017, at 10:58, Jeff King  wrote:
>>
>> On Sun, Feb 26, 2017 at 07:48:16PM +0100, Lars Schneider wrote:
>>
>>> +If the request cannot be fulfilled within a reasonable amount of time
>>> +then the filter can respond with a "delayed" status and a flush packet.
>>> +Git will perform the same request at a later point in time, again. The
>>> +filter can delay a response multiple times for a single request.
>>> +
>>> +packet:  git< status=delayed
>>> +packet:  git< 
>>> +

Is it something that happens instead of filter process sending the contents
of file, or is it something that happens after sending some part of the
contents (maybe empty) instead of empty list to keep "status=success"
unchanged or instead of "status=error" if there was a problem processing
file?

>>> +
>>
>> So Git just asks for the same content again? I see two issues with that:
>>
>>  1. Does git have to feed the blob content again? That can be expensive
>> to access or to keep around in memory.
>>
>>  2. What happens when the item isn't ready on the second request? I can
>> think of a few options:
>>
>>   a. The filter immediately says "nope, still delayed". But then
>>  Git ends up busy-looping with "is this one ready yet?"
>>
>>   b. The filter blocks until the item is ready. But then if other
>>items _are_ ready, Git cannot work on processing them. We lose
>>parallelism.
>>
>>   c. You could do a hybrid: block until _some_ item is ready, and
>>  then issue "delayed" responses for everything that isn't
>>ready. Then if you assume that Git is looping over and over
>>through the set of objects, it will either block or pick up
>>_something_ on each loop.
>>
>>But it makes a quadratic number of requests in the worst case.
>>E.g., imagine you have N items and the last one is available
>>first, then the second-to-last, and so on. You'll ask N times,
>>then N-1, then N-2, and so on.

The current solution is a 'busy loop' one that I wrote about[1][2],
see below.

> 
> I completely agree - I need to change that. However, the goal of the v2
> iteration was to get the "convert" interface in an acceptable state.
> That's what I intended to say in the patch comment section:
> 
> "Please ignore all changes behind async_convert_to_working_tree() and 
>  async_filter_finish() for now as I plan to change the implementation 
>  as soon as the interface is in an acceptable state."

I think that it is more important to start with a good abstraction,
and the proposal for protocol, rather than getting bogged down in
implementation details that may change as the idea for protocol
extension changes.

>>
>> I think it would be much more efficient to do something like:
>>
>>  [Git issues a request and gives it an opaque index id]
>>  git> command=smudge
>>  git> pathname=foo
>>  git> index=0
>>  git> 
>>  git> CONTENT
>>  git> 
>>
>>  [The data isn't ready yet, so the filter tells us so...]
>>  git< status=delayed
>>  git< 

So is it only as replacement for "status=success" + contents or
"status=abort", that is upfront before sending any part of the file?

Or, as one can assume from the point of the paragraph with the
"status=delayed", it is about replacing null list for success or
"status=error" after sending some part (maybe empty) of a file,
that is:

[filter driver says that it can process contents]
git< status=success
git< 
git< PARTIAL_SMUDGED_CONTENT (maybe empty)
[there was some delay, for example one of shards is slow]
git< 
git< status=delayed
git< 

>>
>>  [Git may make other requests, that are either served or delayed]
>>  git> command=smudge
>>  git> pathname=foo
>>  git> index=1
>>  git> 
>>  git< status=success
>>  git< 
>>  git< CONTENT
>>  git< 
>>
>>  [Now Git has processed all of the items, and each one either has its
>>   final status, or has been marked as delayed. So we ask for a delayed
>>   item]
>>  git> command=wait
>>  git> 

In my proposal[2] I have called this "command=continue"... but at this
point it is bikeshedding.  I think "command=wait" (or "await" ;-))
might be better.

>>
>>  [Some time may pass if nothing is ready. But eventually we get...]
>>  git< status=success

Or

git< status=resumed

If it would not be undue burden on the filter driver process, we might
require for it to say where to continue at (in bytes), e.g.

git< from=16426

That should, of course, go below index/pathname line.

>>  git< index=0

Or a filter driver could have used pathname as an index, that is

git< pathname=path/testfile.dat

>>  git< 
>>  git< CONTENT
>>  git< 
>>
>> From Git's side, the loop is something like:
>>
>>  while (delayed_items > 0) {
>>  /* issue a wait, and get back the 

Re: [PATCH 2/2] apply: handle assertion failure gracefully

2017-02-27 Thread René Scharfe

Am 27.02.2017 um 21:04 schrieb Junio C Hamano:

René Scharfe  writes:


diff --git a/apply.c b/apply.c
index cbf7cc7f2..9219d2737 100644
--- a/apply.c
+++ b/apply.c
@@ -3652,7 +3652,6 @@ static int check_preimage(struct apply_state *state,
if (!old_name)
return 0;

-   assert(patch->is_new <= 0);


5c47f4c6 (builtin-apply: accept patch to an empty file) added that
line. Its intent was to handle diffs that contain an old name even for
a file that's created.  Citing from its commit message: "When we
cannot be sure by parsing the patch that it is not a creation patch,
we shouldn't complain when if there is no such a file."  Why not stop
complaining also in case we happen to know for sure that it's a
creation patch? I.e., why not replace the assert() with:

if (patch->is_new == 1)
goto is_new;


previous = previous_patch(state, patch, );


When the caller does know is_new is true, old_name must be made/left
NULL.  That is the invariant this assert is checking to catch an
error in the calling code.


There are some places in apply.c that set ->is_new to 1, but none of 
them set ->old_name to NULL at the same time.


Having to keep these two members in sync sounds iffy anyway.  Perhaps 
accessors can help, e.g. a setter which frees old_name when is_new is 
set to 1, or a getter which returns NULL for old_name if is_new is 1.


René


Re: [PATCH 1/2] apply: guard against renames of non-existant empty files

2017-02-27 Thread René Scharfe

Am 27.02.2017 um 21:10 schrieb Junio C Hamano:

René Scharfe  writes:


Would it make sense to mirror the previously existing condition and
check for is_new instead?  I.e.:

if ((!patch->is_delete && !patch->new_name) ||
(!patch->is_new&& !patch->old_name)) {



Yes, probably.


or

if (!(patch->is_delete || patch->new_name) ||
!(patch->is_new|| patch->old_name)) {


This happens after calling parse_git_header() so we should know the
actual value of is_delete and is_new by now (instead of mistaking
-1 aka "unknown" as true), so this rewrite would also be OK.


The two variants are logically equivalent -- (!a && !b) == !(a || b).  I 
wonder if the second one may be harder to read, though.


René


Re: [PATCH] t6300: avoid creating refs/heads/HEAD

2017-02-27 Thread Junio C Hamano
Jeff King  writes:

> The "other" stuff could sometimes be useful, I guess. It's not _always_
> wrong to do:
>
>   git branch -f @{upstream} foo
>
> It depends on what your @{upstream} resolves to. Switching to just using
> interpret_nth_prior_checkout() would break the case when it resolves to
> a local branch. I'm not sure if we're OK with that or not. If we want to
> keep all the existing cases working, I think we need something like the
> "not_in_refs_heads" patch I posted elsewhere.

I haven't seen that patch, but yes, telling the caller if the
returned value is meant to be used inside refs/heads/ is the right
approach and makes it possible for "@{upstream}" (and just "@") to
be handled sensibly in "git branch -m @{that at-mark thing}".



Re: [PATCH v7 4/6] stash: teach 'push' (and 'create_stash') to honor pathspec

2017-02-27 Thread Thomas Gummerer
On 02/27, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > +   test -n "$untracked" || git ls-files --error-unmatch -- "$@" >/dev/null 
> > || exit 1
> 
> This silent "exit 1" made me scratch my head, but --error-unmatch
> would have already given an error message, like
> 
> error: pathspec 'no such' did not match any file(s) known to git.
> Did you forget to 'git add'?
> 
> so that would be OK.

Yeah exactly, the plan was to let --error-unmatch print the error
message, as it's better at giving a good error message than we could
easily do here I think.

Maybe this deserves a comment so others won't have the same doubts
about this line?


Re: [PATCH v7 4/6] stash: teach 'push' (and 'create_stash') to honor pathspec

2017-02-27 Thread Thomas Gummerer
On 02/27, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > if test -z "$patch_mode"
> > then
> > -   git reset --hard ${GIT_QUIET:+-q}
> > +   if test $# != 0
> > +   then
> > +   git reset ${GIT_QUIET:+-q} -- "$@"
> > +   git checkout ${GIT_QUIET:+-q} HEAD -- $(git ls-files -z 
> > --modified "$@")
> 
> "ls-files -z" on the command line?  
> 
> Apparently new tests do not cover the correctness of this codepath.
> 
> I wonder if this
> 
>   git ls-files -z --modified "$@" |
>   git checkout-index -z --force --stdin
> 
> is what the above "checkout" wanted to do.  The "reset" in the
> previous step presumably updated the index entries that match
> specified pathspec to those of the HEAD, so checking out the paths
> that match "$@" from the index would be the same as checking them
> out from the HEAD (while updating the index with them).

Yes, you're completely right, this is exactly what it should have
done.  Sorry for being slow on getting this right.

> Perhaps squash the following into an appropriate patch in the
> series?

Thanks, I'll squash this in and re-roll.

>  git-stash.sh |  3 ++-
>  t/t3903-stash.sh | 16 
>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/git-stash.sh b/git-stash.sh
> index 28d0624c75..9c70662cc8 100755
> --- a/git-stash.sh
> +++ b/git-stash.sh
> @@ -300,7 +300,8 @@ push_stash () {
>   if test $# != 0
>   then
>   git reset ${GIT_QUIET:+-q} -- "$@"
> - git checkout ${GIT_QUIET:+-q} HEAD -- $(git ls-files -z 
> --modified "$@")
> + git ls-files -z --modified -- "$@" |
> + git checkout-index -z --force --stdin
>   git clean --force ${GIT_QUIET:+-q} -d -- "$@"
>   else
>   git reset --hard ${GIT_QUIET:+-q}
> diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> index f7733b4dd4..e868aafab2 100755
> --- a/t/t3903-stash.sh
> +++ b/t/t3903-stash.sh
> @@ -891,4 +891,20 @@ test_expect_success 'stash without verb with pathspec' '
>   test_path_is_file bar
>  '
>  
> +test_expect_success 'stash with pathspec matching multiple paths' '
> + echo original >file &&
> + echo original >other-file &&
> + git commit -m "two" file other-file &&
> + echo modified >file &&
> + echo modified >other-file &&
> + git stash -- "*file" &&
> + echo original >expect &&
> + test_cmp expect file &&
> + test_cmp expect other-file &&
> + git stash pop &&
> + echo modified >expect &&
> + test_cmp expect file &&
> + test_cmp expect other-file
> +'
> +
>  test_done


Re: [PATCH] t6300: avoid creating refs/heads/HEAD

2017-02-27 Thread Junio C Hamano
Junio C Hamano  writes:

> ...  I suspect that calling interpret_empty_at() from
> that function is fundamentally flawed.  The "@" end user types never
> means refs/heads/HEAD, and HEAD@{either reflog or -1} would not mean
> anything that should be taken as a branch_name, either.  

The latter should read "HEAD@{either reflog or -1 or 'upstream'}"

Or do we make HEAD@{upstream} to mean "deref HEAD to learn the
current branch name and then take its upstream"?  If so @@{upstream}
might logically make sense, but I do not see why @{upstream} without
HEAD or @ is not sufficient to begin with, so...



[PATCH 2/6] Specify explicitly where we parse timestamps

2017-02-27 Thread Johannes Schindelin
Currently, Git's source code represents all timestamps as `unsigned
long`. In preparation for using `time_t` instead, let's introduce a
symbol `parse_timestamp` (currently being defined to `strtoul`) where
appropriate, so that we can later easily switch to use `strtoull()`
instead.

Signed-off-by: Johannes Schindelin 
---
 builtin/am.c | 2 +-
 bundle.c | 2 +-
 commit.c | 4 ++--
 date.c   | 6 +++---
 fsck.c   | 2 +-
 git-compat-util.h| 2 ++
 pretty.c | 2 +-
 ref-filter.c | 2 +-
 t/helper/test-date.c | 2 +-
 tag.c| 2 +-
 upload-pack.c| 2 +-
 11 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 31fb60578f6..75e2d939036 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -882,7 +882,7 @@ static int hg_patch_to_mail(FILE *out, FILE *in, int 
keep_cr)
char *end;
 
errno = 0;
-   timestamp = strtoul(str, , 10);
+   timestamp = parse_timestamp(str, , 10);
if (errno)
return error(_("invalid timestamp"));
 
diff --git a/bundle.c b/bundle.c
index bbf4efa0a0a..f43bfcf5ff3 100644
--- a/bundle.c
+++ b/bundle.c
@@ -227,7 +227,7 @@ static int is_tag_in_date_range(struct object *tag, struct 
rev_info *revs)
line = memchr(line, '>', lineend ? lineend - line : buf + size - line);
if (!line++)
goto out;
-   date = strtoul(line, NULL, 10);
+   date = parse_timestamp(line, NULL, 10);
result = (revs->max_age == -1 || revs->max_age < date) &&
(revs->min_age == -1 || revs->min_age > date);
 out:
diff --git a/commit.c b/commit.c
index 2cf85158b48..7f56b643704 100644
--- a/commit.c
+++ b/commit.c
@@ -90,7 +90,7 @@ static unsigned long parse_commit_date(const char *buf, const 
char *tail)
if (buf >= tail)
return 0;
/* dateptr < buf && buf[-1] == '\n', so strtoul will stop at buf-1 */
-   return strtoul(dateptr, NULL, 10);
+   return parse_timestamp(dateptr, NULL, 10);
 }
 
 static struct commit_graft **commit_graft;
@@ -608,7 +608,7 @@ static void record_author_date(struct author_date_slab 
*author_date,
!ident.date_begin || !ident.date_end)
goto fail_exit; /* malformed "author" line */
 
-   date = strtoul(ident.date_begin, _end, 10);
+   date = parse_timestamp(ident.date_begin, _end, 10);
if (date_end != ident.date_end)
goto fail_exit; /* malformed date */
*(author_date_slab_at(author_date, commit)) = date;
diff --git a/date.c b/date.c
index a996331f5b3..a8848f6e141 100644
--- a/date.c
+++ b/date.c
@@ -510,7 +510,7 @@ static int match_digit(const char *date, struct tm *tm, int 
*offset, int *tm_gmt
char *end;
unsigned long num;
 
-   num = strtoul(date, , 10);
+   num = parse_timestamp(date, , 10);
 
/*
 * Seconds since 1970? We trigger on that for any numbers with
@@ -658,7 +658,7 @@ static int match_object_header_date(const char *date, 
unsigned long *timestamp,
 
if (*date < '0' || '9' < *date)
return -1;
-   stamp = strtoul(date, , 10);
+   stamp = parse_timestamp(date, , 10);
if (*end != ' ' || stamp == ULONG_MAX || (end[1] != '+' && end[1] != 
'-'))
return -1;
date = end + 2;
@@ -1066,7 +1066,7 @@ static const char *approxidate_digit(const char *date, 
struct tm *tm, int *num,
 time_t now)
 {
char *end;
-   unsigned long number = strtoul(date, , 10);
+   time_t number = parse_timestamp(date, , 10);
 
switch (*end) {
case ':':
diff --git a/fsck.c b/fsck.c
index 939792752bf..33a66e68a83 100644
--- a/fsck.c
+++ b/fsck.c
@@ -690,7 +690,7 @@ static int fsck_ident(const char **ident, struct object 
*obj, struct fsck_option
p++;
if (*p == '0' && p[1] != ' ')
return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid 
author/committer line - zero-padded date");
-   if (date_overflows(strtoul(p, , 10)))
+   if (date_overflows(parse_timestamp(p, , 10)))
return report(options, obj, FSCK_MSG_BAD_DATE_OVERFLOW, 
"invalid author/committer line - date causes integer overflow");
if ((end == p || *end != ' '))
return report(options, obj, FSCK_MSG_BAD_DATE, "invalid 
author/committer line - bad date");
diff --git a/git-compat-util.h b/git-compat-util.h
index ef6d560e156..5eff97bea2e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -319,6 +319,8 @@ extern char *gitdirname(char *);
 #define PRIo32 "o"
 #endif
 
+#define parse_timestamp strtoul
+
 #ifndef PATH_SEP
 #define PATH_SEP ':'
 #endif
diff --git a/pretty.c b/pretty.c
index 5e683830d9d..6d1e1e87e7d 100644
--- a/pretty.c
+++ b/pretty.c
@@ -409,7 

[PATCH 0/6] Use time_t

2017-02-27 Thread Johannes Schindelin
Git v2.9.2 was released in a hurry to accomodate for platforms like
Windows, where the `unsigned long` data type is 32-bit even for 64-bit
setups.

The quick fix was to simply disable all the testing with "absurd" future
dates.

However, we can do much better than that, as `time_t` exists, and at
least on 64-bit Windows it is 64-bit. Meaning: we *can* support these
absurd future dates on those platforms.

So let's do this.

One notable fallout of this patch series is that on 64-bit Linux (and
other platforms where `unsigned long` is 64-bit), we now limit the range
of dates to LONG_MAX (i.e. the *signed* maximum value). This needs to be
done as `time_t` can be signed (and indeed is at least on my Ubuntu
setup).

Obviously, I think that we can live with that, and I hope that all
interested parties agree.


Johannes Schindelin (6):
  t0006 & t5000: prepare for 64-bit time_t
  Specify explicitly where we parse timestamps
  Introduce a new "printf format" for timestamps
  Prepare for timestamps to use 64-bit signed types
  ref-filter: avoid using `unsigned long` for catch-all data type
  Use time_t where appropriate

 Documentation/technical/api-parse-options.txt |  8 +--
 Makefile  |  4 ++
 archive-tar.c |  5 +-
 builtin/am.c  |  4 +-
 builtin/blame.c   | 14 ++---
 builtin/fsck.c|  6 +-
 builtin/gc.c  |  2 +-
 builtin/log.c |  4 +-
 builtin/merge-base.c  |  2 +-
 builtin/name-rev.c|  6 +-
 builtin/pack-objects.c|  4 +-
 builtin/prune.c   |  4 +-
 builtin/receive-pack.c| 10 +--
 builtin/reflog.c  | 24 +++
 builtin/show-branch.c |  4 +-
 builtin/worktree.c|  4 +-
 bundle.c  |  4 +-
 cache.h   | 14 ++---
 commit.c  | 16 ++---
 commit.h  |  2 +-
 config.mak.uname  |  2 +
 credential-cache--daemon.c| 12 ++--
 date.c| 90 +--
 fetch-pack.c  |  8 +--
 fsck.c|  2 +-
 git-compat-util.h | 10 +++
 http-backend.c|  4 +-
 parse-options-cb.c|  4 +-
 pretty.c  |  4 +-
 reachable.c   | 10 ++-
 reachable.h   |  4 +-
 ref-filter.c  | 22 +++
 reflog-walk.c |  8 +--
 refs.c| 14 ++---
 refs.h|  8 +--
 refs/files-backend.c  |  4 +-
 revision.c|  6 +-
 revision.h|  4 +-
 sha1_name.c   |  6 +-
 t/helper/test-date.c  | 11 ++--
 t/helper/test-parse-options.c |  4 +-
 t/t0006-date.sh   |  4 +-
 t/t5000-tar-tree.sh   |  6 +-
 t/test-lib.sh |  2 +
 tag.c |  4 +-
 tag.h |  2 +-
 upload-pack.c |  8 +--
 vcs-svn/fast_export.c |  8 +--
 vcs-svn/fast_export.h |  4 +-
 vcs-svn/svndump.c |  2 +-
 wt-status.c   |  2 +-
 51 files changed, 221 insertions(+), 199 deletions(-)


base-commit: e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7
Published-As: https://github.com/dscho/git/releases/tag/time_t-may-be-int64-v1
Fetch-It-Via: git fetch https://github.com/dscho/git time_t-may-be-int64-v1

-- 
2.11.1.windows.1.379.g44ae0bc



Re: [PATCH] t6300: avoid creating refs/heads/HEAD

2017-02-27 Thread Jeff King
On Mon, Feb 27, 2017 at 01:19:29PM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I suspect there are a lot of other places that are less clear cut. E.g.,
> > I think just:
> >
> >   git branch foo bar
> >
> > will put "foo" through the same interpretation. So you could do:
> >
> >   git branch -f @{-1} bar
> >
> > Is that insane? Maybe. But it does work now.
> 
> No, it _is_ very sensible, so is "git checkout -B @{-1} "
> 
> Perhaps interpret-branch-name that does not error out when given "@"
> is what is broken?  I suspect that calling interpret_empty_at() from
> that function is fundamentally flawed.  The "@" end user types never
> means refs/heads/HEAD, and HEAD@{either reflog or -1} would not mean
> anything that should be taken as a branch_name, either.
> 
> So perhaps what interpret_empty_at() does is necessary for the "four
> capital letters is too many to type, so just type one key while
> holding a shift", but it should be called from somewhere else, and
> not from interpret_branch_name()?

I think _most_ of interpret_branch_name() is in the same boat. The
"@{upstream}" mark is not likely to give you a branch in refs/heads
either.

So in practice, I think strbuf_check_branch_ref() could probably get by
with just calling interpret_nth_prior_checkout(). Or if you prefer, to
rip everything out of interpret_branch_name() except that. :) But that
other stuff has to go somewhere, and there are some challenges with the
recursion from reinterpret().

The "other" stuff could sometimes be useful, I guess. It's not _always_
wrong to do:

  git branch -f @{upstream} foo

It depends on what your @{upstream} resolves to. Switching to just using
interpret_nth_prior_checkout() would break the case when it resolves to
a local branch. I'm not sure if we're OK with that or not. If we want to
keep all the existing cases working, I think we need something like the
"not_in_refs_heads" patch I posted elsewhere.

-Peff


Re: [PATCH v7 4/6] stash: teach 'push' (and 'create_stash') to honor pathspec

2017-02-27 Thread Junio C Hamano
Thomas Gummerer  writes:

> + test -n "$untracked" || git ls-files --error-unmatch -- "$@" >/dev/null 
> || exit 1

This silent "exit 1" made me scratch my head, but --error-unmatch
would have already given an error message, like

error: pathspec 'no such' did not match any file(s) known to git.
Did you forget to 'git add'?

so that would be OK.


[PATCH 5/6] ref-filter: avoid using `unsigned long` for catch-all data type

2017-02-27 Thread Johannes Schindelin
In its `atom_value` struct, the ref-filter source code wants to store
different values in a field called `ul` (for `unsigned long`), e.g.
timestamps.

However, as we are about to switch the data type of timestamps away from
`unsigned long` (because it may be 32-bit even when `time_t` is 64-bit),
that data type is not large enough.

Simply use `uintmax_t` instead.

Unfortunately, this patch is larger than that because the field's name
was tied to its data type.

Signed-off-by: Johannes Schindelin 
---
 ref-filter.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 07c1f372351..b8b34d4dd9e 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -236,7 +236,7 @@ struct atom_value {
struct align align;
} u;
void (*handler)(struct atom_value *atomv, struct ref_formatting_state 
*state);
-   unsigned long ul; /* used for sorting when not FIELD_STR */
+   uintmax_t value; /* used for sorting when not FIELD_STR */
 };
 
 /*
@@ -492,7 +492,7 @@ static void grab_common_values(struct atom_value *val, int 
deref, struct object
if (!strcmp(name, "objecttype"))
v->s = typename(obj->type);
else if (!strcmp(name, "objectsize")) {
-   v->ul = sz;
+   v->value = sz;
v->s = xstrfmt("%lu", sz);
}
else if (deref)
@@ -539,8 +539,8 @@ static void grab_commit_values(struct atom_value *val, int 
deref, struct object
v->s = xstrdup(oid_to_hex(>tree->object.oid));
}
else if (!strcmp(name, "numparent")) {
-   v->ul = commit_list_count(commit->parents);
-   v->s = xstrfmt("%lu", v->ul);
+   v->value = commit_list_count(commit->parents);
+   v->s = xstrfmt("%lu", (unsigned long)v->value);
}
else if (!strcmp(name, "parent")) {
struct commit_list *parents;
@@ -644,11 +644,11 @@ static void grab_date(const char *buf, struct atom_value 
*v, const char *atomnam
if ((tz == LONG_MIN || tz == LONG_MAX) && errno == ERANGE)
goto bad;
v->s = xstrdup(show_date(timestamp, tz, _mode));
-   v->ul = timestamp;
+   v->value = timestamp;
return;
  bad:
v->s = "";
-   v->ul = 0;
+   v->value = 0;
 }
 
 /* See grab_values */
@@ -1583,9 +1583,9 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct 
ref_array_item *a, stru
else if (cmp_type == FIELD_STR)
cmp = cmp_fn(va->s, vb->s);
else {
-   if (va->ul < vb->ul)
+   if (va->value < vb->value)
cmp = -1;
-   else if (va->ul == vb->ul)
+   else if (va->value == vb->value)
cmp = cmp_fn(a->refname, b->refname);
else
cmp = 1;
-- 
2.11.1.windows.1.379.g44ae0bc




[PATCH 3/6] Introduce a new "printf format" for timestamps

2017-02-27 Thread Johannes Schindelin
Currently, Git's source code treats all timestamps as if they were
unsigned longs. Therefore, it is okay to write "%lu" when printing them.

There is a substantial problem with that, though: at least on Windows,
time_t is *larger* than unsigned long, and hence we will want to switch
to using time_t instead.

So let's introduce the pseudo format "PRItime" (currently simply being
"lu") so that it is easy later on to change the data type to time_t.

Signed-off-by: Johannes Schindelin 
---
 builtin/blame.c   |  6 +++---
 builtin/fsck.c|  2 +-
 builtin/log.c |  2 +-
 builtin/receive-pack.c|  4 ++--
 date.c| 26 +-
 fetch-pack.c  |  2 +-
 git-compat-util.h |  1 +
 t/helper/test-date.c  |  2 +-
 t/helper/test-parse-options.c |  2 +-
 upload-pack.c |  2 +-
 vcs-svn/fast_export.c |  4 ++--
 11 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index cffc6265408..c9486dd580b 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1736,11 +1736,11 @@ static int emit_one_suspect_detail(struct origin 
*suspect, int repeat)
get_commit_info(suspect->commit, , 1);
printf("author %s\n", ci.author.buf);
printf("author-mail %s\n", ci.author_mail.buf);
-   printf("author-time %lu\n", ci.author_time);
+   printf("author-time %"PRItime"\n", ci.author_time);
printf("author-tz %s\n", ci.author_tz.buf);
printf("committer %s\n", ci.committer.buf);
printf("committer-mail %s\n", ci.committer_mail.buf);
-   printf("committer-time %lu\n", ci.committer_time);
+   printf("committer-time %"PRItime"\n", ci.committer_time);
printf("committer-tz %s\n", ci.committer_tz.buf);
printf("summary %s\n", ci.summary.buf);
if (suspect->commit->object.flags & UNINTERESTING)
@@ -1853,7 +1853,7 @@ static const char *format_time(unsigned long time, const 
char *tz_str,
 
strbuf_reset(_buf);
if (show_raw_time) {
-   strbuf_addf(_buf, "%lu %s", time, tz_str);
+   strbuf_addf(_buf, "%"PRItime" %s", time, tz_str);
}
else {
const char *time_str;
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1a5caccd0f5..5413c76e7a6 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -407,7 +407,7 @@ static void fsck_handle_reflog_sha1(const char *refname, 
unsigned char *sha1,
if (timestamp && name_objects)
add_decoration(fsck_walk_options.object_names,
obj,
-   xstrfmt("%s@{%ld}", refname, 
timestamp));
+   xstrfmt("%s@{%"PRItime"}", refname, 
timestamp));
obj->used = 1;
mark_object_reachable(obj);
} else {
diff --git a/builtin/log.c b/builtin/log.c
index 55d20cc2d88..24612c2299a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -903,7 +903,7 @@ static void get_patch_ids(struct rev_info *rev, struct 
patch_ids *ids)
 static void gen_message_id(struct rev_info *info, char *base)
 {
struct strbuf buf = STRBUF_INIT;
-   strbuf_addf(, "%s.%lu.git.%s", base,
+   strbuf_addf(, "%s.%"PRItime".git.%s", base,
(unsigned long) time(NULL),

git_committer_info(IDENT_NO_NAME|IDENT_NO_DATE|IDENT_STRICT));
info->message_id = strbuf_detach(, NULL);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1dbb8a06922..4a878645847 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -456,12 +456,12 @@ static char *prepare_push_cert_nonce(const char *path, 
unsigned long stamp)
struct strbuf buf = STRBUF_INIT;
unsigned char sha1[20];
 
-   strbuf_addf(, "%s:%lu", path, stamp);
+   strbuf_addf(, "%s:%"PRItime, path, stamp);
hmac_sha1(sha1, buf.buf, buf.len, cert_nonce_seed, 
strlen(cert_nonce_seed));;
strbuf_release();
 
/* RFC 2104 5. HMAC-SHA1-80 */
-   strbuf_addf(, "%lu-%.*s", stamp, 20, sha1_to_hex(sha1));
+   strbuf_addf(, "%"PRItime"-%.*s", stamp, 20, sha1_to_hex(sha1));
return strbuf_detach(, NULL);
 }
 
diff --git a/date.c b/date.c
index a8848f6e141..97ab5fcc349 100644
--- a/date.c
+++ b/date.c
@@ -100,41 +100,41 @@ void show_date_relative(unsigned long time, int tz,
diff = now->tv_sec - time;
if (diff < 90) {
strbuf_addf(timebuf,
-Q_("%lu second ago", "%lu seconds ago", diff), diff);
+Q_("%"PRItime" second ago", "%"PRItime" seconds ago", 
diff), diff);
return;
}
/* Turn it into minutes */
diff = (diff + 30) / 60;
if (diff < 90) {
strbuf_addf(timebuf,
-

[PATCH 4/6] Prepare for timestamps to use 64-bit signed types

2017-02-27 Thread Johannes Schindelin
Currently, Git's source code uses the unsigned long type to represent
timestamps. However, this type is limited to 32-bit e.g. on 64-bit
Windows. Hence it is a suboptimal type for this use case.

In any case, we need to use the time_t type to represent timestamps
since we often send those values to system functions which are declared
to accept time_t parameters.

So let's prepare for the case where timestamps are represented as 64-bit
signed integers by introducing the Makefile option TIME_T_IS_INT64.

As we have to resort to using `strtoull()` (and casting the parsed,
unsigned value to an `int64_t`), the check in the `date_overflows()`
helper has to be relaxed: a value of ULLONG_MAX (cast to `int64_t`)
now *also* indicates an overflow.

Signed-off-by: Johannes Schindelin 
---
 Makefile   | 4 
 archive-tar.c  | 5 -
 builtin/name-rev.c | 2 +-
 builtin/prune.c| 2 +-
 builtin/worktree.c | 2 +-
 credential-cache--daemon.c | 2 +-
 date.c | 8 
 git-compat-util.h  | 7 +++
 ref-filter.c   | 2 +-
 9 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/Makefile b/Makefile
index 8e4081e0619..0232cf62d33 100644
--- a/Makefile
+++ b/Makefile
@@ -1518,6 +1518,10 @@ ifdef HAVE_GETDELIM
BASIC_CFLAGS += -DHAVE_GETDELIM
 endif
 
+ifdef TIME_T_IS_INT64
+   BASIC_CFLAGS += -DTIME_T_IS_INT64
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
diff --git a/archive-tar.c b/archive-tar.c
index 380e3aedd23..695339a2369 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -27,9 +27,12 @@ static int write_tar_filter_archive(const struct archiver 
*ar,
  */
 #if ULONG_MAX == 0x
 #define USTAR_MAX_SIZE ULONG_MAX
-#define USTAR_MAX_MTIME ULONG_MAX
 #else
 #define USTAR_MAX_SIZE 0777UL
+#endif
+#if TIME_MAX == 0x
+#define USTAR_MAX_MTIME TIME_MAX
+#else
 #define USTAR_MAX_MTIME 0777UL
 #endif
 
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index cd89d48b65e..a0f16407b93 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -145,7 +145,7 @@ static int name_ref(const char *path, const struct 
object_id *oid, int flags, vo
struct name_ref_data *data = cb_data;
int can_abbreviate_output = data->tags_only && data->name_only;
int deref = 0;
-   unsigned long taggerdate = ULONG_MAX;
+   unsigned long taggerdate = TIME_MAX;
 
if (data->tags_only && !starts_with(path, "refs/tags/"))
return 0;
diff --git a/builtin/prune.c b/builtin/prune.c
index 8f4f0522856..1e5eb0292b1 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -111,7 +111,7 @@ int cmd_prune(int argc, const char **argv, const char 
*prefix)
};
char *s;
 
-   expire = ULONG_MAX;
+   expire = TIME_MAX;
save_commit_buffer = 0;
check_replace_refs = 0;
ref_paranoia = 1;
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 831fe058a53..3df95e112e5 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -131,7 +131,7 @@ static int prune(int ac, const char **av, const char 
*prefix)
OPT_END()
};
 
-   expire = ULONG_MAX;
+   expire = TIME_MAX;
ac = parse_options(ac, av, prefix, options, worktree_usage, 0);
if (ac)
usage_with_options(worktree_usage, options);
diff --git a/credential-cache--daemon.c b/credential-cache--daemon.c
index 46c5937526a..b298ac01e4f 100644
--- a/credential-cache--daemon.c
+++ b/credential-cache--daemon.c
@@ -52,7 +52,7 @@ static int check_expirations(void)
static unsigned long wait_for_entry_until;
int i = 0;
unsigned long now = time(NULL);
-   unsigned long next = (unsigned long)-1;
+   unsigned long next = TIME_MAX;
 
/*
 * Initially give the client 30 seconds to actually contact us
diff --git a/date.c b/date.c
index 97ab5fcc349..23dee2964c1 100644
--- a/date.c
+++ b/date.c
@@ -659,7 +659,7 @@ static int match_object_header_date(const char *date, 
unsigned long *timestamp,
if (*date < '0' || '9' < *date)
return -1;
stamp = parse_timestamp(date, , 10);
-   if (*end != ' ' || stamp == ULONG_MAX || (end[1] != '+' && end[1] != 
'-'))
+   if (*end != ' ' || stamp == TIME_MAX || (end[1] != '+' && end[1] != 
'-'))
return -1;
date = end + 2;
ofs = strtol(date, , 10);
@@ -762,7 +762,7 @@ int parse_expiry_date(const char *date, unsigned long 
*timestamp)
 * of the past, and there is nothing from the future
 * to be kept.
 */
-   *timestamp = ULONG_MAX;
+   *timestamp = TIME_MAX;
else
*timestamp = approxidate_careful(date, );
 
@@ -1184,8 +1184,8 @@ int date_overflows(unsigned long t)
 {
time_t sys;
 
-   /* If we overflowed our unsigned long, that's bad... */
-  

Re: [PATCH] t6300: avoid creating refs/heads/HEAD

2017-02-27 Thread Junio C Hamano
Jeff King  writes:

> I suspect there are a lot of other places that are less clear cut. E.g.,
> I think just:
>
>   git branch foo bar
>
> will put "foo" through the same interpretation. So you could do:
>
>   git branch -f @{-1} bar
>
> Is that insane? Maybe. But it does work now.

No, it _is_ very sensible, so is "git checkout -B @{-1} "

Perhaps interpret-branch-name that does not error out when given "@"
is what is broken?  I suspect that calling interpret_empty_at() from
that function is fundamentally flawed.  The "@" end user types never
means refs/heads/HEAD, and HEAD@{either reflog or -1} would not mean
anything that should be taken as a branch_name, either.  

So perhaps what interpret_empty_at() does is necessary for the "four
capital letters is too many to type, so just type one key while
holding a shift", but it should be called from somewhere else, and
not from interpret_branch_name()?







Re: [PATCH 4/4] ident: do not ignore empty config name/email

2017-02-27 Thread Junio C Hamano
Dennis Kaarsemaker  writes:

> On Thu, 2017-02-23 at 23:18 -0500, Jeff King wrote:
>> On Thu, Feb 23, 2017 at 08:11:11PM -0800, Junio C Hamano wrote:
>> 
>> > > So I dunno. I could really go either way on it. Feel free to drop it, or
>> > > even move it into a separate topic to be cooked longer.
>> > 
>> > If it were 5 years ago, it would have been different, but I do not
>> > think cooking it longer in 'next' would smoke out breakages in
>> > obscure scripts any longer.  Git is used by too many people who have
>> > never seen its source these days.
>> 
>> Yeah, I have noticed that, too. I wonder if it would be interesting to
>> cut "weeklies" or something of "master" or even "next" that people could
>> install with a single click.
>> 
>> Of course it's not like we have a binary installer in the first place,
>> so I guess that's a prerequisite.
>
> I provide daily[*] snapshots of git's master and next tree as packages
> for Ubuntu, Debian, Fedora and CentOS on launchpad and SuSE's
> openbuildservice. If there's sufficient interest in this (I know of
> only a few users), I can try to put more effort into this.

That sounds handy for people who do not build from the source
themselves.

Christian, perhaps rev-news can help advertising Dennis's effort to
recruit like-minded souls?


Re: show all merge conflicts

2017-02-27 Thread Jeff King
On Mon, Feb 27, 2017 at 11:45:31AM -0800, Junio C Hamano wrote:

> Michael J Gruber  writes:
> 
> > If you're curious, I kept rebasing Thomas' remerge-diff (on top of our
> > next) so far. You can find it at
> >
> > https://github.com/mjg/git/tree/remerge-diff
> 
> ;-).
> Yes, this was a good one.

FWIW, I have also been carrying it forward. It's not a tool I reach for
often, but a couple of times it has come in very handy (mostly helping
somebody to track down a mistake that somebody made in a merge, like
accidentally using "checkout --ours" on top of a conflict).

> > if you're interested. I don't know what problems were found back then,
> > or what it would take to get this in-tree now.
> 
> If I recall correctly, everybody was in favor of what it does (or at
> least attempted to do), but was leaky and not ready for "log -p" to
> be used on a long stretch of history or something?

The last round was at:

  http://public-inbox.org/git/cover.1409860234.git...@thomasrast.ch/

I think. I think the leakiness was dealt with by rebasing onto the
name_hash refactoring. But it looks like there are a lot of little
issues, and maybe one bigger one: it turns "log" from a read-only
operation into that writes into the object database.

-Peff


Re: git-clone --config order & fetching extra refs during initial clone

2017-02-27 Thread Jeff King
On Mon, Feb 27, 2017 at 11:16:35AM -0800, Junio C Hamano wrote:

> >> TL;DR: git-clone ignores any fetch specs passed via --config.
> >
> > I agree that this is a bug. There's some previous discussion and an RFC
> > patch from lat March (author cc'd):
> >
> >   
> > http://public-inbox.org/git/1457313062-10073-1-git-send-email-sze...@ira.uka.de/
> >
> > That discussion veered off into alternatives, but I think the v2 posted
> > in that thread is taking a sane approach.
> 
> Let's see how well it fares by cooking it in 'next' ;-) 
> 
> I think it was <1459349623-16443-1-git-send-email-sze...@ira.uka.de>,
> which needs a bit of massaging to apply to the current codebase.

Yeah, that is the most recent one I found.

I didn't actually review it very carefully before, but I'll do so now
(spoiler: a few nits, but it looks fine).

>  static struct ref *wanted_peer_refs(const struct ref *refs,
> - struct refspec *refspec)
> + struct refspec *refspec, unsigned int refspec_count)

Most of the changes here and elsewhere are just about passing along
multiple refspecs instead of a single, which makes sense.

> @@ -856,7 +861,9 @@ int cmd_clone(int argc, const char **argv, const char 
> *prefix)
>   int submodule_progress;
>  
>   struct refspec *refspec;
> - const char *fetch_pattern;
> + unsigned int refspec_count = 1;
> + const char **fetch_patterns;
> + const struct string_list *config_fetch_patterns;

This "1" seems funny to up here by itself, as it must be kept in sync
with the later logic that feeds exactly one non-configured refspec into
our list. The current code isn't wrong, but it would be nice to have it
all together. I.e., replacing:

> + if (config_fetch_patterns)
> + refspec_count = 1 + config_fetch_patterns->nr;
> + fetch_patterns = xcalloc(refspec_count, sizeof(*fetch_patterns));
> + fetch_patterns[0] = value.buf;

with:

  refspec_count = 1;
  if (config_fetch_patterns)
refspec_count += config_fetch_patterns->nr;
  ...

Though if I'm bikeshedding, I'd probably have written the whole thing
with an argv_array and avoided counting at all.

> + refspec = parse_fetch_refspec(refspec_count, fetch_patterns);
>  
> + strbuf_reset();
>   strbuf_reset();
>  
>   remote = remote_get(option_origin);

I do also notice that right _after_ this parsing, we use remote_get(),
which is supposed to give us this config anyway. Which makes me wonder
if we could just reorder this to put remote_get() first, and then read
the resulting refspecs from remote->fetch.

> diff --git a/t/t5611-clone-config.sh b/t/t5611-clone-config.sh
> index e4850b778c..3bed17783b 100755
> --- a/t/t5611-clone-config.sh
> +++ b/t/t5611-clone-config.sh
> @@ -37,6 +37,30 @@ test_expect_success 'clone -c config is available during 
> clone' '
>   test_cmp expect child/file
>  '
>  
> +test_expect_success 'clone -c remote.origin.fetch= works' '
> + rm -rf child &&
> + git update-ref refs/grab/it refs/heads/master &&
> + git update-ref refs/keep/out refs/heads/master &&
> + git clone -c "remote.origin.fetch=+refs/grab/*:refs/grab/*" . child &&
> + (
> + cd child &&
> + git for-each-ref --format="%(refname)" refs/grab/ >../actual
> + ) &&
> + echo refs/grab/it >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'git -c remote.origin.fetch= clone works' '
> + rm -rf child &&
> + git -c "remote.origin.fetch=+refs/grab/*:refs/grab/*" clone . child &&
> + (
> + cd child &&
> + git for-each-ref --format="%(refname)" refs/grab/ >../actual
> + ) &&
> + echo refs/grab/it >expect &&
> + test_cmp expect actual
> +'

These look reasonable. Using "git -C for-each-ref" would save a
subshell, but that's minor.

If we wanted to be thorough, we could also check that the feature works
correctly with "--origin" (I think it does).

-Peff


Re: [PATCH] t6300: avoid creating refs/heads/HEAD

2017-02-27 Thread Jeff King
On Mon, Feb 27, 2017 at 11:33:23AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > This comes originally from Junio's 84679d470. I cannot see how naming
> > the new branch HEAD would make any difference to the test, but perhaps I
> > am missing something.
> 
> Nah, I think it was just a random string that came to mind and the
> topic being "ah we blindly dereference something when showing %(HEAD)"
> it was plausible I thought of "H E A D" as that random string before
> I used my usual other random strings like frotz ;-)

OK, thanks for confirming.

> > I noticed this while digging on a nearby issue around "git branch -m @".
> > This does happen to be the only test that checks that we can make a
> > branch called refs/heads/HEAD, and I found it because it triggers if you
> > try to disallow "git branch -m HEAD". :)
> 
> About that "nearby" one, does it even make sense to do the interpret
> thing on the  name?  I can understand "please rename the branch
> I was previously on to this new name" wanting to say @{-1} when the
> user does not recall the exact spelling of a long name, but I do not
> quite see how "to this new name" part benefits by the "interpret
> branch name" magic in the first place.

Yeah, it's arguable whether the "new" side of a rename should do any
interpretation at all. At the same time, the bug is in the underlying
function that assumes you can slap "refs/heads/" in front of the results
of interpret_branch_name(). And that function gets used in a lot of
places, including the "old" side of a rename. So:

  git branch @{-1} foo

should clearly work. Doing:

  git branch @{upstream} foo

is more debatable. It _does_ work, but only if your upstream is actually
a local branch (otherwise it tries to rename refs/heads/origin/master or
some such nonsense. It happens to fail most of the time because you
probably don't have such a branch, but it's still wrong to even look at
that).

I suspect there are a lot of other places that are less clear cut. E.g.,
I think just:

  git branch foo bar

will put "foo" through the same interpretation. So you could do:

  git branch -f @{-1} bar

Is that insane? Maybe. But it does work now.

-Peff


Re: Transition plan for git to move to a new hash function

2017-02-27 Thread Tony Finch
Ian Jackson  wrote:

A few questions and one or two suggestions...

> TEXTUAL SYNTAX
> ==
>
> We also reserve the following syntax for private experiments:
>   E[A-Z]+[0-9a-z]+
> We declare that public releases of git will never accept such
> object names.

Instead of this I would suggest that experimental hash names should have
multi-character prefixes and an easy registration process - rationale:
https://tools.ietf.org/html/rfc6648

> A single object may refer to other objects the hash function which
> names the object itself, or by other hash functions, in any
> combination.

If I understand it correctly, this freedom is greatly restricted later on
in this document, depending on the object type in question. If so, it's
probably worth saying so at this point.

> Commits
> ---
>
> The hash function naming an origin commit is controlled by the hint
> left in .git for the ref named by HEAD (or for HEAD itself, if HEAD is
> detached) by git checkout --orphan or git init.

This confused me for a while - I think you mean "root commit"?

> TRANSITION PLAN
> ===
>
> Y4: BLAKE by default for new projects.
>
> When creating a new working tree, it starts using BLAKE.
>
> Servers which have been updated will accept BLAKE.

Why not allow newhash pushes before making it the default for new
projects? Wouldn't it make sense to get the server side ready some time
before projects start actively using new hashes?

Or is the idea that newhash upgrade is driven from the server?

What's the upgrade process for send-email patch exchange?

Tony.
-- 
f.anthony.n.finch    http://dotat.at/  -  I xn--zr8h punycode
Fair Isle: Southwest 6 to gale 8, backing east 5 or 6, backing north 6 to gale
8 later. Rough or very rough. Rain or showers. Moderate or good.


Re: [PATCH 1/2] apply: guard against renames of non-existant empty files

2017-02-27 Thread Junio C Hamano
René Scharfe  writes:

> Would it make sense to mirror the previously existing condition and
> check for is_new instead?  I.e.:
>
>   if ((!patch->is_delete && !patch->new_name) ||
>   (!patch->is_new&& !patch->old_name)) {
>

Yes, probably.

> or
>
>   if (!(patch->is_delete || patch->new_name) ||
>   !(patch->is_new|| patch->old_name)) {

This happens after calling parse_git_header() so we should know the
actual value of is_delete and is_new by now (instead of mistaking
-1 aka "unknown" as true), so this rewrite would also be OK.



Re: [PATCH 2/2] apply: handle assertion failure gracefully

2017-02-27 Thread Junio C Hamano
René Scharfe  writes:

>> diff --git a/apply.c b/apply.c
>> index cbf7cc7f2..9219d2737 100644
>> --- a/apply.c
>> +++ b/apply.c
>> @@ -3652,7 +3652,6 @@ static int check_preimage(struct apply_state *state,
>>  if (!old_name)
>>  return 0;
>>
>> -assert(patch->is_new <= 0);
>
> 5c47f4c6 (builtin-apply: accept patch to an empty file) added that
> line. Its intent was to handle diffs that contain an old name even for
> a file that's created.  Citing from its commit message: "When we
> cannot be sure by parsing the patch that it is not a creation patch,
> we shouldn't complain when if there is no such a file."  Why not stop
> complaining also in case we happen to know for sure that it's a
> creation patch? I.e., why not replace the assert() with:
>
>   if (patch->is_new == 1)
>   goto is_new;
>
>>  previous = previous_patch(state, patch, );

When the caller does know is_new is true, old_name must be made/left
NULL.  That is the invariant this assert is checking to catch an
error in the calling code.

Errors in the patches fed as its input are caught by "if we do not
know if the patch is to add a new path yet, then declare it is, but
if we do know the patch is _NOT_ adding a new path, barf if that
path is not there" and other checks in this function, and changing
the assert to "if already new, then make it a no-op" defeats the
whole point of having an assert (and just removing it is even worse).

Thanks.


RE: Unconventional roles of git

2017-02-27 Thread Randall S. Becker
> -Original Message-
> From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On
> Behalf Of ankostis
> Sent: February 26, 2017 6:52 AM
> To: Git Mailing List 
> Cc: Jason Cooper 
> Subject: Unconventional roles of git
> 
> On 26 February 2017 at 02:13, Jason Cooper  wrote:
> > As someone looking to deploy (and having previously deployed) git in
> > unconventional roles, I'd like to add ...
> 
> We are developing a distributed storage for type approval files regarding all
> vehicles registered in Europe.[1]  To ensure integrity even after 10 or 30
> years, the hash of a commit of these files (as contained in a tag) are to be
> printed on the the paper certificates.
> 
> - Can you provide some hints for other similar unconventional roles of git?
> - Any other comment on the above usage of git are welcomed.

I am involved in managing manufacturing designs and parts configurations and 
approvals with git being intimately involved in the process of developing and 
deploying tested designs to computerized manufacturing environments. It's 
pretty cool actually to see things become real.

Cheers,
Randall

-- Brief whoami: NonStop developer since approximately 
UNIX(421664400)/NonStop(2112884442)
-- In my real life, I talk too much.





Re: show all merge conflicts

2017-02-27 Thread Junio C Hamano
Michael J Gruber  writes:

> If you're curious, I kept rebasing Thomas' remerge-diff (on top of our
> next) so far. You can find it at
>
> https://github.com/mjg/git/tree/remerge-diff

;-).
Yes, this was a good one.  


> if you're interested. I don't know what problems were found back then,
> or what it would take to get this in-tree now.

If I recall correctly, everybody was in favor of what it does (or at
least attempted to do), but was leaky and not ready for "log -p" to
be used on a long stretch of history or something?


Re: [PATCH] interpret_branch_name(): handle auto-namelen for @{-1}

2017-02-27 Thread Junio C Hamano
Jeff King  writes:

> On Mon, Feb 27, 2017 at 04:25:40AM -0500, Jeff King wrote:
>
>> However, before we do that auto-namelen magic, we call
>> interpret_nth_prior_checkout(), which gets fed the bogus
>> "0". This was broken by 8cd4249c4 (interpret_branch_name:
>> always respect "namelen" parameter, 2014-01-15).  Though to
>> be fair to that commit, it was broken in the _opposite_
>> direction before, where we would always treat "name" as a
>> string even if a length was passed.
>
> That commit is mine, by the way. More embarrassing than introducing the
> bug is that I _noticed_ the problem at the time and wrote a paragraph in
> the commit message rationalizing why it was OK, rather than just doing
> this trivial fix.

Thanks, I should also be embarrased since I didn't even notice the
issue when we queued it ;-)


Re: [PATCH 1/2] commit: be more precise when searching for headers

2017-02-27 Thread Junio C Hamano
René Scharfe  writes:

> Search for a space character only within the current line in
> read_commit_extra_header_lines() instead of searching in the whole
> buffer (and possibly beyond, if it's not NUL-terminated) and then
> discarding any results after the end of the current line.
>
> Signed-off-by: Rene Scharfe 
> ---
>  commit.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Makes sense.

> diff --git a/commit.c b/commit.c
> index 2cf85158b4..173c6d3818 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -1354,8 +1354,8 @@ static struct commit_extra_header 
> *read_commit_extra_header_lines(
>   strbuf_reset();
>   it = NULL;
>  
> - eof = strchr(line, ' ');
> - if (next <= eof)
> + eof = memchr(line, ' ', next - line);
> + if (!eof)
>   eof = next;
>  
>   if (standard_header_field(line, eof - line) ||


Re: [PATCH] cvs tests: When root, skip tests that call "cvs commit"

2017-02-27 Thread Junio C Hamano
Thanks, makes sense.


Re: 'git submodules update' ignores credential.helper config of the parent repository

2017-02-27 Thread Stefan Beller
On Mon, Feb 27, 2017 at 5:33 AM, Dmitry Neverov
 wrote:>
>   git -c credential.helper= submodule update
>
> Is it by design?

A similar question came up w.r.t. submodule configuration
recently. It is about url..insteadOf[1] that is set
in the super project and is expected to work in the submodules.
More reading on some background there, as it is the very same
problem: Which configuration should propagate to the submodules,
how do we tell the users and can the user influence if some
articular settings are propagated?

For both these settings (url...insteadOf and the credentialHelper)
one might think that they absolutely should be propagated
to the submodules, but that may not be true; e.g. a submodule
might be hosted at a different hosting provider, needing a different
credentials setup. (The submodule might be an open source library
that you use, which may even require no credentials at all)

So I think we have to come up with a generic solution to respect
certain settings of the superproject instead of e.g. hard coding
credential.helper to be looked up in the superproject.

So the current proposal (in that mentioned thread) is
to borrow the idea from worktrees that have a similar problem:
split up the config into multiple files and each file applies to
a different worktree or in our case we would have
(A) a config file that applies to the superproject;
(B) a config file that applies to both superproject
 and all submodules
(C) and each submodule has its own config file as well.

---
For worktrees these multiple config files sounded like
the obvious solution, but I wonder if there was also
some bike shedding about other solutions?

I could imagine that we would want to have attributes
for specific configuration, e.g.:

--8<--
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[remote "origin"]
url = git://github.com/gitster/git
fetch = +refs/heads/*:refs/remotes/origin/*
[attribute "submodules"]
read = true
# this will be read and respected by submodules as well:
[url."internal-git-miror"]
insteadOf = github.com
[attribute "submodules"]
read = false
# This (and the beginning of this file) will not be respected
# by submodules
[credential]
helper =
-->8--

This would change the semantics of a config file as the attribute for
each setting depends on the location (was attribute.FOO.read =
{true, false} read before).

This would be read-compatible with older versions of Git, and it seems
as if it were write compatible as well. Just writing a new value with a specifc
attribute would be interesting to implement.

Thanks,
Stefan


[1] 
https://public-inbox.org/git/84fcb0bd-85dc-0142-dd58-47a04eaa7...@durchholz.org/


Re: [PATCH] t6300: avoid creating refs/heads/HEAD

2017-02-27 Thread Junio C Hamano
Jeff King  writes:

> This comes originally from Junio's 84679d470. I cannot see how naming
> the new branch HEAD would make any difference to the test, but perhaps I
> am missing something.

Nah, I think it was just a random string that came to mind and the
topic being "ah we blindly dereference something when showing %(HEAD)"
it was plausible I thought of "H E A D" as that random string before
I used my usual other random strings like frotz ;-)

> I noticed this while digging on a nearby issue around "git branch -m @".
> This does happen to be the only test that checks that we can make a
> branch called refs/heads/HEAD, and I found it because it triggers if you
> try to disallow "git branch -m HEAD". :)

About that "nearby" one, does it even make sense to do the interpret
thing on the  name?  I can understand "please rename the branch
I was previously on to this new name" wanting to say @{-1} when the
user does not recall the exact spelling of a long name, but I do not
quite see how "to this new name" part benefits by the "interpret
branch name" magic in the first place.

> If we care about that, though, I think we should make an explicit test
> for "git branch HEAD". But I'm not sure we _do_ care about that. Making
> a branch called HEAD is moderately insane, and I don't think it would be
> unreasonable for us to outlaw it at some point.

Yeah, at that point we would have "test_must_fail git branch HEAD".

>  t/t6300-for-each-ref.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
> index aea1dfc71..a468041c5 100755
> --- a/t/t6300-for-each-ref.sh
> +++ b/t/t6300-for-each-ref.sh
> @@ -558,7 +558,7 @@ test_expect_success 'do not dereference NULL upon %(HEAD) 
> on unborn branch' '
>   test_when_finished "git checkout master" &&
>   git for-each-ref --format="%(HEAD) %(refname:short)" refs/heads/ 
> >actual &&
>   sed -e "s/^\* /  /" actual >expect &&
> - git checkout --orphan HEAD &&
> + git checkout --orphan orphaned-branch &&
>   git for-each-ref --format="%(HEAD) %(refname:short)" refs/heads/ 
> >actual &&
>   test_cmp expect actual
>  '


Re: git-clone --config order & fetching extra refs during initial clone

2017-02-27 Thread Junio C Hamano
Jeff King  writes:

> [Re-sending, as I used an old address for Gábor on the original]
>
> On Sat, Feb 25, 2017 at 07:12:38PM +, Robin H. Johnson wrote:
>
>> TL;DR: git-clone ignores any fetch specs passed via --config.
>
> I agree that this is a bug. There's some previous discussion and an RFC
> patch from lat March (author cc'd):
>
>   
> http://public-inbox.org/git/1457313062-10073-1-git-send-email-sze...@ira.uka.de/
>
> That discussion veered off into alternatives, but I think the v2 posted
> in that thread is taking a sane approach.

Let's see how well it fares by cooking it in 'next' ;-) 

I think it was <1459349623-16443-1-git-send-email-sze...@ira.uka.de>,
which needs a bit of massaging to apply to the current codebase.

-- >8 --
From: SZEDER Gábor 
Date: Wed, 30 Mar 2016 16:53:43 +0200
Subject: [PATCH] clone: respect configured fetch respecs during initial fetch

Conceptually 'git clone' should behave as if the following commands
were run:

  git init
  git config ... # set default configuration and origin remote
  git fetch
  git checkout   # unless '--bare' is given

However, that initial 'git fetch' behaves differently from any
subsequent fetches, because it takes only the default fetch refspec
into account and ignores all other fetch refspecs that might have
been explicitly specified on the command line (e.g. 'git -c
remote.origin.fetch= clone' or 'git clone -c ...').

Check whether there are any fetch refspecs configured for the origin
remote and take all of them into account during the initial fetch as
well.

Signed-off-by: SZEDER Gábor 
---
 builtin/clone.c | 36 
 t/t5611-clone-config.sh | 24 
 2 files changed, 52 insertions(+), 8 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 3f63edbbf9..97229268b6 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -516,7 +516,7 @@ static struct ref *find_remote_branch(const struct ref 
*refs, const char *branch
 }
 
 static struct ref *wanted_peer_refs(const struct ref *refs,
-   struct refspec *refspec)
+   struct refspec *refspec, unsigned int refspec_count)
 {
struct ref *head = copy_ref(find_ref_by_name(refs, "HEAD"));
struct ref *local_refs = head;
@@ -537,13 +537,18 @@ static struct ref *wanted_peer_refs(const struct ref 
*refs,
warning(_("Could not find remote branch %s to clone."),
option_branch);
else {
-   get_fetch_map(remote_head, refspec, , 0);
+   unsigned int i;
+   for (i = 0; i < refspec_count; i++)
+   get_fetch_map(remote_head, [i], , 
0);
 
/* if --branch=tag, pull the requested tag explicitly */
get_fetch_map(remote_head, tag_refspec, , 0);
}
-   } else
-   get_fetch_map(refs, refspec, , 0);
+   } else {
+   unsigned int i;
+   for (i = 0; i < refspec_count; i++)
+   get_fetch_map(refs, [i], , 0);
+   }
 
if (!option_mirror && !option_single_branch)
get_fetch_map(refs, tag_refspec, , 0);
@@ -856,7 +861,9 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
int submodule_progress;
 
struct refspec *refspec;
-   const char *fetch_pattern;
+   unsigned int refspec_count = 1;
+   const char **fetch_patterns;
+   const struct string_list *config_fetch_patterns;
 
packet_trace_identity("clone");
argc = parse_options(argc, argv, prefix, builtin_clone_options,
@@ -1002,9 +1009,21 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (option_required_reference.nr || option_optional_reference.nr)
setup_reference();
 
-   fetch_pattern = value.buf;
-   refspec = parse_fetch_refspec(1, _pattern);
+   strbuf_addf(, "remote.%s.fetch", option_origin);
+   config_fetch_patterns = git_config_get_value_multi(key.buf);
+   if (config_fetch_patterns)
+   refspec_count = 1 + config_fetch_patterns->nr;
+   fetch_patterns = xcalloc(refspec_count, sizeof(*fetch_patterns));
+   fetch_patterns[0] = value.buf;
+   if (config_fetch_patterns) {
+   struct string_list_item *fp;
+   unsigned int i = 1;
+   for_each_string_list_item(fp, config_fetch_patterns)
+   fetch_patterns[i++] = fp->string;
+   }
+   refspec = parse_fetch_refspec(refspec_count, fetch_patterns);
 
+   strbuf_reset();
strbuf_reset();
 
remote = remote_get(option_origin);
@@ -1058,7 +1077,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
refs = transport_get_remote_refs(transport);
 
if (refs) {
-   mapped_refs = 

Re: [PATCH] http: add an "auto" mode for http.emptyauth

2017-02-27 Thread Junio C Hamano
Jeff King  writes:

> The auto mode may incur an extra round-trip over setting
> http.emptyauth=true, because part of the emptyauth hack is
> to feed this blank password to curl even before we've made a
> single request.

IOW, people who care about an extra round-trip have this workaround,
which is good.

This, along with the possible security implications, may want to be
added to the documentation but that is outside the topic of this
change, and I think we would want to see such an update come from
those who actually use NTLM (or Kerberos, but they know they have
minimum security implications).

> +#ifndef LIBCURL_CAN_HANDLE_AUTH_ANY
> + /*
> +  * Our libcurl is too old to do AUTH_ANY in the first place;
> +  * just default to turning the feature off.
> +  */
> +#else
> + /*
> +  * In the automatic case, kick in the empty-auth
> +  * hack as long as we would potentially try some
> +  * method more exotic than "Basic" or "Digest".
> +  *
> +  * But only do this when this is our second or
> +  * subsequent * request, as by then we know what

I'll drop the '*' that you left while line-wrapping ;-)

> +  * methods are available.
> +  */

Thanks.  This looks good.


[PATCH v4 21/22] Documentation/config: add splitIndex.sharedIndexExpire

2017-02-27 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 Documentation/config.txt | 12 
 1 file changed, 12 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 8e745bda52..0e9982c5e3 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2844,6 +2844,18 @@ splitIndex.maxPercentChange::
than 20 percent of the total number of entries.
See linkgit:git-update-index[1].
 
+splitIndex.sharedIndexExpire::
+   When the split index feature is used, shared index files that
+   were not modified since the time this variable specifies will
+   be removed when a new shared index file is created. The value
+   "now" expires all entries immediately, and "never" suppresses
+   expiration altogether.
+   The default value is "2.weeks.ago".
+   Note that a shared index file is considered modified (for the
+   purpose of expiration) each time a new split-index file is
+   created based on it.
+   See linkgit:git-update-index[1].
+
 status.relativePaths::
By default, linkgit:git-status[1] shows paths relative to the
current directory. Setting this variable to `false` shows paths
-- 
2.12.0.22.g0672473d40



[PATCH v4 15/22] read-cache: touch shared index files when used

2017-02-27 Thread Christian Couder
When a split-index file is created, let's update the mtime of the
shared index file that the split-index file is referencing.

In a following commit we will make shared index file expire
depending on their mtime, so updating the mtime makes sure that
the shared index file will not be deleted soon.

Signed-off-by: Christian Couder 
---
 read-cache.c | 29 ++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index aeb413a508..5f295af4c6 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1674,6 +1674,19 @@ int do_read_index(struct index_state *istate, const char 
*path, int must_exist)
die("index file corrupt");
 }
 
+/*
+ * Signal that the shared index is used by updating its mtime.
+ *
+ * This way, shared index can be removed if they have not been used
+ * for some time.
+ */
+static void freshen_shared_index(char *base_sha1_hex, int warn)
+{
+   const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
+   if (!check_and_freshen_file(shared_index, 1) && warn)
+   warning("Could not freshen shared index '%s'", shared_index);
+}
+
 int read_index_from(struct index_state *istate, const char *path)
 {
struct split_index *split_index;
@@ -2245,6 +2258,7 @@ static int too_many_not_shared_entries(struct index_state 
*istate)
 int write_locked_index(struct index_state *istate, struct lock_file *lock,
   unsigned flags)
 {
+   int new_shared_index, ret;
struct split_index *si = istate->split_index;
 
if (!si || alternate_index_output ||
@@ -2261,13 +2275,22 @@ int write_locked_index(struct index_state *istate, 
struct lock_file *lock,
}
if (too_many_not_shared_entries(istate))
istate->cache_changed |= SPLIT_INDEX_ORDERED;
-   if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
-   int ret = write_shared_index(istate, lock, flags);
+
+   new_shared_index = istate->cache_changed & SPLIT_INDEX_ORDERED;
+
+   if (new_shared_index) {
+   ret = write_shared_index(istate, lock, flags);
if (ret)
return ret;
}
 
-   return write_split_index(istate, lock, flags);
+   ret = write_split_index(istate, lock, flags);
+
+   /* Freshen the shared index only if the split-index was written */
+   if (!ret && !new_shared_index)
+   freshen_shared_index(sha1_to_hex(si->base_sha1), 1);
+
+   return ret;
 }
 
 /*
-- 
2.12.0.22.g0672473d40



[PATCH v4 17/22] read-cache: unlink old sharedindex files

2017-02-27 Thread Christian Couder
Everytime split index is turned on, it creates a "sharedindex."
file in the git directory. This change makes sure that shared index
files that haven't been used for a long time are removed when a new
shared index file is created.

The new "splitIndex.sharedIndexExpire" config variable is created
to tell the delay after which an unused shared index file can be
deleted. It defaults to "2.weeks.ago".

A previous commit made sure that each time a split index file is
created the mtime of the shared index file it references is updated.
This makes sure that recently used shared index file will not be
deleted.

Signed-off-by: Christian Couder 
---
 read-cache.c | 64 +++-
 1 file changed, 63 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 5f295af4c6..45fc831010 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2199,6 +2199,65 @@ static int write_split_index(struct index_state *istate,
return ret;
 }
 
+static const char *shared_index_expire = "2.weeks.ago";
+
+static unsigned long get_shared_index_expire_date(void)
+{
+   static unsigned long shared_index_expire_date;
+   static int shared_index_expire_date_prepared;
+
+   if (!shared_index_expire_date_prepared) {
+   git_config_get_expiry("splitindex.sharedindexexpire",
+ _index_expire);
+   shared_index_expire_date = approxidate(shared_index_expire);
+   shared_index_expire_date_prepared = 1;
+   }
+
+   return shared_index_expire_date;
+}
+
+static int can_delete_shared_index(const char *shared_index_path)
+{
+   struct stat st;
+   unsigned long expiration;
+
+   /* Check timestamp */
+   expiration = get_shared_index_expire_date();
+   if (!expiration)
+   return 0;
+   if (stat(shared_index_path, ))
+   return error_errno(_("could not stat '%s"), shared_index_path);
+   if (st.st_mtime > expiration)
+   return 0;
+
+   return 1;
+}
+
+static int clean_shared_index_files(const char *current_hex)
+{
+   struct dirent *de;
+   DIR *dir = opendir(get_git_dir());
+
+   if (!dir)
+   return error_errno(_("unable to open git dir: %s"), 
get_git_dir());
+
+   while ((de = readdir(dir)) != NULL) {
+   const char *sha1_hex;
+   const char *shared_index_path;
+   if (!skip_prefix(de->d_name, "sharedindex.", _hex))
+   continue;
+   if (!strcmp(sha1_hex, current_hex))
+   continue;
+   shared_index_path = git_path("%s", de->d_name);
+   if (can_delete_shared_index(shared_index_path) > 0 &&
+   unlink(shared_index_path))
+   error_errno(_("unable to unlink: %s"), 
shared_index_path);
+   }
+   closedir(dir);
+
+   return 0;
+}
+
 static struct tempfile temporary_sharedindex;
 
 static int write_shared_index(struct index_state *istate,
@@ -2220,8 +2279,11 @@ static int write_shared_index(struct index_state *istate,
}
ret = rename_tempfile(_sharedindex,
  git_path("sharedindex.%s", 
sha1_to_hex(si->base->sha1)));
-   if (!ret)
+   if (!ret) {
hashcpy(si->base_sha1, si->base->sha1);
+   clean_shared_index_files(sha1_to_hex(si->base->sha1));
+   }
+
return ret;
 }
 
-- 
2.12.0.22.g0672473d40



Re: [PATCH 0/5] recursing submodules with relative pathspec (grep and ls-files)

2017-02-27 Thread Brandon Williams
On 02/24, Brandon Williams wrote:
> It was discovered that when using the --recurse-submodules flag with `git 
> grep`
> and `git ls-files` and specifying a relative path when not at the root causes
> the child processes spawned to error out with an error like:
> 
> fatal: ..: '..' is outside repository
> 
> While true that ".." is outside the scope of the submodule repository, it
> probably doesn't make much sense to the user who gave that pathspec with
> respect to the superproject.  Since the child processes that are spawned to
> handle the submodules have some context that they are executing underneath a
> superproject (via the 'super_prefix'), they should be able to prevent dying
> under this circumstance.
> 
> This series fixes this bug in both git grep and git ls-files as well as
> correctly formatting the output from submodules to handle the relative paths
> with "..".
> 
> One of the changes made to fix this was to add an additional flag for the
> parse_pathspec() function in order to treat all paths provided as being from
> the root of the repository.  I hesitantly selected the name 
> 'PATHSPEC_FROMROOT'
> but I'm not fond of it since its too similar to the pathspec magic define
> 'PATHSPEC_FROMTOP'.  So I'm open for naming suggestions.
> 
> Brandon Williams (5):
>   grep: illustrate bug when recursing with relative pathspec
>   pathspec: add PATHSPEC_FROMROOT flag
>   grep: fix bug when recuring with relative pathspec
>   ls-files: illustrate bug when recursing with relative pathspec
>   ls-files: fix bug when recuring with relative pathspec
> 
>  builtin/grep.c |  8 --
>  builtin/ls-files.c |  8 --
>  pathspec.c |  2 +-
>  pathspec.h |  2 ++
>  t/t3007-ls-files-recurse-submodules.sh | 50 
> ++
>  t/t7814-grep-recurse-submodules.sh | 42 
>  6 files changed, 107 insertions(+), 5 deletions(-)
> 

Turns out that this series doesn't address all of the issues.  This
series also seems to introduce broken behavior when recursing from a
subdirectory.  So I need to think about this problem a little bit more
and reroll.

-- 
Brandon Williams


Re: [PATCH] gitweb tests: Skip tests when we don't have Time::HiRes

2017-02-27 Thread Ævar Arnfjörð Bjarmason
On Mon, Feb 27, 2017 at 6:48 PM, Jakub Narębski  wrote:
> W dniu 27.02.2017 o 13:37, Ævar Arnfjörð Bjarmason pisze:
>> Change the gitweb tests to skip when we can't load the Time::HiRes
>> module.
>
> Could you tell us in the commit message why this module is needed?
> Is it because gitweb loads it unconditionally, or does that at least
> in the default configuration, or is it used in tests, or...?
>
> [I see it is somewhat addressed below]
>
>>
>> This module has bee in perl core since v5.8, which is the oldest
>
> s/bee/been/

I'll clarify that in a re-roll & fix the typo, pending any other
comments. Thanks!

>> version we support, however CentOS (and perhaps some other
>> distributions) carve it into its own non-core-perl package that's not
>> installed along with /usr/bin/perl by default. Without this we'll hard
>> fail the gitweb tests when trying to load the module.
>
> I see that it because gitweb.perl as the following at line 20:
>
> use Time::HiRes qw(gettimeofday tv_interval);
>
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>
> Good catch (if a strange one...).

This and the associated cvs tests failing as root patch I submitted
were discovered when trying to build git in the standard mock build
environment on CentOS. It creates a chroot and rpm installs just the
packages you declare, so issues like these crop up.

>> ---
>>  t/gitweb-lib.sh | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/t/gitweb-lib.sh b/t/gitweb-lib.sh
>> index d5dab5a94f..116c3890e6 100644
>> --- a/t/gitweb-lib.sh
>> +++ b/t/gitweb-lib.sh
>> @@ -114,4 +114,9 @@ perl -MCGI -MCGI::Util -MCGI::Carp -e 0 >/dev/null 2>&1 
>> || {
>>   test_done
>>  }
>>
>> +perl -mTime::HiRes -e 0  >/dev/null 2>&1 || {
>> + skip_all='skipping gitweb tests, Time::HiRes module unusable'
>
> Is "unusable" a good description, instead of "not found"?

Yeah it's odd, but I just copied the several lines above that use that phrasing.

>> + test_done
>> +}
>> +
>>  gitweb_init
>>
>


[PATCH v4 13/22] Documentation/config: add splitIndex.maxPercentChange

2017-02-27 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 Documentation/config.txt | 13 +
 1 file changed, 13 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 61a863adeb..8e745bda52 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2831,6 +2831,19 @@ showbranch.default::
The default set of branches for linkgit:git-show-branch[1].
See linkgit:git-show-branch[1].
 
+splitIndex.maxPercentChange::
+   When the split index feature is used, this specifies the
+   percent of entries the split index can contain compared to the
+   total number of entries in both the split index and the shared
+   index before a new shared index is written.
+   The value should be between 0 and 100. If the value is 0 then
+   a new shared index is always written, if it is 100 a new
+   shared index is never written.
+   By default the value is 20, so a new shared index is written
+   if the number of entries in the split index would be greater
+   than 20 percent of the total number of entries.
+   See linkgit:git-update-index[1].
+
 status.relativePaths::
By default, linkgit:git-status[1] shows paths relative to the
current directory. Setting this variable to `false` shows paths
-- 
2.12.0.22.g0672473d40



Re: [PATCH v5 1/1] config: add conditional include

2017-02-27 Thread Junio C Hamano
Jeff King  writes:

> I don't think driving that with a two-entry table is the right thing
> here. We are as likely to add another "foobar:" entry as we are to add
> another modifier "/i" modifier to "gitdir:", and it is unclear whether
> that modifier would be mutually exclusive with "/i".

OK, I didn't take /i as something that was meant as a modifier; I
took the "gitdir:" and "gitdir/i:" are totally different tests that
are spelled similarly, but for the implementation expediency, called
into a single helper function without having a layer that presents
the same function signature in the middle to make it drivable by a
table.

Let's leave it to the review of a future patch that wants to add a
third condition then.  At that time, we will have more things to
look at to make a better decision.





Re: [PATCH 10/10] submodule--helper clone: check for configured submodules using helper

2017-02-27 Thread Brandon Williams
On 02/23, Stefan Beller wrote:
> On Thu, Feb 23, 2017 at 3:47 PM, Brandon Williams  wrote:
> 
> > @@ -795,14 +794,11 @@ static int prepare_to_clone_next_submodule(const 
> > struct cache_entry *ce,
> > }
> >
> > /*
> > -* Looking up the url in .git/config.
> > +* Check if the submodule has been initialized.
> >  * We must not fall back to .gitmodules as we only want
> >  * to process configured submodules.
> 
> This sentence "we must not ..." is also no longer accurate,
> as we do exactly that when using sub->url instead of just url
> below.

You're right, I'll fix that.

-- 
Brandon Williams


[PATCH v4 18/22] t1700: test shared index file expiration

2017-02-27 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 t/t1700-split-index.sh | 44 
 1 file changed, 44 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 21f43903f8..480d3a8dc3 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -310,4 +310,48 @@ test_expect_success 'check splitIndex.maxPercentChange set 
to 0' '
test_cmp expect actual
 '
 
+test_expect_success 'shared index files expire after 2 weeks by default' '
+   : >ten &&
+   git update-index --add ten &&
+   test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+   just_under_2_weeks_ago=$((5-14*86400)) &&
+   test-chmtime =$just_under_2_weeks_ago .git/sharedindex.* &&
+   : >eleven &&
+   git update-index --add eleven &&
+   test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+   just_over_2_weeks_ago=$((-1-14*86400)) &&
+   test-chmtime =$just_over_2_weeks_ago .git/sharedindex.* &&
+   : >twelve &&
+   git update-index --add twelve &&
+   test $(ls .git/sharedindex.* | wc -l) = 1
+'
+
+test_expect_success 'check splitIndex.sharedIndexExpire set to 16 days' '
+   git config splitIndex.sharedIndexExpire "16.days.ago" &&
+   test-chmtime =$just_over_2_weeks_ago .git/sharedindex.* &&
+   : >thirteen &&
+   git update-index --add thirteen &&
+   test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+   just_over_16_days_ago=$((-1-16*86400)) &&
+   test-chmtime =$just_over_16_days_ago .git/sharedindex.* &&
+   : >fourteen &&
+   git update-index --add fourteen &&
+   test $(ls .git/sharedindex.* | wc -l) = 1
+'
+
+test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and 
"now"' '
+   git config splitIndex.sharedIndexExpire never &&
+   just_10_years_ago=$((-365*10*86400)) &&
+   test-chmtime =$just_10_years_ago .git/sharedindex.* &&
+   : >fifteen &&
+   git update-index --add fifteen &&
+   test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+   git config splitIndex.sharedIndexExpire now &&
+   just_1_second_ago=-1 &&
+   test-chmtime =$just_1_second_ago .git/sharedindex.* &&
+   : >sixteen &&
+   git update-index --add sixteen &&
+   test $(ls .git/sharedindex.* | wc -l) = 1
+'
+
 test_done
-- 
2.12.0.22.g0672473d40



Re: [PATCH] strbuf: add strbuf_add_real_path()

2017-02-27 Thread Brandon Williams
On 02/25, René Scharfe wrote:
> Add a function for appending the canonized absolute pathname of a given
> path to a strbuf.  It keeps the existing contents intact, as expected of
> a function of the strbuf_add() family, while avoiding copying the result
> if the given strbuf is empty.  It's more consistent with the rest of the
> strbuf API than strbuf_realpath(), which it's wrapping.
> 
> Also add a semantic patch demonstrating its intended usage and apply it
> to the current tree.  Using strbuf_add_real_path() instead of calling
> strbuf_addstr() and real_path() avoids an extra copy to a static buffer.
> 

Seems like a reasonable thing to do.  When I wrote strbuf_realpath() I
think I looked at the strbuf_getcwd() function for what it did (since it
handled paths) and it simply uses the provided buffer disregarding what
is already stored in it.

> Signed-off-by: Rene Scharfe 
> ---
>  contrib/coccinelle/strbuf.cocci |  6 ++
>  setup.c |  2 +-
>  strbuf.c| 11 +++
>  strbuf.h| 14 ++
>  4 files changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
> index 63995f22ff..1d580e49b0 100644
> --- a/contrib/coccinelle/strbuf.cocci
> +++ b/contrib/coccinelle/strbuf.cocci
> @@ -38,3 +38,9 @@ expression E1, E2, E3;
>  @@
>  - strbuf_addstr(E1, find_unique_abbrev(E2, E3));
>  + strbuf_add_unique_abbrev(E1, E2, E3);
> +
> +@@
> +expression E1, E2;
> +@@
> +- strbuf_addstr(E1, real_path(E2));
> ++ strbuf_add_real_path(E1, E2);
> diff --git a/setup.c b/setup.c
> index 967f289f1e..f14cbcd338 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -254,7 +254,7 @@ int get_common_dir_noenv(struct strbuf *sb, const char 
> *gitdir)
>   if (!is_absolute_path(data.buf))
>   strbuf_addf(, "%s/", gitdir);
>   strbuf_addbuf(, );
> - strbuf_addstr(sb, real_path(path.buf));
> + strbuf_add_real_path(sb, path.buf);
>   ret = 1;
>   } else {
>   strbuf_addstr(sb, gitdir);
> diff --git a/strbuf.c b/strbuf.cq
> index 8fec6579f7..ace58e7367 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -707,6 +707,17 @@ void strbuf_add_absolute_path(struct strbuf *sb, const 
> char *path)
>   strbuf_addstr(sb, path);
>  }
>  
> +void strbuf_add_real_path(struct strbuf *sb, const char *path)
> +{
> + if (sb->len) {
> + struct strbuf resolved = STRBUF_INIT;
> + strbuf_realpath(, path, 1);
> + strbuf_addbuf(sb, );
> + strbuf_release();
> + } else
> + strbuf_realpath(sb, path, 1);

I know its not required but I would have braces on the 'else' branch
since they were needed on the 'if' branch.  But that's up to you and
your style :)

> +}
> +
>  int printf_ln(const char *fmt, ...)
>  {
>   int ret;
> diff --git a/strbuf.h b/strbuf.h
> index cf1b5409e7..cf8e4bf532 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -441,6 +441,20 @@ extern int strbuf_getcwd(struct strbuf *sb);
>   */
>  extern void strbuf_add_absolute_path(struct strbuf *sb, const char *path);
>  
> +/**
> + * Canonize `path` (make it absolute, resolve symlinks, remove extra
> + * slashes) and append it to `sb`.  Die with an informative error
> + * message if there is a problem.
> + *
> + * The directory part of `path` (i.e., everything up to the last
> + * dir_sep) must denote a valid, existing directory, but the last
> + * component need not exist.
> + *
> + * Callers that don't mind links should use the more lightweight
> + * strbuf_add_absolute_path() instead.
> + */
> +extern void strbuf_add_real_path(struct strbuf *sb, const char *path);
> +
>  
>  /**
>   * Normalize in-place the path contained in the strbuf. See
> -- 
> 2.12.0
> 

-- 
Brandon Williams


Re: [PATCH 1/5] grep: illustrate bug when recursing with relative pathspec

2017-02-27 Thread Brandon Williams
On 02/26, Duy Nguyen wrote:
> On Sat, Feb 25, 2017 at 6:50 AM, Brandon Williams  wrote:
> > When using the --recurse-submodules flag with a relative pathspec which
> > includes "..", an error is produced inside the child process spawned for
> > a submodule.  When creating the pathspec struct in the child, the ".."
> > is interpreted to mean "go up a directory" which causes an error stating
> > that the path ".." is outside of the repository.
> >
> > While it is true that ".." is outside the scope of the submodule, it is
> > confusing to a user who originally invoked the command where ".." was
> > indeed still inside the scope of the superproject.  Since the child
> > process luanched for the submodule has some context that it is operating
> 
> s/luanched/launched/
> 
> I would prefer 1/5 t to be merged with 3/5 though. The problem

We can definitely merge them and I agree that it may be easier for
looking through the logs if they are merged.

> description is very light there, and the test demonstration in the
> diff is simply switching from failure to success, which forces the
> reader to come back here. It's easier to find here now, but it'll be a
> bit harder when it enters master and we have to read it from git-log,
> I think.
> 
> I'm still munching through the super-prefix patches. From how you
> changed match_pathspec call in 0281e487fd (grep: optionally recurse
> into submodules - 2016-12-16), I guess pathspecs should be handled
> with super-prefix instead of the submodule's prefix (which is empty
> anyway, I guess). The right solution wrt. handling relative paths may
> be teach pathspec about super-prefix (and even original super's cwd)
> then let it processes path in supermodule's context.
> 
> Does it handle relative paths with wildcards correctly btw? Ones that
> cross submodules? I have a feeling it doesn't, but I haven't seen how
> exactly super-prefix works yet.

I'm not 100% sure about the relative paths with wildcards.  I did notice
that this series solves one problem and introduces another (recursing
from a subdirectory doesn't work with this series) so I need to
rethink the solution a little bit.  And I'll take into account wildcards
as well.

> 
> There's another problem with passing pathspec from one process to
> another. The issue with preserving the prefix, see 233c3e6c59
> (parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN -
> 2013-07-14). :(icase) needs this because given a path
> "/foobar", only the "foobar" part is considered case
> insensitive, the prefix part is always case-sensitive. For example, if
> you have 4 paths "abc/def", "abc/DEF", "ABC/def" and "ABC/DEF" and are
> standing at "abc", you would want ":(icase)def" to match the first two
> only, not all of them.

Hmm...yeah its a really difficult thing to get 100% correct (as I'm now
noticing).  It also makes it a little more challenging because you don't
have the same state in both processes (child and parent).  I'm thinking
I may have to a little bit more work, which is more involved, to
properly handle the pathspecs passed to the children.  As in we may not
be able to pass the raw pathspec used in the parent to the child and
instead may have to do a little pre-processing on it.

> > underneath a superproject, this error could be avoided.
> >
> > Signed-off-by: Brandon Williams 
> > ---
> >  t/t7814-grep-recurse-submodules.sh | 42 
> > ++
> >  1 file changed, 42 insertions(+)
> >
> > diff --git a/t/t7814-grep-recurse-submodules.sh 
> > b/t/t7814-grep-recurse-submodules.sh
> > index 67247a01d..418ba68fe 100755
> > --- a/t/t7814-grep-recurse-submodules.sh
> > +++ b/t/t7814-grep-recurse-submodules.sh
> > @@ -227,6 +227,48 @@ test_expect_success 'grep history with moved 
> > submoules' '
> > test_cmp expect actual
> >  '
> >
> > +test_expect_failure 'grep using relative path' '
> > +   test_when_finished "rm -rf parent sub" &&
> > +   git init sub &&
> > +   echo "foobar" >sub/file &&
> > +   git -C sub add file &&
> > +   git -C sub commit -m "add file" &&
> > +
> > +   git init parent &&
> > +   echo "foobar" >parent/file &&
> > +   git -C parent add file &&
> > +   mkdir parent/src &&
> > +   echo "foobar" >parent/src/file2 &&
> > +   git -C parent add src/file2 &&
> > +   git -C parent submodule add ../sub &&
> > +   git -C parent commit -m "add files and submodule" &&
> > +
> > +   # From top works
> > +   cat >expect <<-\EOF &&
> > +   file:foobar
> > +   src/file2:foobar
> > +   sub/file:foobar
> > +   EOF
> > +   git -C parent grep --recurse-submodules -e "foobar" >actual &&
> > +   test_cmp expect actual &&
> > +
> > +   # Relative path to top errors out
> 
> After 3/5, it's not "errors out" any more, is it?
> 

Correct, will fix the comment.

> > +   cat >expect <<-\EOF &&
> > +   ../file:foobar
> > +   file2:foobar
> 

[PATCH v4 20/22] read-cache: use freshen_shared_index() in read_index_from()

2017-02-27 Thread Christian Couder
This way a share index file will not be garbage collected if
we still read from an index it is based from.

As we need to read the current index before creating a new
one, the tests have to be adjusted, so that we don't expect
an old shared index file to be deleted right away when we
create a new one.

Signed-off-by: Christian Couder 
---
 read-cache.c   |  1 +
 t/t1700-split-index.sh | 14 +++---
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 3ea20701a3..d1152161f6 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1719,6 +1719,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
base_sha1_hex, base_path,
sha1_to_hex(split_index->base->sha1));
 
+   freshen_shared_index(base_sha1_hex, 0);
merge_base_index(istate);
post_read_index_from(istate);
return ret;
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 480d3a8dc3..ea1aeacff0 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -313,17 +313,17 @@ test_expect_success 'check splitIndex.maxPercentChange 
set to 0' '
 test_expect_success 'shared index files expire after 2 weeks by default' '
: >ten &&
git update-index --add ten &&
-   test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+   test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
just_under_2_weeks_ago=$((5-14*86400)) &&
test-chmtime =$just_under_2_weeks_ago .git/sharedindex.* &&
: >eleven &&
git update-index --add eleven &&
-   test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+   test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
just_over_2_weeks_ago=$((-1-14*86400)) &&
test-chmtime =$just_over_2_weeks_ago .git/sharedindex.* &&
: >twelve &&
git update-index --add twelve &&
-   test $(ls .git/sharedindex.* | wc -l) = 1
+   test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_expect_success 'check splitIndex.sharedIndexExpire set to 16 days' '
@@ -331,12 +331,12 @@ test_expect_success 'check splitIndex.sharedIndexExpire 
set to 16 days' '
test-chmtime =$just_over_2_weeks_ago .git/sharedindex.* &&
: >thirteen &&
git update-index --add thirteen &&
-   test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+   test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
just_over_16_days_ago=$((-1-16*86400)) &&
test-chmtime =$just_over_16_days_ago .git/sharedindex.* &&
: >fourteen &&
git update-index --add fourteen &&
-   test $(ls .git/sharedindex.* | wc -l) = 1
+   test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and 
"now"' '
@@ -345,13 +345,13 @@ test_expect_success 'check splitIndex.sharedIndexExpire 
set to "never" and "now"
test-chmtime =$just_10_years_ago .git/sharedindex.* &&
: >fifteen &&
git update-index --add fifteen &&
-   test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+   test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
git config splitIndex.sharedIndexExpire now &&
just_1_second_ago=-1 &&
test-chmtime =$just_1_second_ago .git/sharedindex.* &&
: >sixteen &&
git update-index --add sixteen &&
-   test $(ls .git/sharedindex.* | wc -l) = 1
+   test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_done
-- 
2.12.0.22.g0672473d40



[PATCH v4 16/22] config: add git_config_get_expiry() from gc.c

2017-02-27 Thread Christian Couder
This function will be used in a following commit to get the expiration
time of the shared index files from the config, and it is generic
enough to be put in "config.c".

Signed-off-by: Christian Couder 
---
 builtin/gc.c | 15 ++-
 cache.h  |  3 +++
 config.c | 13 +
 3 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 331f219260..66dff6a8af 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -62,17 +62,6 @@ static void report_pack_garbage(unsigned seen_bits, const 
char *path)
string_list_append(_garbage, path);
 }
 
-static void git_config_date_string(const char *key, const char **output)
-{
-   if (git_config_get_string_const(key, output))
-   return;
-   if (strcmp(*output, "now")) {
-   unsigned long now = approxidate("now");
-   if (approxidate(*output) >= now)
-   git_die_config(key, _("Invalid %s: '%s'"), key, 
*output);
-   }
-}
-
 static void process_log_file(void)
 {
struct stat st;
@@ -111,8 +100,8 @@ static void gc_config(void)
git_config_get_int("gc.auto", _auto_threshold);
git_config_get_int("gc.autopacklimit", _auto_pack_limit);
git_config_get_bool("gc.autodetach", _auto);
-   git_config_date_string("gc.pruneexpire", _expire);
-   git_config_date_string("gc.worktreepruneexpire", 
_worktrees_expire);
+   git_config_get_expiry("gc.pruneexpire", _expire);
+   git_config_get_expiry("gc.worktreepruneexpire", 
_worktrees_expire);
git_config(git_default_config, NULL);
 }
 
diff --git a/cache.h b/cache.h
index 6b25b50aab..8994e7d373 100644
--- a/cache.h
+++ b/cache.h
@@ -1888,6 +1888,9 @@ extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
 extern int git_config_get_max_percent_split_change(void);
 
+/* This dies if the configured or default date is in the future */
+extern int git_config_get_expiry(const char *key, const char **output);
+
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
  * ensure that they do not modify the untracked cache when reading it.
diff --git a/config.c b/config.c
index 35b6f02960..f20d7d88f7 100644
--- a/config.c
+++ b/config.c
@@ -1712,6 +1712,19 @@ int git_config_get_pathname(const char *key, const char 
**dest)
return ret;
 }
 
+int git_config_get_expiry(const char *key, const char **output)
+{
+   int ret = git_config_get_string_const(key, output);
+   if (ret)
+   return ret;
+   if (strcmp(*output, "now")) {
+   unsigned long now = approxidate("now");
+   if (approxidate(*output) >= now)
+   git_die_config(key, _("Invalid %s: '%s'"), key, 
*output);
+   }
+   return ret;
+}
+
 int git_config_get_untracked_cache(void)
 {
int val = -1;
-- 
2.12.0.22.g0672473d40



[PATCH v4 22/22] Documentation/git-update-index: explain splitIndex.*

2017-02-27 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 Documentation/config.txt   |  2 +-
 Documentation/git-update-index.txt | 37 +
 2 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0e9982c5e3..2ccb053d92 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2853,7 +2853,7 @@ splitIndex.sharedIndexExpire::
The default value is "2.weeks.ago".
Note that a shared index file is considered modified (for the
purpose of expiration) each time a new split-index file is
-   created based on it.
+   either created based on it or read from it.
See linkgit:git-update-index[1].
 
 status.relativePaths::
diff --git a/Documentation/git-update-index.txt 
b/Documentation/git-update-index.txt
index e091b2a409..1579abf3c3 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -163,14 +163,10 @@ may not support it yet.
 
 --split-index::
 --no-split-index::
-   Enable or disable split index mode. If enabled, the index is
-   split into two files, $GIT_DIR/index and $GIT_DIR/sharedindex..
-   Changes are accumulated in $GIT_DIR/index while the shared
-   index file contains all index entries stays unchanged. If
-   split-index mode is already enabled and `--split-index` is
-   given again, all changes in $GIT_DIR/index are pushed back to
-   the shared index file. This mode is designed for very large
-   indexes that take a significant amount of time to read or write.
+   Enable or disable split index mode. If split-index mode is
+   already enabled and `--split-index` is given again, all
+   changes in $GIT_DIR/index are pushed back to the shared index
+   file.
 +
 These options take effect whatever the value of the `core.splitIndex`
 configuration variable (see linkgit:git-config[1]). But a warning is
@@ -394,6 +390,31 @@ Although this bit looks similar to assume-unchanged bit, 
its goal is
 different from assume-unchanged bit's. Skip-worktree also takes
 precedence over assume-unchanged bit when both are set.
 
+Split index
+---
+
+This mode is designed for repositories with very large indexes, and
+aims at reducing the time it takes to repeatedly write these indexes.
+
+In this mode, the index is split into two files, $GIT_DIR/index and
+$GIT_DIR/sharedindex.. Changes are accumulated in
+$GIT_DIR/index, the split index, while the shared index file contains
+all index entries and stays unchanged.
+
+All changes in the split index are pushed back to the shared index
+file when the number of entries in the split index reaches a level
+specified by the splitIndex.maxPercentChange config variable (see
+linkgit:git-config[1]).
+
+Each time a new shared index file is created, the old shared index
+files are deleted if their modification time is older than what is
+specified by the splitIndex.sharedIndexExpire config variable (see
+linkgit:git-config[1]).
+
+To avoid deleting a shared index file that is still used, its
+modification time is updated to the current time everytime a new split
+index based on the shared index file is either created or read from.
+
 Untracked cache
 ---
 
-- 
2.12.0.22.g0672473d40



[PATCH v4 19/22] read-cache: refactor read_index_from()

2017-02-27 Thread Christian Couder
It looks better and is simpler to review when we don't compute
the same things many times in the function.

It will also help make the following commit simpler.

Signed-off-by: Christian Couder 
---
 read-cache.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 45fc831010..3ea20701a3 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1691,6 +1691,8 @@ int read_index_from(struct index_state *istate, const 
char *path)
 {
struct split_index *split_index;
int ret;
+   char *base_sha1_hex;
+   const char *base_path;
 
/* istate->initialized covers both .git/index and .git/sharedindex.xxx 
*/
if (istate->initialized)
@@ -1708,15 +1710,15 @@ int read_index_from(struct index_state *istate, const 
char *path)
discard_index(split_index->base);
else
split_index->base = xcalloc(1, sizeof(*split_index->base));
-   ret = do_read_index(split_index->base,
-   git_path("sharedindex.%s",
-sha1_to_hex(split_index->base_sha1)), 1);
+
+   base_sha1_hex = sha1_to_hex(split_index->base_sha1);
+   base_path = git_path("sharedindex.%s", base_sha1_hex);
+   ret = do_read_index(split_index->base, base_path, 1);
if (hashcmp(split_index->base_sha1, split_index->base->sha1))
die("broken index, expect %s in %s, got %s",
-   sha1_to_hex(split_index->base_sha1),
-   git_path("sharedindex.%s",
-sha1_to_hex(split_index->base_sha1)),
+   base_sha1_hex, base_path,
sha1_to_hex(split_index->base->sha1));
+
merge_base_index(istate);
post_read_index_from(istate);
return ret;
-- 
2.12.0.22.g0672473d40



[PATCH v4 14/22] sha1_file: make check_and_freshen_file() non static

2017-02-27 Thread Christian Couder
This function will be used in a commit soon, so let's make
it available globally.

Signed-off-by: Christian Couder 
---
 cache.h | 3 +++
 sha1_file.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 955e80913e..6b25b50aab 100644
--- a/cache.h
+++ b/cache.h
@@ -1229,6 +1229,9 @@ extern int has_pack_index(const unsigned char *sha1);
 
 extern void assert_sha1_type(const unsigned char *sha1, enum object_type 
expect);
 
+/* Helper to check and "touch" a file */
+extern int check_and_freshen_file(const char *fn, int freshen);
+
 extern const signed char hexval_table[256];
 static inline unsigned int hexval(unsigned char c)
 {
diff --git a/sha1_file.c b/sha1_file.c
index ec957db5e1..192073ea95 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -593,7 +593,7 @@ static int freshen_file(const char *fn)
  * either does not exist on disk, or has a stale mtime and may be subject to
  * pruning).
  */
-static int check_and_freshen_file(const char *fn, int freshen)
+int check_and_freshen_file(const char *fn, int freshen)
 {
if (access(fn, F_OK))
return 0;
-- 
2.12.0.22.g0672473d40



[PATCH v4 12/22] t1700: add tests for splitIndex.maxPercentChange

2017-02-27 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 t/t1700-split-index.sh | 72 ++
 1 file changed, 72 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index df19b812fd..21f43903f8 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -238,4 +238,76 @@ test_expect_success 'set core.splitIndex config variable 
to false' '
test_cmp expect actual
 '
 
+test_expect_success 'set core.splitIndex config variable to true' '
+   git config core.splitIndex true &&
+   : >three &&
+   git update-index --add three &&
+   BASE=$(test-dump-split-index .git/index | grep "^base") &&
+   test-dump-split-index .git/index | sed "/^own/d" >actual &&
+   cat >expect <<-EOF &&
+   $BASE
+   replacements:
+   deletions:
+   EOF
+   test_cmp expect actual &&
+   : >four &&
+   git update-index --add four &&
+   test-dump-split-index .git/index | sed "/^own/d" >actual &&
+   cat >expect <<-EOF &&
+   $BASE
+   100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0   four
+   replacements:
+   deletions:
+   EOF
+   test_cmp expect actual
+'
+
+test_expect_success 'check behavior with splitIndex.maxPercentChange unset' '
+   git config --unset splitIndex.maxPercentChange &&
+   : >five &&
+   git update-index --add five &&
+   BASE=$(test-dump-split-index .git/index | grep "^base") &&
+   test-dump-split-index .git/index | sed "/^own/d" >actual &&
+   cat >expect <<-EOF &&
+   $BASE
+   replacements:
+   deletions:
+   EOF
+   test_cmp expect actual &&
+   : >six &&
+   git update-index --add six &&
+   test-dump-split-index .git/index | sed "/^own/d" >actual &&
+   cat >expect <<-EOF &&
+   $BASE
+   100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0   six
+   replacements:
+   deletions:
+   EOF
+   test_cmp expect actual
+'
+
+test_expect_success 'check splitIndex.maxPercentChange set to 0' '
+   git config splitIndex.maxPercentChange 0 &&
+   : >seven &&
+   git update-index --add seven &&
+   BASE=$(test-dump-split-index .git/index | grep "^base") &&
+   test-dump-split-index .git/index | sed "/^own/d" >actual &&
+   cat >expect <<-EOF &&
+   $BASE
+   replacements:
+   deletions:
+   EOF
+   test_cmp expect actual &&
+   : >eight &&
+   git update-index --add eight &&
+   BASE=$(test-dump-split-index .git/index | grep "^base") &&
+   test-dump-split-index .git/index | sed "/^own/d" >actual &&
+   cat >expect <<-EOF &&
+   $BASE
+   replacements:
+   deletions:
+   EOF
+   test_cmp expect actual
+'
+
 test_done
-- 
2.12.0.22.g0672473d40



[PATCH v4 11/22] read-cache: regenerate shared index if necessary

2017-02-27 Thread Christian Couder
When writing a new split-index and there is a big number of cache
entries in the split-index compared to the shared index, it is a
good idea to regenerate the shared index.

By default when the ratio reaches 20%, we will push back all
the entries from the split-index into a new shared index file
instead of just creating a new split-index file.

The threshold can be configured using the
"splitIndex.maxPercentChange" config variable.

We need to adjust the existing tests in t1700 by setting
"splitIndex.maxPercentChange" to 100 at the beginning of t1700,
as the existing tests are assuming that the shared index is
regenerated only when `git update-index --split-index` is used.

Signed-off-by: Christian Couder 
---
 read-cache.c   | 32 
 t/t1700-split-index.sh |  1 +
 2 files changed, 33 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 99bc274b8d..aeb413a508 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2212,6 +2212,36 @@ static int write_shared_index(struct index_state *istate,
return ret;
 }
 
+static const int default_max_percent_split_change = 20;
+
+static int too_many_not_shared_entries(struct index_state *istate)
+{
+   int i, not_shared = 0;
+   int max_split = git_config_get_max_percent_split_change();
+
+   switch (max_split) {
+   case -1:
+   /* not or badly configured: use the default value */
+   max_split = default_max_percent_split_change;
+   break;
+   case 0:
+   return 1; /* 0% means always write a new shared index */
+   case 100:
+   return 0; /* 100% means never write a new shared index */
+   default:
+   break; /* just use the configured value */
+   }
+
+   /* Count not shared entries */
+   for (i = 0; i < istate->cache_nr; i++) {
+   struct cache_entry *ce = istate->cache[i];
+   if (!ce->index)
+   not_shared++;
+   }
+
+   return (int64_t)istate->cache_nr * max_split < (int64_t)not_shared * 
100;
+}
+
 int write_locked_index(struct index_state *istate, struct lock_file *lock,
   unsigned flags)
 {
@@ -2229,6 +2259,8 @@ int write_locked_index(struct index_state *istate, struct 
lock_file *lock,
if ((v & 15) < 6)
istate->cache_changed |= SPLIT_INDEX_ORDERED;
}
+   if (too_many_not_shared_entries(istate))
+   istate->cache_changed |= SPLIT_INDEX_ORDERED;
if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
int ret = write_shared_index(istate, lock, flags);
if (ret)
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 1659986d8d..df19b812fd 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -8,6 +8,7 @@ test_description='split index mode tests'
 sane_unset GIT_TEST_SPLIT_INDEX
 
 test_expect_success 'enable split index' '
+   git config splitIndex.maxPercentChange 100 &&
git update-index --split-index &&
test-dump-split-index .git/index >actual &&
indexversion=$(test-index-version <.git/index) &&
-- 
2.12.0.22.g0672473d40



[PATCH v4 00/22] Add configuration options for split-index

2017-02-27 Thread Christian Couder
Goal


We want to make it possible to use the split-index feature
automatically by just setting a new "core.splitIndex" configuration
variable to true.

This can be valuable as split-index can help significantly speed up
`git rebase` especially along with the work to libify `git apply`
that has been merged to master
(see https://github.com/git/git/commit/81358dc238372793b1590efa149cc1581d1fbd98)
and is now in v2.11.

Design
~~

The design is similar as the previous work that introduced
"core.untrackedCache". 

The new "core.splitIndex" configuration option can be either true,
false or undefined which is the default.

When it is true, the split index is created, if it does not already
exists, when the index is read. When it is false, the split index is
removed if it exists, when the index is read. Otherwise it is left as
is.

Along with this new configuration variable, the two following options
are also introduced:

- splitIndex.maxPercentChange

This is to avoid having too many changes accumulating in the split
index while in split index mode. The git-update-index
documentation says:

If split-index mode is already enabled and `--split-index` is
given again, all changes in $GIT_DIR/index are pushed back to
the shared index file.

but it is probably better to not expect the user to think about it
and to have a mechanism that pushes back all changes to the shared
index file automatically when some threshold is reached.

The default threshold is when the number of entries in the split
index file reaches 20% of the number of entries in the shared
index file. The new "splitIndex.maxPercentChange" config option
lets people tweak this value.

- splitIndex.sharedIndexExpire

To make sure that old sharedindex files are eventually removed
when a new one has been created, we "touch" the shared index file
every time a split index file using the shared index file is
either created or read from. Then we can delete shared indexes
with an mtime older than one week (by default), when we create a
new shared index file. The new "splitIndex.sharedIndexExpire"
config option lets people tweak this grace period.

This idea was suggested by Duy in:


https://public-inbox.org/git/cacsjy8bqmfashf5kjguh+bd7xg98cafnyde964vjypxz-em...@mail.gmail.com/

and after some experiments, I agree that it is much simpler than
what I thought could be done during our discussion.

Junio also thinks that we have to do "time-based GC" in:
 
https://public-inbox.org/git/xmqqeg33ccjj@gitster.mtv.corp.google.com/

Note that this patch series doesn't address a leak when removing a
split-index, but Duy wrote that he has a patch to fix this leak:

https://public-inbox.org/git/CACsJy8AisF2ZVs7JdnVFp5wdskkbVQQQ=dbq5uze1moscfb...@mail.gmail.com/

Highlevel view of the patches in the series
~~~

Except for patch 1/22 and 1/22, there are 3 big steps, one for each
new configuration variable introduced.

There only small differences between this patch series and the v3
patch series sent a few months ago.

- Patch 1/22 marks a message for translation. It is not new and
  can be applied separately.

- Patch 2/22 improves the existing indentation style of t1700 by
  using different here document style. It is a new preparatory
  patch suggested by Junio.

Step 1 is:

- Patches 3/22 to 6/22 introduce the functions that are reading
  the "core.splitIndex" configuration variable and tweaking the
  split index depending on its value.

- Patch 7/22 adds a few tests for the new feature.

- Patches 8/22 and 9/22 add some documentation for the new
  feature.

There is no change since v3 in this step.

Step 2 is:

- Patches 10/22 and 11/22 introduce the functions that are reading
  the "splitIndex.maxPercentChange" configuration variable and
  regenerating a new shared index file depending on its value.

  Patch 11/12 has a few changes suggested by Junio since v3, see
  
https://public-inbox.org/git/CAP8UFD3_1EN=0esd12cew1muw8yhtpazw0m_g3wmvkfk-ug...@mail.gmail.com/

- Patch 12/22 adds a few tests for the new feature.

- Patch 13/22 add some documentation for the new feature. The
  added documentation has been reworded a little bit since v3 as
  suggested by Junio.

Step 3 is:

- Patches 14/22 to 17/22 introduce the functions that are reading
  the "splitIndex.sharedIndexExpire" configuration variable and
  expiring old shared index files depending on its value.

  Patch 15/22 has a few changes suggested by Ramsay and Junio in
  
http://public-inbox.org/git/a1a44640-ff6c-2294-72ac-46322eff8...@ramsayjones.plus.com/
  http://public-inbox.org/git/xmqqbmunq6mg@gitster.mtv.corp.google.com/

  The main change is that the new freshen_shared_index() will now
  warn if the split-index file 

[PATCH v4 02/22] t1700: change here document style

2017-02-27 Thread Christian Couder
This improves test indentation by getting rid of the outdated
here document style.

Signed-off-by: Christian Couder 
---
 t/t1700-split-index.sh | 170 -
 1 file changed, 85 insertions(+), 85 deletions(-)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 292a0720fc..cb68b8dc1e 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -19,12 +19,12 @@ test_expect_success 'enable split index' '
own=8299b0bcd1ac364e5f1d7768efb62fa2da79a339
base=39d890139ee5356c7ef572216cebcd27aa41f9df
fi &&
-   cat >expect one &&
git update-index --add one &&
git ls-files --stage >ls-files.actual &&
-   cat >ls-files.expect actual &&
-   cat >expect ls-files.actual &&
-   cat >ls-files.expect actual &&
-   cat >expect ls-files.actual &&
-   cat >ls-files.expect actual &&
-   cat >expect one &&
git update-index one &&
git ls-files --stage >ls-files.actual &&
-   cat >ls-files.expect actual &&
-   q_to_tab >expect two &&
git update-index --add two &&
git ls-files --stage >ls-files.actual &&
-   cat >ls-files.expect actual &&
-   q_to_tab >expect ls-files.actual &&
-   cat >ls-files.expect actual &&
-   q_to_tab >expect ls-files.actual &&
-   cat >ls-files.expect actual &&
-   cat >expect 

[PATCH v4 10/22] config: add git_config_get_max_percent_split_change()

2017-02-27 Thread Christian Couder
This new function will be used in a following commit to get the
value of the "splitIndex.maxPercentChange" config variable.

Signed-off-by: Christian Couder 
---
 cache.h  |  1 +
 config.c | 15 +++
 2 files changed, 16 insertions(+)

diff --git a/cache.h b/cache.h
index 014aa7ea11..955e80913e 100644
--- a/cache.h
+++ b/cache.h
@@ -1883,6 +1883,7 @@ extern int git_config_get_maybe_bool(const char *key, int 
*dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
+extern int git_config_get_max_percent_split_change(void);
 
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
diff --git a/config.c b/config.c
index 2a97696be7..35b6f02960 100644
--- a/config.c
+++ b/config.c
@@ -1746,6 +1746,21 @@ int git_config_get_split_index(void)
return -1; /* default value */
 }
 
+int git_config_get_max_percent_split_change(void)
+{
+   int val = -1;
+
+   if (!git_config_get_int("splitindex.maxpercentchange", )) {
+   if (0 <= val && val <= 100)
+   return val;
+
+   return error(_("splitIndex.maxPercentChange value '%d' "
+  "should be between 0 and 100"), val);
+   }
+
+   return -1; /* default value */
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
-- 
2.12.0.22.g0672473d40



[PATCH v4 06/22] update-index: warn in case of split-index incoherency

2017-02-27 Thread Christian Couder
When users are using `git update-index --(no-)split-index`, they
may expect the split-index feature to be used or not according to
the option they just used, but this might not be the case if the
new "core.splitIndex" config variable has been set. In this case
let's warn about what will happen and why.

Signed-off-by: Christian Couder 
---
 builtin/update-index.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 24fdadfa4b..d74d72cc7f 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1099,12 +1099,21 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
}
 
if (split_index > 0) {
+   if (git_config_get_split_index() == 0)
+   warning(_("core.splitIndex is set to false; "
+ "remove or change it, if you really want to "
+ "enable split index"));
if (the_index.split_index)
the_index.cache_changed |= SPLIT_INDEX_ORDERED;
else
add_split_index(_index);
-   } else if (!split_index)
+   } else if (!split_index) {
+   if (git_config_get_split_index() == 1)
+   warning(_("core.splitIndex is set to true; "
+ "remove or change it, if you really want to "
+ "disable split index"));
remove_split_index(_index);
+   }
 
switch (untracked_cache) {
case UC_UNSPECIFIED:
-- 
2.12.0.22.g0672473d40



[PATCH v4 05/22] read-cache: add and then use tweak_split_index()

2017-02-27 Thread Christian Couder
This will make us use the split-index feature or not depending
on the value of the "core.splitIndex" config variable.

Signed-off-by: Christian Couder 
---
 read-cache.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 9054369dd0..99bc274b8d 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1558,10 +1558,27 @@ static void tweak_untracked_cache(struct index_state 
*istate)
}
 }
 
+static void tweak_split_index(struct index_state *istate)
+{
+   switch (git_config_get_split_index()) {
+   case -1: /* unset: do nothing */
+   break;
+   case 0: /* false */
+   remove_split_index(istate);
+   break;
+   case 1: /* true */
+   add_split_index(istate);
+   break;
+   default: /* unknown value: do nothing */
+   break;
+   }
+}
+
 static void post_read_index_from(struct index_state *istate)
 {
check_ce_order(istate);
tweak_untracked_cache(istate);
+   tweak_split_index(istate);
 }
 
 /* remember to discard_cache() before reading a different cache! */
-- 
2.12.0.22.g0672473d40



[PATCH v4 08/22] Documentation/config: add information for core.splitIndex

2017-02-27 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 Documentation/config.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 015346c417..61a863adeb 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -334,6 +334,10 @@ core.trustctime::
crawlers and some backup systems).
See linkgit:git-update-index[1]. True by default.
 
+core.splitIndex::
+   If true, the split-index feature of the index will be used.
+   See linkgit:git-update-index[1]. False by default.
+
 core.untrackedCache::
Determines what to do about the untracked cache feature of the
index. It will be kept, if this variable is unset or set to
-- 
2.12.0.22.g0672473d40



[PATCH v4 07/22] t1700: add tests for core.splitIndex

2017-02-27 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 t/t1700-split-index.sh | 37 +
 1 file changed, 37 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index cb68b8dc1e..1659986d8d 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -200,4 +200,41 @@ test_expect_success 'unify index, two files remain' '
test_cmp expect actual
 '
 
+test_expect_success 'set core.splitIndex config variable to true' '
+   git config core.splitIndex true &&
+   : >three &&
+   git update-index --add three &&
+   git ls-files --stage >ls-files.actual &&
+   cat >ls-files.expect <<-EOF &&
+   100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0   one
+   100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0   three
+   100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0   two
+   EOF
+   test_cmp ls-files.expect ls-files.actual &&
+   BASE=$(test-dump-split-index .git/index | grep "^base") &&
+   test-dump-split-index .git/index | sed "/^own/d" >actual &&
+   cat >expect <<-EOF &&
+   $BASE
+   replacements:
+   deletions:
+   EOF
+   test_cmp expect actual
+'
+
+test_expect_success 'set core.splitIndex config variable to false' '
+   git config core.splitIndex false &&
+   git update-index --force-remove three &&
+   git ls-files --stage >ls-files.actual &&
+   cat >ls-files.expect <<-EOF &&
+   100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0   one
+   100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0   two
+   EOF
+   test_cmp ls-files.expect ls-files.actual &&
+   test-dump-split-index .git/index | sed "/^own/d" >actual &&
+   cat >expect <<-EOF &&
+   not a split index
+   EOF
+   test_cmp expect actual
+'
+
 test_done
-- 
2.12.0.22.g0672473d40



[PATCH v4 09/22] Documentation/git-update-index: talk about core.splitIndex config var

2017-02-27 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 Documentation/git-update-index.txt | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/git-update-index.txt 
b/Documentation/git-update-index.txt
index 7386c93162..e091b2a409 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -171,6 +171,12 @@ may not support it yet.
given again, all changes in $GIT_DIR/index are pushed back to
the shared index file. This mode is designed for very large
indexes that take a significant amount of time to read or write.
++
+These options take effect whatever the value of the `core.splitIndex`
+configuration variable (see linkgit:git-config[1]). But a warning is
+emitted when the change goes against the configured value, as the
+configured value will take effect next time the index is read and this
+will remove the intended effect of the option.
 
 --untracked-cache::
 --no-untracked-cache::
-- 
2.12.0.22.g0672473d40



[PATCH v4 04/22] split-index: add {add,remove}_split_index() functions

2017-02-27 Thread Christian Couder
Also use the functions in cmd_update_index() in
builtin/update-index.c.

These functions will be used in a following commit to tweak
our use of the split-index feature depending on the setting
of a configuration variable.

Signed-off-by: Christian Couder 
---
 builtin/update-index.c | 18 ++
 split-index.c  | 22 ++
 split-index.h  |  2 ++
 3 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index d530e89368..24fdadfa4b 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1099,18 +1099,12 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
}
 
if (split_index > 0) {
-   init_split_index(_index);
-   the_index.cache_changed |= SPLIT_INDEX_ORDERED;
-   } else if (!split_index && the_index.split_index) {
-   /*
-* can't discard_split_index(_index); because that
-* will destroy split_index->base->cache[], which may
-* be shared with the_index.cache[]. So yeah we're
-* leaking a bit here.
-*/
-   the_index.split_index = NULL;
-   the_index.cache_changed |= SOMETHING_CHANGED;
-   }
+   if (the_index.split_index)
+   the_index.cache_changed |= SPLIT_INDEX_ORDERED;
+   else
+   add_split_index(_index);
+   } else if (!split_index)
+   remove_split_index(_index);
 
switch (untracked_cache) {
case UC_UNSPECIFIED:
diff --git a/split-index.c b/split-index.c
index 615f4cac05..f519e60f87 100644
--- a/split-index.c
+++ b/split-index.c
@@ -317,3 +317,25 @@ void replace_index_entry_in_base(struct index_state 
*istate,
istate->split_index->base->cache[new->index - 1] = new;
}
 }
+
+void add_split_index(struct index_state *istate)
+{
+   if (!istate->split_index) {
+   init_split_index(istate);
+   istate->cache_changed |= SPLIT_INDEX_ORDERED;
+   }
+}
+
+void remove_split_index(struct index_state *istate)
+{
+   if (istate->split_index) {
+   /*
+* can't discard_split_index(_index); because that
+* will destroy split_index->base->cache[], which may
+* be shared with the_index.cache[]. So yeah we're
+* leaking a bit here.
+*/
+   istate->split_index = NULL;
+   istate->cache_changed |= SOMETHING_CHANGED;
+   }
+}
diff --git a/split-index.h b/split-index.h
index c1324f521a..df91c1bda8 100644
--- a/split-index.h
+++ b/split-index.h
@@ -31,5 +31,7 @@ void merge_base_index(struct index_state *istate);
 void prepare_to_write_split_index(struct index_state *istate);
 void finish_writing_split_index(struct index_state *istate);
 void discard_split_index(struct index_state *istate);
+void add_split_index(struct index_state *istate);
+void remove_split_index(struct index_state *istate);
 
 #endif
-- 
2.12.0.22.g0672473d40



[PATCH v4 01/22] config: mark an error message up for translation

2017-02-27 Thread Christian Couder
Signed-off-by: Christian Couder 
---
 config.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/config.c b/config.c
index c6b874a7bf..2ac1aa19b0 100644
--- a/config.c
+++ b/config.c
@@ -1728,8 +1728,8 @@ int git_config_get_untracked_cache(void)
if (!strcasecmp(v, "keep"))
return -1;
 
-   error("unknown core.untrackedCache value '%s'; "
- "using 'keep' default value", v);
+   error(_("unknown core.untrackedCache value '%s'; "
+   "using 'keep' default value"), v);
return -1;
}
 
-- 
2.12.0.22.g0672473d40



[PATCH v4 03/22] config: add git_config_get_split_index()

2017-02-27 Thread Christian Couder
This new function will be used in a following commit to know
if we want to use the split index feature or not.

Signed-off-by: Christian Couder 
---
 cache.h  |  1 +
 config.c | 10 ++
 2 files changed, 11 insertions(+)

diff --git a/cache.h b/cache.h
index 61fc86e6d7..014aa7ea11 100644
--- a/cache.h
+++ b/cache.h
@@ -1882,6 +1882,7 @@ extern int git_config_get_bool_or_int(const char *key, 
int *is_bool, int *dest);
 extern int git_config_get_maybe_bool(const char *key, int *dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
+extern int git_config_get_split_index(void);
 
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
diff --git a/config.c b/config.c
index 2ac1aa19b0..2a97696be7 100644
--- a/config.c
+++ b/config.c
@@ -1736,6 +1736,16 @@ int git_config_get_untracked_cache(void)
return -1; /* default value */
 }
 
+int git_config_get_split_index(void)
+{
+   int val;
+
+   if (!git_config_get_maybe_bool("core.splitindex", ))
+   return val;
+
+   return -1; /* default value */
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
-- 
2.12.0.22.g0672473d40



Re: [PATCH] gitweb tests: Skip tests when we don't have Time::HiRes

2017-02-27 Thread Jakub Narębski
W dniu 27.02.2017 o 13:37, Ævar Arnfjörð Bjarmason pisze:
> Change the gitweb tests to skip when we can't load the Time::HiRes
> module.

Could you tell us in the commit message why this module is needed?
Is it because gitweb loads it unconditionally, or does that at least
in the default configuration, or is it used in tests, or...?

[I see it is somewhat addressed below]

> 
> This module has bee in perl core since v5.8, which is the oldest

s/bee/been/

> version we support, however CentOS (and perhaps some other
> distributions) carve it into its own non-core-perl package that's not
> installed along with /usr/bin/perl by default. Without this we'll hard
> fail the gitweb tests when trying to load the module.

I see that it because gitweb.perl as the following at line 20:

use Time::HiRes qw(gettimeofday tv_interval);

> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 

Good catch (if a strange one...).

> ---
>  t/gitweb-lib.sh | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/t/gitweb-lib.sh b/t/gitweb-lib.sh
> index d5dab5a94f..116c3890e6 100644
> --- a/t/gitweb-lib.sh
> +++ b/t/gitweb-lib.sh
> @@ -114,4 +114,9 @@ perl -MCGI -MCGI::Util -MCGI::Carp -e 0 >/dev/null 2>&1 
> || {
>   test_done
>  }
>  
> +perl -mTime::HiRes -e 0  >/dev/null 2>&1 || {
> + skip_all='skipping gitweb tests, Time::HiRes module unusable'

Is "unusable" a good description, instead of "not found"?

> + test_done
> +}
> +
>  gitweb_init
> 



Re: [ANNOUNCE] Git v2.12.0

2017-02-27 Thread Ævar Arnfjörð Bjarmason
On Fri, Feb 24, 2017 at 8:28 PM, Junio C Hamano  wrote:
> The latest feature release Git v2.12.0 is now available at the
> usual places.  It is comprised of 517 non-merge commits since
> v2.11.0, contributed by 80 people, 24 of which are new faces.

Yay, some explanations / notes / elaborations:

>  * "git diff" learned diff.interHunkContext configuration variable
>that gives the default value for its --inter-hunk-context option.

This is really cool. Now if you have e.g. lots of changed lines each
10 lines apart --inter-hunk-context=10 will show those all as one big
hunk, instead of needing to specify -U10 as you had to before, which
would give all hunks a context of 10 lines.

>  * An ancient repository conversion tool left in contrib/ has been
>removed.

I thought "what tool?" so here's what this is. git.git was born on
April 7, 2005. For the first 13 days we'd hash the contents of
*compressed* blobs, not their uncompressed contents. Linus changed
this in: https://github.com/git/git/commit/d98b46f8d9

This tool was the ancient tool to convert these old incompatible
repositories from the old format. If someone hasn't gotten around to
this since 2005 they probably aren't ever going to do it :)

>  * Some people feel the default set of colors used by "git log --graph"
>rather limiting.  A mechanism to customize the set of colors has
>been introduced.

This is controlled via the log.graphColors variable. E.g.:

git -c log.graphColors="red, green, yellow" log --graph HEAD~100..

Does anyone have a prettier invocation?

>  * "git diff" and its family had two experimental heuristics to shift
>the contents of a hunk to make the patch easier to read.  One of
>them turns out to be better than the other, so leave only the
>"--indent-heuristic" option and remove the other one.

... the other one being --compaction-heuristic.


Re: Why BLAKE2?

2017-02-27 Thread Ian Jackson
Markus Trippelsdorf writes ("Re: Why BLAKE2?"):
> On 2017.02.27 at 13:00 +, Ian Jackson wrote:
> > For brevity I will write `SHA' for hashing with SHA-1, using current
> > unqualified object names, and `BLAKE' for hasing with BLAKE2b, using
> > H object names.
> 
> Why do you choose BLAKE2? SHA-2 is generally considered still fine and
> would be the obvious choice. And if you want to be adventurous then
> SHA-3 (Keccak) would be the next logical candidate.

I don't have a strong opinion.  Keccak would be fine too.
We should probably avoid SHA-2.

The main point of my posting was not to argue in favour of a
particular hash function :-).

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.


Re: Why BLAKE2?

2017-02-27 Thread Markus Trippelsdorf
On 2017.02.27 at 13:00 +, Ian Jackson wrote:
> 
> For brevity I will write `SHA' for hashing with SHA-1, using current
> unqualified object names, and `BLAKE' for hasing with BLAKE2b, using
> H object names.

Why do you choose BLAKE2? SHA-2 is generally considered still fine and
would be the obvious choice. And if you want to be adventurous then
SHA-3 (Keccak) would be the next logical candidate.

-- 
Markus


Re: [PATCH 4/4] ident: do not ignore empty config name/email

2017-02-27 Thread Dennis Kaarsemaker
On Thu, 2017-02-23 at 23:18 -0500, Jeff King wrote:
> On Thu, Feb 23, 2017 at 08:11:11PM -0800, Junio C Hamano wrote:
> 
> > > So I dunno. I could really go either way on it. Feel free to drop it, or
> > > even move it into a separate topic to be cooked longer.
> > 
> > If it were 5 years ago, it would have been different, but I do not
> > think cooking it longer in 'next' would smoke out breakages in
> > obscure scripts any longer.  Git is used by too many people who have
> > never seen its source these days.
> 
> Yeah, I have noticed that, too. I wonder if it would be interesting to
> cut "weeklies" or something of "master" or even "next" that people could
> install with a single click.
> 
> Of course it's not like we have a binary installer in the first place,
> so I guess that's a prerequisite.

I provide daily[*] snapshots of git's master and next tree as packages
for Ubuntu, Debian, Fedora and CentOS on launchpad and SuSE's
openbuildservice. If there's sufficient interest in this (I know of
only a few users), I can try to put more effort into this.

-- 
Dennis Kaarsemaker
http://www.kaarsemaker.net

[*]When the tooling isn't broken for some reason.


Bug: "git worktree add" Unable to checkout a branch with no local ref

2017-02-27 Thread Alexander Grigoriev

git version 2.10.2.windows.1:
If a remote branch has never been checked out locally (its ref only exists 
in remotes// directory), "git worktree add" command is unable to 
check it out by its normal short name (not prefixed by remotes/), 
while "git checkout" command has been able to handle such a branch and 
properly convert it to a local branch. 



Re: show all merge conflicts

2017-02-27 Thread Michael J Gruber
G. Sylvie Davies venit, vidit, dixit 29.01.2017 07:45:
> On Sat, Jan 28, 2017 at 6:28 AM, Jeff King  wrote:
>> On Fri, Jan 27, 2017 at 09:42:41PM -0800, G. Sylvie Davies wrote:
>>
>>> Aside from the usual "git log -cc", I think this should work (replace
>>> HEAD with whichever commit you are analyzing):
>>>
>>> git diff --name-only HEAD^2...HEAD^1 > m1
>>> git diff --name-only HEAD^1...HEAD^2 > b1
>>> git diff --name-only HEAD^1..HEAD> m2
>>> git diff --name-only HEAD^2..HEAD> b2
>>>
>>> If files listed between m1 and b2 differ, then the merge is dirty.
>>> Similarly for m2 and b1.
>>>
>>> More information here:
>>>
>>> http://stackoverflow.com/questions/27683077/how-do-you-detect-an-evil-merge-in-git/41356308#41356308
>>
>> I don't think that can reliably find evil merges, since it looks at the
>> file level. If you had one hunk resolved for "theirs" and one hunk for
>> "ours" in a given file, then the file will be listed in each diff,
>> whether it has evil hunks or not.
>>
> 
> Well, you have to do both.  Do "git show -c" to catch that one
> ("theirs" for one hunk, "ours" for the other, same file).
> 
> And then do that sequence of the 4 "git diff" commands to identify
> dirty merges where "theirs" or "ours" was applied to entire files, and
> thus not showing up in the "git show -c".
> 
>> I don't think this is just about evil merges, though. For instance,
>> try:
>>
>>   seq 1 10 >file
>>   git add file
>>   git commit -m base
>>
>>   sed s/4/master/ tmp && mv tmp file
>>   git commit -am master
>>
>>   git checkout -b other HEAD^
>>   sed s/4/other/ tmp && mv tmp file
>>   git commit -am other
>>
>>   git merge master
>>   git checkout --ours file
>>   git commit -am merged
>>
>>   merge=$(git rev-parse HEAD)
>>
>> The question is: were there conflicts in $merge, and how were they
>> resolved?
>>
>> That isn't an evil merge, but there's still something interesting to
>> show that "git log --cc" won't display.
>>
>> Replaying the merge like:
>>
>>   git checkout $merge^1
>>   git merge $merge^2
>>   git diff -R $merge
>>
>> shows you the patch to go from the conflict state to the final one.
>>
> 
> I know the stackoverflow question asks "how to detect evil merges",
> and I go along with that in my answer.  But honestly I prefer to call
> them dirty rather than evil, and by "dirty" I just mean merges that
> did not resolve cleanly via "git merge", and had some form of user
> intervention, be it conflict resolution, or other strange things.
> 
> The trick I propose with the sequence of 4 "git diff" commands
> identifies that merge from your example as dirty:
> 
> $ cat b1 m2
> file
> 
> $ cat b2 m1
> file
> file
> 
> The trick doesn't really tell you much except that the merge is dirty.
> If you notice that the "m2" file is empty, I think that's one way to
> realize that master's edit was dropped, and therefore "other" won.
> 
> Maybe it even merged cleanly but someone did a "git commit --amend" to
> make it the merge dirty after the fact.
> 
> I do like your approach, it's very simple and reliable.  But in my
> situation I'm writing pre-receive hooks for bare repos, so I don't
> think I can actually do "git merge"!
> 
> I think my suggestion would work for OP, as long as they also run "git
> show -c" alongside it.   (And your suggestion would work, too, of
> course).

If you're curious, I kept rebasing Thomas' remerge-diff (on top of our
next) so far. You can find it at

https://github.com/mjg/git/tree/remerge-diff

if you're interested. I don't know what problems were found back then,
or what it would take to get this in-tree now.

Michael



  1   2   >