Re: [PATCH v2 00/21] Read `packed-refs` using mmap()

2017-09-25 Thread Johannes Schindelin
Hi Peff,


On Wed, 20 Sep 2017, Jeff King wrote:

> On Tue, Sep 19, 2017 at 08:22:08AM +0200, Michael Haggerty wrote:
> 
> > This branch is also available from my fork on GitHub as branch
> > `mmap-packed-refs`.
> > 
> > My main uncertainties are:
> > 
> > 1. Does this code actually work on Windows?
> > 
> 
> I can't really answer (1) due to lack of knowledge.

Sorry, I have only a couple of minutes per day to look through the flood
of emails from the Git mailing list and to answer them, so I forgot to say
that I had a VM build and test Michael's patches, and they worked on
Windows.

Ciao,
Dscho


Re: [PATCH v2 00/21] Read `packed-refs` using mmap()

2017-09-20 Thread Jeff King
On Tue, Sep 19, 2017 at 08:22:08AM +0200, Michael Haggerty wrote:

> This branch is also available from my fork on GitHub as branch
> `mmap-packed-refs`.
> 
> My main uncertainties are:
> 
> 1. Does this code actually work on Windows?
> 
> 2. Did I implement the new compile-time option correctly? (I just
>cargo-culted some existing options.) Is there some existing option
>that I could piggy-back off of instead of adding a new one?
> 
> 3. Is a compile-time option sufficient, or would the `mmap()` option
>need to be configurable at runtime, or even tested at repository
>create time as is done for some other filesystem properties in
>`create_default_files()`?

I can't really answer (1) due to lack of knowledge. For (2), I think
it's mostly good, though I raised a small nit. For (3), I think this is
really an OS restriction and not a filesystem one, so probably
build-time is OK (though of course Windows people may be able to correct
me).

I left a few comments, but overall it looks quite good to me.

-Peff


Re: [PATCH v2 00/21] Read `packed-refs` using mmap()

2017-09-19 Thread Johannes Schindelin
Hi Michael,

On Tue, 19 Sep 2017, Michael Haggerty wrote:

> This is v2 of a patch series that changes the reading and caching of the
> `packed-refs` file to use `mmap()`. Thanks to Junio, Stefan, and
> Johannes for their comments about v1 [1].

Thank you for the new iteration.

> The main change since v1 is to accommodate Windows, which doesn't let
> you replace a file using `rename()` if the file is currently mmapped.
> This is unfortunate, because it means that Windows will never get the
> O(N) → O(lg N) improvement for reading single references that more
> capable systems can now enjoy.

Triggered by your enquiry, I looked into passing the FILE_SHARE_DELETE
flag which I hoped would let us delete the file even if it still is open
(and mapped). In my tests, this did not work. If anybody wants to have a
look at what I did (and whether they can make it work):
https://github.com/dscho/git/tree/replace-wopen

> The background was discussed on the mailing list [2]. The bottom line
> is that on Windows, keeping the `packed-refs` lock mmapped would be
> tantamount to holding reader lock on that file, preventing anybody
> (even unrelated processes) from changing the `packed-refs` file while
> it is mmapped. This is even worse than the situation for packfiles
> (which is solved using `close_all_packs()`), because a packfile, once
> created, never needs to be replaced—every packfile has a filename that
> is determined from its contents. The worst that can happen if a
> packfile is locked is that another process cannot remove it, but that
> is not critical for correctness. The `packed-refs` file, on the other
> hand, always has the same filename and needs to be overwritten for
> correctness.
> 
> So the approach taken here is that a new compile-time option,
> `MMAP_PREVENTS_DELETE`, is introduced. When this option is set, then
> the `packed-refs` file is read quickly into memory then closed.

Another approach would be to imitate close_all_packs() and rely on the
Windows-specific code that retries renames in a staggered fashion, waiting
a little longer and longer before retrying, and finally telling the user
that some file cannot be overwritten:
https://github.com/git-for-windows/git/blob/v2.14.1.windows.1/compat/mingw.c#L2439-L2441

This is not a new problem, by the way. If a file is in use while you try
to run `git checkout` with a different version of that file, we have the
exact same problem on Windows. And we deal with it using that
retry_ask_yes_no() function.

For this to work, the current process really would need to be able to
release all snapshots in one go (for simplicity, I would not even check
the filename but simply blow them all away when we want to overwrite
packed-refs).

I guess I should set aside some time to implement that on top of your
series (I *really* want our in-house users to benefit from that O(lg n)
improvement). In the meantime, I think this can go forward with the
current design.

Ciao,
Dscho

Re: [PATCH v2 00/21]

2017-05-09 Thread Junio C Hamano
Duy Nguyen  writes:

> On Sun, May 7, 2017 at 11:20 AM, Junio C Hamano  wrote:
>> On Thu, May 4, 2017 at 2:45 PM, Junio C Hamano  wrote:
>>>
>>> Nguyễn Thái Ngọc Duy   writes:
>>>
>>> > Changes since v1:
>>> >
>>> >  - fopen_or_warn() and warn_on_fopen_errors() are introduced. The
>>> >latter's name is probably not great...
>>> >  - A new patch (first one) to convert a bunch to using xfopen()
>>> >  - Fix test failure on Windows, found by Johannes Sixt
>>> >  - Fix the memory leak in log.c, found by Jeff
>>> >
>>> > There are still a couple of fopen() remained, but I'm getting slow
>>> > again so let's get these in first and worry about the rest when
>>> > somebody has time.
>>
>> Hmm, is this topic responsible for recent breakage Travis claims on MacOS X?
>>
>> https://travis-ci.org/git/git/jobs/229585098#L3091
>>
>> seems to expect an error when test-config is fed a-directory but we are
>> not getting the expected warning and error?
>
> Sounds about right. Let me see if defining FREAD_READS_DIRECTORIES on
> MacOS X makes travis happy.

Thanks.  I should have suspected that myself to save a round-trip.


Re: [PATCH v2 00/21]

2017-05-09 Thread Duy Nguyen
On Sun, May 7, 2017 at 11:20 AM, Junio C Hamano  wrote:
> On Thu, May 4, 2017 at 2:45 PM, Junio C Hamano  wrote:
>>
>> Nguyễn Thái Ngọc Duy   writes:
>>
>> > Changes since v1:
>> >
>> >  - fopen_or_warn() and warn_on_fopen_errors() are introduced. The
>> >latter's name is probably not great...
>> >  - A new patch (first one) to convert a bunch to using xfopen()
>> >  - Fix test failure on Windows, found by Johannes Sixt
>> >  - Fix the memory leak in log.c, found by Jeff
>> >
>> > There are still a couple of fopen() remained, but I'm getting slow
>> > again so let's get these in first and worry about the rest when
>> > somebody has time.
>
> Hmm, is this topic responsible for recent breakage Travis claims on MacOS X?
>
> https://travis-ci.org/git/git/jobs/229585098#L3091
>
> seems to expect an error when test-config is fed a-directory but we are
> not getting the expected warning and error?

Sounds about right. Let me see if defining FREAD_READS_DIRECTORIES on
MacOS X makes travis happy.
-- 
Duy


Re: [PATCH v2 00/21]

2017-05-07 Thread Junio C Hamano
On Thu, May 4, 2017 at 2:45 PM, Junio C Hamano  wrote:
>
> Nguyễn Thái Ngọc Duy   writes:
>
> > Changes since v1:
> >
> >  - fopen_or_warn() and warn_on_fopen_errors() are introduced. The
> >latter's name is probably not great...
> >  - A new patch (first one) to convert a bunch to using xfopen()
> >  - Fix test failure on Windows, found by Johannes Sixt
> >  - Fix the memory leak in log.c, found by Jeff
> >
> > There are still a couple of fopen() remained, but I'm getting slow
> > again so let's get these in first and worry about the rest when
> > somebody has time.

Hmm, is this topic responsible for recent breakage Travis claims on MacOS X?

https://travis-ci.org/git/git/jobs/229585098#L3091

seems to expect an error when test-config is fed a-directory but we are
not getting the expected warning and error?


Re: [PATCH v2 00/21]

2017-05-03 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> Changes since v1:
>
>  - fopen_or_warn() and warn_on_fopen_errors() are introduced. The
>latter's name is probably not great...
>  - A new patch (first one) to convert a bunch to using xfopen()
>  - Fix test failure on Windows, found by Johannes Sixt
>  - Fix the memory leak in log.c, found by Jeff
>
> There are still a couple of fopen() remained, but I'm getting slow
> again so let's get these in first and worry about the rest when
> somebody has time.
>
> Nguyễn Thái Ngọc Duy (21):
>   Use xfopen() in more places
>   clone: use xfopen() instead of fopen()
>   config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD
>   wrapper.c: add warn_on_fopen_errors()
>   wrapper.c: add fopen_or_warn()
>   attr.c: use fopen_or_warn()
>   ident.c: use fopen_or_warn()
>   bisect: report on fopen() error
>   blame: report error on open if graft_file is a directory
>   log: report errno on failure to fopen() a file
>   log: fix memory leak in open_next_file()
>   commit.c: report error on failure to fopen() the graft file
>   remote.c: report error on failure to fopen()
>   rerere.c: report error on failure to fopen()
>   rerere.c: report correct errno
>   sequencer.c: report error on failure to fopen()
>   server-info: report error on failure to fopen()
>   wt-status.c: report error on failure to fopen()
>   xdiff-interface.c: report errno on failure to stat() or fopen()
>   config.c: handle error on failing to fopen()
>   t1308: add a test case on open a config directory

Thanks.  If the number of parts affected by this series were
smaller, it may have made the review easier to have the introduction
of a helper and its use in a single larger patch, but there are
spread across many, some with files that are touched by different
in-flight topics, and these "collection of smaller patches" makes it
easier to manage both while reviewing and also merging.

All looked good, even though I do share the doubt on the name
"warn-on-fopen-errors"; when something applies equally to fopen(3)
and underlying open(2), I would tend to call that with open not
fopen myself.  But that is a minor point.



Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Jeff King
On Tue, Mar 28, 2017 at 12:40:29PM -0700, Junio C Hamano wrote:

> > Here's that minor tweak, in case anybody is interested. It's less useful
> > without that follow-on that touches "eol" more, but perhaps it increases
> > readability on its own.
> 
> Yup, the only thing that the original (with Brian's fix) appears to
> be more careful about is it tries very hard to avoid setting boc
> past eoc.  As we are not checking "boc != eoc" but doing the
> comparison, that "careful" appearance does not give us any benefit
> in practice, other than having to do an extra "eol ? eol+1 : eoc";
> the result of this patch is easier to read.
> 
> By the way, eoc is "one past the end" of the array that begins at
> boc, so setting a pointer to eoc+1 may technically be in violation.
> I do not know how much it matters, though ;-)

I think that is OK. We are reading a strbuf, so eoc must either be the
first character of the PGP signature, or the terminating NUL if there
was no signature block[1]. So it's actually _inside_ the array, and
eoc+1 is our "one past".

-Peff

[1] Arguably we should bail when parse_signature() does not find a PGP
signature at all. We already bail with "malformed push certificate"
when there are other syntactic anomalies.


Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Junio C Hamano
Jeff King  writes:

> Here's that minor tweak, in case anybody is interested. It's less useful
> without that follow-on that touches "eol" more, but perhaps it increases
> readability on its own.

Yup, the only thing that the original (with Brian's fix) appears to
be more careful about is it tries very hard to avoid setting boc
past eoc.  As we are not checking "boc != eoc" but doing the
comparison, that "careful" appearance does not give us any benefit
in practice, other than having to do an extra "eol ? eol+1 : eoc";
the result of this patch is easier to read.

By the way, eoc is "one past the end" of the array that begins at
boc, so setting a pointer to eoc+1 may technically be in violation.
I do not know how much it matters, though ;-)

> -- >8 --
> Subject: [PATCH] receive-pack: simplify eol handling in cert parsing
>
> The queue_commands_from_cert() function wants to handle each
> line of the cert individually. It looks for "\n" in the
> to-be-parsed bytes, and special-cases each use of eol (the
> end-of-line variable) when we didn't find one.  Instead, we
> can just set the end-of-line variable to end-of-cert in the
> latter case.
>
> For advancing to the next line, it's OK for us to move our
> pointer past end-of-cert, because our loop condition just
> checks for pointer inequality. And it doesn't even violate
> the ANSI C "no more than one past the end of an array" rule,
> because we know in the worst case we've hit the terminating
> NUL of the strbuf.
>
> Signed-off-by: Jeff King 
> ---
>  builtin/receive-pack.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index 5d9e4da0a..58de2a1a9 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -1524,8 +1524,10 @@ static void queue_commands_from_cert(struct command 
> **tail,
>  
>   while (boc < eoc) {
>   const char *eol = memchr(boc, '\n', eoc - boc);
> - tail = queue_command(tail, boc, eol ? eol - boc : eoc - boc);
> - boc = eol ? eol + 1 : eoc;
> + if (!eol)
> + eol = eoc;
> + tail = queue_command(tail, boc, eol - boc);
> + boc = eol + 1;
>   }
>  }


Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Jeff King
On Tue, Mar 28, 2017 at 01:35:36PM -0400, Jeff King wrote:

> I thought I'd knock this out quickly before I forgot about it. But it
> actually isn't so simple.
> 
> The main caller in read_head_info() does indeed just pass strlen(line)
> as the length in each case. But the cert parser really does need us to
> respect the line length. So we either have to pass it in, or tie off the
> string.
> 
> The latter looks something like the patch below (on top of a minor
> tweak around "eol" handling). It's sufficiently ugly that it may not
> count as an actual cleanup, though. I'm OK if we just drop the idea.

Here's that minor tweak, in case anybody is interested. It's less useful
without that follow-on that touches "eol" more, but perhaps it increases
readability on its own.

-- >8 --
Subject: [PATCH] receive-pack: simplify eol handling in cert parsing

The queue_commands_from_cert() function wants to handle each
line of the cert individually. It looks for "\n" in the
to-be-parsed bytes, and special-cases each use of eol (the
end-of-line variable) when we didn't find one.  Instead, we
can just set the end-of-line variable to end-of-cert in the
latter case.

For advancing to the next line, it's OK for us to move our
pointer past end-of-cert, because our loop condition just
checks for pointer inequality. And it doesn't even violate
the ANSI C "no more than one past the end of an array" rule,
because we know in the worst case we've hit the terminating
NUL of the strbuf.

Signed-off-by: Jeff King 
---
 builtin/receive-pack.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 5d9e4da0a..58de2a1a9 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1524,8 +1524,10 @@ static void queue_commands_from_cert(struct command 
**tail,
 
while (boc < eoc) {
const char *eol = memchr(boc, '\n', eoc - boc);
-   tail = queue_command(tail, boc, eol ? eol - boc : eoc - boc);
-   boc = eol ? eol + 1 : eoc;
+   if (!eol)
+   eol = eoc;
+   tail = queue_command(tail, boc, eol - boc);
+   boc = eol + 1;
}
 }
 
-- 
2.12.2.845.g55fcf8b10



Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Jeff King
On Tue, Mar 28, 2017 at 11:13:15AM +, brian m. carlson wrote:

> > I suggested an additional cleanup around "linelen" in one patch. In the
> > name of keeping the number of re-rolls sane, I'm OK if we skip that for
> > now (the only reason I mentioned it at all is that you have to justify
> > the caveat in the commit message; with the fix, that justification can
> > go away).
> 
> Let's leave it as it is, assuming Junio's okay with it.  I can send in a
> few more patches to clean that up and use skip_prefix that we can drop
> on top and graduate separately.
> 
> I think the justification is useful as it is, since it explains why we
> no longer want to check that particular value for historical reasons.

I thought I'd knock this out quickly before I forgot about it. But it
actually isn't so simple.

The main caller in read_head_info() does indeed just pass strlen(line)
as the length in each case. But the cert parser really does need us to
respect the line length. So we either have to pass it in, or tie off the
string.

The latter looks something like the patch below (on top of a minor
tweak around "eol" handling). It's sufficiently ugly that it may not
count as an actual cleanup, though. I'm OK if we just drop the idea.

---
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 58de2a1a9..561a982e7 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1483,13 +1483,10 @@ static void execute_commands(struct command *commands,
 }
 
 static struct command **queue_command(struct command **tail,
- const char *line,
- int linelen)
+ const char *line)
 {
struct object_id old_oid, new_oid;
struct command *cmd;
-   const char *refname;
-   int reflen;
const char *p;
 
if (parse_oid_hex(line, _oid, ) ||
@@ -1498,9 +1495,7 @@ static struct command **queue_command(struct command 
**tail,
*p++ != ' ')
die("protocol error: expected old/new/ref, got '%s'", line);
 
-   refname = p;
-   reflen = linelen - (p - line);
-   FLEX_ALLOC_MEM(cmd, ref_name, refname, reflen);
+   FLEX_ALLOC_STR(cmd, ref_name, p);
oidcpy(>old_oid, _oid);
oidcpy(>new_oid, _oid);
*tail = cmd;
@@ -1510,7 +1505,7 @@ static struct command **queue_command(struct command 
**tail,
 static void queue_commands_from_cert(struct command **tail,
 struct strbuf *push_cert)
 {
-   const char *boc, *eoc;
+   char *boc, *eoc;
 
if (*tail)
die("protocol error: got both push certificate and unsigned 
commands");
@@ -1523,10 +1518,17 @@ static void queue_commands_from_cert(struct command 
**tail,
eoc = push_cert->buf + parse_signature(push_cert->buf, push_cert->len);
 
while (boc < eoc) {
-   const char *eol = memchr(boc, '\n', eoc - boc);
+   char *eol = memchr(boc, '\n', eoc - boc);
+   char tmp;
+
if (!eol)
eol = eoc;
-   tail = queue_command(tail, boc, eol - boc);
+
+   tmp = *eol;
+   *eol = '\0';
+   tail = queue_command(tail, boc);
+   *eol = tmp;
+
boc = eol + 1;
}
 }
@@ -1590,7 +1592,7 @@ static struct command *read_head_info(struct oid_array 
*shallow)
continue;
}
 
-   p = queue_command(p, line, linelen);
+   p = queue_command(p, line);
}
 
if (push_cert.len)


Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Junio C Hamano
Jeff King  writes:

> On Sun, Mar 26, 2017 at 04:01:22PM +, brian m. carlson wrote:
>
>> This is part 7 in the continuing transition to use struct object_id.
>> 
>> This series focuses on two main areas: adding two constants for the
>> maximum hash size we'll be using (which will be suitable for allocating
>> memory) and converting struct sha1_array to struct oid_array.
>
> Both changes are very welcome. I do think it's probably worth changing
> the name of sha1-array.[ch], but it doesn't need to happen immediately.
>
> I read through the whole series and didn't find anything objectionable.
> The pointer-arithmetic fix should perhaps graduate separately.

I didn't see anything incorrect when I queued the series, either,
and after I re-read it I saw a few minor readability issues, but
modulo that this looks ready.  I did split the push-cert parsing fix
and applied to an older base independently, though.

> I suggested an additional cleanup around "linelen" in one patch. In the
> name of keeping the number of re-rolls sane, I'm OK if we skip that for
> now (the only reason I mentioned it at all is that you have to justify
> the caveat in the commit message; with the fix, that justification can
> go away).

A follow-up after the dust settles could also mention "we earlier
mentioned this caveat but with this fix we no longer have to worry
about it", no?


Thanks both, anyways.


Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread brian m. carlson
On Tue, Mar 28, 2017 at 03:31:59AM -0400, Jeff King wrote:
> I read through the whole series and didn't find anything objectionable.
> The pointer-arithmetic fix should perhaps graduate separately.

Junio's welcome to take that patch separately if he likes.

> I suggested an additional cleanup around "linelen" in one patch. In the
> name of keeping the number of re-rolls sane, I'm OK if we skip that for
> now (the only reason I mentioned it at all is that you have to justify
> the caveat in the commit message; with the fix, that justification can
> go away).

Let's leave it as it is, assuming Junio's okay with it.  I can send in a
few more patches to clean that up and use skip_prefix that we can drop
on top and graduate separately.

I think the justification is useful as it is, since it explains why we
no longer want to check that particular value for historical reasons.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Jeff King
On Sun, Mar 26, 2017 at 04:01:22PM +, brian m. carlson wrote:

> This is part 7 in the continuing transition to use struct object_id.
> 
> This series focuses on two main areas: adding two constants for the
> maximum hash size we'll be using (which will be suitable for allocating
> memory) and converting struct sha1_array to struct oid_array.

Both changes are very welcome. I do think it's probably worth changing
the name of sha1-array.[ch], but it doesn't need to happen immediately.

I read through the whole series and didn't find anything objectionable.
The pointer-arithmetic fix should perhaps graduate separately.

I suggested an additional cleanup around "linelen" in one patch. In the
name of keeping the number of re-rolls sane, I'm OK if we skip that for
now (the only reason I mentioned it at all is that you have to justify
the caveat in the commit message; with the fix, that justification can
go away).

-Peff


Re: [PATCH v2 00/21] Add configuration options for split-index

2016-12-26 Thread Christian Couder
On Mon, Dec 19, 2016 at 1:02 PM, Duy Nguyen  wrote:
> On Sat, Dec 17, 2016 at 03:55:26PM +0100, Christian Couder wrote:
>> Goal
>> 
>>
>> We want to make it possible to use the split-index feature
>> automatically by just setting a new "core.splitIndex" configuration
>> variable to true.
>>
>> This can be valuable as split-index can help significantly speed up
>> `git rebase` especially along with the work to libify `git apply`
>> that has been merged to master
>> (see 
>> https://github.com/git/git/commit/81358dc238372793b1590efa149cc1581d1fbd98)
>> and is now in v2.11.
>
> I've read through the series (*) and I think it looks good, just a few
> minor comments here and there.

Thanks for your review.

I think I addressed all the minor points left in the v3 and the emails
I just sent.

> (*) guiltily admit that I only skimmed through tests, not giving them
> as much attention as I should have


Re: [PATCH v2 00/21] Add configuration options for split-index

2016-12-19 Thread Junio C Hamano
Duy Nguyen  writes:

> I've read through the series (*) and I think it looks good, just a few
> minor comments here and there.
>
> (*) guiltily admit that I only skimmed through tests, not giving them
> as much attention as I should have

OK.  I'd still want to see them get reviewed, though.  Perhaps I'll
do so myself once I run out of things to do, but hopefully somebody
else gets there first ;-)

Thanks.


Re: [PATCH v2 00/21] Add configuration options for split-index

2016-12-19 Thread Duy Nguyen
On Sat, Dec 17, 2016 at 03:55:26PM +0100, Christian Couder wrote:
> Goal
> 
> 
> We want to make it possible to use the split-index feature
> automatically by just setting a new "core.splitIndex" configuration
> variable to true.
> 
> This can be valuable as split-index can help significantly speed up
> `git rebase` especially along with the work to libify `git apply`
> that has been merged to master
> (see 
> https://github.com/git/git/commit/81358dc238372793b1590efa149cc1581d1fbd98)
> and is now in v2.11.

I've read through the series (*) and I think it looks good, just a few
minor comments here and there.

(*) guiltily admit that I only skimmed through tests, not giving them
as much attention as I should have
--
Duy


Re: [PATCH v2 00/21] Add configuration options for split-index

2016-12-17 Thread Christian Couder
> The previous versions were:
>
>   RFC: https://github.com/chriscool/git/commits/config-split-index7
>   v1:  https://github.com/chriscool/git/commits/config-split-index72

The diff since v1 is:

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 8fbef25cb1..52a3cac4ff 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2782,9 +2782,9 @@ splitIndex.sharedIndexExpire::
 index file is created. The value "now" expires all entries
 immediately, and "never" suppresses expiration altogether.
 The default value is "one.week.ago".
-Note that each time a new split-index file is created, the
-mtime of the related shared index file is updated to the
-current time.
+Note that each time a split index based on a shared index file
+is either created or read from, the mtime of the shared index
+file is updated to the current time.
 See linkgit:git-update-index[1].

 status.relativePaths::
diff --git a/Documentation/git-update-index.txt
b/Documentation/git-update-index.txt
index 635d1574b2..46c953b2f2 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -407,10 +407,14 @@ specified by the splitIndex.maxPercentChange
config variable (see
 linkgit:git-config[1]).

 Each time a new shared index file is created, the old shared index
-files are deleted if they are older than what is specified by the
-splitIndex.sharedIndexExpire config variable (see
+files are deleted if their mtime is older than what is specified by
+the splitIndex.sharedIndexExpire config variable (see
 linkgit:git-config[1]).

+To avoid deleting a shared index file that is still used, its mtime is
+updated to the current time everytime a new split index based on the
+shared index file is either created or read from.
+
 Untracked cache
 ---

diff --git a/builtin/gc.c b/builtin/gc.c
index c1e9602892..1e40d45aa2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -100,8 +100,8 @@ static void gc_config(void)
 git_config_get_int("gc.auto", _auto_threshold);
 git_config_get_int("gc.autopacklimit", _auto_pack_limit);
 git_config_get_bool("gc.autodetach", _auto);
-git_config_get_date_string("gc.pruneexpire", _expire);
-git_config_get_date_string("gc.worktreepruneexpire",
_worktrees_expire);
+git_config_get_expiry("gc.pruneexpire", _expire);
+git_config_get_expiry("gc.worktreepruneexpire", _worktrees_expire);
 git_config(git_default_config, NULL);
 }

diff --git a/builtin/update-index.c b/builtin/update-index.c
index a14dbf2612..dc1fd0d44d 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1099,18 +1099,18 @@ int cmd_update_index(int argc, const char
**argv, const char *prefix)

 if (split_index > 0) {
 if (git_config_get_split_index() == 0)
-warning("core.splitIndex is set to false; "
-"remove or change it, if you really want to "
-"enable split index");
+warning(_("core.splitIndex is set to false; "
+  "remove or change it, if you really want to "
+  "enable split index"));
 if (the_index.split_index)
 the_index.cache_changed |= SPLIT_INDEX_ORDERED;
 else
 add_split_index(_index);
 } else if (!split_index) {
 if (git_config_get_split_index() == 1)
-warning("core.splitIndex is set to true; "
-"remove or change it, if you really want to "
-"disable split index");
+warning(_("core.splitIndex is set to true; "
+  "remove or change it, if you really want to "
+  "disable split index"));
 remove_split_index(_index);
 }

diff --git a/cache.h b/cache.h
index 8e26aaf05e..279415afbd 100644
--- a/cache.h
+++ b/cache.h
@@ -1823,11 +1823,13 @@ extern int git_config_get_bool(const char
*key, int *dest);
 extern int git_config_get_bool_or_int(const char *key, int *is_bool,
int *dest);
 extern int git_config_get_maybe_bool(const char *key, int *dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
-extern int git_config_get_date_string(const char *key, const char **output);
 extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
 extern int git_config_get_max_percent_split_change(void);

+/* This dies if the configured or default date is in the future */
+extern int git_config_get_expiry(const char *key, const char **output);
+
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
  * ensure that they do not modify the untracked cache when reading it.
diff --git a/config.c b/config.c
index f88c61bb30..5c52cefd78 100644
--- a/config.c
+++ b/config.c
@@ -1685,7 +1685,7 @@ int git_config_get_pathname(const char *key,
const char **dest)
 return ret;
 }

-int git_config_get_date_string(const char *key, const char **output)
+int git_config_get_expiry(const char *key, const 

Re: [PATCH v2 00/21] Support multiple worktrees

2013-12-27 Thread Duy Nguyen
On Fri, Dec 27, 2013 at 12:12 AM, Junio C Hamano gits...@pobox.com wrote:
 Duy Nguyen pclo...@gmail.com writes:

 On Sun, Dec 22, 2013 at 1:38 PM, Junio C Hamano gits...@pobox.com wrote:

 Do we even need to expose them as ref-like things as a part of the
 external API/UI in the first place?  For end-user scripts that want
 to operate in a real or borrowing worktree, there should be no
 reason to touch this other repository directly.  Things like if
 one of the wortrees tries to check out a branch that is already
 checked out elsewhere, error out policy may need to consult the
 other worktrees via $GIT_COMMON_DIR mechanism but at that level we
 have all the control without contaminating end-user facing ref
 namespace in a way main/FETCH_HEAD... do.

 No, external API/UI is just extra bonus. We need to (or should) do so
 in order to handle $GIT_COMMON_DIR/HEAD exactly like how we do normal
 refs.

 And that is what I consider a confusion-trigger, not a nice bonus.

 I do not think it is particularly a good idea to contaminate
 end-user namespace for this kind of peek another repository
 special operation.

 In order to handle your multiple worktrees manipulating the same
 branch case sanely, you need to be aware of not just the real
 repository your worktree is borrowing from, but also _all_ the other
 worktrees that borrow from that same real repository, so a single
 main virtual namespace will not cut it. You will be dealing with
 an unbounded set of HEADs, one for each such worktree.

Yes. My problem is, while all secondary worktrees are in
$GIT_DIR/repos and their HEADs can be accessed there with
repos/xxx/HEAD, the first worktree's HEAD can't be accessed this way
because HEAD in a linked checkouts means repos/my worktree/HEAD.

 Can't we do this by adding a with this real repository layer,
 e.g. resolve HEAD wrt that repository, somewhat similar to how we
 peek into submodule namespaces?

Hmm.. never thought of it like a submodule. Thanks for the idea.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/21] Support multiple worktrees

2013-12-26 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 On Sun, Dec 22, 2013 at 1:38 PM, Junio C Hamano gits...@pobox.com wrote:

 Do we even need to expose them as ref-like things as a part of the
 external API/UI in the first place?  For end-user scripts that want
 to operate in a real or borrowing worktree, there should be no
 reason to touch this other repository directly.  Things like if
 one of the wortrees tries to check out a branch that is already
 checked out elsewhere, error out policy may need to consult the
 other worktrees via $GIT_COMMON_DIR mechanism but at that level we
 have all the control without contaminating end-user facing ref
 namespace in a way main/FETCH_HEAD... do.

 No, external API/UI is just extra bonus. We need to (or should) do so
 in order to handle $GIT_COMMON_DIR/HEAD exactly like how we do normal
 refs.

And that is what I consider a confusion-trigger, not a nice bonus.

I do not think it is particularly a good idea to contaminate
end-user namespace for this kind of peek another repository
special operation.

In order to handle your multiple worktrees manipulating the same
branch case sanely, you need to be aware of not just the real
repository your worktree is borrowing from, but also _all_ the other
worktrees that borrow from that same real repository, so a single
main virtual namespace will not cut it. You will be dealing with
an unbounded set of HEADs, one for each such worktree.

Can't we do this by adding a with this real repository layer,
e.g. resolve HEAD wrt that repository, somewhat similar to how we
peek into submodule namespaces?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/21] Support multiple worktrees

2013-12-22 Thread Duy Nguyen
On Sun, Dec 22, 2013 at 1:38 PM, Junio C Hamano gits...@pobox.com wrote:
 Duy Nguyen pclo...@gmail.com writes:

 I am not happy with the choice of main/HEAD that would squat on a
 good name for remote-tracking branch (i.e. s/origin/main/), though.
 $GIT_DIR/COMMON_HEAD perhaps?

 It's not just about HEAD. Anything worktree-specific of the main
 checkout can be accessed this way, e.g. main/index,
 main/FETCH_HEAD and it's not exactly common because it's
 worktree info. Maybe 1ST_ as the prefix (e.g. 1ST_HEAD, 1ST_index...)
 ?

 Do we even need to expose them as ref-like things as a part of the
 external API/UI in the first place?  For end-user scripts that want
 to operate in a real or borrowing worktree, there should be no
 reason to touch this other repository directly.  Things like if
 one of the wortrees tries to check out a branch that is already
 checked out elsewhere, error out policy may need to consult the
 other worktrees via $GIT_COMMON_DIR mechanism but at that level we
 have all the control without contaminating end-user facing ref
 namespace in a way main/FETCH_HEAD... do.

No, external API/UI is just extra bonus. We need to (or should) do so
in order to handle $GIT_COMMON_DIR/HEAD exactly like how we do normal
refs. Given any ref, git_path(ref) gives the path to that ref,
git_path(logs/%s, ref) gives the path of its reflog. By mapping
special names to real paths behind git_path(), We can feed
$GIT_COMMON_DIR/HEAD (under special names) to refs.c and it'll handle
correctly without any changes for special cases.

 You said

 This makes it possible for a linked checkout to detach HEAD of
 the main one.

 but I think that is misguided---it just makes it easier to confuse
 users, if done automatically and without any policy knob. It instead
 should error out, while saying which worktree has the branch in
 question checked out. After all, checking out a branch that is
 checked out in another worktree (not the common one) needs the same
 caution, so main/HEAD is not the only special one.

 And if your updated git checkout 'frotz' gives a clear report of
 which worktree has the branch 'frotz' checked out, the user can go
 there to detach the HEAD in that worktree to detach with

 git -C $the_other_one checkout HEAD^0

 if he chooses to.

Jonathan mentions about the checkout in portable device case that
would make the above a bit unnatural as you just can't cd there (git
update-ref still works).

I don't see any problems with checking out a branch multiple times. I
may want to try modifying something in the branch that will be thrown
away in the end. It's when the user updates a branch that we should
either error+reject or detach other checkouts. I guess it's up to the
user to decide which way they want. The error+reject way may make the
user hunt through dead checkouts waiting to be pruned. But we can
start with error+reject then add an option to auto-detach.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/21] Support multiple worktrees

2013-12-21 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 I am not happy with the choice of main/HEAD that would squat on a
 good name for remote-tracking branch (i.e. s/origin/main/), though.
 $GIT_DIR/COMMON_HEAD perhaps?

 It's not just about HEAD. Anything worktree-specific of the main
 checkout can be accessed this way, e.g. main/index,
 main/FETCH_HEAD and it's not exactly common because it's
 worktree info. Maybe 1ST_ as the prefix (e.g. 1ST_HEAD, 1ST_index...)
 ?

Do we even need to expose them as ref-like things as a part of the
external API/UI in the first place?  For end-user scripts that want
to operate in a real or borrowing worktree, there should be no
reason to touch this other repository directly.  Things like if
one of the wortrees tries to check out a branch that is already
checked out elsewhere, error out policy may need to consult the
other worktrees via $GIT_COMMON_DIR mechanism but at that level we
have all the control without contaminating end-user facing ref
namespace in a way main/FETCH_HEAD... do.  You said

This makes it possible for a linked checkout to detach HEAD of
the main one.

but I think that is misguided---it just makes it easier to confuse
users, if done automatically and without any policy knob. It instead
should error out, while saying which worktree has the branch in
question checked out. After all, checking out a branch that is
checked out in another worktree (not the common one) needs the same
caution, so main/HEAD is not the only special one.

And if your updated git checkout 'frotz' gives a clear report of
which worktree has the branch 'frotz' checked out, the user can go
there to detach the HEAD in that worktree to detach with

git -C $the_other_one checkout HEAD^0

if he chooses to.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/21] Support multiple worktrees

2013-12-20 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 I've got a better version [1] that fixes everything I can think of
 (there's still some room for improvements). I'm going to use it a bit
 longer before reposting again. But here's its basic design without
 going down to code

 New .git file format includes two lines:
 -- 8 --
 gitid: id
 gitdir: path
 -- 8 --

 Which would set $GIT_COMMON_DIR to path and $GIT_DIR to
 path/repos/id. Repository split is the same as before, worktree
 stuff in $GIT_DIR, the rest in $GIT_COMMON_DIR. This .git file format
 takes precedence over core.worktree but can still be overriden with
 $GIT_WORK_TREE. The main interface to create new worktree is git
 checkout --to.

 repos belongs to $GIT_COMMON_DIR (i.e. shared across all checkouts).
 The new worktrees (which I call linked checkouts) can also access
 HEAD of the original worktree via a virtual path main/HEAD. This
 makes it possible for a linked checkout to detach HEAD of the main
 one.

I am not happy with the choice of main/HEAD that would squat on a
good name for remote-tracking branch (i.e. s/origin/main/), though.
$GIT_DIR/COMMON_HEAD perhaps?

 The interesting thing is support for third party scripts (or hooks,
 maybe) so that they could work with both old and new git versions
 without some sort of git version/feature detection. Of course old git
 versions will only work with ordinary worktrees. To that end, git
 rev-parse --git-dir behavior could be changed by two environment
 variables. $GIT_ONE_PATH makes 'rev-parse --git-dir' return the .git
 _file_ in this case, which makes it much easier to pass the repo's
 checkout view around with git --git-dir=... .$GIT_COMMON_DIR_PATH
 makes 'rev-parse --git-dir' return $GIT_COMMON_DIR if it's from a
 linked checkout, or $GIT_DIR otherwise.

I do not understand why you need to go such a route.

Existing scripts that works only in a real repository will only know
git rev-parse --git-dir as the way to get the real GIT_DIR and
would not care about the common thing.  Scripts updated to work
well with the common thing needs to be aware of the common thing
anyway, so adding git rev-parse --common-git-dir or somesuch that
only these updated knows would be sufficient, no?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/21] Support multiple worktrees

2013-12-20 Thread Duy Nguyen
On Sat, Dec 21, 2013 at 3:32 AM, Junio C Hamano gits...@pobox.com wrote:
 Duy Nguyen pclo...@gmail.com writes:

 I've got a better version [1] that fixes everything I can think of
 (there's still some room for improvements). I'm going to use it a bit
 longer before reposting again. But here's its basic design without
 going down to code

 New .git file format includes two lines:
 -- 8 --
 gitid: id
 gitdir: path
 -- 8 --

 Which would set $GIT_COMMON_DIR to path and $GIT_DIR to
 path/repos/id. Repository split is the same as before, worktree
 stuff in $GIT_DIR, the rest in $GIT_COMMON_DIR. This .git file format
 takes precedence over core.worktree but can still be overriden with
 $GIT_WORK_TREE. The main interface to create new worktree is git
 checkout --to.

 repos belongs to $GIT_COMMON_DIR (i.e. shared across all checkouts).
 The new worktrees (which I call linked checkouts) can also access
 HEAD of the original worktree via a virtual path main/HEAD. This
 makes it possible for a linked checkout to detach HEAD of the main
 one.

 I am not happy with the choice of main/HEAD that would squat on a
 good name for remote-tracking branch (i.e. s/origin/main/), though.
 $GIT_DIR/COMMON_HEAD perhaps?

It's not just about HEAD. Anything worktree-specific of the main
checkout can be accessed this way, e.g. main/index,
main/FETCH_HEAD and it's not exactly common because it's
worktree info. Maybe 1ST_ as the prefix (e.g. 1ST_HEAD, 1ST_index...)
?

 The interesting thing is support for third party scripts (or hooks,
 maybe) so that they could work with both old and new git versions
 without some sort of git version/feature detection. Of course old git
 versions will only work with ordinary worktrees. To that end, git
 rev-parse --git-dir behavior could be changed by two environment
 variables. $GIT_ONE_PATH makes 'rev-parse --git-dir' return the .git
 _file_ in this case, which makes it much easier to pass the repo's
 checkout view around with git --git-dir=... .$GIT_COMMON_DIR_PATH
 makes 'rev-parse --git-dir' return $GIT_COMMON_DIR if it's from a
 linked checkout, or $GIT_DIR otherwise.

 I do not understand why you need to go such a route.

 Existing scripts that works only in a real repository will only know
 git rev-parse --git-dir as the way to get the real GIT_DIR and
 would not care about the common thing.  Scripts updated to work
 well with the common thing needs to be aware of the common thing
 anyway, so adding git rev-parse --common-git-dir or somesuch that
 only these updated knows would be sufficient, no?

It simplifies the changes, if the new script is to work with both old
and new git versions it may have to write

DIR=`git rev-parse --git-common-dir 2/dev/null || git rev-parse --git-dir`

the env way makes it

DIR=`GIT_COMMON_DIR=1 git rev-parse --git-dir`
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/21] Support multiple worktrees

2013-12-19 Thread Duy Nguyen
I've got a better version [1] that fixes everything I can think of
(there's still some room for improvements). I'm going to use it a bit
longer before reposting again. But here's its basic design without
going down to code

New .git file format includes two lines:
-- 8 --
gitid: id
gitdir: path
-- 8 --

Which would set $GIT_COMMON_DIR to path and $GIT_DIR to
path/repos/id. Repository split is the same as before, worktree
stuff in $GIT_DIR, the rest in $GIT_COMMON_DIR. This .git file format
takes precedence over core.worktree but can still be overriden with
$GIT_WORK_TREE. The main interface to create new worktree is git
checkout --to.

repos belongs to $GIT_COMMON_DIR (i.e. shared across all checkouts).
The new worktrees (which I call linked checkouts) can also access
HEAD of the original worktree via a virtual path main/HEAD. This
makes it possible for a linked checkout to detach HEAD of the main
one.

There are three entries in repos/id: gitdir should point to the
.git file that points it back here. Every time a linked checkout is
used, git should update mtime of this gitdir file to help pruning.
It should update the file content too if the repo is moved. link is
a hardlink to .git file, if supported, again for pruning support.
locked, if exists, means no automatic pruning (e.g. the linked
checkout is on a portable device).

The interesting thing is support for third party scripts (or hooks,
maybe) so that they could work with both old and new git versions
without some sort of git version/feature detection. Of course old git
versions will only work with ordinary worktrees. To that end, git
rev-parse --git-dir behavior could be changed by two environment
variables. $GIT_ONE_PATH makes 'rev-parse --git-dir' return the .git
_file_ in this case, which makes it much easier to pass the repo's
checkout view around with git --git-dir=... .$GIT_COMMON_DIR_PATH
makes 'rev-parse --git-dir' return $GIT_COMMON_DIR if it's from a
linked checkout, or $GIT_DIR otherwise. This makes 'rev-parse
--git-dir' falls back safely when running using old git versions. The
last patch in [1] that updates git-completion.bash could demonstrate
how it's used.

[1] https://github.com/pclouds/git.git checkout-new-worktree
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/21] Support multiple worktrees

2013-12-14 Thread Duy Nguyen
On Sat, Dec 14, 2013 at 5:54 PM, Nguyễn Thái Ngọc Duy pclo...@gmail.com wrote:
 Known issues

Scripts that expand $GIT_DIR/objects and are not aware about the new
env variable. I introduced test-path-utils --git-path to test
git_path(). I could move it to git rev-parse --git-path for use in
scripts, but there'll be more changes. git-new-workdir's symlink
approach shines here.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html